Lecture Notes in Computer Science Edited by G. Goos, J. Hartmanis and J. van Leeuwen
1884
3
Berlin Heidelberg New York Barcelona Hong Kong London Milan Paris Singapore Tokyo
ˇ Július Stuller Jaroslav Pokorný Bernhard Thalheim Yoshifumi Masunaga (Eds.)
Current Issues in Databases and Information Systems East-European Conference on Advances in Databases and Information Systems Held Jointly with International Conference on Database Systems for Advanced Applications, ADBIS-DASFAA 2000 Prague, Czech Republic, September 5-9, 2000 Proceedings
13
Series Editors Gerhard Goos, Karlsruhe University, Germany Juris Hartmanis, Cornell University, NY, USA Jan van Leeuwen, Utrecht University, The Netherlands Volume Editors ˇ Július Stuller Academy of Sciences of the Czech Republic, Institute of Computer Science Pod Vodárenskou vˇezˇ í 2, 182 07 Prague 8, Czech Republic E-mail:
[email protected] Jaroslav Pokorný Charles University, Faculty of Mathematics and Physics Malostranské nám. 25, 118 00 Prague 1, Czech Republic E-mail:
[email protected] Bernhard Thalheim Brandenburg University of Technology, Computer Science Institute P.O. Box 101344, 03013 Cottbus, Germany E-mail:
[email protected] Yoshifumi Masunaga Ochanomizu University, Faculty of Science, Department of Information Science 2-1-1 Otsuka, Bunkyo-ku, Tokyo 112-8610, Japan E-mail:
[email protected] Cataloging-in-Publication Data applied for Die Deutsche Bibliothek - CIP-Einheitsaufnahme Current issues in databases and information systems : proceedings / East European Conference on Advances in Databases and Information Systems held jointly with International Conference on Database Systems for Advanced Applications, ADBIS-DASFAA 2000, Prague, Czech Republic, September 5 - 9, 2000. J´ulius Stuller . . . (ed.). - Berlin ; Heidelberg ; New York ; Barcelona ; Hong Kong ; London ; Milan ; Paris ; Singapore ; Tokyo : Springer, 2000 (Lecture notes in computer science ; Vol. 1884 ISBN 3-540-67977-4
CR Subject Classification (1998): H.2, H.3, H.4, H.5 ISSN 0302-9743 ISBN 3-540-67977-4 Springer-Verlag Berlin Heidelberg New York This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer-Verlag. Violations are liable for prosecution under the German Copyright Law. Springer-Verlag Berlin Heidelberg New York a member of BertelsmannSpringer Science+Business Media GmbH © Springer-Verlag Berlin Heidelberg 2000 Printed in Germany Typesetting: Camera-ready by author, data conversion by PTP-Berlin, Stefan Sossna Printed on acid-free paper SPIN: 10722450 06/3142 543210
Preface
The East European Conference on Advances in Databases and Information Systems (ADBIS) is the successor of the annual International Workshops with the same title that during 1993–1996 were organized in Russia by the Moscow ACM SIGMOD Chapter. Initiated in St. Petersburg, Russia, in 1997, it continued in Poznan, Poland, in 1998 and in Maribor, Slovenia, in 1999. The ADBIS Conference became the premier database and information systems conference in Eastern Europe. It intended to increase interaction and collaboration between researchers from the East and the West, and to provide an internationally recognized tribune for the presentation of research results. The International Conference on Database Systems for Advanced Applications (DASFAA) was first held in Seoul, Korea, in 1989 to promote database research and development activities in Asian and Australasian countries. The Special Interest Group of Database Systems (SIGDBS) of the Information Processing Society of Japan (IPSJ) and the Special Interest Group of Data Base (SIGDB) of Korea Information Science Society (KISS) had important roles in the organization of DASFAA. Since that time the DASFAA has been held every two years: Tokyo in 1991, Daejon in 1993, Singapore in 1995, Melbourne in 1997, and Taiwan in 1999. The DASFAA became one of the most prestigious international conferences ever held in Asia or Australasia. The decision to organize the 2000 ADBIS-DASFAA Symposium was made just before the ADBIS’98. The organizers of the DASFAA Conference nominated Professor Yoshifumi Masunaga as the co-chair from DASFAA for this joint event in 2000. The goal of the symposium was to promote interaction and collaboration between researchers from Eastern Europe and Asia and Australasia, and to provide an international forum for the presentation of research on database theory, the development of advanced data base management systems technologies, and their advanced applications. It is planned that this symposium will be held bi-annually in the years when DASFAA conferences are not organized. In addition, the 2000 ADBIS-DASFAA Symposium is considered to be the regular, Fourth ADBIS annual Conference. The 2000 ADBIS-DASFAA Symposium took place on 5–8 September 2000, in Prague, Czech Republic. In addition to the usual program, a special session on mobile database technology was organized in collaboration with Sanjay Kumar Madria. Its main objective was to bring together researchers and developers interested in this topic and to enrich classical database themes. This year the 42 members of the 2000 ADBIS-DASFAA Symposium Program Committee, from 22 countries, evaluated 115 submissions. After a careful review process, followed by a very long discussion at the PC meeting held on 4–5 May 2000 in Cottbus, Germany, 51 papers were selected for presentation at the 2000 ADBIS-DASFAA Symposium: 18 regular papers (14 pages long) and 9 short papers (8 pages) were accepted to be published by Springer-Verlag and
VI
Preface
24 papers (10 pages) to be published in the “Proceedings of Challenges”. The paper selection for the mobile database technology session was done by its own program committee which accepted 6 papers (3 regular and 3 short) for this volume and 1 paper for the local proceedings. We obtained 17 submissions from Asia and Australasia from which 12 were either accepted for this volume (6) or for the “Proceedings of Challenges” (6). Thus, we consider the merge of the ADBIS and DASFAA conferences this year a good beginning. The selected contributions are completed by 3 invited lectures. They are on spatio-temporal databases, database systems as middleware, and advanced processing environment for managing the continuous and semistructured features of multimedia content. The staging of the 2000 ADBIS-DASFAA Symposium was made possible by the cooperation of many different organizations and people. We would like to express our thanks to all the institutions who actively supported this event, namely: – Charles University, Faculty of Mathematics and Physics – Institute of Computer Science (ICS), Academy of Sciences of the Czech Republic – Brandenburg University of Technology at Cottbus, Computer Science Institute (CSI BUT) We would also like to thank all members of the Program Committee and the Organizing Committee as well as the Action M Agency for their professional work. Special thanks go to Alfred Hofmann of Springer-Verlag for accepting these proceedings to the LNCS series, to Hana B´ılkov´ a from ICS who did an excellent job in the completion of the proceedings, to the staff of Bernard Thalheim at CSI BUT for realizing a submission and review system that worked perfectly and for organizing a smooth PC session at Cottbus, and, particularly, to Leonid Kalinichenko for his kind help and advice during the symposium preparation.
July 2000
Yoshifumi Masunaga Bernhard Thalheim Jaroslav Pokorn´ y ˇ J´ ulius Stuller
2000 ADBIS-DASFAA Organization The 2000 ADBIS-DASFAA Symposium on Advances in Databases and Information Systems was organized by the Charles University, Faculty of Mathematics and Physics, and the Institute of Computer Science of the Academy of Sciences of the Czech Republic, in cooperation with the Moscow ACM SIGMOD Chapter.
Executive Commitee Conference Chair: Program Co-chairs:
Jaroslav Pokorn´y (Charles University, Prague, Czech Republic) Yoshifumi Masunaga (Department of Information Science, Ochanomizu University, Tokyo, Japan) Bernhard Thalheim (Computer Science Institute, Brandenburg University of Technology at Cottbus, Germany) ˇ J´ ulius Stuller (Institute of Computer Science, Academy of Sciences of the Czech Republic, Prague, Czech Republic)
Program Committee Leopoldo Bertossi (Chile) Yuri Breitbart (USA) Omran Bukhres (USA) Albertas Caplinskas (Lithuania) Wojciech Cellary (Poland) Arbee Chen (Taiwan) Jan Chomicki (USA) Bogdan Czejdo (USA) Johann Eder (Austria) Remigijus Gustas (Sweden) Tom´aˇs Hruˇska (Czech Republic) Hannu Jaakkola (Finland) Leonid Kalinichenko (Russia) Michal Kopeck´ y (Czech Republic) Wolfgang Klas (Germany) Dik-Lum Lee (Hong Kong) Tok Wang Ling (Singapore) Yannis Manolopoulos (Greece) Yoshifumi Masunaga (Japan) Tomaz Mohoric (Slovenia) Tadeusz Morzy (Poland)
Pavol N´avrat (Slovakia) Dennis Ng (USA) Nikolay Nikitchenko (Ukraine) Boris Novikov (Russia) Maria Orlowska (Australia) Seog Park (Korea) Oscar Pastor (Spain) ˇ ıha (Czech Republic) Anton´ın R´ Colette Rolland (France) Ivan Rozman (Slovenia) Nand Lal Sarda (India) Klaus-Dieter Schewe (Germany) ˇ J´ ulius Stuller (Czech Republic) Kazimierz Subieta (Poland) Bernhard Thalheim (Germany) Gottfried Vossen (Germany) Benkt Wangler (Sweden) Tatjana Welzer (Slovenia) Viacheslav Wolfengagen (Russia) Vladimir Zadorozhny (USA) Alexander Zamulin (Russia)
VIII
Organization
Program Committee for Mobile Database Technology Bharat Bhargava (USA) Panos Chrysanthis (USA) Margaret H. Dunham (USA) Sumi Halal (USA) Abdelkader Hameurlain (France) Vijay Kumar (USA) Mikhail Kogalovsky (Russia) Sergei Kuznetsov (Russia) Wang-Chien Lee (USA)
Wee-Keong Ng Nanyang (Singapore) Evaggelia Pitoura (Greece) George Samara (Cyprus) Nand Lal Sarda (India) Krishnamurthy Vidyasankar (Canada) Radek Vingralek (USA) Gottfried Vossen (Germany) Ouri Wolfson (USA) Arkady Zaslavsky (Australia)
ADBIS Steering Commitee Chair: Members:
Leonid Kalinichenko (Russian Academy of Sciences, Russia) Andras Benczur (Hungary) Radu Bercaru (Romania) Albertas Caplinskas (Lithuania) Johann Eder (Austria) Janis Eiduks (Latvia) Hele-Mai Haav (Estonia) Mirjana Ivanovic (Yugoslavia) Mikhail Kogalovsky (Russia) Yannis Manopoulos (Greece) Rainer Manthey (Germany) Tadeusz Morzy (Poland) Pavol N´avrat (Slovakia) Boris Novikov (Russia) Jaroslav Pokorn´ y (Czech Republic) Anatoly Stogny (Ukraine) Tatjana Welzer (Slovenia) Viacheslav Wolfengagen (Russia)
DASFAA Steering Commitee Chair: Members:
Yahiko Kambayashi (Kyoto University, Japan) Arbee L.P. Chen (Taiwan) Tok Wang Ling (Singapore) Fred Lochovsky (Hong Kong) Yoshifumi Masunaga (Japan) Seog Park (Korea) Ron Sacks-Davis (Australia) Kunitoshi Tsuruoka (Japan) Kyhun Um (Korea) Kyu-Young Whang (Korea)
Organization
IX
Organizing Commitee Chair: Members:
Sponsoring Institutions INTAX, s.r.o. Hewlett Packard, s.r.o. KOMIX, s.r.o. Smart4U Ltd. DCIT, s.r.o.
Jaroslav Pokorn´ y (Charles University, Prague, Czech Republic) Martin Beran (Charles University, Prague) Hana B´ılkov´ a (ICS, Prague) Andrea Kutnarov´ a (Action M Agency, Prague) G¨ unter Millahn (CSI BUT, Cottbus) ˇ J´ ulius Stuller (ICS, Prague) Lucie Vachov´a (Action M Agency, Prague) Milena Zeithamlov´ a (Action M Agency, Prague)
Table of Contents
Invited Talks An Advanced Processing Environment for Managing the Continuous and Semistructured Features of Multimedia Content . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 S. Nishio, K. Tanaka, Y. Ariki, S. Shimojo, M. Tsukamoto, M. Arikawa, K. Tajima, K. Harumoto Database Systems as Middleware–Events, Notification, Messages . . . . . . . . . . . 21 H. Schweppe, A. Hinze, D. Faensen Regular Papers Managing Schemata for Semistructured Databases Using Constraints . . . . . . .23 A. Bergholz, J. C. Freytag Caching for Mobile Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 T. Carpenter, R. Carter, M. Cochinwala, M. Eiger Efficient Cache Management Protocol Based on Data Locality in Mobile DBMSs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 I.Y. Chung, J.H. Ryu, C.-S. Hwang Algorithms for Rewriting Aggregate Queries Using Views . . . . . . . . . . . . . . . . . . . 65 S. Cohen, W. Nutt, A. Serebrenik A New High-Dimensional Index Structure Using a Cell-Based Filtering Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 S.-G. Han, J.-W. Chang Active System for Heterogeneous ODBMS Using Mobile Rule Codes . . . . . . . 93 K. C. Kim, S. B. Yoo, K. W. Ko, S. K. Cha 2-D Spatial Indexing Scheme in Optimal Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 N. Kitsios, C. Makris, S. Sioutas, A. Tsakalidis, J. Tsaknakis, B. Vassiliadis Mining around Association and Representative Rules . . . . . . . . . . . . . . . . . . . . . 117 M. Kryszkiewicz A New Algorithm for Page Access Sequencing in Join Processing . . . . . . . . . .128 A. Lim, O. W. Chong, C.-H. Chi Data Organization Issues for Location-Dependent Queries in Mobile Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 S. K. Madria, B. Bhargava, E. Pitoura, V. Kumar
XII
Table of Contents
Towards Intelligent Information Retrieval Engines: A Multi-agent Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 C. Makris, A. Tsakalidis, B. Vassiliadis Design and Implementation of a Novel Approach to Keyword Searching in Relational Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 U. Masermann, G. Vossen A Rule-Oriented Architecture to Incorporate Dissemination-Based Information Delivery into Information Integration Environments . . . . . . . . . . 185 H. Mizuguchi, H. Kitagawa, Y. Ishikawa, A. Morishima Hierarchical Materialisation of Method Results in Object-Oriented Views . 200 T. Morzy, R. Wrembel, T. Koszlajda Finding Generalized Path Patterns for Web Log Data Mining . . . . . . . . . . . . . 215 A. Nanopoulos, Y. Manolopoulos Size Estimation of the Intersection Join between Two Line Segment Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 E. Nardelli, G. Proietti Distributed Searching of k -Dimensional Data with Almost Constant Costs 239 A. Di Pasquale, E. Nardelli X2 QL: An eXtensible XML Query Language Supporting User-Defined Foreign Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251 N. Shinagawa, H. Kitagawa, Y. Ishikawa Decision Committee Learning with Dynamic Integration of Classifiers . . . . . 265 A. Tsymbal Multiversion Linear Quadtree for Spatio-Temporal Data . . . . . . . . . . . . . . . . . . 279 T. Tzouramanis, M. Vassilakopoulos, Y. Manolopoulos On the Analysis of On-Line Database Reorganization . . . . . . . . . . . . . . . . . . . . . 293 V. I. S. Wietrzyk, M. A. Orgun, V. Varadharajan Short Papers Systematization of Approaches to Equality–Generating Constraints . . . . . . . 307 A. Binemann–Zdanowicz Efficient Region Query Processing by Optimal Page Ordering . . . . . . . . . . . . . 315 D.-S. Cho, B.-H. Hong On Deploying and Executing Data-Intensive Code on SMart Autonomous Storage (SmAS) Disks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323 V. V. Dimakopoulos, A. Kinalis, E. Pitoura, I. Tsoulos An Efficient Storage Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331 D. G. Kapopoulos, M. Hatzopoulos, P. Stamatopoulos
Table of Contents
XIII
A Timeout-Based Mobile Transaction Commitment Protocol . . . . . . . . . . . . . . 339 V. Kumar Flexible Integration of Optimistic and Pessimistic Concurrency Control in Mobile Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346 K. A. Momin, K. Vidyasankar An Analysis of Alternative Methods for Storing Semistructured Data in Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354 I. Nekrestyanov, B. Novikov, E. Pavlova Main-Memory Management in Temporal Object Database Systems . . . . . . . 362 K. Nørv˚ ag Design Issues in Transaction-Time Temporal Object Database Systems . . . 371 K. Nørv˚ ag Mobile Transaction Management in Mobisnap . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379 N. Pregui¸ca, C. Baquero, F. Moura, J. Legatheaux Martins, R. Oliveira, H. Domingos, J. O. Pereira, S. Duarte How to Manage Evolution in OODB? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387 D. Tamzalit, M. Oussalah Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395
An Advanced Processing Environment for Managing the Continuous and Semistructured Features of Multimedia Content Shojiro Nishio1,4 , Katsumi Tanaka2 , Yasuo Ariki3 , Shinji Shimojo4 , Masahiko Tsukamoto1 , Masatoshi Arikawa5 , Keishi Tajima2 , and Kaname Harumoto4 1
Department of Information Systems Engineering, Osaka University 2-1 Yamadaoka, Suita, Osaka 565-0871, Japan 2 Department of Computer and Systems Engineering, Kobe University Rokkodai, Nada, Kobe 657-8501, Japan 3 Department of Electronics and Informatics, Ryukoku University 1-5 Yokotani, Ohe-cho, Seta, Otsu 520-2194, Japan 4 Cybermedia Center, Osaka University 5-1 Mihogaoka, Ibaraki, Osaka 567-0047, Japan 5 Center for Spatial Information Science (CSIS), The University of Tokyo 4-6-1 Komaba, Meguro-ku, Tokyo 153-8904, Japan
Abstract. The remarkable advances made in computer and communication hardware/software system technologies have made it possible for us to use multimedia data for many purposes in various service areas. Considering the importance of developing an advanced processing environment for the acquisition, management, retrieval, creation, and utilization of large amounts of multimedia content, the “Advanced Multimedia Content Processing (AMCP)” research project has been started in Japan, focussing on the management of the continuous and semistructured features of multimedia content. In this paper, we will present a part of research results achieved by this project, along with the process from the first time the multimedia content is acquired/organized until it is delivered to and presented at the client side.
1
Introduction
With the remarkable advances made in computer and communication hardware/software system technologies, we can now easily obtain large volume of multimedia data through advanced computer networks, store and handle them in our own personal disk devices. Sophisticated and integrated multimedia content processing technologies, which are essential to building a highly advanced information-based society, are attracting ever-increasing attention in various service areas, including broadcasting, publishing, medical treatment, entertainment, and communications. The prime concerns of these technologies are how to acquire multimedia data from the real world, how to automatically organize and ˇ J. Stuller et al. (Eds.): ADBIS-DASFAA 2000, LNCS 1884, pp. 1–20, 2000. c Springer-Verlag Berlin Heidelberg 2000
2
Shojiro Nishio et al.
store these obtained data in databases for sharing and reuse, and how to generate and create new, attractive multimedia content using the stored data. Considering the importance of developing an advanced processing environment for the acquisition, management, retrieval, creation, and utilization of large amounts of multimedia content, we started in 1997 the five years research project titled “Advanced Multimedia Content Processing (AMCP)” under the support of a new farsighted program entitled “Research for the Future (RFTF).” This RFTF program, launched in 1996, is funded through a capital investment made by the Japanese government to promote and expand the frontiers of scientific research. Currently, our AMCP project is composed of fourteen members who are from four different universities. The AMCP project aims at covering the following particular areas: (1) Dynamic multimedia data modeling and intelligent structuring of content based on active, bottom-up, and self-organized strategies. (2) Access architecture, querying facilities, and distribution mechanisms for multimedia content. (3) Synthesis of virtual and augmented real environments using large amounts of multimedia data for the creation of multimedia content and the applications of such technology to mobile computing environments. Artistic and innovative applications through the active use of multimedia content are also subjects of the project. In this paper, we will present some major results of research obtained by this project. The rest of this paper will be organized as follows. In section 2, important properties of multimedia content are discussed, where its continuous and semistructured features are taken up, and then we briefly explain how research in this project manages these features. In section 3, we discuss the multimedia data organization by means of media integration. After that, in section 4, we discuss our approach in the organization and retrieval of Web data in WWW (World Wide Web) systems. Discussion on the algebraic filtering framework for continuous media will be given in section 5 followed by the discussion on access control for inline objects in Web data in section 6. After discussing our spatial querying and browsing environment in section 7, we give the conclusion of this paper in section 8.
2
Important Properties of Multimedia Content
When we think about storing, sharing, and retrieving multimedia content, “continuous” and “semistructured” are two key features that should get more serious consideration. This is because, without appropriate handling, these features could prohibit us from getting valuable information from the content. However, if we could manage them in an appropriate way, it is possible for us to perform more advanced operations including query and search on the multimedia content.
An Advanced Processing Environment
2.1
3
Continuity Feature
James Monaco (1981)1 said that “There are no basic units of meaning in film”. Although the above quote is describing his view as a film and media critic, the quote is exactly pointing at the continuity feature of film, and thus fit into our discussion here. For examples, in (digitized) video data, any video interval could become a result of searching. Therefore, any interval could become an information unit, thus it could be considered as one-dimensional (1-D) continuous media. Similar case also applies to audio data. However, the principle of continuity in multimedia is broader and not limited only to this obvious fact. In general, the continuity feature of multimedia data could be found at least in three aspects, i.e., spatial continuity, time continuity and quality continuity. In spatial data like two-dimensional (2-D) map or three-dimensional (3-D) virtual space, any space could become a result of searching. Any space, therefore, could become an information unit, thus it could be considered as 2-D or 3-D continuous media. Including that of video and audio data, we call this continuity as spatial continuity. Later in this paper, we will introduce the algebraic retrieval method of fragmentarily indexed video which can handle efficiently the spatial continuity feature in providing answers to queries on video data. Furthermore, we will also introduce our spatial querying and browsing environment proposed for presenting multimedia data, including video data, in 3-D context. Time continuity is closely related to the timeliness needed in presenting the content to the user. Timeliness and synchronization are imposed in order to ensure smooth and meaningful flow of contents to the user. In this paper, we will discuss WebQoS that exploits the feature of composite objects in Web data in the Quality of Service (QoS) control in order to ensure the time continuity (timeliness requirement) of Web data presented to the user. Quality continuity is related to the access management regarding the presentation quality of multimedia content. So far, research in access management assumes that users have knowledge about the schema of database, and the types of data involved are of simple text or numerical data. Since handling such data types costs almost nothing and it could be done both in the client and server, the access performance and security have been the main problems. Accordingly, the approaches of existing access management methods are based on all-or-nothing principle. However, the case is different when we are dealing with multimedia content. Besides the high cost of processing multimedia data, in general, users will determine if they want to access the multimedia data after viewing it at least once in any quality. When we are considering the commercialization of multimedia content, it is necessary to provide price based Level of Detail (LoD) management. Furthermore, when we are dealing with multimedia object in 3-D context, 1
James Monaco is author or editor of more than a dozen books on film and media, published in more than 35 editions, including American Film Now, The New Wave, and The Film Encyclopedia. He is the founder of Baseline, the worldwide information source for the entertainment business. He is also well-known as a pioneer of the electronic publishing industry.
4
Shojiro Nishio et al.
it is important to provide a smooth change in LoD according to the distance from the object, especially, if we want to support walk-through in virtual space. Accordingly, in handling multimedia content, quality continuity is among the important aspects to consider. Later, we will discuss our approach in controlling LoD in our spatial querying and browsing environment. 2.2
Semistructured Feature
In this paper, we mean the semistructured feature of data as the fact that the data do not have a unified schema like tables in a database, but elements that can be used in realizing schema-like organization exist in the data. By exploiting such an organization, to some degree, we can perform operations such as query and search which originally depend on the existence of schema. Below, we will explain this feature in more detail by using video and Web data as examples of semistructured data. Video is composed of frames or shots. For the purpose of editing or authoring, frames or video intervals (an interval is composed of one or more consecutive frames or shots) are assigned with keywords. In this point, each frame or interval already has a kind of structure (for example, grouping is possible by using keywords). Though any frame or interval could be assigned with multiple keywords, in general there is no particular rule in assigning keywords. Considering this fact, we could classify video as semistructured data. On the other hand, with some elaboration, it is possible to use these keywords in organizing frames and intervals in video data, and later use this organization in the structure (schema) dependent processing such as query and search. Considering this fact, we can see the video data with keywords as semistructured data. The case is also similar with Web pages. Web pages usually consist of text and graphics. In general, each page is written following a structure determined by the author. However, it is a rare case that all pages are designed following a specific structure (this is because the original goal of providing information in Web pages is generally to disseminate information on the Internet. Therefore, the clarity of information and the artistic aspect are the most important motivation behind the author work, and for that goal the structure of pages is not the necessary condition). Considering this fact, we could classify Web pages as semistructured data. On the other hand, Web pages generally contain also links to other pages for navigating the viewer to the related pages. Considering the existence of links in Web pages, we thus can see the Web data as a huge graph containing Web pages connected by links. By traversing nodes in this graph, we could perform search or query to find the information of our interest. According to this fact, we can see Web pages also as semistructured data. As we can see in the above mentioned examples, data which are involved in the multimedia content presentation are generally semistructured and we need to do some operations on these data before they can be used in processing queries or search requests. Related to the semistructured aspect of multimedia content,
An Advanced Processing Environment
5
in this paper, we will discuss our approaches in indexing multimedia content by using speech recognition technique, the organization and retrieval of Web and video data, and the visualization of multimedia data in 3-D context.
3 3.1
Multimedia Data Organization by Media Integration Organization of Continuous Media
Contents stored in digital libraries or museums can be accessed quickly using their manually constructed table of contents or indices. On the other hand, broadcast news, animation, drama and movies usually have no table of contents or indices, so that data have to be accessed sequentially. To deal with such a problem in handling continuous media, a content based access architecture is required. In video images, an index corresponds to an event that occurs in the video image. On the other hand, a table of contents corresponds to a hierarchical structure of topics, like chapters and sections of books. In order to access the continuous media, these indices and tables of contents have to be produced via automatic analysis on video and audio data. This is called organization of continuous media which mainly includes functions of indexing and topic segmentation. A typical example of such organization of continuous media is a news on demand (NOD) system. Since TV news programs are broadcast to all over the world through the digitization, TV viewers are required to select and watch the most interesting news in a short time. In order to satisfy this requirement, we are now developing an NOD system with functions such as automatic topic segmentation, retrieval of related topics and summarization. These functions can be realized based on the organization of continuous media. 3.2
Approach to the Organization
In order to realize the indexing and topic segmentation in an NOD system, indices such as words, objects and shots have to be extracted automatically from audio and video data at first. Then they will be meaningfully integrated into topics. By these indexing and topic segmentation, new functions, such as cross media retrieval which can retrieve the content across the different media, could be prepared. Fig. 1 shows the organization process of continuous media. Signal data are organized into contents through the processes of “Segmentation & classification”, “Recognition & indexing”, “Association in time and space”, and finally “Construction of topic thesaurus” as shown in the left hand side. According to these organization processes, the signal data are converted into concept through pattern, symbol and topic as shown in the right hand side. At each level of data presentation, the corresponding retrieval is available. For example, at the signal level, specific images, music or noises can be retrieved based on signal processing. At the pattern level, human faces and speech are
6
Shojiro Nishio et al.
retrieved using pattern extraction techniques. An example at the symbol level is a retrieval of the president’s face and speech based on pattern recognition techniques. At the topic level, the president’s speech concerning his/her economic policy or highlight scenes in TV sports program can be retrieved. Finally at the concept level, the upper or lower topics as well as similar or opposite topics will be retrieved. Here the upper topics mean not only the summaries but also the topics at abstract level such as “economy” to “bankruptcy”. In the same way, the lower topics mean not only the details but also the topics at concrete level. These retrieval of opposite, upper and lower topics based on automatic construction of topic thesaurus rather than summaries or detail is unique approach as compared to Informedia-II being developed at CMU. InformediaII focuses on the distillation of information across multiple video paragraphs and visualization over time and space to understand how events evolve and are correlated over time and geographically [23].
Fig. 1. Organization process of continuous media.
3.3
Media Integration
(a) Integration of Speaker Indexing and Speech Transcription We have developed a system which can automatically index and classify TV news stories into 10 topics based on a speech transcribing technique [3]. After transcribing the spoken news stories, pre-defined keywords are searched and the news stories are classified based on the keywords. In general, news speech includes reporter speech as well as announcer speech. The announcer speech is clear but the reporter speech is sometimes noisy due to wind or environmental noises so that the transcription accuracy for the reporter speech is lower than that for the announcer speech. Therefore, if the speech transcribing process is limited only to the announcer speech, the processing time can be reduced without decreasing the news classification accuracy.
An Advanced Processing Environment
7
From this point of view, our system can automatically divide the TV news speech into speaker sections at first and then perform indexing in real time who is speaking. This can be realized by using a technique of speaker verification. However, the speaker verification technique is sensitive to the time lapse. To solve this problem, speaker models are not prepared in advance but are constructed in a course of indexing in self-organization mode. We verified the effectiveness of our proposed methods by carrying out an experiment of extracting and transcribing only the announcer speech and then classifying the news stories into 10 topics. At present, 75% accuracy of news story classification is obtained. This system is applicable to meeting documents as shown in Fig. 2.
Fig. 2. Integration of speaker indexing and speech transcribing.
(b) Integration of Video OCR and Speech Transcription We have also developed a system of cross media retrieval for TV news as shown in Fig. 3. At first, the video caption is recognized through video OCR (Optical Character Recognition). Then a vector is constructed for a video caption using the recognized nouns. Next, speech transcription of a news story (called spoken document hereafter) is converted into a vector using extracted important words. The widely used vector space model counts only the number of common words between documents, but it does not take into consideration the similarity between non-common words. This causes a decrease of the retrieval performance in a case where the number of words included in video captions is small. To solve this problem, word distance has to be computed based on similarity between words. Though mutual information and co-occurrence are usually used as the similarity measure, they show the similarity between words in only 1-D space. On the other hand, word distance is defined as the distance between words in a 3-D space of mutual information, TF (Term Frequency) and IDF (Inverse Document Frequency). We call such a 3-D space as a word space.
8
Shojiro Nishio et al.
The method to compute the inner product of two vectors, a vector for video caption and a vector for spoken document, is formalized by incorporating this word distance into vector space model. Our experiments to retrieve the spoken documents by the video caption demonstrated 80% recall and 80% precision. This is extremely higher than the original vector space model.
Fig. 3. Integration of video OCR and speech transcribing.
4
Web Data Organization and Retrieval
One of the most popular and the most important example of multimedia content is Web data. Currently, a huge amount of information resources are stored as Web data, and in some sense, Web is the largest database that we have ever had. The Web data, however, have a property which makes them very different from data in traditional databases. In traditional databases, data are created in a well-organized form in the first place so that they can easily be retrieved by using queries. On the other hand, the Web data are usually organized mainly for good presentation in the browsers, and the creators of the Web data are not necessarily very conscious of the retrieval by queries. As a result, the structure of the Web data does not necessarily reflect the intended logical structure, and sometimes the necessary information for queries is not described as explicitly as in the traditional databases. The lack of explicit logical structure causes difficulty in retrieving Web data by using queries. We first need to extract the necessary logical structures from the given structures in the Web data, such as HTML (HyperText Markup Language) tags or link structures. Those structures are, however, very heterogeneous, and thus, Web data are semistructured data. This heterogeneity also causes difficulty in extracting necessary information. We need techniques to extract logical structure which is described only implicitly and heterogeneously. In this section, we discuss the techniques of extracting three kinds of logical structures: (1) hierarchical standard routes in Web sites, (2) logical data units in Web graphs, and (3) components correspondence within Web pages.
An Advanced Processing Environment
4.1
9
Hierarchical Standard Routes in Web Sites
Web data consist of pages and only one kind of links connecting them. Links are, however, used in various meanings. Some links jump to pages including related information (jump links) and some are meant to be the standard routes going through pages in a Web site (route links). Links go from a page back to some previous pages on the route to the page (the links are often assigned with anchor strings such as “back” or “top”) are also often used (back links). If we extract only route links, the graph structure of most Web sites are reduced to one or multiple hierarchies. We designed an algorithm that identifies hierarchical route links within a Web site [9]. Identifying such a hierarchical structure is useful for helping users when browsing, and furthermore for identifying logical data units as will be discussed later. Here, we explain only the outline of the algorithm. For each page, the algorithm selects, based on heuristics, one incoming link that is the most likely to be the route link to the page (or it selects no link if the page is recognized as one of the roots of the hierarchies). The concatenation of all selected links corresponds to the hierarchical structure of the Web site. We determine, incoming links in a page are likely to be the route link in the following priority order: 1. downward links (links from pages in ancestor directories), 2. intradirectory links (links from pages in the same directory), and 3. sibling links (links from pages in incomparable directories). We determine, neither upward links (links from pages in subdirectories) nor intersite links (links from pages on different Web sites) can be route links.
a
route links
route link
c
b back links
route link Fig. 4. Finding Route Links
When there are multiple candidates of the same priority, we select one of them so that there would be the largest number of links going back from the children pages to the ancestor pages of the selected link. For example, if the links denoted by bold arrows in Fig. 4 are candidates for the route link of the page c,
10
Shojiro Nishio et al.
we select the link from the page b. This decision is taken based on the fact that the existence of many back links over a link, in many cases, suggests that the link is certainly a route link. 4.2
Logical Data Units in Web
In most existing search engines for Web data, the data units for retrieval are individual pages. As mentioned before, however, Web pages are often organized not for expressing logical structure. As a result, one logical document discussing one topic is often organized into a hierarchy of pages connected via route links. In this way, a logical data unit in Web data is not a page but is a connected subgraph corresponding to one logical document. When we use individual pages as the data units, conjunctive queries with multiple keywords sometimes fail to retrieve an appropriate document if those keywords happen to appear in different pages of that document. To solve this problem, we developed a method to identify connected subgraphs corresponding to logical data units in Web data [16,17]. By using those subgraphs as the querying data units, we can avoid the problem of conjunctive queries above. Our method identifies such subgraphs in two steps. First, we extract hierarchical structure of Web sites by using the method explained above. After that, we divide those hierarchies into connected subgraphs based on the similarity of appearing words in neighboring pages. In the partitioning, our algorithm allows overlapping among subgraphs because logical data units in real Web data are often not disjoint. The approach mentioned above statically divides Web graph into logical data units. Another approach is to dynamically find minimal subgraphs that include all query keywords [5,8,18]. The advantage of this approach is that we can always get “best N results” [8]. Some of minimal subgraphs, however, may not be part of one logical data unit. To give priority to subgraphs that are more likely to be part of a single document, we proposed ranking methods based on the localities of query keywords [5,18]. If the localities of query keywords in a subgraph greatly overlap, this subgraph is estimated to have a higher rank. We examine both the locality of words across the pages and the locality of words within each page. 4.3
Components Correspondence within Web Pages
The Web browsing operation is easy for experienced computer users, but it is still difficult for inexperienced users and ordinary TV audience. Current Web browsers enforce users to have activeness and knowledge about computers. In order to acquire necessary information from Web data, users must read pages and navigate themselves through hyperlinks by clicking anchors. We call such a browsing interface a read and click interface. On the other hand, TV audience just need to watch and listen to the TV programs, and acquire necessary information in a passive manner. Our idea is to introduce a new passive Web browsing interface called a watch and listen interface.
An Advanced Processing Environment
11
We have implemented a prototype system of such an interface for Web pages [19]. Given an URL, this system analyzes the tag structures of the Web page of the URL, and transforms the page into a TVML (TV Program Making Language) script describing a TV-program, which is then executed by a TVML player [6,22]. In that TV program, CG (Computer Graphics) characters in 3-D space spell out the text portion of the page, and images appearing in that page are shown one by one on the panel in the same space. The appearance of images is synchronized with the characters’ speech, and CG characters explain the images by playing several performances (e.g., pointing out to the panel). In order to transform the data of ordinary HTML pages into TV-programs in TVML, we need the following techniques: – text summarization, – text conversion into speech with behaviors of animation characters, – discovery of components correspondence between images and text explaining them, – synchronized presentation of images and characters’ speech/behaviors, and – several presentation techniques such as camera movements. Even if the technologies mentioned above have been developed, the automatic transformation of HTML pages into TV-like programs could not be done completely. This is due to the fact that HTML and its successor XML (eXtensible Markup Language) are not invented to describe data that will be shown to the users in TV-program-like presentation. Therefore, in addition to the fullautomatic translation approach, we also consider a semi-automatic approach, where the page authors are provided with an authoring mechanism where the authors can partially use automatic translation, and can also explicitly specify necessary information [11]. We also proposed a new markup language called Scripting-XML (S-XML) which extends the XML to facilitate the description of Web data which are appropriate for passive browsing.
5
Algebraic Filtering Framework for Continuous Media
In this section, an algebraic framework for filtering fragmentarily indexed video data is discussed. We have been focussing on how to compute semantically meaningful video intervals dynamically from the fragmentarily indexed shots, when queries are issued. Our goal is to develop a new framework for logical query optimizations of video queries, which retrieve meaningful intervals from such fragmentarily indexed video data. We proposed algebraic operations, what we call glue join operations, to compose longer intervals from short indexed shots [13,14]. These operations compose as many answer intervals (i.e., the intervals that become the query results) as possible. However, answer intervals may be of very long ones, thus they may contain too many irrelevant shots as well. Therefore, we propose selection operations with several filter functions which can be applied to discard irrelevant intervals from the answer set. Both glue join and selection operations possess certain algebraic properties that are useful for logical
12
Shojiro Nishio et al.
query oprimization. Especially, we obtained a necessary and sufficient condition under which a class of selection operations can be pushed down in a given query. For I, a set of subintervals of a given video data, and for F , a predicate which maps a video interval into true or f alse, a selection from I by the predicate F is defined as a subset I 0 of I such that I 0 includes all and only intervals satisfying F . That is, the selection from I by F , denoted by σF (I), is defined below: σF (I) = {i|i ∈ I, F (i) = true} Hereafter, the predicate F is also called a filter of the selection σF . There are several kinds of practically useful filters. Among them, we show some basic filters in Table 1. By using the interval logic filters shown below, it is possible to represent Allen’s interval logic [1]. Table 1. Primitive Filters name keyword selection maximum length maximum noise length interval logic
notation keyword = k
meaning the interval has a keyword k length ≤ c the interval length ≤ c maxnoise(K) ≤ c the length of for a set K the longest of keywords noise ≤ c S(ka ) ≤ S(kb ) each interval for keywords ka , kb with ka etc. start earlier than all intervals with kb
The interval glue join is an operation to connect two intervals (together with an interval exists in between). Let x and y be subintervals of a given video v. Intuitively, the interval glue join of x and y, denoted by x 1 y, is a minimum subinterval of v that contains x and y. Given two sets X and Y of subintervals of a video v, pairwise glue join of X and Y , denoted by X 1 Y , is defined as a set of video intervals yielded by taking glue joins of every combination of an element of X and an element of Y in a pairwise manner. That is, X 1 Y = {x 1 y|x ∈ X, y ∈ Y } Given two sets X and Y of subintervals of a video v, powerset glue join of X and Y , denoted by X 1∗ Y , is defined as a set of video intervals that are yielded by applying glue join operation to an arbitrary number (but not 0) of elements in X and Y . More formally, it is defined as follows:
An Advanced Processing Environment
13
X 1∗ Y = {1 (X 0 ∪ Y 0 )|X 0 ⊆ X, Y 0 ⊆ Y, X 0 6=φ and Y 0 6=φ} where 1 {i1 , i2 , . . . , in } = i1 1 . . . 1 in . Fig. 5 illustrates an example of powerset glue join operation.
X Y
X *Y
x1 i1 i2 i5
y1
i3
x2
y2
i4
i6
Fig. 5. Powerset Glue Join between X and Y
For arbitrary interval sets X, Y, Z, the powerset glue join satisfies the following algebraic properties [20]: – – – –
Commutativity: X 1∗ Y = Y 1∗ X Associativity: (X 1∗ Y ) 1∗ Z = X 1∗ (Y 1∗ Z) X 1∗ X = X 1 X Pairwise glue reduction: X 1∗ Y = (X 1 Y ) 1 (X 1 Y )
If a filter F satisfies the following condition for any intervals sets X and Y , then we call F an optimizable filter: σF (X 1 Y ) = σF (σF (X) 1 σF (Y )) Then if F is optimizable, by the pairwise glue reduction property, we can process the following selection and powerset glue join expression more efficiently by using the following transformation: σF (X 1∗ Y ) = σF (σF (σF (X) 1 σF (X)) 1 σF (σF (Y ) 1 σF (Y ))) = σF (σF (σF (X) 1 σF (Y )) 1 σF (σF (X) 1 σF (Y ))) Since optimizable filter we defined can be incorporated with the glue join operations as the expression above, at the initial stage of query processing, we
14
Shojiro Nishio et al.
can discard intervals that will eventually become useless for computing relevant answer intervals. As for the optimizable filters. We have the following: A filter F is an optimizable filter if and only if the following holds: (∀i, i0 s.t. F (i) = true and i0 ⊆ i)(F (i0 ) = true) The three filters – maximum length, maximum noise length, and interval logic filters are easily shown to be optimizable filters. Furthermore, the class of optimizable filters is closed under conjunction and disjunction. On the other hand, the class of optimizable filters is not closed under negation.
6
Controlling QoS of Web Pages
For managing the multimedia content, the content delivering system is one of the important system components. Since the volume of multimedia content is usually very large and the network bandwidth is limited, a QoS control mechanism is required to meet the users’ requests under the available network resource. In this section, we discuss such a control mechanism for Web pages. In the WWW, impressive information can be provided by embedding a large number of inline objects such as still images, animated images, and background sounds in a Web page. However, if the available network bandwidth between the client and the Web server is insufficient, it takes a long time for a client to get all inline objects in a Web page. Users often terminate the downloading of the Web page before it is completed. To solve this problem, total page transmission time should be controlled so that users are not irritated. Controlling transmission order of inline objects is also important because inline objects in a Web page generally appear on Web browsers in the order of transmission. Therefore, we define a language to specify the total transmission time and transmission order of inline objects, and propose a new transfer protocol, HTSP (Hypertext Streaming Protocol), which is an extension of HTTP (Hypertext Transfer Protocol). In the proposed language, content providers can describe the total transmission time threshold and the transmission order of inline objects in the head section of the HTML document. More detailed specifications on how the quality of inline objects should be controlled can be described within the inline objects themselves [4]. HTSP is used to realize the transmission order control of inline objects [12]. The new method, MGET, is introduced in HTSP to retrieve multiple objects with specified order by a single request. Fig. 6 presents an example of the proposed specification. We show how the inline objects are transmitted based on the description. This WWW page contains one important image and one picture consisting of three interlaced images and one animated image. Moreover, a background sound and many other images are embedded. Based on the description, the important image is transferred first. Next, four partial images are transferred in parallel so that it gives the users a quick preview of the full picture. After that, the background sound, the rest of the frames of the animated image and other images whose transmission order is
An Advanced Processing Environment
15
<meta name="TransmissionTime" content="30"> <meta name="AllowAll" content="ToJpegIfLowQuality">
<seq> <xfer object="B"/> <par slice="10"> <xfer object="C" volume="1f"/> <xfer object="D"/> <xfer object="E"/> <xfer object="F"/> <xfer object="A"/> <xfer object="C" volume="rest"/> ...... ......