Communicating with Smart Objects
This page intentionally left blank
INNOUATIUE TECHNOLOGY SERIES INFORMATION SYSTEMS AND NETWORKS
Communicating with Smart Objects
edited by
Claude Kintzig, Gerard Poulain, Gilles Priuat & Pierre-Noel Fauennec
London and Sterling, VA
First published in France in 2002 by Hermes Science entitled 'Objets communicants' First published in Great Britain and the United States in 2003 by Kogan Page Science, an imprint of Kogan Page Limited Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms and licences issued by the CLA. Enquiries concerning reproduction outside these terms should be sent to the publishers at the undermentioned addresses: 120 Pentonville Road London N1 9JN UK www.koganpagescience.comM
22883 Quicksilver Drive Sterling VA 20166-2012 USA
© France Telecom R&D and Lavoisier, 2002 © Kogan Page Limited, 2003 The right of Claude Kintzig, Gerard Poulain, Gilles Privat and Pierre-Noel Favennec to be identified as the editors of this work has been asserted by them in accordance with the Copyright, Designs and Patents Act 1988. ISBN 1 9039 9636 8
British Library Cataloguing-in-Publication Data A CIP record for this book is available from the British Library.
Library of Congress Cataloging-in-Publication Data Objets communicants. English Communicating with smart objects: developing technology for usable pervasive computing systems / edited by Claude Kintzig ... [et al.]. p. cm. — (Innovative technology series. Information systems and networks) Includes bibliographical references and index. ISBN 1-903996-36-8 1. Electronic apparatus and appliances—Automatic control. 2. Telecommunication systems—Technological inovations. 3. Wireless LANs. 4. Remote control. I. Kintzig, Claude, 1948- II. Title. III. Series. TK7881.2.O25132003 621.3815'4-dc21 2003013875
Typeset by Kogan Page Printed and bound in Great Britain by Biddies Ltd, Guildford and King's Lynn www. biddies.co.uk
Contents Introduction: The Role of Smart Devices in Communication Bruno Choquet
vii
Part 1. Interaction
1
1. New Distributed and Active Tools and Narrative Activities Franchise Decortis, Claudio Moderini, Antonio Rizzo and Job Rutgers
3
2. Smart Houses and Dependent People: Acceptability, Solvency and International Tendencies Chantal Ammi
9
3. Towards Multimodal Human-computer Dialogue by Intelligent Agents Patrice Clemente
17
4. Multimodal Interaction on Mobile Artefacts Laurence Pasqualetti, Laurence Nigay, Moustapha Zouinar, Pierre Salembier, Guillaume Calvet, Mien Kahn and Gaetan Rey
39
5. The Voice as a Means of Humanising Man-machine Interfaces Noel Chateau
47
Part 2. Software Infrastructure for Smart Devices/Ambient Intelligence Gilles Privat
57
6. Introduction to a Middleware Framework Vincent Olive and A. Vareille
61
7. A Model and Software Architecture for Location-management in Smart Devices/Ambient Communication Environments Thibaud Flury, Gilles Privat and Naoufel Chraiet
71
8. A Software Infrastructure for Distributed Applications on Mobile Physical Objects Mohammed Ada-Hanifi, Serge Martin and Vincent Olive
91
9. Integrating a Multimedia Player in a Network of Communicating Objects Jacques Lemordant
103
10. Reverse Localisation Joaquin Keller
111
Part 3. Networking Technologies for Smart Objects Pierre-Noel Favennec
117
11. Wireless Techniques and Smart Devices Jean-Claude Bic
121
vi
Communicating with Smart Objects
12. Wireless Local Area Networks Philippe Bertin
135
13. Radio Links in the Millimeter Wave Band Nadine Malhouroux-Gaffet, Olivier Veyrunes, Valery Guillet, Lionel Chaigneaud and Isabelle Siaud
159
14. Propagation of Radio Waves Inside and Outside Buildings Herve Sizun
169
15. Ad-Hoc Networks Patrick Tortelier
201
16. INDEED: High Rate Infrared Communications in the "Indoor" Context Jean-Christophe Prunnot, Adrian Mihaescu, Christian Boisrobert, Pascal Besnard, Pierre Pellat-Finet, Philippe Guignard, Frederique De Fornel and Fabrice Bourgart
209
17. Artificial Materials for Protected Communications Frederique de Fornel, Rabia Moussa, Laurent Salomon, Christian Boisrobert, Herve Sizun and Philippe Guignard
221
18. Free-space Optical Communication Links Olivier Bouchet and Herve Sizun
229
Part 4. Evolution of Smart Devices Claude Kintzig
243
19. Mobile and Collaborative Augmented Reality Laurence Nigay, Philippe Renevier, Laurence Pasqualetti, Pierre Salembier and Tony Marchand
247
20. Towards a Description of Information-seeking Tasks Contributing to the Design of Communications Objects and Services Andre Tricot and Caroline Golanski
257
21. Making Context Explicit in Communicating Objects Patrick Brezillon
273
22. Dynamic Links for Change-sensitive Interaction Philip Gray and Meurig Sage
285
23. Communicating Devices, Multimode Interfaces and Artistic Creation Guillaume Hutzler, Bernard Gortais and Gerard Poulain
293
24. Powering Communicating Objects Didier Marquet
311
Conclusion: From "Things That Connect" to "Ambient Communication" Gilles Privat
325
Index
339
Introduction
The Role of Smart Devices in Communication Bruno Choquet France Telecom R&D, France
Will communicating objects be the invaders of tomorrow? One speaks of it, one hears of it, but does one know what they will be, what they will represent, what they will do? What are "smart devices"? The concept of a device is commonly understood as that of a physical object composed of mechanisms, hydraulics, electronics, data processing. Its capacities depend on the whole or part of these components that will bring certain degrees of life to it and will produce an operational tool. Communication, experienced by such objects, is characterised by transfers of data which will take on a conversational aspect if the data involves action by the object which receives it, it having capacities of emission which will make react with another object. And since one speaks about communication and interactivity, why not introduce into this system such concepts as communicating entity? This intervention causes an opening up concepts like multitude, synchronisation or time sharing, human specificities and characteristics (emotion, mood, capacity of analysis and synthesis, intelligence, memory, adaptation, etc.) and therefore a large range of parameters in which the interest will be to be used as models with the physical objects.
CyberMonde CyberMonde is a research programme of France Telecom R&D, sponsored by the Scientific Direction. It is intended to take possession of new technologies to able to ensure the advances necessary for France Telecom and its business units and to be an engine of the innovation in: •
co-ordinating research around some great changes of technology and usage;
viii
• •
Communicating with Smart Objects
suggesting vision for their impacts for the future of the services and the networks; accelerating the transfer towards the market, preparing and testing in time unsuspected innovations from these changes.
The CyberMonde program addresses the general set of themes of the 'virtual environment' characterised by these guidelines: • • • •
to always be able to be everywhere and in a capacity of communication; to develop environments (and the associated interfaces) the communicating of which is based on smart physical devices (sensors, actuators); to be able to immerse itself in real, augmented or virtual spaces; to be able to project itself remotely in adapted forms (tele-presence, clones and avatars).
Two major objectives emerge: •
•
not to restrict CyberMonde to virtual environments only, but to consider as natural and real that contents that it is image, audio, interaction, etc., intervene and must intervene in any communication system; to keep in the centre of our concerns the communication dimension which constitutes the main goal of France Telecom and to have in line of sight the enduser, i.e. the human being (but not only) which will place the functioning of these environments at his disposal.
The two major objectives of the CyberMonde program take on again the traditional comprehension of the cyberspace which combines virtual and telematic reality and adds ubiquity, quasi-permanent, real or simulated presence (teleportation): *
To implement virtual environments within the traditional context, i.e.: to divide and make live joint virtual and real elements, to envisage the access and diffusion modes of shared information, to put in place the methods and tools adapted to the preceding tasks, to develop suitable supporting technologies, to adapt to the material constraints (networks, terminals, etc.).
•
To invent new fields of communication: to support all the modes of interactions, to offer forms of presentation, with the help of the available material, to free itself from the place and time.
Introduction
ix
Smart devices Smart devices will support interaction between CyberMonde and the environment: it is their main task; but also they will facilitate the internal evolution of CyberMonde and why not the environment itself. Indeed, CyberMonde, if it lives periodically in a closed area, needs to be built or to be reconstituted according to its own rules; also, the environment is subjected to conditions (forced or enrichments), free or not, which modify its own parameters. In reference to the proposals (incomplete at the date of writing), the role of smart devices appears diversified, made up however of more or less unifying poles of interest: •
•
•
•
•
•
•
management of its primary needs: that is a very basic role, but necessary to be maintained in life and to be able to fulfil the external requests for which the object is intended. The survival of the device should not depend on its capacities to feed (to increase its energy) or to carry out its own maintenance, but this must be decided on if it would become useless. letter box; the situation of the passive device or the device which fills, without fail, always the same functionality. Increased use can possibly occur, but it is known in advance. In this category, one will find probably sensors of information, displays, switches. at a higher level, the tool for specialised storage carries out classification of the data and information which will require greater software power and the setting in place of protocols, dialogues and indexing procedures, with human supervision. decentralisation of the processing capacities of information answers other problems and even corresponds to a certain vision of the organisation of the system, but this track becomes possible thanks to the performances related to the memories, the software developments of architecture and management, etc. This field develops, depending of technological progress and scientific projections, and will allow better adaptation to the specifications. a higher level step: attention carried to context. One finds the functions presented there previously, but also of the capacities of fusion of data and interactivity. The standard example is the physical localisation of a user. But one will easily imagine the need to obtain from it the identification and thus the identity. The access to local information also forms part of the context. more subtle will be the device which will succeed in perceiving the environment, but not only the physical environment or the data-processing links. To perceive the environment requires to be alert to an unspecified event, but also to analyse data which predict relations between communicating entities and especially the way in which each one perceives the other. This introduction of the relational fields is not without reference to the first steps of work on emotion. an additional element of the smart device will be its potentiality to carry out a behavioural analysis of the devices and environment which surround it. This
x
Communicating with Smart Objects
analysis will bring an obvious advantage since it is a question of communicating, to co-operate, co-ordinate, connect, and therefore to connect and to interact. These topics come up more and more frequently, but undoubtedly not yet enough. There is surely a gain in taking as a starting point the mono and multi-agents models. •
•
the device, companion of the communicating entity which is the individual, has the characteristics of its master and puts in place capacities of selection of information (at entry) and recopy (at exit) to assist and accompany the individual; to finish, but undoubtedly is not yet one of prime importance, one will evoke intelligence. But what intelligence for a device? Undoubtedly, software agents are an answer.
Elements, in particular chapters dealing with semantic and emotional aspects, will appear elsewhere. Other fields deserve to be studied, for example, the autonomy and the decision-making power of an object, the takeover of an action, proactivity, etc. Is it false to think that current efforts are not centred on these last subjects? Or is this an ignorance of the work of laboratories? Is a smart device only physical?
Parti Interaction
This page intentionally left blank
Chapter 1
New Distributed and Active Tools and Narrative Activities I
Fransoise Decortis1 , Claudio Moderini2 , Antonio Rizzo3 and Job Rutgers4 1FNRS Universite de Liege, 2Domus Academy, 3Universita di Siena and 4Philips Design
1. Introduction Situated among the technologies of the future, the development of one invisible, distributed and ubiquitious technology was the aim of a research project on which we currently work within the framework of project POGO (European Program 13 Intelligent Information Interfaces, Exploring New Learning Future for Children). This technological orientation rides on the concept of affordance as perceived by objects (Gibson, 1977) and of apparatuses of information (Norman, 1998), and considers the way in which the instrument supports the task so that it becomes an integral part of it, as if it were a natural extension of the human and his work. That implies a specialisation of the function so as to be in perfect agreement with real needs of the users, and to offer great simplicity and transparency. According to this philosophy, instruments will not be recognisable as such any more, so they will disappear from the sight and the consciousness of the humans. The movement towards an invisible technology (Norman, 1998), is manifested in instruments designed and thought out within the framework of project POGO - active instruments, new semiotics tools which should fit in a harmonious way into traditional instruments where any centralised unit of data processing disappears from the sight and consciousness of the user.
1.1. New active tools Six tools forming the POGO system were designed and evaluated. Beamer POGO is one tool which makes it possible to visualise and capture the purposes or appearance of the physical world and to import them into the virtual world by, in particular, projecting them directly on the screen. The beamer includes/understands a screen
4
Communicating with Smart Objects
sensitive to the touch and allows the children to draw on it, to write as if their finger were a pencil and also to capture various type images and to record them on a card. The cards' pastiches are memories of elements and of the made up images. The children can record elements there by acting on the beamer. Other cards contain predefined and preregistered elements (sky, sea, landscape). There are also cards which contain sounds. The flexible screen is used to visualise the basic images contained in the cards. The screen is provided with three small pockets which contain card readers which correspond to three different positions on the screen. The screen flexible device can be fixed to the wall or used on the ground. There is also an integrated device which allows the children to change the colour of the projection. The sound carpet is a carpet which allows them to play with sounds which are contained in cards. By inserting a card in the card reader, the children activate a basic sound of the environment which is played in loop plus a series of specific sounds which are activated on pressing on the various zones of the carpet. The tool voice enables the microphone to create distortions of the voice. At this moment it is possible to render the voice sharper or more serious in tone. The mumbo makes it possible to read an image contained in one card and to project its contents onto the flexible screen. It includes functionalities of zoom and rotation which makes it possible to make the elements move on the screen.
1.2. Active tools prototypes The philosophy of design of POGO instruments foresees simple tools affording the stimulation of sensory effects and a great reactivity with environment. They aim at a flexible and harmonious integration between the physical world and virtual world and, in the socio-cutural theoretical orientation, support with development of the narration as a vehicle of the direction and with the interaction between development of the narration as a vehicle of the meaning and with the interaction between world of the child and the images, models and significances existing in the culture. Their central question is to know with which future instruments could one provide advantages compared to what is done already in the school environment where training goes through an appropriation of meaning, with work on the emotion, imagination, exploration and social externalisation. We are interested in the effects of introduction of active and distributed new instruments on the narrative activities in a school environment. How do POGO instruments transform the activity of the children when they create stories?
New Distributed and Active Tools and Narrative Activities
5
The above shows the new tools of POGO: Beamer, the mumbo, the assembler, the microcomputer, the mobile camera, the sound carpet, and the sequences recorder.
6
Communicating with Smart Objects
2. New potentialities What are the new potentialities offered by the POGO system? The system allows the child to enter physically into the story thanks to the use of the table camera and of the camera. This camera also makes it possible for the children to be seen in activity. Their reflexivity to activity of narrative construction thus becomes possible and encouraged. The beamer is also used by the children to photograph part of their body and to modify, thanks to colour features or the morphing, and to add the effects of distortion of the images, and being of real play interest for the children. The beamer thus allows one, following the example of the camera, to project the body of child in the image, and a fortiori in history. To write with the fingers seems particularly appreciated by children. The Pogo system enables integration of the traditional tools for narrative construction (paper-pencil, drawings, account), by opening the field of new possibilities: disguises, sounds, vocal effects can be created, projected, combined. The introduction of several instruments being able to be simultaneously used to capture, handle and to combine images and sounds enables the increased participation of all the children in the construction of the contents history. The simultaneous use of the tools also introduces a more individual dimension into the narration: each child could at the same time take part in collective creation and give its contribution personal, if it wishes. The construction of history becomes multimode: video, virtual and real images, sounds, voice, can be combined, dissociated, be worked separately and simultaneously. The introduction of sound as a narrative element increases the expressive possibilities of Pogo: children can give their voice to the characters, improvise dialogues, recreate sound atmospheres, etc.
The system is user-friendly. There are analogies between the form of tools and their effects and average use. The functionalities of the system are distributed in a clear
New Distributed and Active Tools and Narrative Activities
7
and simple way among each tool. The system is appropriate for children from 6 years of age , who quickly apprehend the operation of the tools. Construction of history is not a work for the children, but take on an play aspect. School becomes a space of plays and discovery. The system encourages communication and co-operation among the children, necessary elements to make a real co-ordination of all the elements of history present at the same time. The system encourages the inspirational phases and production, opening up an enormous field of possibilities in the style of the stories, the manner of building, and in the various media backing the activities. In addition, we observe that the use of the instruments increases the collective dimension of the creative process and in particular the diversification of the roles and their participation. Finally the instruments support the children in the structuring of narrative to produce richer stories.
3. Perspectives What are the development prospects for the tools? The results indicate some inadequacies concerning the use of the instruments by children in class which seem to us to be of interest. It seems to us that the role of the teacher could better be supported by the Pogo system in particular with regard to the phases of exploration and production. Mobile, portable and wire-less tools allowing the capture of moving images and sound outside the classroom would enrich the quantity and quality of the experiments which the children can record and re-use in creation of narrative. With regard to the narrative structuring, the system seems to support and even improve the organisation of the history according to the Labov (1972) model. We estimate however that an evolution of the tools with more open methods of use could enrich the expressive potential of the children -a functionality of zoom and mobile camera. Compared with the prototypal precedents the possibility of zooming to obtain large plans of the elements of scene and a mobile camera allowing one to photograph the purposes of various points of sight offer to the children the possibility of developing richer narration, varied at the visual level. They can highlight, for example, the face of one character through a large plan or to change the catchment of sight into agreement with the point of sight of the character, etc. Our results also state that the use of the instruments does not seem to create interference with the activity they being integrated into the existing instruments; the beamer for example becomes a scheme of work, ideas collected outside or produced by children can be integrated in the system and to be thus developed. Moreover, the instruments are simple to use. Each action generates an immediately visible effect (for example creating purposes on the beamer is directly visible on the screen). The interactions are connected by the physical purposes. Those make for simple actions (by avoiding screen menus for example). These results return us to the concepts
8
Communicating with Smart Objects
suggested by Norman (1998) concerning information apparatuses, and to the fact that the tool is considered in the context of its way of supporting the task so that it becomes an integral part of it, as if it were an extension natural of the person and his/her work. That implies a specialisation of the function instrument as being in perfect agreement with the real needs of the users, and the offer of a great simplicity and transparency. Each tool is simple, request its proper method of operation. Each one must be learned, and to make it possible to carry out a specialised and appropriate task. We approach the idea according to which, in the long term, the instruments will not be recognisable any more in so much as they will form part of the task, so much that they will disappear from the sight and consciousness of the people. The distribution of the instruments in space seems to us also interesting. The use of the tablet directs us towards a possible incorporation of the units mnemonic in the physical purposes and of their handling in space (ie the possibility of transporting them, of re-using them in another space-time). Their handling of information is extended to space and is not thus any more confined to one centralised unit. The instruments also seem to us to go in the direction of one production located, the space of design and recording being integrated into the context of handling and of construction of natural purposes of the physical world suitable for children. These points testify thus to a movement towards an invisible technology.
4. References Eco, U. (1996) Six promenades dans le bois du roman et d'ailleurs. Grasset, Paris, France. Gibson, J.G. (1977) The theory of affordances. In R.E. Shaw & J. Bransford (Eds.), Perceiving, acting and knowing. Hillsdale, New Jersey, USA. Labov, W. (1972) Language in the inner city. University of Pennsylvania Press, Philadelphia, USA. Norman, D.A. (1998) The invisible computer, MIT Press, Cambridge University Press, Massachussets, USA.
Chapter 2
Smart Houses and Dependent People: Acceptability, Solvency and International Tendencies Chantal Ammi Department of Management, National Institute of Telecommunications, Evry, France
1. Introduction The appearance of new technologies has modified normal life. Home is becoming an intelligent open space adapted to people who live inside and are able to accept new systems. Integration of new technologies can help dependent people to stay at home as long as they want, and to help reduce their feelings of dependence.
2. Context During the last twenty years, the number of disabled has increased: in 2000 more than 23 millions in Europe (all types of handicaps), for two main reasons: • •
the number of elderly has grown; some of them, specially those 75 and over have more or less the same needs as the disabled; the progress of medical and associated tools have enabled people, especially young people with severe injuries or neuro-muscular dystrophy, to be saved.
These motor disabled persons have, very often, maintained their intellectual capacity. They could work, learn and live full lives if they could have the chance to live in an adapted environment. The adaption to an active and socially full life is very important from a number of points of view: •
Psychologically, to avoid the tendency to stay alone, to retire from the real world and to permit the maintenance of intellectual and physical capacities;
10
Communicating with Smart Objects
Economically, to help decrease dependence costs: hospitalisation, rehabilitation etc and to obtain a regular income; Socially, to include disabled people in a normal way of life. However, deficiencies exist and generate incapacity and disadvantages at home, at work and in the street. Rehabilitation technical aids contribute to compensate for these deficiencies. The use of new technological tools and means (smart house, telecommunication, computers etc) and their applications in normal life can reduce or eliminate these disadvantages. Used by normal people, these products allow dependent people (disabled and elderly) to decrease their dependence and to be able to: • • • • •
open and close doors, windows, shutters etc.; switch lights on and off; use appliances: TV, recorder, microwave, fridge etc.; use means of communication: phones, fax, computers; use various aids.
We can distinguish five types of rehabilitation technical aids: Mobility
—> wheelchairs, adapted cars
Environment control
—> smart house
Remote manipulator
—> robotics
Communication
—> devices, interfaces
Access to computers
—> adapted computers
To be useful and therefore used, rehabilitation technical aids must be adapted to the user needs.
3. Method of analysis The purpose is to evaluate, to modify and to develop existing or new products. These products must have the following characteristics: be useful. be adapted to needs;
be accept financially.
Smart Houses and Dependent People
11
To verify these different characteristics, it is necessary to list the functions of the products in order to analyse them: • • •
analysis of needs; analysis of costs; analysis of market.
3.1. Analysis of user needs Technical assistive aids can be resolved into 3 types according to special needs: • • •
need to control systems for greater comfort and increased security; need to manage equipment in environment; need to communicate with home, surroundings and office, and to have access to outside services for greater comfort and fun.
The satisfaction of the users depends of three variables: needs, adaptation and the accessability of the products. The consideration of needs, before the development of the products, needs a good knowledge of the disabled world with the different types of handicaps, the restrictions of possibilities and the capacities of adaptation and acceptance. Only a collaboration between all the actors (therapists, user associations, rehabilitation centres, engineers) can gain knowledge of all the variables to enable the best solution to be evolved. Depending on the situation, a study of the needs will be made before any further development can take place and will be completed by frequent iterative steps which will permit integration into the prototypes the results of the collected data. In spite of this integration of the special needs, we notice very often a maladjustment between the results and the needs and some difficulties in acceptance of the developed products and systems. In the case of motor disability, the end-users are obliged to use several type of assistive aids to compensate lost functions such as mobility, object telemanipulation and communication. Because of non standardisation between the different types of assistive aids which exist on the market, the users, who are severely disabled, find themselves confronted by many user interfaces. To decrease or to avoid these problems and to permit a better fit with the needs of disabled people, an evaluation phase is necessary. Conducted by therapists and ergotherapists, the evaluation facilitates integration of some valuable modifications (technical, ergonomic etc) to the final product and the ability to find adequate access interfaces, such as computer devices, voice recognition, sensors etc.
12
Communicating with Smart Objects
3.2. Analysis of costs But, in spite of these new applications, which allow reduced dependence for handicapped persons in their way of life, it is necessary to consider the problem of cost and economic opportunities to permit real access of these products for the users. To avoid extra and costly functions, we have to verify with qualitative and quantitative criteria: • • •
the duration of utilisation; the frequency of utilisation; the learning capacity.
For each function, we have to calculate the direct and indirect costs with the following aspects: • • • • • • • •
development; ergonomic; evaluation; communication; dissemination; technological research; market aspects; fabrication.
The costs are compared to the needs and the utility of the function. Too high a cost can have two origins: A higher cost: then, it is necessary to reduce direct and indirect charges and all the actors must be implicated in the new specification of the functions. Unsuited utility of the function: we can decide either to cut functions or to have the opportunity to put options on them.
3.3. Analysis of the market The relationships between user and technical aspects are based on the choice of two different strategies: Standardisation: the dependent user is considered to be a new opportunity for an existing market. The same technology and main functions are maintained and only the types of access are modified. The size of the market is large, but the utility and the use are low;
Smart Houses and Dependent People
13
Specialisation', the creation of new products where the dependent user is the only and final consumer. The utility and the use are respected, but the size of the market is too small to make profitable the development.
4. Remaining problems But in spite of real market opportunities, two main problems still remain and can explain actual difficulties for a real emergence of technical assistive aids: • •
the acceptability of the users; the affordability and economic rationale.
Leaving to specific details, these types of products may be conceived, developed and proposed in a productive sequence as shown following: Specification of Needs Determination of Target Conception and Production of Products and Services Potential Demand Problem of Acceptability Revealed Demand Problem of Solvency Real Demand Purchase
4.1. A cceptability To satisfy the demand of dependent people (disabled and the elderly), products and services are more and more often conceived according expressed needs, after long and expensive market study. But, in spite of this analysis, a part of the demand is still not satisfied or not concerned by all the products. The phenomenum of acceptability can explain the differences between declared needs and their acceptance by the users. Many criteria can be integrated: • • • • • • • •
cultural, sociological an religious environment; level of social life; technical acceptance; age of the users; profession of the users; medical and dependence situation; the weight of user association; the level of development;
14
Communicating with Smart Objects
• •
the rule of the public institutions; the size of the market etc.
The three following examples can explain the situation: •
•
•
quadriplegic, living alone at home, need assistive aids to prepare meals, to switch on TV or recorder etc. Some products are adapted to their handicap and can help them, but an individual's way of life, their culture, their personal history can lead them to reject the product. older people, especially dependent ones with problems of mobility and memory, prefer to stay at home instead of going to special institutions. Special equipment such as alarm and surveillance services exist, but they can be rejected because they symbolise age and dependency. religious aspects can explain a better acceptance of technologies. Protestants have a different conception of society from Catholics and in general they use technological products more readily.
4.2. Affordability After the transformation of needs into real demand a last step must be solved before the purchase of products, ie the financial resources of users. Despite public and official promises, most countries have not integrated into their social assurance system the financial contribution to finance for technical assistive aids. These products and services have a cost, and a limited market does not permit a decrease in the selling ie "ladder economies". In Europe the analysis of two different countries reveal the importance of the public financial contribution. In The Netherlands, technical assistive aids such as environment control and robots are supported by the public social assurance system as are medical expenses or pharmaceutical products. Special sites1 are implanted to fix these technical assistive aids on the wheelchair for instance. The consequences on the market are extremely positive2 and there is a real emergence of technical assistive aids for the benefit of disabled and older people. But in France, public social assurance is oriented to medical and pharmaceutical expenses. Technical assistive aids do not benefit from public financial support. Users who cannot finance these products and services themselves must find private
2
Hetdorp in The Netherlands. Sales of Manus are very significant in The Netherlands, more than hundreds.
Smart Houses and Dependent People
15
support eg users associations3. Users are left alone to manage integration of all these products in order to counter discomfort, disruptions, compatibility problems etc.
4.3. The international market In spite of a rising number of dependent people all other the world, and the actual tendency of the industrial sectors to resolve problems and to propose products and services not for a local market but for in international one, we can ask the following questions: • • • •
Can we speak of an international or local demand? Is the demand similar everywhere? How to measure the phenomenum of acceptability? How to solve the problem of solvency?
5. Conclusion In spite of a rising number of dependent people, too many potential users cannot benefit from adapted tools or a special environment which could permit them to decrease their dependency or to ameliorate their actual way of life. The main reasons are the disjunction between offer and demand and the lack of financial structure. Analysis of the international market can permit avoidance of errors in some countries and to provide orientation for the researchers.
6. References C. Ammi, The market of technical assitive aids, tendencies, problems, necessary adjustments, the French case, AAATE Llubjana, Slovenia, September 2001. C. Ammi, Problems in technical assistive aids, RAATE, Birmingham, Great Britain, November 2001. C. Ammi, Telecommunication and dependence, Hermes Sciences and FTR&D, Paris, April 2002.
3
AFM (French muscular dystrophy users association) and APF (French association for motor disabled), for instance.
This page intentionally left blank
Chapter 3
Towards Multimodal Human-computer Dialogue by Intelligent Agents Patrice Clemente France Telecom R&D, Lannion, France
1. Introduction Mobile telephones, PDA, GPS, communicating clothes, infra-red connections, Bluetooth technology, domestic networks, domestic robots, software agents, etc. The list is long. It is necessary to face the evidence: communicating objects have already started to invade us, and that will continue. This increasing volume foreshadows many problems of interaction for the future, between man and these objects, and between the objects themselves. Moreover, the absence of standards of communication between objects will lead to a multitude of protocols, and to interoperating problems between communicating objects. The difficulties of coherence and cohesion between these objects, due to their number and their autonomy, will generate unexpected and undesirable behaviour from the systems or networks of objects. The respect of the free-referee and the integrity of the individual will be complex to guarantee. Communications are inevitably increasing in number, and information of all kinds will submerge users, if they do not have intelligent and suitable mediators. If no precautions are taken, systems will thus become useless, or unusable. To avoid these pitfalls, one has to keep control of objects and systems. This requires from the latter to precisely "understand" the desires and needs of the user, an indispensable condition to satisfy them, and this, whatever the media used or wished for by the user. Objects and systems have to answer to the requests of the user and adapt their behaviour according to his/her personal profile, to the context (situation, history, etc), and to the type of task.
18
Communicating with Smart Objects
Intelligent agents, autonomous software entities, are able to reason, act, and bring interesting solutions to those problems. An intelligent agent is able to perceive and act on its environment. Thus, it can control "unintelligent" communicative objects such as actuators. Moreover, an intelligent agent can communicate, when provided with dialogue capacities. It can thus interact with other dialoguing agents, which for example deliver information or are integrated in communicating objects. The means of communication used is therefore an inter-agents' language of communication. Finally, an intelligent agent can converse in a natural way with humans. Throughout the dialogue, it can help people achieve their goals, deliver relevant information to them, carry out a certain number of actions (possibly for them), supervise their resources and all this in a dynamic way and upon request. In this case, the means of interaction traditionally used is a natural language, such as English. A system in which an agent is introduced benefits at the same time from the agent's intelligence. The agent constitutes a comprehensible and co-operative interlocutor. It can represent, for example, an assistant or a personal secretary and then will learn from its owner's specificities, and adapt to them. The agent can play the role of mediator and preserve the user from all kinds of intrusions from his/her environment, like undesired or non prior information. For example, when entering a store, a user does not systematically want the items compatible with the shopping list to appear on the PDA. When approaching objects in this same store, he may not want his/her PDA to indicate prices either, although this function is always available. We will develop into more detail a particular application of intelligent agents: human-computer dialogue (HCD). More particularly, we will treat multimodal HCD. Initially, we will point out traditional approaches of HCD and current multimodal HCD. We will then come to the gist of this chapter: the phenomenon of multimodal referring to objects. After having recalled the main problems of linguistic and multimodal referring, we will introduce our formalism for multimodal referring, 1
This language can be ACL (Agent Communication Language) proposed by the FIPA consortium. ACL is founded on the formal definition of communicating acts between agents, making it possible to carry out unambiguous interactions. " i.e. using several communication modalities. The communication modalities are defined by the structure of information which they convey (linguistic, graphic, haptic, etc) and their intrinsic properties. As they are linked with communication modes (acquisitive and productive modes), it is possible to classify them into inputs and output modalities (see [BER 97] for a survey on representation modalities).
Towards Multimodal Human-computer Dialogue
19
made possible by an original representation of objects. We will show a theoretical model of a multimodal referring act, illustrated by a short example. We will conclude with technical remarks on our model and its implementation and general ones on systems which it will allow to develop.
2. Human-computer dialogue 2.1. Various approaches There are various points of view concerning modelling and implementation of HCD sy systems3. Structural approaches assert the existence of an interaction structure, built on the regularity of exchanges appearing in a dialogue. According to these approaches, this structure is determinable a priori, and would be established in a finished way (see [SAD 99] for a recall of these approaches). Differential approaches consider a dialogue as the realisation of one or more communicative acts. Based on the principle that to communicate, is to act [AUSTIN, 1962], they start from the idea that communicative actions, following the example of classical actions, are justified by goals and are planned in this way. In particular, these goals relate to the change of the mental states (of the interlocutors), represented in terms of mental attitudes, such as knowledge, intention, uncertainty. These approaches consider the dialogue from more general models of the action and the mental attitudes (see [SAD 99] for a range of these approaches). Rational communicating agents fall under this approach, while insisting on the vision of natural and user-friendly communication as an intrinsically emergent aspect of intelligent behaviour (see [COM 90], [SAD 91], [COH 94] for the primary works and [SAD 99] for a more exhaustive overview). This approach makes the problems of flexible and co-operative interface design and intelligent artificial agents overlap. Flexibility appears by an unconstrained dialogue which makes it possible to evolve in the interaction freely, to deviate from the consensual behaviour of conversation, in order to, for example, signal a problem to the system to possibly remedy it. Cooperation appears in many forms: reaction to requests, adoption of user's intentions, sincerity of the system, relevance of the answers, etc. The rational communicating agent constitutes the core of the human-computer dialogue system. To establish the link with the user, the agent, which is nothing else 3
Conscious of the heaviness of 'a computer system whose human-computer interface is dialoguing', we will use, from now even if it is an abusive language 'dialogue system' or 'human-computer dialogue system'.
20
Communicating with Smart Objects
than a program, uses a certain number of interfaces made up of more or less physical and tangible communicating objects. Figure 3.1 illustrates this matter with several layers of communicating objects. The "high" layer, the agent one, communicates with the transition layer made up of recognition and synthesis systems. This layer is in relation with a lower layer which represents physical media of interaction (such as peripherals and physical communicating objects). The last layer is embodied by the user, since he/she can also be seen as a communicating object.
2.2. Multimodal human-computer dialogue The first systems of Multimodal HCD made it possible to dialogue in written natural language, for three major reasons. When those systems appeared, interaction devices were limited, typically to a keyboard and an alphanumeric screen. One thought that natural language was the prevalent means of dialogue, the most effective to exchange information and to understand one another. Since Turing ([TUR 50]), natural language was seen as the ultimate demonstration of intelligence. It is well known today that this is not true. Many other dimensions come into play during interactions and dialogue. Natural language in fact constitutes only one component of dialogue. Gestures, postures, gaze, facial gestures, prosodic cues, effects, cultural dimension, micro-social proxemic and some others are quite as important [COS 97].
Towards Multimodal Human-computer Dialogue
21
Figure 3.1 Layered model of communicative objects involved in a multimodal human-computer dialogue
For example, gestures are involved in thinking as much as language [MAC NEIL, 1992]. They can illustrate some mental images in the scene of speech, which the language cannot always do in a satisfying manner4. They make it possible, in other cases, to replace word groups and/or word traits5, thus dispersing the message on two communication modalities (which are natural language and gestures). New technologies bring new media of communication. They open a potential new way towards other modalities of interaction with the computer. Thus, to the old (but always useful) keyboards, mice and screens are added voice recognition, gesture recognition, gaze tracking, haptic devices (tactile screens, data gloves etc), natural language generation, graphic and image synthesis, voice synthesis, talking faces, virtual clones (see [BEN 98] and [LEM 01] for a broad range of these systems); it is already possible to convey fragrances on the WWW!
4
These gestures are called illustrative gestures. These gestures can be either called illustrative gestures or emblematic gestures (see [COS 97] for a survey on communicative gestures). 3
22
Communicating with Smart Objects
In order to build HCD more user-friendly and natural to use systems, much work over the past ten years has tried to benefit from these new technological potentials in trying to conceive multimodal HCD. The stakes are numerous. The user should be able to converse naturally, using the modalities he wishes6, to switch from one modality to the other (transmodality). The emotions expressed by the prosody of his voice or his facial gestures should be understood and taken into account. Symmetrically, the system is expected to answer using the best modalities: the selected modalities should be the most effective for the type of information to convey, those preferred by the user and the emotional dimension should also be present (see Chapter 5), etc.
3. Referring during dialogue 3.1. Problematics 3.1.1. Linguistic reference Natural language HCD systems make it possible for the user to question a server, or carry out research. For example, the user can manage his diary, his share portfolio, consult the weather forecast, etc. For all of these tasks, the dialogue relates to objects of the world7. Speaking about an object (e.g. using nominal groups) constitutes a linguistic reference to the object. In fact, in any dialogue, interlocutors talk about something, thus carrying out references. Searle, in his philosophical theory of language acts ([SEA 69]), goes further. He states that: When an agent performs an illocutionary act8, he performs by the fact referring and predicating acts. This means that each sentence consists of references (to objects in particular) and predications (for example to specify properties of these objects). Many works deal with the comprehension and the generation of linguistic references in dialogue systems. For the comprehension of referential expressions, one is confronted with the problems involved in voice recognition in vocal systems. For mono-speaker systems, a user voice training by the recognition tool is needed. For multi-speaker systems, 6
See [BEL 95] for an exhaustive definition of the various types of multimodalities. Object in the broad sense, i.e. physical, conceptual or virtual entity. For example, an Email address is an object of the world, as well as the car of the neighbor opposite. 8 An illocutionary act is the act achieved by the production of a succession of signs in a social relation context, which expresses an intention ("to inform", "to request" are illocutionary acts). 7
Towards Multimodal Human-computer Dialogue
23
vocabulary size is limited, and the performances of recognition are therefore restricted. This implies the development of strategies of semantic completion of the recognised propositions [CAD 95], a technique which has its own limits. Beyond the eventuality of speech recognition (which disappears with written natural language), the system tries to understand the reference, i.e. to identify the referent. To do so, it traditionally proceeds by satisfaction constraints (see [HEE 95]). The object descriptor refers to a whole of potential candidates. Each component of the expression brings a constraint, reducing the unit. Ideally, this process converges towards a single candidate but it is not always the case (none or several acceptable referents can appear). It is then necessary to make a clarification dialogue. Recent works [SAL 01] adopt a wider approach in modelling the mental representations of situation and the domain of reference. For the generation of referential expressions, other problems are encountered: the choice of the descriptors to be used for the reference [APP 85], [DAL 87], [REI 90], [REI 92], the calculation of unambiguous description [CLA 86], the referential collaboration which underlines the fact that a reference is often understood only after the succession of several references during a repair dialogue [HEE 91], [EDM 94], the co-presence of agents and objects [COH 81], [HEE 95], and the management of focus in dialogue [GRO 86], [REI 92]. Moreover, some works show the problems which dialogue systems have to solve to become a little more user-friendly. The majority of the current dialogues systems do not take into account space-time evolution of the world and are based on rather fixed representations of the world. The evolutionary referents (in time, space or their own nature) cause hard to model difficulties that current technologies cannot manage [PIE 97]. 3.1.2. Multimodal reference We think that the assertion of Searle (cf. §3.1.1.) applies to any type of communicative act, either linguistic, monomodal or multimodal. Thus, pointing at an object with the voice and gesture, constitutes a multimodal reference to that object. Multimodal referring should therefore bring a richness and additional expressiveness to HCD, in particular to exploit the properties of representation modalities.
24
Communicating with Smart Objects
3.1.3. Examples The two following examples underline all the potential of expressiveness, comfort and ease of use that multimodal references can bring. The first example presents a sample dialogue in which the user is talking to a computer. The second example illustrates several turns of natural language HCD. The last statement, produced by the computer, is a multimodal one. Comprehension of a multimodal reference by the communicating agent Let us take the example of a multimodal statement composed of a natural language (NL) sentence and of a gestural (G) designation on the screen. The HCD system allows the online purchase of vehicles at a car dealer. A user consults on a tactile screen the list of available automobiles, presented under the form of small photographs. He wants to know the price of a car (Figure 3.2). This multimodal utterance partly consists of references to an object, of automobile type. This one is referred, on the one hand, with a referential linguistic expression, by the demonstrative nominal group 'this car' which refers to instance_auto_21, a particular instance of the object category of automobile and, on the other hand, with a gestural deictic reference (gestural designation) which refers to the same object. There is therefore for this object a bimodal reference (or co-reference).
Figure 3.2 Multimodal utterance example. The modalities used are natural spoken language (NL) and designation gesture (G)
Generation of a multimodal reference by the communicating agent If a human being is intrinsically limited by his/her means of expression: voice and gesture, he/she can at best extend the modalities to writing, drawing, using charts, etc. However, the amount of time taken to exploit them is increasingly longer than for natural modalities. The intelligent agent does not suffer from this kind of problem and benefits from very large processing and storage computer capabilities. In a situation where a person would try to make a gesture to illustrate the content of his speech, the agent can replace this illustrative gesture, difficult to build and sometimes to understand, by a graphic visual representation strongly analogue9 to 9
See [BER 97] for an overview of analogue modalities.
Towards Multimodal Human-computer Dialogue
25
the original object or concept in question (a photograph for an object, and a diagram for a concept, for example). Figure 3.3 illustrates this phenomenon in a fictitious example of dialogue between a user (U) and his/her intelligent personal electronic assistant (S).
Figure 3.3 Fictitious example of a system multimodal utterance (white part of the HCD) 3.1.4. Difficulties Multimodal dialogue systems, as well as natural language dialogue systems, are confronted with problems directly or indirectly involved in the referring phenomena in multimodal HCD. First of all, things are not simpler for multimodal references than for linguistic ones. If intuition leads one to think that the possible system redundancies in input will allow it to make a more robust interpretation and avoid ambiguities, the facts are very different. This is partly due to recognition systems, which do not ensure optimal recognition. Thus, when two parallel messages coming from two different modalities are contradictory, it is difficult to know which one is erroneous. Fortunately, task and context can be helpful in this process. Then, certain problems relating to linguistic references remain within the framework of multimodal ones. It is obviously the case of the calculation of unambiguous object descriptions, identification of referent (and referential collaboration), co-presence of the agents and management of focus (all the more difficult as modalities are numerous) [CSI 94]. Other problems appear at the comprehension level of reference, like confusion between command gestures of communicative gestures, or between unintentional gestures and deictic ones [STR 97]; metaphor of display of the real world (confusion between the object and its representation) [HE 97]; temporal synchronism and the scheduling of the events.
26
Communicating with Smart Objects
The problems occurring in generation of multimodal references are the choice of descriptors on the selected modalities (this selection is related to the choice of modalities10 [REI 97]); the internal inference of attributes of a modality, starting from other attributes already known on other modalities or known of the categories of objects [AND 94], the metaphor of display, temporal synchronism, etc. 3.2. Multimodal mental representation 3.2.1. Psychological model The phenomenon of referring should be approached in a general way and the model of reference used by the agent has to be sufficiently close to that of the human one in order to enable him/her to plan the reactions of the computer just as he/she would do with a human interlocutor so as to continue to refer as he used to. In the same way, the software agent has to be able to envisage the interpretation of its reference by the user so as to build it accordingly. The interpretation of a reference is partly related to the knowledge of the recipient agent. This is why the object representation model (and its principle of dynamic construction) has to meet this need. Damasio [DAM 94] indicates that the mental representations of objects that humans build consist of perceptive elements acquired during sensitive experiments on these same objects. Let us take the example of a person z "known by sight" by a person j This person j keeps in memory a certain amount of perceptive information about z, like visual information (e.g. his face, his size, his style of clothes), like auditory information (e.g. the sound of his voice), etc. Using this information intelligently, j can build references to z in order to bring his/her interlocutor to identify him/her (i.e. z). The mental representations are also partly made up of linguistic object descriptors [PAI 69] that describe on the one hand the semantic category of the object, and on the other, its particular properties. Most of the sensitive and linguistic elements are redundant, because they are encoded in a double way by the phenomenon of dual coding [PAI 69], [PAI 86]. During perception, dual coding converts sensitive (resp. linguistic) elements into linguistic (resp. sensitive) equivalents and stores everything in memory.
10 It is known for example that a piece of geographical information is better conveyed by graphic than a piece of abstract information for which the text will be more appropriate.
Towards Multimodal Human-computer Dialogue
27
3.2.2. Computing model In order to formalise this organisation of object memory, we introduce, for an agent, the concept of multimodal mental representations (MMR). A MMR is a formal entity corresponding to the intuitive idea of multimodal mental representation of an object that an agent has. A MMR consists of a set of acquisitive object representations (OR), which appeared during linguistic and sensitive perceptions by the agent and of a set of productive OR produced for linguistic and sensitive references. All sensitive OR (acquisitive and productive ones) constitute the entire sensitive mental image of the object and all linguistic OR (also acquisitive and productive ones) constitute the entire linguistic mental image of the object. These two mental images constitute the entire MMR. OR make it possible to refer to the generic and particular properties of objects. Generic properties are in fact categorical descriptors of objects. Here is an example of linguistic categorical descriptors: "animal" —> "mammalian" —> "dog". Specific properties make it possible to code the particular attributes of the object. For example: "dog" —> "brown "-^"Droopy", etc. While using these OR, the agent will be able to build references. Our model of mental representation is to be brought closer to that of Appelt and Kronfeld [APP 87], [KRO 90] who used the term individuating set (Is) for mental representation. An Is is composed of intensional object representation^) (IOR), which represent11 the referred object, if this one exists. Appelt and Kronfeld defined two types of IOR: speech IOR which results from linguistic acts of referring in the discourse and perceptive IOR which result from perceptive acts of referring in the speech. We do not consider this model sufficiently precise for multimodal reference. The definition of IOR remains too vague, does not detail enough various possible natures of perceptive IOR (in terms of various modalities of interaction). As we just saw, we defined two types of OR, acquisitive OR (input of the agent) and productive OR (produced by the agent). Acquisitive OR occurs after an act of perception. This act of perception can be of two types. The first one corresponds to the sensitive perception of real objects. The vision of an object for example makes people perceive its form, its size, its colour, its aspect, etc. These sensory descriptors of the object will constitute acquisitive sensitive OR. The second one corresponds to the perception of linguistic references 11 We take the term of representation used by Maida [MAI 92], which we prefer at the original term of denotation employed by Appelt and Kronfeld.
28
Communicating with Smart Objects
in the speech. These linguistic descriptors of the object will constitute acquisitive linguistic OR. Productive OR appear when the agent tries to build a reference. In this case, there are also two possibilities. The first possibility is the computerised method of dual coding (cf. § 3.2.1.). However, there is a small difference which is that this method will only be triggered by need, i.e. whether for reasoning or referring, not during perception. In order to do that, we use a set of categorical and semantic associations between linguistic and sensitive descriptors. The second possibility makes it possible to generate new traits or properties of the object on the same modality, starting from generic knowledge either on the category of objects or on the domain, in a deductive way. Both methods can generate linguistic and sensitive OR.
3.3. Multimodal object representation Our approach makes it possible to combine OR related to potentially different communication modalities in a new multimodal OR (MOR). That thus gives the agent the capacity to use several OR in one MOR to refer to an object in a multimodal way. Formally, this combination carries out the semantic sum of OR components. Some rules specify certain characteristics of MOR (e.g. temporal layout of OR components). In order to model this capacity of combination, we propose the formal predicates Mor_combine (Figure 3.4). Mor_combine(mor, orj,..., orj is true if and only if the MOR mor is the combination of every OR from or; to orn which are all related to different modalities. Figure 3.4 Multimodal OR combination predicate 3.4. Act of referring The preceding prerequisite enables us to introduce a formal model of act of referring which can be integrated into a theory of rational interaction [SAD 91], on which dialoguing agents developed in FT R&D are based. This theory of rational interaction is founded on an integrated formal model of the mental attitudes and rational action, which makes it possible to take into account various components and capacities implied in communication. It is the case of
Towards Multimodal Human-computer Dialogue
29
rational balance, a relation established on the one hand, between the various mental attitudes of an agent (belief, uncertainty, intention) and, on the other hand, between its mental attitudes and actions. The communicative actions fit within this framework. They can be recognised and planned like traditional actions by the primitive principles of rational behaviour. Sadek proposed some models of communicative acts in his theory of rational interaction. They characterise, on the one hand, the reasons for which the act was selected (called rational effects) and, on the other, the preconditions of feasibility having to be satisfied so that the act can be planned (we will reconsider these concepts a little further). In this theory, the various models of communicative acts (like the inform act for example), are made operational using logical principles of rationality. These principles for example will make it possible for the agent to select the actions which lead to its goals. The model of act of referring we propose integrates the theory of rational interaction. This act can be planned and carried out by the same principles of rational behaviour. To define this model of act, we partly take the one proposed in [BRE 95] adapted from definition of Appelt and Kronfeld [APP 87], and of Maida [MAI 92]: An agent refers to an object whenever the agent has a mental representation which represents what the agent believes to be a particular object, and when the agent intends the hearer to reach a mental representation which represents the same object. We define an act of referring for each set of relevant multimodal reference12. The act presented below (Figure 3.5) enables one to produce a reference of the same type as the reference to the car in Figure 3.2, i.e. to refer to a MMR using a conclusive nominal group and a deictic gesture. The heading of the act13 expresses that an agent i performs a referring act to an agent j using the MOR mori to refer to the MMR mmrj. The preconditions of relevance to the context (ConP) of the act express the conditions depending on the context which must be true so that the act is accomplished. If they are false, the act, unrealisable, will not be selected by the agent. For example, if the system needs a physical connection to WWW it does not have, it will not seek to have one. The preconditions of relevance to the context presented in Figure 3.5 mean that to perform a linguistic (natjang) and gestural (gesture) reference, these two modalities must be available (Available(m)}. 12 These types of multimodality are so numerous that we eliminated those considered to be irrelevant or useless. They were built from a taxonomy of unimodal modalities, based on [BER 97]. 13 For reasons of clarity, the model presented here is simplified. In particular, the distinction between productive and acquisitive OR does not appear.
30
Communicating with Smart Objects
ConP: Modality_isa(m1, gesture) A Modality_isa(m2, not Jang) A Available(m1) A Availablefm^ A CapP: Orm(mori) A Mmr(mmrj) A Has_mmr(i, mmri) A Belong(mor}, mmri) A Or(orj) A Ofor^ A Belong(ori, mmrj) A Belong(or2, mmrj A Modality_cat(ori, mj) A SemioticFnct"_isa(or•/, deictic) A Location(loci, mmri) A Visiblefloci) A Destination(or•], locj A Modality_cat(or2, m^ A Txt_isa(or2, nom_dem_gr) A Mor_combine(mori, or/, or^) A Desc_ident(mori, rmmj) A ^for^ RE: Has_rmm(j, rmmj') A Represent_same_individual(rmmi, rmm}')
(1) (1) (2) (2) (3) (3) (4) (5)
Figure 3.5 Logical model of the act of referring. Example taken with a multimodal OR made of a linguistic OR and a gestural OR The preconditions of capacity (CapP) relate to the capacity of the agent to perform the act. If the conditions are false, the agent can plan the actions which will make them true. The preconditions of capacity presented here mean the following things. The conditions (1) are present in every act of referring: the MOR used to refer to a MMR must belong to this MMR (Belong), an agent i can refer only to one of its MMR. The rational effect (ER) is also common to all acts of referring (we will return to this point further). The conditions (2) specify that or/ and or2 are OR and must belong to the MMR mmri. The conditions (3) express that the category of modality (Modality_Caf) of or/ is gesture, that its semiotic function (SemioticFnct_isa) is deictic, that loc1 is the site (Location) of the object represented by the MMR mmrb that loci is visible and that the destination of the deictic gesture is loc} (location of the object to be referred). The conditions (4) mean that the category of modality of or2 is natural language (nat_lang) and that its textual category14 (Txtjsd) is a conclusive nominal group (nom_dem_gr). The conditions (5) express that ormj is the combination of or/ and or2. The Descjdent predicate expresses that MOR ormj is an identifying description of the MMR mmrj. §(or$ specifies additional properties on or2 which we do not detail here (see [PAN 96]). To illustrate the fact that the preconditions of capacity can be planned, let us take the example of the visibility of the location locj. If the location /oc/ is not visible, the agent can plan a succession of actions to make it visible (e.g. according to the context: to physically move the object, to move its representation, to move (in) the scene of discourse). The rational effect (RE) (the expected effect) of this act is that its addressee, the agent j, will have a MMR mmrj' representing the same object as the one represented by mmri.
See [PAN 96] for the definition of textual category.
Towards Multimodal Human-computer Dialogue
31
Let us illustrate our act with an example, while partly taking the one of Figure 3.2. The user asks: 'What is the price of this car' but this time, the user points at a spot on the screen where there is no vehicle. This spot is actually located between the photographs of two cars. Thanks to the model of act of reference shown in Figure 3.5, the agent will note that the reference is a bimodal one with a natural language OR and a gestural OR (white part of Figure 3.2). Natural language OR is of conclusive nominal group type and gestural OR is of deictic type. The agent will think that there is a MOR combining the preceding two OR. Thus, these two OR are the realisation of only one co-reference to a single object. Conclusive nominal group OR will make it possible for the agent to identify the category of the object as being an instance of the automobile object category. Unfortunately, the gesture deictic pointing on no vehicle, the agent will not be able to identify the referent. A possible reaction of the agent (directed by the primitive principles of the rational behaviour) will be for example to undertake a clarification dialogue in order to obtain an identifying description of the object. The agent can also, thanks to its knowledge manager, elect several possible candidates for this referent. Then he will be able to answer the user by using the act of referring: "This vehicle costs 12500 US Dollars...' (by indicating the vehicle located above the spot indicated by the user) '...and this vehicle costs 15000 US Dollars' (by indicating the vehicle located below the spot indicated by the user).
4. Concluding remarks The model of multimodal act of referring we propose is based on an original representation of objects. It makes it possible to integrate information related to different modalities. Thanks to this model, the agent can understand a multimodal reference produced by another agent, and identify the referent. It can also produce a multimodal reference. One of the characteristics of this model is that it can easily be extended beyond the referring to objects. Indeed, it makes also possible to refer in a multimodal way to relations existing between objects, as well as with other properties of multimodal utterances, like facial gesture reference to illocutionary force. Our model is in the course of implementation within a system of dialogue, already operational15 in natural language. This model, once implemented, should partly allow richer and user-friendly dialogues. The number of media and modes of communication which could be used 15 This system integrates a dialoguing rational agent called ARTIMIS [SAD 97], founded on a theory of rational interaction [SAD 91].
32
Communicating with Smart Objects
will diversify interactions. The users will be able to define their preferred modalities, to switch from one to another. User and system will be able to use several modalities at the same time, and adapt the presentation to the content. The multimedia and multimodal dimensions of information processing systems will finally be exploited to its right measure, i.e. in an intelligent and adapted way.
5. References [AND 94] E. Andre, T. Rist, Referring to World Objects with Text and Pictures, In Proceedings of COLING-94, 530-534, 1994. [APP 85] D.E. Appelt, Planning English Referring Expressions. Articifial Intelligence, 26(1), pages 1-33, 1985. [APP 87] D.E. Appelt et A. Kronfeld, A Computational Model of Referring, In Proceedings of the 10th IJCAI, pages 640-647, Milan, Italy, 1987. [AUS 62] J.A. Austin, How to Do Things with Words, Harward University Press, 1962. [BEL 95] Y. Bellik, Interface Multimodales: Concepts, Modeles et Architectures, Thesis Dissertation, University of Paris-XI, France, 1995. [BEN 98] C. Benoit, J-C. Martin, C. Pelachaud, L. Schomaker and B. Suhm, AudioVisual and Multimodal Speech Systems, In D. Gibbon (Ed.), Handbook of Standards and Resources for Spoken Language Systems - Supplement Volume, 1998. [BER 97] N. O. Bernsen, A Toolbox of Output Modalities, Representing Output Information in Multimodal Interfaces, The Maersk Mc-Kinney Moller Institute for Production Technology, Odense University, Denmark, 1997. http://www.nis.sdu.dk/ publications/papers/toolbox_paper/index.html [BRE 95] P. Bretier, F. Panaget, D. Sadek, Integrating Linguistic Capabilities into the Formal Model of a Rational Agent: Application to Cooperative Spoken Dialogue, In Proceedings of the AAAI Fall Symposium on Rational Agency, Cambridge, 1995. [CAD 95] V. Cadoret, Determination d'actes de dialogue: une approche combinant representations explicites des connaissances et apprentissage connexionniste, Thesis Dissertation, University of Rennes I, 1995.
Towards Multimodal Human-computer Dialogue
33
[CLA 86] H.H. Clark et D. Wilkes-Gibbs, Referring as a collaborative process. Cognition, 22:1-39. Reprinted in (Cohen, Morgan and Pollack, 1990, pages 463493), 1986. [COH 81] P.R. Cohen, The need for referent identification as a planned action. In Proceedings of the Seventh International Joint Conference on Articifial Intelligence (IJCAI-81), pages 31-36, 1981. [COH 90] P.R. Cohen, J. Morgan, M.E. Pollack (Ed.), Intentions in communication, MIT Press, 1990. [COH 94] P.R. Cohen, H.J. Levesque, Preliminaries to a collaborative model of dialogue, In SPECOM'94, pages 265-274, 1994. [COS 97] J. Cosnier, J. Vaysse, Semiotique des gestes communicatifs. Nouveaux actes semiotiques, 54, pages 7-28, 1997. [CSI 94] A. Csinger, Cross-Modal and the Attention Problem, Technical Report from the Department of Computer Science at the University of British Columbia, 1994. [DAL 87] R. Dale, Cooking up referring expressions. In Proceedings of the 27th Annual Meeting of the Association for Computational Linguistics, pages 68-75, 1987. [DAM 94] A.R. Damasio, Descarte's Error: Emotion, Reason and the Human Brain, New York: Grosset/Putman Press, 1994. [EDM 94] P. G. Edmonds, Collaboration on reference to objects that are not mutually known, Proceedings of the 15th International Conference on Computational Linguistics (COLING-94), Kyoto, pages 1118-1122, 1994. [HE 97] D. He, G. Ritchie, J. Lee, Referring to Displays in Multimodal Interfaces, In Workshop on "Referring Phenomena in a Multimedia Discourse and their Computational Treatment", ACL - SIGMEDIA, 1997. [HEE 91] P. Heeman, A computational model of collaboration on referring expressions, Master's thesis, Department of Computer Science, University of Toronto, 1991. [HEE 95] P. Heeman, and G. Hirst, Collaborating on referring expressions. Computational linguistics, 21(3), pages 351-382, 1995. [GRO 86] B.J. Grosz, and C.L. Sidner, Attention, Intentions, and the Structure of Discourse, Computational Linguistics, 12(3), 1986.
34
Communicating with Smart Objects
[GRO 95] BJ. Grosz, A.K. Joshi, S. Weinstein, Centering: A framework for modeling the local coherence of discourse. Computational Linguistics, 12(2), 203-225, 1995. [KRO 90] A. Kronfeld, Reference and Computation, An Essay in Applied Philosophy Of Language, Studies In Natural Processing, Cambridge University Press, 1990. [LEM 01] P. Le Mer, Modele de communication homme-clone pour les environnements virtuels collaboratifs non-immersifs, Thesis Dissertation, University of Lille, 2001. [MAI 92] A.S. Maida, Knowledge representation requirements for description-based communication, In Proceedings of Knowledge Representation'92, pages 232-243, 1992. [MCN 92] D. Me Neill, Hand and Mind: What Gestures Reveal about Thought. University of Chicago Press, 1992. [PAI 69] A. Paivio, Mental Imagery in Associative Learning and Memory. Psychological Review, 76, pages 241-263, 1969. [PAI 86] A. Paivio, Mental representation: A dual coding approach, New York, Oxford University Press, 1986. [PAN 96] F. Panaget, D'un systeme generique de generation d'enonces en contexte de dialogue oral a la formalisation logique des capacites linguistiques d'un agent rationnel dialoguant, Thesis Dissertation, University of Rennes I, 1996. [PIE 97] J-M. Pierrel, L. Romary, G. Sabah, J. Vivier, A. Vilnat, A. Nicolle, Machine, Langage et Dialogue, Collection Figures de 1'interaction, Paris, L'harmattan, 1997. [REI 90] E. Reiter. The Computational Complexity of Avoiding Conversational Implicatures. In Proc of 28th Meeting of the Association of Computational Linguistics (ACL-1990), pages 97-104. MIT Press, 1990. [REI 92] E. Reiter, A fast algorithm for the generation of referring expressions. In Proceedings of the 14th International Conference on Computational linguistics (COLING-92), pages 232-238, 1992. [REI 97] E. Reiter, Discussion Paper: Choosing a Media for Presenting Information, Discussion paper for Electronic Transactions on Artificial Intelligence, 1997. http://www.dfki.de/etai/statements/reiter-nov-97.html
Towards Multimodal Human-computer Dialogue
35
[SAD 91] D. Sadek, Attitudes mentales et interaction rationnelle: vers une theorie formelle de la communication, Thesis Dissertation, University of Rennes I, 1991. [SAD 97] D. Sadek, P. Bretier, F. Panaget, ARTMIS: Natural Dialogue Meets Rational Agency, In Proceedings of 15th IJCAI, pages 1030-1035, Nagoya, Japan, 1997. [SAD 99] D. Sadek, Design Considerations on Dialogue Systems: From Theory to Technology - The Case of ARTMIS, In Proceedings of the ESCA TR Workshop on Interactive Dialogue for Multimodal Systems (IDS'99), Allemagne, 1999. [SAL 01] S. Salmon-Alt, Reference et dialogue finalise: de la linguistique a un modele operationnel, Thesis Dissertation, University of Nancy I, 2001. [SEA 69] J.R. Searle, Speech Acts, Cambridge University Press, New York, 1969. [STR 97] M. Streit, Active and Passive Gestures - Problems with the Resolution of Deictic and Elliptic Expressions in a Multimodal System. In Workshop on "Referring Phenomena in a Multimedia Discourse and their Computational Treatment"., ACL SIGMEDIA, 1997. [TUR 50] A. Turing, Computing machinery and intelligence, Mind, LIX, 1950.
6. Glossary ACL (Agent Communication Language): Inter agents communication language proposed by consortium FIPA. ACL is founded on the formal definition of communicating acts between agents, making it possible to carry out unambiguous interactions. Actuator: Apparatus, body of an apparatus operating a computer so as to modify its state or its behaviour. Adaptability: Capacity to adapt. For a system of man-computer interaction, that indicates its capacity to modify its behaviour according to the profile of that which uses it, of the context, of the task. Agent: An agent is a data-processing, autonomous and active process permanently. It is able to perceive and act on its environment, to communicate and to even move (to migrate). Belief: The belief characterises what the agent believes true, it thus constitutes the model of the world of the agent.
36
Communicating with Smart Objects
Bluetooth: It is a short waves radio technology which makes it possible to remove wiring between electronic, computers and telephones units by a short radio link (ten meters) with low need of energy. Communication mode: splitted into two types of modes: perceptive mode (related to the human senses, i.e., visual mode, auditive mode, ...) or productive modes (related to the action capabilities of human being, i.e., oral mode, gestural mode). Communicative act: It is a behaviour intended to be observed by (and thus to change the mental state of at least) another agent that its author. An agent adopts a behaviour of this type to communicate an intention. Deictic: Designation by pointing. Dialoguing agent: It is an intelligent agent equipped with capacities of dialogue. In fact a really intelligent agent sees its capacities dialogue to emanate naturally from its intelligence. Dual coding: Process which, during perception, linguistics or sensitive, carries out the encoding and the at the same time linguistic and sensitive storage of the object of perception. Gaze tracking: Processes which make it possible to determine the precise position where the gaze of the user is. Gestures recognition: Very recent processes which make it possible to recognise certain categories of gestures and their mean. GPS (Global Positioning System): System of localisation per satellite. Haptic (interfaces): Emergent interfaces of which the tactile screens form parts, based on the tactile or space interactions. Illocutionary act: Act achieved by producing a succession of signs in a context of social relation, and which consists in expressing an intention (to inform, to request, are examples of illocutionary acts). Image synthesis: Process able to generate the image of one object using its semantic or visual description (form, size, colour, etc). Intelligence: See Intelligent agent. Intelligent agent: The intelligence of an agent results in a behaviour known as intelligent or rational, like the aptitude for the reasoning, the aptitude for the comprehension, the capacity of reaction, autonomy.
Towards Multimodal Human-computer Dialogue
37
Intention: An agent has the intention of a proposal when he wants that this one to be true in the future. Intention maintain thus a strong link with action. Interworking: Capacities of initially independent and different systems to being able to function in interaction. Knowledge (of the agent): See Belief. Locutionary act: Act consisting in producing a succession of signs (phonems, graphems, etc), according to a given language grammar. Media: Physical device of interaction. There are various types of media: monodirectional in input of the system (keyboard, mouse, tactile screen) ...or at output (screen, loudspeaker) ...the and bi-directional ones which are most of the time only the combination of mono-directional medias (telephone, PDA, etc). Mediator: Intelligent intermediary able to moderate the actions of the interlocutors, to establish the link between them, etc. Mental attitude: Various mental attitudes are classically modelled within the intelligent agents such knowledge, the desire, the intention, uncertainty ... Modality: Structure of information used at the time of a communication (linguistic, textual, graphic, tactile, ...). The modalities are more or less directly related to the modes of communication. Multimedia: Use of several media. Multimodal: Use of several modes of communication or several modalities. Multimodal referring: Reference carried out by using several modalities during its production or its perception. PDA (Personal Digital Assistant): electronic organiser, portable electronic diary, etc. Perlocutionary act: Act achieved by the fact of stating a succession of signs. "To convince", "to encourage", "to alarm", "to frighten" are examples of perlocutionary acts. The perlocutionary acts are especially characterised in terms of perlocutionary effects as they appear as effects of illocutionnary acts and are not necessarily accomplished intentionally. Predicating act: Act allowing to describe properties on the world (such of the relations between the objects).
38
Communicating with Smart Objects
Prepositional act: Act made up of referring and predicating acts. Prosody: Set of parameters involved in the force and the duration of the phonic units (phonemes) in a vocal signal. Protocol: Preset sequence of events allowing to guarantee the good course of a communication (communications protocol) of actions (operational protocol). Rationality: The principles of rationality constitute one of the important aspect of what one regards as characteristic of the intelligence. Rationality is in particular what will push an agent to be acted by selecting in an optimal way the actions which will lead it to its goals. Referring act: Act allowing to indicate objects, while referring to its mental representation or with the object itself. Service: Set of functionalities gathered around a total offer intended for the user to facilitate the achievement of a task to him or to return it to him more pleasant: information services, purchase/sale services... Software agent: See Agent. Tactile screen: Screen equipped with a tactile matrix on which one can indicate an area on the screen using his finger or a pen. The size of this area depends on the number of points of the matrix. However, the allowed precision remains most of the time rather weak. Talking heads: Computer graphic designed faces, in 3 dimensions whose facial muscles and mouth can be articulated and whose movements can be synchronised with the uttering of a vocal message. Trait (of word): The trait of a word corresponds to the one of the attributive or characteristic aspect of its mean. Example: the verb "to walk" has several traits: the fact of advancing, with feet, slowly, etc. Transmodality: That represents the physical switch of a modality to another. Uncertainty: Characterise the uncertainty of agent on the veracity of a proposal. Virtual clone: like-human entity represented in 2 or 3 dimensions. Voice recognition: Signal processing techniques which return a sequence of recognised words. Voice synthesis: Reproduction of a speech signal which follows a written text.
Chapter 4
Multimodal Interaction on Mobile Artefacts Laurence Pasqualetti, Laurence Nigay, Moustapha Zouinar, Pierre Salembier, Guillaume Calvet, Julien Kahn and Gae'tan Rey France Telecom R&D, University of Grenoble and University of Toulouse, France
1. Introduction Recent progress achieved with the miniaturisation of microprocessors and wireless networks make it possible to consider that the "grey box" of the personal computer is condemned to disappear or at least not to be the only place of interaction between the user and the numerical world. This comes about as a result of a double movement resulting from technological work on the concept of ubiquitous computing and the disappearing computer and from the evolution of ideas in the field of models of interaction. Indeed, research is now gradually being directed towards models of interaction in which data-processing resources are distributed in a multitude of usual objects with which the user will interact in an explicit or implicit way. The idea here is to use the environment as an interface, as a system of manipulation of technical resources functionally limited but contextually relevant (concept of "tangible" interface). The device can be physically handled in a meaningful way: action with the device returning to a preset function (that requires the definition of a semantics and possibly of a syntax of the interaction with the device). The concept of communicating objects covers very diverse technological and conceptual realities; among the specific properties generally attached to these objects one will retain in particular: •
•
numerical increase rather than substitution: the direct idea is to set out again usual objects in which one endeavours to preserve the intrinsic advantages inherent in their material constitutivity (in particular properties of affordability and support of awareness) at the same time associating additional functionalities (eg: paper and the numerical pen, the communicating refrigerator etc). transportable character of devices: it makes it possible to be given "embarked" resources of processing and communication and thus support mobility and
40
•
Communicating with Smart Objects
intellectual nomadism. These devices can take several forms (PDA, portable telephone, communicating clothing etc). capacity to communicate in an autonomous or controlled way: in addition to their traditional function of support to the communication between users, portable devices can detect the presence of a device of the same type (or dataprocessing resources distributed in the environment) and exchange information according to possibly pre-established rules but independently of any command given intentionally by the user.
In the HOURIA project we study the way in which certain physical achievements of this concept of communicating object could be used easily and in an effective way by mobile individuals, and to be integrated without causing rupture to their daily activities and the physical and social environments in which they are located at every moment. We justify our choice of the multimodality by the following: •
•
These artefacts rely on physical devices with restricted capacities, outside the traditional framework (a large screen, a keyboard and a mouse). It is thus advisable to conceive methods of interaction relying on paradigms other than direct handling "screen-mouse" like the tangible interfaces or the "embodied user interface". The context of use of these artefacts is by definition very variable. Indeed, the physical (noise etc) and social characteristics (intimacy, intrusiveness etc) of the environment determine a whole set of contextual constraints which will require interactional adaptations; from this point of view, multimodality constitutes a type of answer adapted to this requirement of adaptability and adaptivity. The development of mobile computing thus represents a field highly suited to the application of multimodal interaction techniques.
2. Problems and aims of the study Multimodality has given rise to much theoretical and empirical work. The theoretical work is mainly concerned with the definition of the concepts of modality and multimodality, and with the development of "design spaces". These spaces are conceptual frameworks which provide a whole set of concepts making it possible to describe the modality and the possibilities of combination of these in comparison with the interaction user-system. For example, the models TYCOON (Martin & Beroule, 1993) and CARE (Coutaz & Nigay, 1996) propose a whole set of concepts which describe various types of theoretical relationships of composition or "cooperation" between modalities: assignment, complementarity, equivalence, redundancy, competition, etc. Empirical work has explored the effective (real) use of the modalities and the real contribution (i.e. effectiveness) of multimodality in situations of interaction with more or less simulated multimodal systems. From the point of view of the user, these studies made it possible to show how the users combine various modalities to
Multimodal Interaction on Mobile Artefacts
41
interact with the systems and in which cases they use multimodality. Certain studies, such that of Oviatt, DeAngeli, & Kuhn (1997) for example, thus showed that multimodality is not always used (approximately 20% of the time of a session of interaction); the cases of use of this appear when the users describe in their commands some spacial information (for example: localisation, size, orientation or shape of an object). In addition, several types of combination were observed (Guyomard, Le Meur, Poignonnec & Siroux 1995; Mignot & Carbonel, 1996; Oviatt et al.. 1997, etc.): combinations of complementarity (for example, the user supplements a verbal statement by pointing by tactile designation an object aimed on the screen) and of the combinations of redundancy (for example, the user indicates orally and explicitly an object and, at the same time, tactily indicates it on the interface). From the point of view of effectiveness, few studies have systematically approached this point and the evaluation criteria used are in the majority concerned with the saving of time which brings multimodal interfaces to be compared with monomodal interfaces Although it produced many interesting results on the use of multimodality, this work approached only one limited whole of input modalities (mainly speech, pointing or writing). Moreover, few of these studies analysed the phenomena of appropriation of the multimodality through many sessions of interaction with the system. Lastly, the tasks suggested to the subjects were generally "new" for the users. A limit to this choice is that it leaves aside the study of the consequences of multimodality in the realisation of familiar tasks with more "classical" systems. The general objective of the study presented here aims at contributing to design multimodal systems and the analysis of their use by approaching these various points, i.e.: • • •
To apprehend multimodality in situations of interaction "meaning" for the users, i.e. contexts of tasks/activities which are familiar to them. To study the processes of appropriation of the multimodality through several sessions of interaction. To explore the use of "new" / "original" modalities in order to see empirically which place they take in man-machine interaction, the problems which they pose individually and in their relationship to other modalities.
In addition, by this study we aim at answering a certain number of precise questions related to the use of multimodality: how is it carried out and what guides the choice of multimodality? How are the choices and the changes of modalities of the user carried out? What is the role of the modality properties in this choice? Which are the criteria which lead a user to choose or give up a modality of interaction to the benefit of another? How do users combine the modalities? From a practical point of view, our objective is to manage to release ergonomic recommendations as generic as possible for the design of multimodal systems.
42
Communicating with Smart Objects
We shall start from published work (Coutaz & Nigay, 1996) to consider a modality as a means of communication which implements a physical device and a language of interaction. We will consider multimodality from the user's point of view and it will be regarded as a means of production of one's intentions through various modalities (glance, gesture, speech, handling of the artefacts, etc). From this point of view, we regard multimodal systems as those which allow specialised uses (a modality exclusively dedicated to a command), equivalents (a modality can be used for all the commands), complementary (several modalities can be combined to carry out the same command).
3. Method In order to answer these questions, we chose to develop a partially simulated multimodal system (Wizard of Oz technique). Our objective was to force the interaction as little as possible while limiting the development cost of the system and to free us from the current technological limitations at exploratory ends centered on new modalities. 3.1. Subjects Ten subjects took part in this experimental phase; they all had an electronic mail and were regular users of Internet but none was a computer specialist.
3.2. Task and nature of the modalities tested In order to put the subjects in a situation of realisation of a familiar task, they were asked to use an application of consultation of electronic mail by means of a pocket computer (PDA Jornada 540 series, Hewlett Packard) which was connected, for the purposes of the experimentation to their personal electronic mail. Four modalities studied were implemented: • • •
•
the vocal modality is presented as a mode of interaction in natural language without particular constraints (no preset statements to be used by the subjects); tactile modality, which requires a stylet to point the command buttons; gestural modality, which relies on a preset code which forces one to hold the PDA in one hand and to carry out the gesture with the free hand in front of the apparatus; improperly qualified modality and temporarily "embodied", which consists of associating commands with certain preset handling of the artefact (changes of orientation, various movements, etc.).
Multimodal Interaction on Mobile Artefacts
43
Only tactile modality was really implemented, the other modalities were simulated by the accomplice thanks to an identical interface and a numerical video system.
Figure 4.1 Implementation of the tactile, gestural and "embodied" modalities
Each modality allowed a realisation of the same set of commands with the tactile modality as a reference. 3.3. Experimental procedure and platform The activity of the subjects during the sessions (15 minutes average duration) was recorded by means of various sensors (cameras, sensors of system events). The experimental sessions were preceded by a training and recall phase, and were spaced over several days. They were immediately followed by autoconfrontation which was video-recorded. 4. Data processing and analyses Different data were collected during the study. During the interaction, the commands and the windows used by the subjects were indexed temporarily and filed automatically. The actions of the subjects, the modalities used, the duration and the contents of the commands were supplemented a posteriori starting from video film of the interaction. Lastly, the recording of the autoconfrontations was carried out in order to allow qualitative analyses starting from the identification of intentions and the strategies of the subjects as for the use of the modalities.
44
Communicating with Smart Objects
5. Results 5.1. Global use One of the first results obtained about the use of the modalities indicates that all of the modalities were used by the subjects during the various sessions. In addition, one observes an evolution in the use of multimodality during sessions: •
• •
During the first sessions one observes a homogeneous distribution in the use of the modalities which corresponds to a phase of an informal test of the system. This fact is corroborated by the data resulting from the autoconfrontation. But some preferences in the use of the modalities quickly appear in the following sessions. These preferences vary according to the subjects.
On a very global level (all sessions and all subjects mixed), one does not note specialisation of one modality to the realisation of one or several commands. Therefore, if there is specialisation, it can be located only at individual level, bringing the need of carrying out a specific analysis of the data for each subject.
5.2. Intra-individual specialisation As mentioned above, one observes in many subjects a tendency to preferential uses of modalities. These tendencies are specific (they appear only during only one session of interaction) or recurring (they appear in other sessions), are individualised (the same modality is used to carry out only one and same command), plural (the same modality is used to carry out several commands). 5.3. Influencing factors for the choices of modality The activity graphs and the verbalisations of the subjects made it possible to identify several factors which are likely to direct the choice and the change of use of the modalities. We present only the four principal ones here: Context of the activity in progress (recurring operational procedure) We could identify that the tendencies to preferential use generally result in sequences of recurring actions carried out by the subjects. These sequences are strongly related to the context of the activity in progress. Change of goal in the activity The changes of modality are often associated with phases of local reorientation of the activity.
Multimodal Interaction on Mobile Artefacts
45
Properties of the modalities (implementation) During the evaluation of a multimodal device, it is advisable to distinguish well the inherent characteristics of a given modality from the implementation mode of this modality. The fact that a modality is little used does not mean that this modality is unsuited but that its technical realisation can be inadequate or constraining. Thus, during the experimentation the tactile modality sometimes was under employed by the subjects because of the constraint induced by holding the stylet. This last is perceived as an external appendix of the PDA, which one possibly ends up putting down to not having to hold what will contribute to limiting its use or to restrict it with specific cases (problem with the other modalities to achieve particular actions such as closing error messages). Error correction Many changes of modalities appeared in dysfunctioning situations in order to correct errors coming either due to the subject, the system or of the accomplice. 6. Conclusions and prospects In this article we presented an exploratory study of the use of a portable multimodal system. The results obtained tie in with those obtained in preceding work: effective use of multimodality (all of the modalities was used), interindividual differences, appearance of preferential tendencies), change of modalities in dysfunctioning situations. Beyond this convergence, an original result of this study is that the subjects are easily appropriate to unusual modalities (such as the "embodied" modality for example). From their nature the communicating objects are likely to be used in very variable contexts. Thus the difficulty of their use in situations arises where the modalities of interaction will not be equally adapted. The multimodality takes on crucial importance then, insofar as it offers to the users a means of regulation of the contextual variations (environmental constraints, social constraints etc) by the adoption of the most suitable modality. It is this that study illustrates with regard to for example the recovery of the dysfunction. Moreover for each usual object considered, it will be necessary to identify the modalities of interaction to be integrated in input as at output. Thus, for an environment including multiple communicating objects, it will be necessary to define mechanisms making it possible to specify the recipient object of the command. From this point of view multimodality could bring solutions; it remains nevertheless to study the conceptual and practical viability of strategies of assignment by construction of a particular modality to an object or a type of object in a whole range of communicating objects.
46
Communicating with Smart Objects
7. Bibliography Calvet, G., Kahn, J., Nigay, L., Rey, G., Pasqualetti, L., Salembier, P. & Zouinar, M. (2001) HOURIA- Nouvelles interactions multimodales, Rapport final contrat de recherche FT-R&D, CLIPS-IMAG, GRIC-IRIT. FT-R&D DIH-UCE, Issy-lesMoulineaux. Calvet, G., Kahn, J., Salembier, P., Zouinar, M., Pasqualetti, L., Nigay, L., Rey, G. & Briois, J.C. (2001). Etude empirique de 1'usage de la multimodalite avec un ordinateur de poche. Actes de la conference "IHM-HCI 2001", 10-14 September, Lille, France. DeAngeli, A., Wolff, F., Lopez, P. & Romary, R., Relevance and perceptual constraints in multimodal refering actions (1999) In Proceedings of the Workshop on Deixis, Demonstration and Deictic Belief, Eleventh European Summer School in Logic, Language and Information (ESSLLI-99), 9-20 August, Utrecht, The Netherlands. Guyomard, M., Le Meur, D., Poignonnec, S. & Siroux, J. (1995) Experimental work dor the dual usage of voice and touch-screen for a cartographic application. Proceedings of the ESCA tutorial and research workshop on Spoken dialog systems, Vigso, Denmark, 30 May-2 June, pp. 153-156. Martin, J.C. & Beroule, D. (1993) Types et buts de cooperations entre modalites dans les interfaces multimodales. Actes des Semes journees sur l 'ingenierie de l 'interaction Hammer Machine, 19-20 October, Lyon, France. Mignot, C. & Carbonell, N. (1996) Commande orale et gestuelle: etude empirique. Technique et Science In-formatiques, 15, pp. 1399-1428. Nigay, L. & Coutaz J. (1996) Espaces conceptuels pour 1'interaction multimedia et multimodale, Technique et Science Informatiques, special Multimedia et Collecticiel, AFCET & Hermes Publ., Vol 15(9), pp. 1195-1225. Oviatt, S.L. (1999) Ten myths of multimodal interactions, Communications of the ACM, Vol. 42, N°l 1, November, pp. 74-81. Oviatt, S.L., DeAngeli, A. & Kuhn, K. (1997) Integration and synchronization of input modes during multimodal human-computer interaction. In Proceedings of Conference on Human Factors in Computing Systems CHI '97 (22-27 March, Atlanta, GA). ACM Press, New York, pp. 415^22.
Chapter 5
The Voice as a Means of Humanising Man-machine Interfaces Noel Chateau France Telecom R&D, France
1. Introduction Recent and future technological progress in the fields of signal processing, algorithms and artificial intelligence foresee communication between men and machines becoming increasingly close to that between two human beings, thus revealing a strong tendency towards anthropomorphism. In the field of telecommunications, the principal medium of communication is still the voice. Today, machines can speak (through vocal synthesis) and hear (using word recognition). They even "reason" and are capable of holding a discussion by means of software agents for intelligent dialogue. However, although they carry out the principal functions of vocal communication, it is tempting to say that, pragmatically, machines are still deaf and dumb. We should remember that pragmatism is to a certain extent the "function of form", that is to say, all of the information borne by the form of the signal (vocal, in this instance) which may modify the initial sense of the semantic content. Irony is a good example: a certain tone of the voice will lead to an altogether different understanding of a message like "The weather is still fine today", compared to a neutral tone, especially if the weather is rainy when this sentence is uttered. This chapter presents the bases for studies concerned with the emotions carried by vocal interaction and attempts to illustrate to what extent these studies lead to manmachine interfaces, especially with regard to communicating objects that are simpler and more pleasant to use by users.
48
Communicating with Smart Objects
2. Emotions 2.1. Affect, emotion and sentiment According to Aboulafia et al. [ABO 01], three levels of emotions may be distinguished, which differ through their intensity and duration. Firstly, the affect relates to an intense emotional state of relative short duration (a few seconds) which calls upon short-term memory. Secondly, emotion characterises emotional states which go beyond the situation which triggered them. Consequently, they call upon episodic memory since they exceed the duration of storage of sensorial memory. Finally, the sentiments are associated with emotional states which relate to the history of the individual, his beliefs, social or religious choices, etc. and form part of his personality. For their part they call upon the long term memory (which is commonly known as memory). Thus, when we speak of emotion in MMIs, through the accurate definitions provided by Aboulafia et al., we note that we are misusing language and that it would be better to speak of affect in the MMIs, since studies in this field are still just in their infancy. However, as this term conveys less of a common meaning and is less of a trigger of dreams, it is extremely little used and discarded in favour of emotions.
2.2. The nature of emotions From ancient Greek times where nomos (rules or reason) and physis (nature, impulsiveness, emotions) were in conflict, and up to the middle of the 20th century, the emotions were studied very little, since they constituted an uncontrollable variable in experimentation which was likely to falsify subjective and behavioural data [RUSI 01]. Moreover, in popular language a significant number of proverbs or sayings invoke the impulsive and uncontrollable nature of the emotions [DESP 01], one of the most famous of which is doubtlessly that of Pascal: "The heart has its reasons which reason does not know". However, this is viewed totally differently today, and psychologists such as Izrad [IZR 77] even argue that the emotions form the physiological base on which the perceptive processes (one does not see and hear the same thing, depending on one's emotional state), the cognitive processes (one does not remember the same elements of an image depending on its emotional charge) and action (fear, through the release of adrenalin which, for example, will trigger the action) are organised. The trend is totally reversed today. Emotions are a subject that has been frequently studied by psychologists since the middle of the 20th century. The "marketeers" also wish to know the emotional impact of advertising on consumers. Studies on emotions have seen, as is often the case in psychology, two opposing chapels of thought which we will be presenting briefly below, using two neologisms, the "categorialists" and the "dimensionalists".
Humanising Man-machine Interfaces
49
2.3. Categorialists and dimensionalists For categorialists, emotions may be grouped in a finite number of categories, with each category itself able to include more subtle sub-categories. Ekman [EKM 84] may be considered to be the principal representative of the categorialists. Through work done on facial emotions, they have demonstrated that it is possible to identify a limited number of mimic arts which express six basic emotions and which can be found in the majority of cultures that exist around the world. These six emotions are represented in a caricatural fashion (see Figure 5.1). They are (from left to right and from bottom to top) anger, fear, surprise, disgust, joy and sadness. Ekman has produced a model of facial animation (the FACS) which provides for seventy units of action, each composed of a group of facial muscles and which enables a synthetic face to express these six basic emotions.
Figure 5.1 The six basic emotions of Ekman's model
For their part, the dimensionalists consider that all the emotions may be organised in accordance with a limited number of continuous dimensions. Russel and Mehrabian [RUSS 77] thus suggests three basic dimensions; the hedonistic valence of emotion (positive or negative), its intensity and the degree to which the person experiencing it controls it. Figure 5.2 illustrates these three dimensions and places the six basic emotions of Ekman's model in Russel and Mehrabian's three dimensional space.
50
Communicating with Smart Objects
Figure 5.2 The three dimensions ofRussel and Mehrabian 's model
It cannot be considered that one approach is better than another. In fact, the approach to select for one's own work depends on the objectives and methodologies of measurement that one wishes to use. For example, a post facto evaluation of the emotion felt by the designation of pictograms (as in the PreEmo [DESM 00]), in the field of automobiles) would be appropriate for a categorialistic approach, whereas an evaluation in real time using electro-physiologic measurement tools (electro dermal conductance, blood pressure, pulse rate, respiration or muscular activity) would be more appropriate to a dimensionalistic approach [DET 01]. 3. Emotion in the voice 3.1. Why have emotion in the voice? In the field of human communication, emotions in general and in the voice in particular have two fundamental roles. The first is to regulate communication, in the same way as a large number of non vocal signs emitted by the interlocutors (in particular, mimicry and gestures) do. For example, if one of the interlocutors shows himself to be astonished or annoyed, he will express this through the intonation of his voice, which will immediately trigger a responding reaction in his interlocutor. Moreover, a particular intonation will signal the end of the intervention of one of the two interlocutors and his desire to give way to someone else.
Humanising Man-machine Interfaces
51
The second role of emotion in the voice is to enable the meaning of the verbal content to be conveyed precisely by the pragmatic content which allows access to connotation. As has been shown in the introduction, one can, for example, distinguish several ways of interpreting the same sentence through the intonations in the voice, which will enable listeners to understand whether the sentence uttered is a question or an affirmation and if this affirmation is meant seriously or ironically. Thus, emotion in the voice in this case enables a great deal of ambiguity in the interpretation of the message to be resolved. What we have just said with regard to communication between human beings may just as relevantly be applied to man-machine communication, as long as the machines are endowed with interpretation and the generation of emotion in the voice. In fact, a number of studies including those bearing on the CASA (computers are social actors) paradigm, Nass et al. [NAS 94] have in the last few years stressed the fact that computer users in the wider sense frequently behave with these machines as they do with humans; they use emotional signs to express content or discontent, stress or fatigue without having any return reaction. This is a form of release, gratuitous affect with no consequence but this may address the human context if there are people around the person. Based on the fact we have ascertained, that man-machine communication has a strong tendency to anthropomorphism, that the technologies which enable these same machines to speak (voice synthesis), hear (word recognition) and reason (agents that dialogue) are reaching maturity and may be transferred to a wide range of software platforms, and finally that the CASA paradigm is a reality, it appears that future developments in the field of interfaces and voice technologies will have a major bearing on the production and interpretation of emotional content. Thus, one might envisage, in five or ten years' time, interfaces, material or not, which will be sensitive to emotions in vocal commands and which might be able to react in accordance with the command itself, of course, and also in accordance with the way it has been uttered. With the prospect of the increasingly strong resemblance between man-man and man-machine communication we have just evoked, a development like this would logically improve the usability of systems (in particular by reducing the number of errors of interpretation) while improving their user friendliness (by making communication more natural) through drawing on the two fundamental roles played by emotions in the voice which we have referred to above. It may be considered that there are two major fields of study covering emotions in the voice. The first concerns the production, perception and analysis of emotion in the natural voice and the second the generation through voice synthesis of expressions of emotions which can be found in the natural voice. We will be presenting these two fields briefly later.
52
Communicating with Smart Objects
3.2. Emotions in the natural voice Studies on emotions in the natural voice have been mainly centred on the identification of physical correlations with the various expressions of emotions in words (DAY 64] [SCH 86] and [FRA 00]. With this approach, the vocal signal is analysed with a view to explaining the emotional state of the speaker as it is perceived by his audience. In his detailed review of literature, Scherer [SCH 86] put forward twelve basic emotions which may be distinguished in words (happiness/pleasure, joy/gladness, displeasure/disgust, scorn/disdain, sadness/despondency, grief/despair, uneasiness/ anxiety, fear/terror, irritation/icy anger, rage/temper, boredom/indifference and timidity/culpability) and reports for each of these the principal acoustic parameters identified as being strongly correlated with them. These studies have enabled the relationship between the vocal signal emitted by a speaker and the emotion he expressed through his voice to be modelled. In a recent study, Maffiolo and Chateau [MAP 01] have shown that the emotions perceived in a vocal signal depend closely on the semantic content of the sentence uttered. Thus, it may be imagined that by coupling such models with a word recognition system and a system of artificial intelligence (used by dialogue agents), it would be possible to identify automatically the various emotions expressed by a speaker in a given semantic field.
3.3. Emotions in the synthetic voice Once the emotional content of the voice of a user has been correctly identified, it is a matter of responding to it with ad hoc emotional content. This raises the principal and prior problem of fact: does the person desire symmetry, dissymmetry, wellmeaning neutrality, etc? As part of using a synthetic voice, it is necessary to dispose of algorithms which will make it possible to "breathe into" the acoustic signal being constructed the characteristics of the intonations of the human voice [PIN 89], [CAH 90], [MUR 93]. That might be done a priori by using the bases of specific acoustic data (for example, those recorded with styles of elocution which call upon a variety of pragmatic content) and also by applying particular patterns of prosody post facto to an "emotionally neutral" signal. In this case, re-exploiting the models for analysing emotions in the natural voice might be envisaged "in order for them to supply the target values of the acoustic parameters which the synthetic voice has to achieve if it is to imitate the natural voice". MIT has proposed the Kismet robot (http://www.ai.mit.edu/projects/humanoidrobotics-group/kismet/kismet.html) which uses a synthetic voice in order to express the six basic emotions of the Ekman model. However these emotions are still prototypes and a more subtle approach is required in order to obtain a synthetic voice that is more natural, with more realistic and less caricatural intonations.
Humanising Man-machine Interfaces
53
4. Communicating objects, emotions and the voice Most objects are created by man to fulfil a function, itself generated by a need. It may be considered that only artistic creations have no need to supply men with functional objects or tools, although works of art themselves meet a need, which, however, cannot be described as functional but rather a need for expression and communication. All objects, however functional they are, incorporate aesthetic dimensions. A simple stroll around the hammer counter of a DIY shop will confirm this. Shapes, colours, materials and packaging are some of the aspects of shape and design on which creators and marketing work to promote the object (the product) and set it apart from its previous version or those of its competitors. The balance between function and aesthetics will depend on the object as well as on fashion and trends. Even though the dichotomy between function and form appears to be facile or even trivial to achieve (the form is only considered by some to be a common, ultimate touch of paint, whereas everything in the object is expressed by its function), several creators and designers have asked questions on the relationship between function and form. The objects and buildings produced by the Bauhaus School are among other things an attempt to reply to these questions. They show clearly that function and form are inextricably linked from the start of the design procedure and that there is no reason to separate them. At present, users have developed expectations and even requirements, which are expressed by the taking into account of a new dimension in creation, with regard to products. Products should no longer merely meet a need by being useful and simple to use; they must blend into the way of life of the users by providing them with a certain pleasure. Designers must concentrate on what Pat Jordan [JOR 01] describes as affective design or emotional design. Given that large numbers of products have similar prices, functions and ease of use, Jordan claims that the difference between them in the market is based on the affective and emotional dimension they have to satisfy. The success of certain models of mobile phone or the Imac appears to confirm Jordan's theses. All of the views we have set out on objects in general obviously apply to communicating objects in particular. For these, equipped with functions that are ever more subtle as a result of the new technologies they incorporate, are now capable of hearing (word recognition) and speaking (voice synthesis). However, as we stated in the introduction, we may consider that, emotionally speaking machines and objects are currently still deaf and dumb. In view of the preponderant role played by emotions in vocal communication (see paragraph 3.1.) and the increasing importance of the aspects of pleasure in using products, as evoked by Jordan, it appears inevitable that changes in the development of communicating objects will include taking emotions in the voice into account, whether these relate to the input (voice command by users) or output (restitution by synthetic voice) of the system.
54
Communicating with Smart Objects
5. Conclusion At present, after having deliberately ignored for a considerable period, the production, perception and interpretation of emotions have become huge fields for psychological investigation. An increase in the intensity of research in this field has even been noted in the last ten years as a result in particular of its association with various technologies, whether vocal and also video (the automatic recognition of the facial expression of emotion) or electro-physiological (the capture and analysis of electro-physiological signals correlating with emotions). However, it may be considered that knowledge in this field, especially in the field of emotion in the voice, is still in its infancy. A decisive initial step would be to be able to understand and correctly convey in a non-caricatural fashion, the six basic emotions of Ekman's model, that is to say, joy, anger, fear sadness, surprise and disgust. Subsequently, coupling with the algorithms of word recognition (access to the initial semantic content) and intelligent dialogue (access to the context) will be necessary in order to be able to progress further and incorporate systems that'are sensitive to emotions in material or immaterial interfaces. 6. Bibliography Aboulafia, A., Bannon, L. & Fernstrom, M. (2001). "Shifting perspective from effect to affect: Some framing questions," Proceedings of the International Conference on Affective Human Factors Design, Asean Academic Press, London, 508-514. Cahn, I.E. (1990). Generating expression in synthesized speech. Technical report, MIT. Davitz, J.R. (1964). The communication of emotional meaning. New York: McGraw-Hill. Desmet, P.M.A., Hekkert, P. & Jacobs, JJ. (2000). "When a car makes you smile: Development and application of an instrument to measure product emotions," in S.J. Hoch and R.J. Meyer (Eds.), Advances in Consumer Research, 27, 111-117. Despret, V. (2001). "Le pouvoir des desirs: les emotions entre science et politique," actes du colloque Pouvoir, Desir et Emotion, UTC, Compiegne. Detember, B.H. (2001). "Measuring emotional responses in human factors research: Some theoretical and practical considerations," Proceedings of the International Conference on Affective Human Factors Design, Asean Academic Press, London, 124-130.
Humanising Man-machine Interfaces
55
Ekman, P. (1984). "Expression and nature of emotion," in K. Scherer & P. Ekman (Eds.), Approaches to Emotion, Hillsdale, NJ: Lawrence Erlbaum Associates, 319343. France, D.J., Shiavi, R.G., Silverman, S, Silverman, M. and Wilkes, D.M. (2000). Acoustical properties of speech as indicators of depression and suicidal risk. IEEE Transaction on Biomedical Engineering, 47, 829-837. Izrad, C.E. (1977). Human Emotions. New York: Appleton-Century-Crofts. Jordan, P. (2001). Proceedings of the International Conference on Affective Human Factors Design, Asean Academic Press, London, 342-348. Maffiolo, V. & Chateau, N. (2001). "Speech's emotional quality in vocal services," Proceedings of the International Conference on Affective Human Factors Design, Asean Academic Press, London, 342-348. Murray, I.R. and Arnott, J.L. (1993). Toward the simulation of emotion in synthetic speech: a review of the literature on human vocal emotion. Journal of the Acoustical Society of America, 93, 1097-1108. Nass, C., Steuer, N. & Tauber, E.R. (1995). "Computers Are Social Actors," ACM CHI'94, 72-78. Pinto, N.B., Childers, D.G. and Lalwani, A.L. (1989). Formant speech synthesis: Improving production quality. IEEE Transaction on Acoustics, Speech and Signal Processing, 37, 1870-1887. Rusinek, S. (2001). "A la recherche d'une definition de 1'emotion: ce que 1'emotion fait faire par-devers soi," actes du colloque Pouvoir, Desir et Emotion, UTC, Compiegne. Russel, J. & Mehrabian, A. (1977). "Evidence for a three-factor theory of emotions," Journal of Research in Personality, 11, 273-294. Scherer, K.R. (1986). Vocal affect expression: A review and a model for future research. Psychological Bulletin, 99(2), 143-165.
This page intentionally left blank
Part 2 Software Infrastructure for Smart Devices/Ambient Intelligence Gilles Privat France Telecom R&D, France
1. Introduction Software for smart networked devices could, not long ago, have been envisioned as a special case of embedded software, and would as such have adhered to the dominant solutions previously adopted in this domain. These solutions placed a strong emphasis on the hardware and power constraints of embedded platforms, for which software used to be streamlined and carefully optimised. Full advantage may now be taken of improved processor + memory capabilities to adopt standards-based, open solutions in lieu of vertically integrated software, dedicated to hardware. The benefits of time-proven and widely adopted approaches inherited from the general-purpose computing field far outweigh the overhead of the corresponding additional software layers, for all but the most cost-constrained devices. When embedded devices with such high-level of processing capability become networked devices, a new domain opens up which is the ideal application field for the vast body of research performed over the years in the general fields of distributed software and distributed systems theory. Distribution is not a mere theoretical nicety added as an afterthought for obtaining performance improvements: it is inherent in the very idea of networked devices, and will be applied here on a much larger scale than anything previously done with classical distributed systems made up of coarse-grain computing nodes (e.g. servers as network hosts). Generic network and middleware infrastructures still have to be adapted to specific constraints of smart networked devices, even it is not in the way it was done previously: rather than hardware-constrained optimisation, what is required is optimisation for robustness of a large-scale distributed system of connected devices, in an environment where devices may appear or disappear dynamically, be
58
Communicating with Smart Objects
disconnected abruptly, and have to "discover" each over and interoperate without previous knowledge. Specific temporal constraints may also have to be taken into account for communication types which may be as far from bulk data transmission as from isochronous multimedia flows: hard real-time and strictly deterministic latencies may be required for the distribution of asynchronous, possibly bursty events. 2. Generic middleware The idea of layering smart-device applications on top of generic middleware corresponds to moving up from such low-level connectivity interfaces as IP to HTTP, CORBA, RMI or IIOP. The dial-tone metaphor inherited from legacy POTS networks captures best this idea of universal connectivity interface: the dial-tone for smart devices definitely needs to correspond at least to the level to one of these generic interfaces. The universal availability of web-based clients and corresponding connectivity has fostered the use of HTTP as the protocol of choice for a new generation of networked embedded devices. A complete HTTP stack is already offered as a standard addition by most embedded OS vendors. Actually the benefits of HTTP come from its universal availability as a least common denominator applicationlevel interface, not from its adaptation to the network-based control of smart device services. HTTP was originally conceived as a protocol for the retrieval of static documents, and it is still limited by these origins. Numerous proprietary or de facto standard extensions have been grafted onto HTTP to enable server-side interaction with scripts or programs. This interfacing with native server-side programs is essential for all smart device applications, where static document retrieval is in itself useless, HTTP acting merely as an entry point to direct interaction. This confusing plethora of kludgy and mutually incompatible solutions (CGI, ISAPI, NSAPI, ASP, JSP, to name a few) have unfortunately compromised the very simplicity and universality that made HTTP useful and successful in the first place. It should by now be clear that higher-level, more general solutions have to be adopted to replace HTTP if smart networked devices are to take full advantage of the evolution of distributed software. For one, dissymmetric client-server solutions are by themselves less general than peer to peer solutions such as proposed in the framework of generic distributed software infrastructures. In this evolution, devices are not restricted to be software clients, or servers, for that matter: they can be both providers and requesters of software services (i.e., at this granularity, methods called from software objects) to and from one another. The relevant common-denominator abstraction is no longer a text-based document, but a full-fledged software object that can be invoked transparently throughout the network. CORBA, GIOP, IIOP, Java RMI are some of the relevant standards and corresponding software infrastructures, providing this level of object-based connectivity. They are, however,
Software Infrastructure for Smart Devices
59
still a long way from spreading from general-purpose distributed computing, where they have not yet even reached mainstream status, to the domain of networked embedded computing.
3. Infrastructures for spontaneous networking Beyond using such generic middleware layers, ubiquitous computing/ambient networking needs to address the higher-level problem of interoperating vast numbers of spontaneously networked, on-off, possibly mobile devices forming ad hoc, temporary federations as they get in touch with one another using whatever wireless network support is available, without any previous knowledge of the environment. For this, new generic services are needed on top of general-purpose middleware, to address the following needs specific to ambient networking environments: •
• •
A universal bootstrap mechanism making it possible for a new device to discover a new, unknown network environment and its general services (specifically what kind). A (centralised or distributed) directory/lookup service making it possible for devices to advertise their services and for other devices to query them. A connectivity mechanism making it possible for devices to use the services of other devices once they have located them.
These three kinds of services do already exist, replicated at several lower levels, in regular network environments (e.g. ARP, DHCP or BOOTP for bootstrap, DMS or RMID for directories, sockets, RPC or RMI for connectivity). These solutions are usually too low-level and need to be federated at a higher level to work in such highly heterogeneous environments as addressed here. The directory/lookup service especially needs to address the description of extremely varied types of services/devices, with support for sophisticated queries such as, e.g. where is the physically closest network printer with colour A3 capabilities. Efforts currently undertaken by industry leaders in various fora and consortia indicate the importance of such service discovery protocols and APIs on one hand, and service description languages on the other. A slew of solutions, that may in some cases be viewed as an uppermost middleware layer, have been proposed to handle theses specific problems of dynamic distributed network configuration. Their pivotal interfaces go all the way from least-common-denominator, text-based data, to the highest level APIs. Basically all of these technologies use IP-multicasting on a shared medium (e.g. Ethernet). Some of these solutions do take lower-level network connectivity for granted, others propose solutions to provide it if needed. To date, the most publicised among such technologies are: Universal Plug and Play (UpnP, www.upnp.org), initiated by Microsoft and currently championed by the UpnP Forum), Service Location Protocol (SLP,www.srvloc.org) which, as an IETF work item has been jointly developed by researchers both from academia and
60
Communicating with Smart Objects
industry as a widely accepted and usable Internet service for service discovery; Jini (www.sun.com/jini, www.jini.org), initiated and still controlled by Sun Microsystems, which provides an architecture for service discovery entirely based on Java RMI as the middleware platform, and Salutation, again a solution pioneered by an industrial forum (www.salutation.org). Less publicised open-source alternatives such as the customisable Jonathan ORB core (www.objectweb.org) could also be streamlined and fitted as infrastructures for embedded devices. 4. The digital divide for smart devices Moving up to such a high-level connectivity interface as Jini, or the like, does, however, raise the bar by several orders of magnitude for the minimal amount of software a device has to integrate before being able to engage in a minimal dialogue on the same level of protocol as its fellow network citizen devices. Moore law notwithstanding, it is not obvious that the finest granularity smart devices, those with strict per-unit-cost constraints, will ever reach this threshold. The model that can be envisioned for the time being is that of a two-tier network, in which first-class network citizens are able to talk to one another at the higher level of protocols, enabling transparent, dynamic configuration and all benefits that come with this more abstract level of discourse. Second-class devices will not be seen directly from the network at large. They will have to be represented by a proxy, which could be viewed as a software agent residing somewhere on a network server, or on another (first-class) device. The lowly device will be allowed to talk with its proxy device using some closed proprietary protocol, provided this low-level communication is entirely hidden behind the proxy for other devices attached to the network at large.
Chapter 6
Introduction to a Middleware Framework Vincent Olive and A. Vareille France Telecom R&D, France
1. Introduction This paper is an introduction to middleware frameworks for distributed systems. The aim is not to cover all the scope because a complete book would not be sufficient for that purpose, but to explain some of the recent architectures still considered as new. These are currently not often deployed but they are already promising many success stories in the near future.
2. Need for middleware frameworks As operating systems manage the collaboration of tasks on a given computer, a set of applications distributed over interconnected machines needs equivalent services but with coherency guarantees: this is the role of middleware. Furthermore, the actual trend is to put most of the middleware services on standalone machines. For instance, the discovery services and the event services proposed by the OSGI framework can efficiently be used on standalone computers. So many actual component-based approaches for the conception of operating systems are proposing middleware services as main foundations. The common method for building any software application is decompose into components that application (distributed or not). The components export interfaces the only way to use them (interfaces for controlling, emitting or receiving data, configuring, testing, securing, billing, debugging, etc). The distribution should be considered as a process for optimizing hardware resources, so the implementations should be automatically generated.
62
Communicating with Smart Objects
3. Some middleware frameworks 3.1. The Jini framework Jini middleware provides the main services of any framework for distribution applications. The same services are also provided in .Net (UpnP/Soap) from Microsoft, and also in Corba from OMG, the only difference being in the way it is implemented. In order to facilitate the distribution of Java applications, Sun has developed a framework that extends the remote method invocation (RMI). The service provider registers his interface (according to the language definition) for a given time period at a registry service with attributes specifying more applicative and non-functional properties (name, keywords, version, author etc). Querying of registry service are performed by multicast messages, the code and attributes transportation are performed by the mean of http servers (Figure 6.1).
Figure 6.1 The Jini framework
64
Communicating with Smart Objects
When querying a service, according to attributes and an interface used as query parameters, the interface is downloaded, in order to build a surrogate that manage Remote Method Invocation via RMI for the implemented service. In the Jini technology, three different types of discovery mechanisms are used: •
•
•
The multicast query protocol is used by the entity needing the nearest available registry service, also called lookup service. This protocol is activated at the initialisation of all Jini services. The multicast announcement protocol is used by the registry service in order to notify its availability after initialisation. It is also used when restarting all the services after a break in the network. The unicast discovery protocol is used by the entities for communicating with a registry service. It is used when services are on a different sub-network and the multicast does not work.
3.2. The UpnP framework First proposed by Microsoft and now supported by the consortium Universal Plug and Play Forum, the UpnP framework specifies descriptions of services and devices. With UpnP, devices can dynamically connect to a network, obtain the IP address, provide its properties and look up the availability and properties of other devices, all these steps automatically. In other words, devices can communicate directly with other devices and establish a P2P architecture. UpnP relies on the standard TCP/IP protocol and on other internet protocols. The basic building blocks of UpnP framework are: devices, services and control points. A control point send actions, devices receive actions and a Node is either a device or a control point. 3.2.1. Service discovery When connecting to an UpnP framework, a device tries first to get an IP address with either DHCP protocol or AUTO-IP. Having an IP address, it can notify the control points the availability of his services. Control points will have similar behaviour when introducing in a network. In the both cases and also for withdrawing, multicast messages are generated (multicast httpmu queries and unicast httpu for answers). Control points use http protocol for exploring the URLs obtained during the registry phase. 3.2.2. Device control It performed through SOAP protocol, which transmits using HTTP the XML encoded remote procedure calls. Each control point uses SOAP for sending control
Introduction to a Middleware Framework
65
messages and receiving error reports. Relying on http, the SOAP messages can overcome firewalls and also use secure sockets. 3.2.3. Events General event notification architecture (GENA) has been specified in order to provide sending and receiving of notifications using HTTP over TCP/IP and multicast UDP. GENA specifies the notions of subscribers, notifying resources and subscription arbiters for routing the notifications, all handling the events. The models and headers of GENA are used in UPnP for notifying the availability or state change of services throughout the SSDP protocol simple service discovery protocol) also relying on Httpu and Httpmu. A control point interested by events will send a request containing the name or characteristics of the service, the return address for the events and the duration of the subscription. In comparison with Jini, UpnP is not language oriented, using mainly http and its variants httpu and httpmu as transport protocol which has the advantage of passing through firewalls and be lighter than using UDP/IP. 3.3. P2P framework Peer to peer (P2P) frameworks are parts of distributed architectures, the services described in the last chapters are encompassed in this new paradigm. Opposed to the hype surrounding emerging services such as Napster, Scour, Gnutella and others, peer to peer technology has existed for 80 years. It is the convergence of two trends: 1. A political one, aiming to decentralise fully the applications in order to optimise availability and scaling. 2. A technical one, following the increasing power of computers and high throughput of networks. Usenet and FidoNet are old but still efficient examples, illustrating the robustness of that technology. The first, as distributed application, provides transparent sharing of files, although it was conceived at the time that data synchronisation was performed at night using point to point modems. Here is the first main property ! Indeed, that was a constraint and is now a main characteristic: in P2P technology exchanges are peer to peer, each of the two partners having a priori the same resources (storage, communication, computing) and so on along all the nodes, each performing part of the tasks: a general function is performed (e.g. Napster proposes a musical resource sharing, Gnutella is rather a general file sharing service and Seti@home a computing
66
Communicating with Smart Objects
resource sharing; there are many messages sharing services such as FidoNet, AIM etc). Sun, very aware, already proposes a Java toolbox Jxta for P2P applications. Jxta is a collection of six protocols not all compulsory for building P2P applications. The heart of Jxta is constituted of managing and searching mechanisms for peer groups, data transmission, control and load balancing of channels, peer activity and networks. In addition, indexation, searching and sharing services are provided for the developers. 3.3.1. Discovery service in the P2P The discovery service allows peers to find themselves so they can interact; this service is, in fact, fundamental and essential. Here are some ways to realise such service: 3.3.2. The explicit way This is a method that avoid the implementation of discovery service, all the nodes know a priori the peers they are communicating with, but not all. This solution does not scale easily, because of the difficulty of maintenance and the lack of dynamic connection of new nodes. Nevertheless, this way simplifies security by fixing the names of the peers (i.e. authentication is implicit, access rights are simplified). 3.3.3. The dynamic way The first method relies on the existence of nodes with the directory where other nodes can register and search others. These directories can also be peers (the DNS is an excellent example of hierarchical directories). This model facilitates directory management and provides a better quality of service, which is nearly centralised. However, as for Napster, it is also easy to stop that. 3.3.4. The networking way As the name suggests, all the peers are dynamic and none knows the whole network. Each peer has a list of possible neighbors and asks them to accept it as neighbor. Once all connections established (each peer is normally accepted by many neighbors) the distributed algorithms can be deployed. This list looks like the first, except that it is renegotiated along all the communication of the peers, one of the background task being to maintain the most efficient neighbors. This is the way for maintaining the list of the main active peers in the network, and so has grown Gnutella.
Introduction to a Middleware Framework
67
3.3.5. The diffusion When the multicast is available, without knowing how many and who are available, a broadcast message allows one to find the interesting and interested peers. This protocol is available in Jxta suite of Sun but is not suitable for the whole network, the multicast being usually stopped by the routers.
3.3.6. The security of P2P applications Security is mandatory when two entities share resources, when they do not know each other and are without any previous contact. The peer node authentication and the shared resource integrity and identity ask security to be guaranteed. This is a weakness of the P2P application. The level of security is the same as in all distributed systems: • • •
authentication:appears under both levels, authentication between peers and authentication of users; authorisation: in order to work on a resource after authentication, authorisation gives rights on this resource, rights on reading, modifying or running; encription: it plays several roles: - it renders unintelligible data flow between peers in the Internet network which does not offer any security; - it guarantees data integrity and authentication through signature of both clients.
4. Conclusion The need for software infrastructure to build distributed applications is essential to controlling complexity and following the more and more dynamic aspect of applications. Nevertheless none of infrastructures described above is predominant. For instance, it exists at the moment very few hardware components compatible with UPnP or Jini, and other competitor standards arrive such as Havi for audiovideo appliances. The architecture job consists in understanding common features as well as underlining interesting properties of every infrastructure, in order to use the best of each.
5. References The work group Peer-to-Peer is a consortium for the establishment of a standard in the domain: http://www.peer-to-peerwg.org O'Reilly's site, http://www.openpep.com/, is dedicated to this technology
68
Communicating with Smart Objects
The practice of security: http://www-106.ibm.com/developerworks/security/ The consortium OSGI: http://www.osgi.org/ Gong, Li Sun Microsystems: Project JXTA: A Technology Overview. 2001 http://www.jxta.org/project/www/docs/TechOverview.pdf Jini™ Technology Architectural whitepapers/architecture.html
Overview,
1999
http://www.sun.com/jini/
Understanding Universal Plug and Play White Paper 2000 http://www.upnp.org/ download/UPNP_UnderstandingUPNPTomFRaviRaoEditsFinal.doc JXTA vl.0 Protocols Specification 2001 http://www.jxta.org/project/www/docs/ ProtocolSpec.pdf UPnP Device Architecture June 2000, UPnP Forum Technical Committee http://www.upnp.org/download/UPnP_Device_Architecture_Generic_20000710.ppt Jini Technology Core Platform Specification October 2000, http://www.sun.com/ jini/specs/core 1_1 .pdf Jini Architecture Specification October 2000, http://www.sun.com/jini/specs/ jinil_l.pdf Jini Device Architecture Specification October 2000, http://www.sun.com/jini/ specs/devicearchl_l .pdf Universal Plug and Play: Background 2001 http://www.upnp.org/forum/default.htm Universal Plug and Play Device Architecture June 2001 www.upnp.org/ download/UPnPDAl0_20000613 .htm Rekesh, John UPnP, Jini and Salutation - A look at some popular coordination frameworks for future networked devices June 1999 http://www.cswl.com/ whiteppr/tech/upnp.html SOAP http://www.w3.org/TR/SOAP SOAP FAQ http://www.develop.com/soap/soapfaq.htm SSDP http://www.upnp.org/draft-goland-http-udp-04.txt SSDP http://www.upnp.org/download/draft_cai_ssdp_v1_03 .txt GENA http://www.upnp.org/draft-cohen-gena-client-01 .txt
Introduction to a Middleware Framework
69
XML http://www.w3.org/XML/ Chakraborty, Dipanjan Chen, Harry Discovery in the future for mobile commerce Jan 2001 http://www.acm.org/crossroads/xrds7-2/service.html AutoIP 03.html
http://www.alternic.org/drafts/drafts-i-j/draft-ietf-dhc-ipv4-autoconfig-
Project JXTA:An Open, Innovative Collaboration 2001 http://www.jxta.org/project/ www/docs/Openlnnovative.pdf
This page intentionally left blank
Chapter 7
A Model and Software Architecture for Location-management in Smart Devices/Ambient Communication Environments1 Thibaud Flury, Gilles Privat and Naoufel Chraiet France Telecom R&D, France
1. Introduction Location is the most obvious and easily exploitable piece of information upon which context-aware applications can rely. Based upon the narrowly-defined requirements of specialised location-based services, a number of programmatic interfaces, textbased or binary formats and protocols have been proposed to handle it in a more or less ad hoc and piecemeal fashion [2], most of them failing to capture the more general theoretical framework in which the location problem could be set. The grand idea behind using physical location to address information is to bridge the gap between the informational and physical worlds, which prevailing desktopcentric interface metaphors have contributed to widen. This goes much beyond the cellular-network-location-based services already on offer, and even beyond the more general context-awareness idea. Such a far-reaching vision of "information in places"[3],"worldboard"[4], "situated information spaces"[5], has been eloquently articulated, and partially illustrated in such projects as MapPlanet [6] and Confluence[7]. In this view, the physical world is the most compelling interface metaphor to cyberspace, and geo-location is used, not only as the user's own position, but as a unifying navigation anchor and an intuitive representational tool, to make sense of the overwhelming multidimensionality of the information space. As such, location information may be used in either an abstract or concrete sense, from the smallest to the largest possible scale. What is located may be either the user himself, the information he retrieves, the physical objects or the other people with whom he interacts.
1 Work partially funded under the ITEA Ambience Project.
72
Communicating with Smart Objects
This paper attempts to give a broad outline of a location architecture that could encompass the diverse requirements of all these potential applications. We begin by specifying the requirements for a location infrastructure with different models of physical space corresponding to increasing levels of abstraction. From this we try to describe a general template for a location infrastructure that draws upon an analogy to the layered models of network protocols and examine one first example of a minimal implementation based on Jini . 2. Requirements for a location infrastructure A location infrastructure should become a basic service of ubicomp environments, just as basic in fact as naming or directory services may be in regular network environments. It should establish a correspondence between location entities (e.g. regions of physical space) which we will, for the sake of this description, call locus/loci and locatable entities which we will call locants. In a typical ubicomp environment, a locant may be either a passive object fitted with a tag, a physical networked device with information/communication services attached, or a human user (with or without a device). In a broader view of "situated information space", it might be any abstract entity, such as a purely informational service, or a piece of static content data (e.g. an entry in a bulletin board or yellow pages service), not necessarily attached to a physical device but linked to some more or less abstract instance of a location concept. Physical location may as such be viewed as a unifying addressing mechanism, whether used directly or indirectly. The proposed solution should be independent of location-sensing technologies [8]. There are two ways in which location information may be used. The first is for the location infrastructure to generate, when locants move through loci, location events forwarded to the interested subscribers. These events may have to be filtered by an intermediate agent to retain only the relevant moves, depending on the space model and the application concerned. The other is to respond to either direct or inverse location queries, as detailed below. In both cases, information has to be provided in real-time by the infrastructure. We will not try, for the time being, to fully integrate the temporal dimension in the infrastructure, which would amount to take into account the complete trajectory of a locant trough various loci.
2.1. Direct location queries The primary type of query a location infrastructure is required to support is: "where can I find something?". In our model, this amounts to providing the identification (this might be incomplete, using properties or attributes) of a locant as input to the query, and expecting the identification of one (or a list of matching) locus as an output. Both input and output are relevant to a particular model of space understood by the application that spawned the query, as explained in the following section.
Location-management in Smart Devices
73
2.2. Inverse location queries The other kind of query a location infrastructure is required to support is: "what is there around here?" This amounts to providing the identification of a locus as input to the query, and expecting the identification of one or several locants as output(s). Again this query may be formulated in a semantic, human-understandable way like, "what can I find in this building", or in a lower-level metric way like "what can I find within aim radius from these WGS84 coordinates".
2.3. Composite queries Many applications scenarios will correspond to direct location queries followed by inverse location queries, especially in the case of a mobile user requesting first to be located by way of their mobile device, then to know what service of a given type are available in this neighbourhood. In a general case, a composite query is composed of a set of elementary ones (in some sort of composition).
3. Location models Various models of space do always implicitly underlie location infrastructures, [2], [9], yet they are rarely set in a proper theoretical framework by going back to the basics of what a space can be in pure mathematics. Though these models are purely abstract, they will be used in association with a particular location-sensing or ranging technology, from which they retain only relevant characteristics that can be mapped to a corresponding notion of space. They will serve to characterise both loci and locants.
3.1. Set-theoretic In this strictly minimal model, no metric information whatever about the shape of a locus or the precise position of a locant is assumed to be available. All that can be known is the presence/absence of a locant in a locus, modelled as an element belonging or not to a subset of space.
3.2. Topological In these models, location properties are defined on a point-based abstraction of a locant, a locus corresponding (loosely) to the mathematical concept of neighbourhood in a topological space. Location is defined relative to a point, with special properties attributed to a given neighbourhood of this point that may not be
74
Communicating with Smart Objects
fully characterisable in metric terms. The spatial continuity property inherent in this topological notion of neighbourhood may itself be useful in some cases, yet places strong constraints on the underlying physical location technology if taken absolutely. Other purely topological notions such as simple connectedness (absence of "holes" in an open set) may also be relevant. Quite different in mathematical terms, yet related for being an alternative potential abstraction of RF location technologies, would be a location model based on fuzzy set theory, on which we will not elaborate.
3.3. Metric These models assume the minimal possibility to quantify a distance between points. The physical size of a locant itself may also be metrically bounded, making it possible to go beyond the point-based abstraction of a locant. A locus may also be a similarly bounded region.
3.4. Affine and affine-euclidean Affine and affine-euclidean spaces are a richer and practically more significant case of metric space, where it is assumed that absolute location information of a locant may be defined with respect to a suitable coordinate system. Geodetic coordinates are practically the most important example, with three main classes of coordinate systems may be used: • • •
Geocentric cartesian coordinates. Polar geographic coordinates (latitude, longitude, elevation). Planar projection coordinates.
A locus may be an arbitrary region of space defined with respect to such a coordinate reference system, while richer properties may also be attached to a locant, such as its precise geometrical shape or its orientation.
3.5. Locant graphs A completely different mathematical species, where the vertices (nodes) of the graph are locants, and loci may correspond to various subsets of this graph (e.g. paths, walks, cycles, or arbitrary subgraphs). This may be used in conjunction with a metric model (yielding a valuated graph) with a location technology that is purely relative and bilateral between objects themselves, rather than related to more or less fixed loci. Such a model is also compatible with a purely distributed management of location, eschewing any fixed infrastructure for both the location devices and the software location infrastructure: each locant may manage its own neighbourhood of
Location-management in Smart Devices
75
objects, actually those objects for which it has a bilateral relative location information. General graphs may be used to model all kinds of bilateral or multilateral relationships between their nodes, besides relative location as put forward here. Of course, all purely network-based models of communication already use such models, and this is not what we are attempting to reinvent...Other relationships may be described for which location may still be used as a metaphor, by extracting topological properties from the graph itself. Semantic relationships may, for example, be described in a structural way, enabling inverse location queries similar to those that may be made in a physical location model.
3.6. Loci graphs By contrast to the previous case, vertices of these graphs are the loci and not the locants. These models may be seen as enrichments of a set-based topological model, modelling not only loci as subsets or neighbourhoods of space but also their structural relationships. This is implicitly the kind of model underlying the cell pavings used in cellular networks, where adjacency relationships between cells are used for the handover of a locatable entity from one cell to another. Adjacency is but one particular case of relationship, and many other kinds of region structurations could be modelled: a hierarchical model of space (loosely underlying most of the semantic models used in directories) is another obvious case.
3.7. Semantic/symbolic models In these models, spatial location may be defined implicitly rather than explicitly, by reference to more or less abstract concepts relevant to a given universe of discourse, i.e. a semantic frame of reference. Loci may correspond to such divisions as streets, precincts, municipalities, regions, states, as used in regular directories. At a lower level and a smaller scale, buildings, floors, rooms, or even shelves in a cupboard, cells on a given shelf, etc could be used as loci providing a spatial reference for all kinds or locants, which will themselves be defined by some supposedly well-known characterisation rather than their physical properties. These loci may themselves be mapped to one of the lower-level models described before, i.e. either a hierarchical graph model, a topological model or a metric model. These characterisations may be compounded with other non-univocal high-level properties associated with a particular locus. These may correspond to a typing or profiling of a particular locus (e.g. authorisation, security constraints, electromagnetic compatibility). Though these models are purely abstract, they will be used in association with a particular location-sensing or ranging technology [8], from which they retain only relevant characteristics that are mapped to their model of space. They may be used to characterise both loci and locants, as detailed in the following table.
76
Communicating with Smart Objects Table 7.1 Location models
Model Set theory Fuzzy set theory Topological space^ Metric space
Affine
Affineeuclidean
Information provided Presence/ absence of a locant in a locus
Locus concept A subset of space
A [0-1] degree of presence of a locant in a locus Presence/ absence in a neighbourhood
A fuzzy subset of space
Relative distance to locant Position of locant w.r.t. absolute coordinate reference system
An open ball
Position+orientation ~ >mapping (translation+rotation) from absolute to relative coordinates Path to locus
Point or region of space, defined from absolute coordinates
Locus graph Path to locant Locant graph Semantic/ symbolic models
Symbolic/semantic mapping to underlying models
An open set/ neighbourhood
Point or region of space, defined from absolute coordinates
Node of the graph & locus previously defined A sub-graph (e.g. a path, a tree, a cycle)) Semantics of a locus defined in the previous models
Locant concept Usually abstracted to a point Usually abstracted to a point A point, or more generally a closed set A point or closed ball Point or region of space, defined from relative coordinates Point or region of space, defined from relative coordinates A locant for the underlying model A node (vertex) of the graph Semantics of a locant defined in the previous models
These models are to be used in combination, each providing a different kind of information based on a different conceptual view of location.
r\z
The spatial continuity property inherent in the topological notion of neighbourhood may itself be useful in some cases, yet places strong constraints on the underlying physical location technology. Other purely topological notions such as simple connectedness (absence of "holes" in an open set) may also be relevant.
Location-management in Smart Devices
77
4. A generic location architecture Fulfilling the above requirements calls for an architecture that jointly articulates these different models of location.
4.1. A layered architecture model for location We propose to define a layered template for this architecture that draws inspiration and is conceptually similar to those used in generic network services and protocols: the bottom layers are closest to the physical properties of space, and as we go higher they get more and more abstracted away and closer to concepts understandable by human users, much as we move up from physical connection, to MAC addressing, to IP, then to DNS and possibly UDDI addressing, in network-based identification protocols. This model juxtaposes two vertical categories of information, orthogonal to layers, corresponding respectively to loci and locants.
Figure 7.1 General location architecture model
78
Communicating with Smart Objects
4.1.1. Physical layer By analogy to the physical layer of network protocols, this is the lowest level of our architecture, directly related to the location-sensing and identification technologies used: from the variety of technologies available, we can distinguish two (overlapping) categories: technologies that identify locants, and technologies that locate them in space. It is only by combining the two kinds of sensor that you can track a specific locant through space. The identification sensor itself may provide some minimal kind of location information if it has a limited range, asserting the presence of the identified locant within this range.
Figure 7.2 Physical layer
The relation between entities and sensor can be seen in the two directions, because with some technologies, the sensor activately searches for objects, or the entity can itself announce its presence, both for identifying or locating. The final aim of this layer is to provide a relative location for an identified entity (the sensor may not be unique but it could result from the combination several of them) it could be vague (near some identifying sensor) or more accurate. But at this state it is only relative to the sensor(s); it has no real sense for the physical space.
4.1.2. Geometrical/topological layer At this level, relevant loci are geometrically defined as sets or neighbourhoods that aggregate information coming from the physical layer, depending on the actual distribution of sensors through space. Position information for a sensor allows mapping of the relative position information it provides to a more global coordinate reference system.
Location-management in Smart Devices
79
Figure 7.3 (detail) Exchanges between physical and topological layer, loci part
Figure 7.4 (detail) Exchanges between physical and topological layer 4.1.3. Structural layer In this layer, multiple locants and loci will be associated in complex ways to model the structural relationships between them. This corresponds to a graph-based model for which vertices (nodes) may be either loci or locants. In the first case, these models may be seen as enrichments of a set-based/topological model, modelling not only loci as subsets or neighbourhoods of space but also their structural relationships. This is implicitly the kind of model underlying the cell pavings used in cellular networks, where adjacency relationships between cells are used for the handover of a locatable entity from one cell to another. These models are also used in navigation systems where it is necessary not only to locate the user,
80
Communicating with Smart Objects
but also to find a route for him to some destination. Adjacency is but one particular case of relationship. A complementary hierarchical model loosely underlies most of the semantic models used in directories, but is also an implicit model for the space within a building, as decomposed in floors, rooms, cabinets, etc. A much more abstract view of location corresponds to the case where the vertices of the graph are locants, and loci may correspond to various subsets of this graph (e.g. paths, walks, cycles, or arbitrary subgraphs). This may be used in conjunction with a metric model (yielding a valuated graph) with a location technology that is purely relative and bilateral between objects themselves, rather than related to more or less fixed loci. General graphs may be used to model all kinds of bilateral or multilateral relationships between their nodes, besides relative location as put forward here. Of course, all classical network models are based on graphs modelling their connectivity relationships, and this is not what we are attempting to reinvent. Other relationships may be described for which location may still be used as a metaphor, by extracting topological properties from the graph itself. Some semantic relationships may, for example, be described in a structural way, enabling inverse location queries similar to those that may be made in a physical location model. 4.1.4. Semantic layer In these models, spatial location may be defined implicitly rather than explicitly, by reference to more or less abstract concepts relevant to a given universe of discourse, i.e. a semantic frame of reference. Loci may correspond to such divisions as streets, precincts, municipalities, regions, states, as used in regular directories. At a lower level and a smaller scale, buildings, floors, rooms, or even shelves in a cupboard, cells on a given shelf, etc could be used as loci providing a spatial reference for all kinds or locants, which will themselves be defined by some supposedly well-known characterisation rather than their physical properties. These symbolic locus descriptions will themselves be mapped to one of the lowerlevel models described before, i.e. either a hierarchical graph model, a topological model or a metric model, and this correspondence has to be accounted for by the architecture. These characterisations may be compounded with other non-univocal high-level properties associated with a particular locus. These may correspond to a typing or profiling of a particular locus (e.g. authorisations, security constraints, etc).
Location-management in Smart Devices
81
4.2. Direct location search queries One of the main goals of the infrastructure shown before is to respond to a query for direct location. The query is of the type "Where can I find something?" something being a communicating entity and the response to where being a locus. So the general path to respond to the query is from the entity to locus.
4.2.1. Targets of the query In a human understandable way, the something is likely to be expressed with a semantic definition (like where is the coffee-machine?). You can also research an entity upon the relation it may have with other entities that build an higher level device (for example a streaming server which can be linked to a video client such as a giant TV flat screen, the query would be, where are the TV located?). Finally you can search for an entity with its low level identification (like with an Ethernet address or an unique URI), but that sort of request should be transparent for users and only accessible for performance purpose (because it skips in some way the infrastructure model).
4.2.2. Path followed by the query On the entity column, the request follows an up to down path, being transcoded layer to layer. It may descend to the physical identification before being translated to the locus column, and then it follow an upward path, stopping at the level required by the asker.
82
Communicating with Smart Objects
Figure 7.5 Direct location query
4.2.3. Type of answer The response could takes many forms too (depending on the need of the request), it could be an absolute positioning in a specific reference frame (for example GPS coordinates if you are outdoor). The response could be the neighbourhood in which the object is located (for example, the coffee-machine is inside room F107) or the structural path to access to it (the coffee machine is on the first floor of the building, section F, between the room F105 and F109, in front of the corridor etc). And finally, it may be a semantic response, especially if the asker is human (the coffee machine is in the cafeteria of section F). Of course these answers could also be combined to fulfill the requirements of an application (especially if you're using a navigating software to guide you to the coffee machine).
Location-management in Smart Devices
83
4.3. Inverse location search queries As explained in the requirements, there is another type of query which is useful, about inverse location search, responding to question like "what entities can I find there" there being a locus. It may be seen on the opposite side, from locus to entities.
4.3.1. Targets of the query The query is likely to be addressed nearly anywhere on the locus column. From the semantic viewpoint, very understandable for human beings with questions like "who is in the conference room?" (people considered as communicating entities through the use of pda or mobile phone) for example. In this case, the place is semantically known to be a conference room and semantic properties are attached to the entity searched (in fact humans, not coffee-machine). But the query can target the structural relation between loci (answering to query of the type "what device can I use in the room next door" or "what can I find inside this entire building"). For proximate selection, a lot of queries are of the type "what services can I use near me". "Near me" can be understood as in the same set of space, or in my neighbourhood, or near my absolute position. Neighbourhoods are also targeted for not proximate selections (asking about entities in neighbourhoods or sets in whose you are not). Absolute positioning can also being used if you have no idea of the semantic identity or the set of space concerned with the position you are asking for (particularly useful if you are outdoor). Finally, but this should not be used, is the direct targeting of the sensors (you ask about the relative positioning or identification of located entities).
4.3.2. Path followed by the query The way of the request is nearly symmetric to direct location query, going down on the Locus column, and up in the entity one.
4.3.3. Type of answer As in direct location queries, the response can be of different types, fulfilling the specific needs of the application asking. It may be some basic identification of the
84
Communicating with Smart Objects
entities responding to the criteria formulated in the request found () or directly some high level semantic definition of it, depending on the possibility of the subjacent.
Figure 7.6 Inverse location query 5. A Jini-based discrete location architecture As a simple illustration of these ideas, we present previous work on the first Jinibased implementation of a location architecture. This project was intended to demonstrate small-scale location-based services within a room. As a user enters or leaves the room, or gets close to devices within the room, the environment reacts to is actions. The smart devices spontaneously propose their services to the user according to his/her position.
5.1. Goals of the project The major ideas was to use Jini to show the concept of inverse location search and to break apart the link between the physical world of detection technologies (very low level information) and location data used at an higher level by the infrastructure. So data used by applications were independent from the hardware used. In this first
Location-management in Smart Devices
85
exploratory work, the location model used was a set-theoretic discrete location scheme using neighbourhoods (the world being divided into sets and neighbourhoods within which you are located or not). Locatable entities were determined as belonging to these regions of space and the mobile entity in the world was the user with a PDA. When the PDA (and so the user) enters a set of space, all services (linked to smart devices) available by proximity are shown to him via some sort of graphical user interface (that also indicates the direct positioning of the pda in the world).
5.2. Location technologies used The detection technology used is mainly RFID because it can output an accurate identification and if adequately placed in space offers reliable information on discrete location (this locant enters in this room for example). To illustrate this, the PDA carried by the user is bound to an RFID tag, and there are RFID tags on the phycons (such as the books in our library). Another discrete location technology has been used, a sensitive pad that detects the presence of locant by pressure.
5.3. Software infrastructure This project has been implemented on top of the Java platform for all its portability and cross-platform advantages and RMI for remote procedure call through the network. For the service discovery infrastructure, we have chosen Jini because it was the more open and used middleware available at the beginning of the project. Staying in a Java world, we have used JMF for multimedia streaming. And accordingly, to the use of Jini, we used Javaspace for a kind of concurrent-access database as a normalised service within the middleware.
5.4. Interaction demonstrations The demonstration is for now constrained to a single room, and based on a single user equipped with a pda and its wireless connection. Services are dynamically offered to him depending on its location. When he enters the room, a window automatically appear on the pda screen, showing the direct location of the user (the room, the floor and the building), and a list of the services available in the room. For example, there is a robotic arm in our room that can be remotely controlled by the user with its pda. Another example of using location as interface is when the user gets close to the bookshelf, an inventory of it automatically appears on the pda screen. Or when the user takes a book (in fact it is a locant, a phycon whose role is to present the path of a multimedia content, for example a movie trailer or a song) from the bookshelf, relevant services are spontaneously offered. If this is a movie, so there is video content, the TV flat screen exports a control interface to the pda screen and gets ready to receive video streaming. The dolby prologic speakers system does
86
Communicating with Smart Objects
the same for audio. And, if possible, a multimedia server that can access to the content is also offered. We can therefore consider that a higher entity of distributed multimedia interface is spontaneously created upon the relations between some of the locant. But the problematic of the distributed interfaces through space, time and modality is another topic. Another extension not yet implemented of our system is to permit follow-me applications (for example with the sound that follows the user with no interruptions if the rooms have some kind of smart audio devices) but it could not be possible while we do not have established the structural layer that links the loci. 5.5. Implementation of the location architecture Our architecture is a simplified adaptation of the general model discussed before, mainly for study purposes. On this first work, the layers and blocs communicate with three main mechanisms. It functions using the possibility of discovering and the advertising of Jini and remote or direct methods calls (we have developed a shared panel of basic interfaces, in the Java understanding to have a little a priori about how to communicate with the main blocs). It also uses the mechanism of events to trigger events quickly (the Java events have been overloaded for this specific purpose). The last mechanism used is a shared access to some values and properties through a common space (in some sort of basic database, specifically useable in Jini known as Javaspace). Of courses these three ways to communicate are combined depending on the case (for example, the Javaspace might throw an event if some specific value is changed, and that can trigger a remote call of a method).
Location-management in Smart Devices
87
Figure 7.7 An example of implementation of the general architecture 5.6. Description of this particular architecture
At the lower physical level, sensors detect locants in space. They relay their data to the concentration service that centralises the data, filters them, and relays them to the identification service and the concerned neighbourhoods. The data relayed is in this case a basic relative location information with the presence of entity regrouped into a particular neighbourhood. The Identification Service links the low levels identifications brought by the detection services to higher levels (it identifies the locant, themselves offering a service recognised by the Jini infrastructure). It is also designed to manage the profiles for users, knowing if a client is allowed to use a particular service. Each Neighbourhood Service manage a physical zone and register clients within it (clients of the neighbourhoods are usually the services associated with the locants). These services also relay (if asked) the information about the entering or the leaving of the clients in the controlled area.
88
Communicating with Smart Objects
Location Registration Service is the bridge between clients and the location architecture, because a client a priori does not know in which neighbourhood he will be registered. Kiosk service shows the different services available through the use of the neighbourhood's registered clients. 5.7. Limitation of the current implementation This preliminary work, if working quite correctly, lacks some very important concepts about location. In the first place, it is relevant for a single and very specific understanding of space. At the lower level, we do not integrate technology for continuous localisation, or a more accurate relative positioning sensor. So with only punctual and discrete information, we can only use theoretical sets and topological models. The structural layer have not been taken into account up to now and as a consequence, relationship between loci cannot be used, and data are not persistent when the user move and changes neighbourhood. The qualifying of the semantic layer of the top level interfaces is for now an abuse of the language because it does really make sense only in a programmatic way in a Jini world; we cannot base any request on a semantic definition of loci or locant (neither on semantic properties). And finally, the really big hole in our implementation is that it cannot really interoperate with heterogeneous applications or services because it does not support any declarative interface for queries.
6. Conclusion: integration of location-management with service discovery infrastructures Service discovery infrastructures such as SLP[15], Jini™[16], UPnP™[17], Salutation[18] and UDDI[19] attempt to move up from purely network-based addressing, to account for higher-level descriptions of networkable entities. They provide a bootstrap mechanism that makes possible the dynamic, spontaneous hookup of services and devices in ubicomp environments. They use, in a centralised or distributed fashion, a generalised lookup service that may build upon and subsume the more specialised naming, trading or directory services provided by underlying middle wares and protocols. As such, both direct, inverse and combined location queries should be directly supported by these infrastructures. We are still far from this objective, if only for the lack of widely adopted representation standards for location information and associated data, at all the different levels described before. Programmatic interfaces for services and their attributes, such as used by Jini™, are both highly expressive and close to implementation (that is why they were chosen in the above-described implementation), but place a probably too high bar on interoperability. Declarative XML-based interfaces are a more pragmatic solution for minimal interoperability
Location-management in Smart Devices
89
and could make it possible to interoperate at all levels of the architecture described above, provided standard DTDs, (preferably schemas), become widely adopted for each of these. There is already a substantial amount of work on this, which unfortunately has led to the definition of different and incompatible DTDs (or, worse, binary formats) by all the different consortia and standardisation bodies interested in this topic, from geographers to mobile telecom operators, to specialists of 3D graphics[2]. For the lower layers where relevant ontologies are fairly straightforward and could be agreed on by all parties, this multiplication of overlapping and competing would-be standards could have been avoided. As usual, a shakeout will let one of these emerge, but this may take some time. For the structural and semantic layers, the nature of the concepts to be manipulated is much more open and an a priori definition more difficult. Here, as in the general issue of semantic service discovery [20], a meta-level specification of location ontologies should be possible: we could have, for example, different models for intra-building location based on different kinds of architectural entities, or culturally-dependent street/precinct hierarchy models for describing location in an urban area.
7. References [ 1 ] www.extra.research.philips.com/euprojects/ambience/ [2] Mari Korkea-aho, Haito Tang: "Experiences of Expressing Location Information for Applications in the Internet", Proceedings of the Workshop on Location Modeling for Ubiquitous Computing, Atlanta, Georgia, September 2001. [3] J.C. Spohrer, "Information in Places", IBM Systems Journal, vol 38 n° 4. [4] www.worldboard.org [5] G. W. Fitzmaurice, "Situated information spaces and spatially aware palmtop computers", CACM, vol 36 n° 7. [6] www.mapplanet.com [7] www.confluence.org [8] J. Hightower, G. Bordello, "Location Systems for Ubiquitous Computing", Computer, August 2001. [9] Svetlana Domnitcheva: "Location Modeling: State of the Art and Challenges", Proceedings of the Workshop on Location Modeling for Ubiquitous Computing, Atlanta, Georgia, September 2001. [10] Martin Bauer, Christian Becker, Kurt Rothermel: "Location Models from the perspective of Context-Aware Applications and Mobile Ad Hoc Networks",
90
Communicating with Smart Objects
Proceedings of the Workshop on Location Modeling for Ubiquitous Computing, Atlanta, Georgia, September 2001. [11] Thomas O'Connell, Peter Jensen, Anind Dey, Gregory Abowd: "Location in the Aware Home", Proceedings of the Workshop on Location Modeling for Ubiquitous Computing, Atlanta, September 2001. [12] Barry Brumitt, Steven Shafer: "Topological World Modeling Using Semantic Spaces", Proceedings of the Workshop on Location Modeling for Ubiquitous Computing, Atlanta, September 2001. [13] Natalia Marmasse, Chris Schmandt: "Location Modeling", Proceedings of the Workshop on Location Modeling for Ubiquitous Computing, September 30, Atlanta, Georgia, 2001. [14] www.autoidcenter.org [15] www.srvloc.org [16] www.sun.com/jini [17] www.upnp.org [18] www.salutation.org [19] www.uddi.org [20] R. McGrath, M. Mickunas, R. H. Campbell, "Semantic discovery for Ubiquitous Computing".
Chapter 8
A Software Infrastructure for Distributed Applications on Mobile Physical Objects Mohammed Ada-Hanifi, Serge Martin and Vincent Olive France Telecom R&D, France
1. Introduction In this paper we present some ideas related to work started in the POPCORN project (MOA: DS) and, pursued in the Cybermonde/Pervasif project (sub-project 4). The earlier results were exploited within the cocooning DIN project, in order to develop a generic remote command. We deal in this context with distributed services in a house. The technologies presented are equally able to cover a range from small things (person or sensor) to large things (national services). This paper is divided into three parts. In the first part, we explain the technology devoted to installation and updating software services involved in the home automation field, through a framework, specified by the OSGi consortium. Moreover, the intended and unintended advantages and limitations of this technology are pointed out. In the second part of this paper, this technology is extended to allow adapted distributed applications to be used by implementing Jonathan middleware is described. The properties of this extension are underlined. Finally, in the third part, we present via an example the first practical results obtained for the generic remote command. 2. The OSGi infrastructure Over the last decade, the concept of the global network has increased in significance with the widespread deployment of the Internet. At present, it is not the monopoly of big companies, but it is deployed, too, at the residential level. At the same time, major evolutions have occurred within equipment and terminals, using more improved processors and electronic chips both in terms of execution speed and memory size. Since these devices have become more smart, and are able to
92
Communicating with Smart Objects
communicate and interact together, new prospects have arisen concerning interoperability between remote software/hardware environments. Thanks to these network-connected devices, it has become possible to read e-mail on the TV or on a cellular phone, or again to control the state of home equipment.
Figure 8.1 Interconnection of a home network to the internet network through OSGi gateway
2.1. The OSGi framework An industrial consortium of telecom operators, telecom equipment and home equipment manufacturers was created in 1999 in order to define and promote an open standard (framework) allowing the development and deployment of e-services in the home, accessible and remotely controlled on the Internet. This standard, written in the Java language, is named OSGi (Open Source Gateway initiative) gateway. Presently, there are more than 5 gateway products on the market, and some of them are less than 300Kbytes in size and can be concurrent candidates for embedded systems. 2.2. History and evolution In the 1.0 OSGi specification release, dated May 2000, the emphasis was put on the definition of basic APIs, intended for implementation in the framework and available to service developers. These APIs were written in Java. This language was chosen for portability reasons, its downloading capacity and its security properties. In the 2.0 release of the OSGi APIs, dated October 2000, security was strengthened and a special management bundle was added in order to fully define and control the security aspects of a bundle in real time. The minimal infrastructure of the gateway deals with: • •
The Java environment: the set of packages and classes required for the framework work. The Framework which defines APIs that load, create and launch the services.
Distributed Applications on Mobile Physical Objects • •
93
The Log service that collects information concerning the framework during execution. The Http service which defines the APIs that allow the launch and use of an http server.
This minimal infrastructure can be grown (enhanced) by more services; both by infrastructure services like (Jini, Corba, uPnP, Havi etc) and by application services (Figure 8.2).
Figure 8.2 OSGi framework architecture 2.3. Functional description and properties The OSGI framework architecture relies on some concepts that we describe below. For a given application, installed in the framework, it is assumed that it offers one or more services. These are encapsulated in a "bundle" that is a .jar file in which we put together the service and other files such as pictures, sounds, movies and all the resources necessary for the working service(s). It also needs to contain a manifest file, which describes the name and the contents of the bundle and an activator which manages the "starting" and "stopping" states of a bundle and /or a service.
94
Communicating with Smart Objects
Once a bundle is in the "installed" state, the framework is able to manage the services in the bundle. These are registered with their properties within the framework, and bundle states are dynamically managed (lifecycle management). The framework is informed about service dependencies and sends an event notification after each state change. Service launching is performed by a class that encapsulates the service and that manages the starting and the stopping states by two methods Start() and Stop(), defined and implemented in the service activator.
Figure 8.3 The lifecycle of a bundle 3. A middleware solution based on Jonathan The OSGi infrastructure architecture presented in the previous section offers infrastructure services accessible only in the execution space of the Java virtual machine. For example, a "trader service" that helps us to retrieve a service searches for it only in a local space of the framework. Thus it is impossible to retrieve a service that has not already been loaded and registered locally within the framework. This working hypothesis is justified when all the locally installed services are considered as an autonomous collection; it means that a set of the imported services is self-consistent. Bearing in mind this last hypothesis, what happens if we try to make communication between several OSGi frameworks, or simply if the framework needs to deal with other communication functions different from the loading function? If we formulate
Distributed Applications on Mobile Physical Objects
95
this in terms of object language concepts, we need to express importation and exportation of the services (represented by their interfaces) between different OSGi frameworks (therefore between different JVMs). Thus, when we propose a middleware based on an ORB such as Corba or RMI, this technology will allow us to extend the local OSGi service notion to a distributed OSGi service notion. In fact we want to extend this approach and provide to each OSGi framework an ORB (Object Request Broker) which allows it to communicate with another framework. We define two steps for this approach: 1. 2.
implementation of the infrastructure services needed to connect distributed applications; definition of a typical architecture for distributed applications.
3.1. Implementation of an ORB service If we decompose ORB services into elementary services, we have two sets: •
•
A set of offered services: - Synchronous or asynchronous remote method calls with parameters passed by value and/or reference. - Management of object references. - Management of naming spaces which provide a translation between a name and its object reference. - Services related to trading, events, and memorisation are useful, but are considered to be additional services and are not strictly necessary. A set of internal services, which are not exported since they are not usually part of an ORB but are rather part of an operating system (memory management, session management, multithreading management, protocol stack, etc). (http://www.objectweb.org) is an open source and provides the possibility to share services.
The first experiment considered the Corba personality of Jonathan naming service as OSGi services. As a result two new OSGi services are offered by the framework. They enable an application service to call another application service operating in a remote OSGi framework.
96
Communicating with Smart Objects
Figure 8.4 Service A and service B call an ORB to manage linking objects (proxy, relay, stub, skeleton)
Some remarks can be formulated concerning the introduction of an additional framework for managing the distribution of services: •
• •
Trading services are duplicated: one for the external services, and another for local services. Moreover, they provide identical functions; searching for services by their names and/or their properties. Their programmable interfaces are thus different. Do we need a naming service by framework for all non-local services, or do we need only a simple relay (proxy) pointing to an external naming service? Could we link the naming service and its relays in order to propagate the requests from one server to another so as to obtain the requested services?
In fact, all these combinations are suitable and correspond to several optimisations, each one taking into account physical constraints such as computing power, transmission speed and throughput, and memory size. •
•
•
A relay pointing to an external naming service can be sampled, for example, by a mobile architecture with enough memory size, obtaining the address of the remote naming service during its connection. This service is then connected either by linking or directly to other services available in the network. A local naming service can be illustrated by a PDA that register its self exported services. During a connection it allows accessibility to its internal services by exporting its own naming service. On the other hand the linking in the importation mode of naming services to the PDA corresponds then to all the available external services.
3.2. Definition of an architecture for a distributed application By exploration of different possibility to organise our naming service, according the same concern of homogeneity and to maintain the same approach, the other applications services will follow the same exploration. The built-in distributed
Distributed Applications on Mobile Physical Objects
97
application is organised on the basis of several services running in different frameworks. The criteria of the services distribution are not the aim of this paper, they take certainly into account an available resource optimisation function and notions like safe and security. In this paper only technical solutions are involved to ensure the distribution. When we put a distributed application at disposal we start by the search of its compound services: this search can be extremely complicated depending on the number and the function of the services involved. It is anyway a complete research field to try to express the composition of services and to deal with request on compound services. Another service thus arise, the loader: all the services are not necessarily reached locally or remotely. This property leads to a dynamic configuration of the services: the configuration can be adapted according to criteria such as physical localisation, availability of the services, shared carrying, autonomy, breakdown tolerance, etc. Finally the distributed application is in place. Some services are locally installed and other remotely installed, thus the application can be activated. This architecture briefly described will be applied and integrated for the home gateway achievement on a PDA within the DIN project ie Cocooning. The principles of architecture design developed here will be applied to the communication between an OSGi home gateway, installed on a PC, and a PDA on which all the home gateway services can be reached dynamically through their interfaces. The PDA can then drive the different multimedia services of the "familibrary project" like video recorder, picture viewer, photolibrary, hifi device, TV etc. From this last described context we present in the next section a simplified example of application for pedagogical reasons.
4. Example of application: the generic remote command When we use a remote command service we apply on one hand the principles described previously concerning deployment of services in an OSGi framework and on another hand exploitation of inherent services of the middleware Corba installed in an OSGi framework. The generic remote command is built as an application allowing command of remote equipment installed in home local network through a home gateway (OSGi framework). The remote command service was installed on a PDA and it was implementing the command interface of a service present on a remote gateway. As the application service is open, the remote command function remind itself open, so that is the significance of generic notion. The configuration ie the remote command (number and name of the command) can be programmed according to and by the remote controlled service.
98
Communicating with Smart Objects
The example presents the remote command application itself as a service of an OSGi framework, installed on a PDA. It controls through the middleware a video recorder present in a home network. This video recorder is connected to a HAVi network, where the management is assumed by another OSGi service, installed on a home desk computer (PC). We have here a typical example of communication between two OSGi gateways. We describe below the communications steps involved between the remote command and a home HAVi service (video recorder management).
4.1. Service presentation The HAVi service (video recorder management) is installed with two bundles on an OSGi home gateway: • •
the video recorder control service; the remote command relay service.
On the PDA, where we find the remote command, are installed two other bundles: • •
the remote command with a graphical interface; the video recorder control relay service.
4.2. Description of the communication mechanism We assume this application works in scenario "Plug and Play". It is the video recorder control that appears in the home network (either it is just powered on or it is just installed in the framework etc). It will activate the exchange necessary to find a command interface service. The sequence diagram (in Figure 8.5) presents the synchronous remote method calls between the different application services on the first part: remote command and video recorder, and the infrastructure services on the second part: trading and naming services. The four steps of the scenario are: •
• • •
Initialisation: each application service registers itself at the different infrastructure service, address of the naming service is assumed to be known beforehand. Search of the complementary service of the remote command by the video recorder. The active steps of the remote command. Escape of the remote command by the video recorder.
Distributed Applications on Mobile Physical Objects
99
This scenario assumes that the remote command is an available service. If it is not the case, the video recorder service can test steadily its presence or better it can register to the event service so as to be informed about apparition of the remote command service. Finally all the services are supposed present (but not necessarily running) on their respective frameworks. Now, for reasons of place, updating or flexibility reasons, these services could be loaded on demand. Thus when starting a service it follows: registration at a naming service, then at a trading service and finally at loading service.
100
Communicating with Smart Objects
Figure 8.5 Sequence diagram between video recorder (device) and the remote command. The diagram issue is an UML specification
5. Conclusion Starting from a framework, managing locally a set of services, we demonstrate the feasibility of extension by completing it with distributed software infrastructure services. This new distributed framework allows us to build distributed applications with higher static and dynamic architecture flexibility.
Distributed Applications on Mobile Physical Objects
101
Depending on the low use of our infrastructure, we think that this approach is an alternative to those proposed presently for mobiles, if we guarantee other properties like a high level of security on the framework, large scale integration and portability. The portability and large scale integration will be studied in Project Vision: Pervasif. The security aspect can match those studied in other projects (ex: SPDA, THINK etc).
This page intentionally left blank
Chapter 9
Integrating a Multimedia Player in a Network of Communicating Objects Jacques Lemordant Gravir-Inria, France
1. Introduction Heterogeneous groups of hardware and software components can be federated into a single network. We have developed a framework allowing MPEG-4 players to join a federation using Jini technology. MPEG-4 players are based on a classical clientserver architecture. By joining a federation, an MPEG-4 player is able to have access not only to predefined services associated with its application server, but also to dynamic services provided by the federation. Reciprocally the player is offering its display capabilities to devices in the federation and a connexion to its application server. This work has been done under the 1ST project SoNG (2000-2002).
2. Motivation and orientation 2.1. Sharing a world In his article What does it mean to share a World?, Bob Rockwell listed ten technical requirements that will have to be met by any interpersonal/interoperable VRML environment. These included the ability to: 1. 2. 3. 4.
Insert/Delete objects (e.g. avatars) in scenes at run-time (more generally: to modify the structure of the scene graph). Merge multiple sound streams from distributed sources into the shared scene's current ambient sound (e.g. voices over music). Track and communicate the state/behaviour of objects in real time (this implies a database of "who needs to know what how often?"). Allow (sets of) objects to be "driven" by users in real time (i.e. provide a UI as well as an API to runtime object control).
104
Communicating with Smart Objects
5.
Let imported objects become persistent (i.e. make them a permanent part of the scene). 6. Protect the scene from damage by imported objects (ultimately, this implies the whole range of data-integrity issues). 7. Assign objects to a series of different "owners" (to insure control over access to object behaviour). 8. Support persistent roles (for people) and rules (for scenes) (i.e. a use-model for scene/object access controls). 9. Link objects dynamically to external data/functions (in particular, to support authentication certificates). 10. Support the free exchange of information among objects (from chat and business cards to arbitrary data containers/streams). The concept of scene sharing includes not just multiple users but also multiple developers. It is the basis for all component-based applications, from the use of electronic cash to database access. One way to implement this concept is through sharing the scene graph by exporting it as a service in a federation and having downloaded code in the player be a client of the services provided by the federation. 2.2. Multimedia players in a federation of devices In dataflow systems used in scientific visualisation, multimedia players are considered as sinks or terminal elements of the system. Here, we consider them as devices. In our terminology, a device is a software or hardware component being part of a federation. An MPEG-4 device is a device capable among other things to play mp4 files. We can think of MPEG-4 devices using many kind of networking technology (IP, Bluetooth,...) and willing to speak to various kind of devices (HAVi devices, ...). The java2 world is in a good position to be the glue between them, but multimedia players are often embedded in an internet browser such as IE and consequently have access to a limited Java world specially with respect to networking. For example the SoNG 3D MPEG-4 player is a COM-ActiveX component. We have developed a bridge, specific to multimedia players, between the COM and Java worlds. This bridge could be reused for other players running outside of Internet explorer, on mobile devices without a JVM for example. To build this bridge, we have used the Jini surrogate architecture and the MPEG-J scene graph api. The original goal of MPEG-J was to have a parametric way to modify the multimedia content of the MPEG4 scene, but equally important is to be able to dynamically find and use external Java objects of all kind (directories, agents, servers, UIs,...). A Jini federation brings together many different types of devices with Java technology as the common base. With the use of Jini connection technology, devices such as cell phones, pagers, PDAs, and TV set-top boxes can speak a common language. There are three basic parts to the Jini Network technology, which makes it simple to use. They include:
Integrating a Multimedia Player in a Network
• • •
105
Lookup Service: where Jini technology enabled services announce their availability. Discovery Protocol: a process to find the required lookup service. Proxy Object: an interface to the Jini technology enabled service.
Figure 9.1 A federation of Jini services
These three simple parts are enough to understand how an entire Jini federation operates. The figure above gives an idea of such a federation. We have build in the SoNG project a theatre demo, described later, which shows how such a federation can work. The federation comprises Javacards, PocketPCs and MPEG-4 3D players as shown below. MPEG-4 players are client of the SmartCards and PDAs are client of specific display services offered by the player. For mobile users, a local Lookup Service can be found with the help of a general Lookup Service using geographical coordinates.
106
Communicating with Smart Objects
Figure 9.2 Jini services used in the theatre demo 3. Architecture specification
Our framework is made of two independent parts as can be seen on the next figure: •
•
A graphic streaming channel or interconnect made of: - SKELETONS inside the scene graph manager. - STUBS implementing a remote MPEG-J. A bi-directional application-level protocol has been defined such that the implementation of the SKELETONS and the STUBS can be done independently. Components to join a Jini network: -Surrogates. We have witten two surrogates. One is used to publish as a service the remote MPEG-J, the other one is specific to the theatrer demo of the SoNG project and is a client of the Jini services published by the smartCard. -Monitor. It's role is to tell the surrogate host that the MPEG-4 device is still on and to pass to the surrogate host the url of the surrogate(s) which will represent the MPEG-4 device inside the Jini federation.
Figure 9.3 Architecture of the MPEG-4 player showing the modules used for its incorporation in a federation of services
108
Communicating with Smart Objects
4. An example: the theatre demo This example shows a Jini smartCard services being used by the SoNG MPEG-4 player.We thank Laurent Lagosanto from Gemplus Research for helping us in integrating this research product from Gemplus in the SoNG theatre Demo. We also thank FT R&D for helping us when writing the XML messaging system between the javascript User Interface and the java MPEG-J layer above the scene graph api.
Figure 9.4 An e-commerce application: the theatre demo
An electronic ticket is stored in the smartcard if a valid pin code is entered using the synthetic layer2D keyboard. Its XML description is shown below: 03435 42678 ONCE
Integrating a Multimedia Player in a Network
109
<promise> Theatre de 1'atelier <sName>La directrice et le financier <seat>192 <seat>189 <seat>398
5. References K. Arnold, O'Sullivan, Sheiffer, Waldo, and Wollrath. "The Jini Specification". Addison Wesley, 1999. Ken Arnold: "The Jini architecture". Proceedings of the 36th ACM/IEEE Conference on Design Automation, June 1999. Jim Waldo: "The Jini architecture for network-centric computing". Communications of the ACM, Volume 42 Issue 7, July 1999. The Community Resource for Jini Technology, http://www.jini.org Jini network technology, http://www.sun.com/jini/index.html The Jini Surrogate Project http://developer.jini.org/exchange/projects/surrogate/ MPEG-4: Coding of audio-visual objects - Part 1: Systems, ISO/IEC 14496-1. Information technology. Multimedia Systems, Standards, and Networks. Edite par Atul Puri et Tsuhan Chen. Signal Processing and Comunications Series, ISBN: 0-8247-9303-X, 2000. Java™ 2 Platform, Micro Edition (J2ME™ Platform): java.sun.com/products/midp/ JAVA CARD™ TECHNOLOGY: http://java.sun.com/products/javacard/
This page intentionally left blank
Chapter 10
Reverse Localisation Joaquin Keller France Telecom R&D, France
1. Localisation and reverse localisation 1.1. Localisation The term 'localisation', when applied to a physical communicating object, refers to the function that represents a given object and specifies its physical location, i.e. the place where the object is situated. This is the function that describes the geolocation services, i.e. the services that depend on the object's physical location. Localisation of a physical object is, indeed, useful for the services that rely on the material or immaterial resources located near to the object1. But localisation is in itself not all-important; usually knowledge of the resources near to the object is enough to fulfill the service. Moreover, knowing the coordinates of an object does not guarantee ability to identify the resources located near to the object and/or the resources related to its location. So, to implement geolocation related services, localisation is neither necessary nor sufficient.
1.2. Reverse location Reverse localisation is, as its name implies, the reverse of the localisation function: it is the function that takes a physical location as an input argument and returns the communicating objects located near or associated with this location.
1
Not all geolocation services fit in this scheme. Particularly, this is not the case of services that rely more on the path followed by the object than on its actual location: management of vehicle fleets, computation of an "optimal" path for a vehicle (car, plane, boat etc) or a pedestrian.
112
Communicating with Smart Objects
And among the "objects" that are associated with a given location, it is possible to include the databases that register the non-computational resources located or related with this location. Reverse localisation completes and extends the localisation of communicating objects, and if both are available, all localisation problems are probably covered. But, are all these functions needed for most geolocation services?
1.3. Telelocation and local localisation It is worthy to note that the problems of localizing an object from nearby and localizing it from afar are not the same. For example: A physical communicating object having at its disposal a GPS, or even better, a motionless physical communicating object, is able to know precisely where it is located. Nevertheless, it is not sure that this object could be remotely located. Indeed, localisation and reverse localisation cover four different functions:
From far Local
Localisation Reverse Localisation C) Telelocalisation A) Telelocalisation Find out the position of a remote Find out the objects that are located in a given place object D) Local reverse localisation B) Local localisation An object knows its own position An object knows the objects in its own surroundings
Some comments regarding these "localisation" functions: These functions are related one to each other and they partly overlap. Particularly and trivially, if function (A) is available, a fortiori, function (B) will be so. And similarly, if functions (C) and (B) are available, (D) will be also available. Provided that some conditions are fulfilled (see further on), the availability of local reverse localisation (D) is enough to compute functions (B) and (Q. So, depending on the kind of geolocation service, some functions may not be needed. Even more, for privacy reasons, some functions, like telelocalisation, might be unwanted or forbidden. Also, reverse telelocalisation could be also restricted to public spaces.
Reverse Localisation
113
2. Implementing reverse localisation 2.1. Cellular reverse localisation The naive approach to the implementation of a reverse localisation function is to cover the space with cells2. For each cell there is a server which is kept informed of the communicating objects located within (or associated with) the cell area. To know the objects situated in a given location, one just needs to request them from the server of the corresponding cell (for local reverse localisation one's own cell server). This approach has many problems.
2.1.1. Cell operating cycle When an communicating object enter a cell, it must inform the corresponding server: How does the object know in which cell it is located? Does it know its position (then it need local localisation) and the geometry of all the cells? Moreover, a network allowing communication with one's cell server is needed, and only the communicating objects that are enabled for this telecommunication network will be taken in account. A solution could be that given as a fact that an object is probably in communication with some of its neighbors (that means that local reverse localisation is partially available) will be able, by this means, to know in which cell it is located and tell the corresponding neighbour about its presence.
2.1.2. Deployment and coverage Prior to the adoption by most of the communicating objects, the coverage (or the promise of coverage) should have an extension wide enough, and that without knowing exactly how to dimension the system: • •
The coverage level depends on the system's success: - Cells must be smaller if the number of communicating objects is large And the system success depends on the coverage level The system will be adopted by a high number of communicating objects only if the coverage is wide enough and the cells small enough to handle the objects' density.
Cell like in cell phone, eg CDMA or GSM.
114
Communicating with Smart Objects
In addition, complex financial mechanisms (billing, peering agreements, who pays what etc) should be implemented to put up the money for the cell servers and all that without knowing how much income the reverse localisation will generate. 2.1.3. Scalability Since cells can only handle a limited number of communicating objects, a cell can become overloaded at any moment (meanwhile other cells might be nearly empty). Therefore, the system that determines the cell areas and boundaries should be highly dynamic (particularly in respect of adding new cell servers) and oversized. 2.1.4. Conclusion Although it is technically feasible, the cell-based approach is probably unrealistic unless the cells have other applications. For example, it might be interesting to add reverse localisation functions to the GSM and/or the UMTS equipments and networks. In this manner it will be possible to gather the GSM/UMTS devices that are present in a given cell.
2.2. Reverse localisation using a network of peers Another approach, inspired by peer-to-peer, gnutella-like systems or ad hoc networks, is conceivable. In this approach, the communicating objects participate and collaborate to provide the reverse localisation function. The communicating objects organise themselves into a peer network in which each participates depending on their capabilities (energy, bandwidth, CPU etc). In this system there is no need for dedicated resources and the resources of the system are provided by the end users. 2.2.1. How it works Let us start with the hypothesis that each communicating object is aware of and able to establish a communication with its nearby neighbors and hence that it implements local reverse localisation. At first glance, this hypothesis may seem paradoxical since it states that prior to implementing local reverse localisation we need local reverse localisation. But the paradox is only apparent and means indeed that the property (knowledge of and ability to communicate with surroundings neighbors) is maintained (at least partially) most of the time, by most of the communicating objects and that the communicating objects should work constantly to maintain the property.
Reverse Localisation
115
This property (and also local reverse localisation) can be extended to software-only communicating objects related to a location (eg a database or any software object related to a given location) that can be considered in this case to be virtually present at the location and participating in the peer network like any object physically present at the location. 2.2.2. Maintaining the network of peers Keeping knowledge of the vicinity is performed by neighborhood collaboration: communicating objects at an given place inform one another of the approach and moving away (the appearance and removal) of the communicating objects they know. Knowledge and communication with the neighbors can be achieve using either a short-medium range network (eg bluetooth or and ad hoc network), or a global network (eg UMTS), or both. It is worth noting that the more dynamic the network (mobility communicating objects), the greater the number of exchanged messages (up to the extreme limit of a system changing too fast to function). If some peers are permanently associated with a location (kind of "servers") they help stabilizing the peer network around this location. Any non moving physical object or any logical object, permanently associated to a location, may play this "server" stabilizing role. 2.2.3. Reverse telelocalisation The problem is the following: How to "interrogate" from far the local reverse localisation "service"? How to know which communicating objects are located in a given place? Since every communicating object (element of the peer network) knows its neighboring objects, to know the whole communicating objects located at a place it is enough to interrogate one of these objects. Also, as soon as a given object knows the communicating objects in a particular area (and if it is able to communication with them), it is possible to assume that the object is virtually located in that area and integrate it into the peer network.In this way it will participate in maintaining the peer network and help in providing the reverse telelocalisation function. The association with this area or place is not arbitrary, since the object not only knows the (objects located in the) place, but by calling the reverse telelocalisation function it has shown a clear interest in the place. The reverse telelocalisation system is a relational network (a graph) of either logical or physical communicating objects. Each node or object knows its own, either
116
Communicating with Smart Objects
virtual or physical, position (a least approximately) and it is able to communicate with its neighbors (it may be able to communicate with further objects). To identify the communicating objects that are located in an given place (reverse localisation), one just finds, to begin with, one communicating object (physical or logical, mobile or motionless) that is located in that place and then enquires of this object about its neighbors (these objects may be also requested if necessary); so step by step, all the objects in that area will be eventually known. The first object or starting point is found using distributed search techniques (a la gnutella) within the peer network. 2.2.4. Conclusion Reverse localisation of communicating objects is not provided by the deployment of servers or equipment but by the organisation of all communicating objects concerned in a peer network (each object has, a priori, the same role) in which all the participants collaborate in providing, one to another, the reverse localisation service. However, this approach is not incompatible with the notion of a cell and it is possible, if there a background business logic, to deploy reverse localisation "servers", i.e. motionless communicating objects that are (totally or partly) dedicated to the reverse localisation of a area (around their position). The advantage of using a peer network is that the deployment costs are minimal and the available resources are utilised to their maximum, all with very low cost of administration and maintenance. The main disadvantage is that no quality of service (in terms of response delay, spacio-temporal availability, accuracy, exhaustiveness etc) can be completely guaranteed. The user have to conform to the available service (more than often nothing but sometimes satisfactory).
3. Perspectives Either direct or reverse, local or distant, localisation functions are rich in potential applications. Geolocation services to be implemented in the near future will rely on localisation functions. But today, except for outdoor local localisation (GPS and in the near future Galileo), few technical solutions (maybe none) could implement localisation functions on great portions of the globe surface and for a great number of objects. And in the times of nanotechnologies and pervasive computing, most manufactured objects will probably be communicating objects, that may be subject to reverse or local localisation.
Parts Networking Technologies for Smart Objects Pierre-Noel Favennec France Telecom R&D, France
The concept of smart objects rests on the connection which exists between two objects thus enabling them to communicate. This connection can be associated with a real physical link between the objects or with a virtual link created during the transfer of information, and only during this exchange, of a transmitting object towards one (or from) receiving objects and reciprocally. Concerning the connections with real link, the connection can be wired. One can have an electric connection: copper wire allows indeed the transport of information by the displacement of electric charges going from an object towards one (or from) another object; the intensity and the format of modulation then defines the order between the objects. One can also have a guided optical link. The link is made by optical fibres or more generally by optical guides, information being then transported by guided photons characterised by their energy (their wavelength), their intensity and their format of modulation. Modern smart objects do not have to have wired links between them, the only physical link can be electromagnetic waves in open space connecting the objects. In this book, we present the technology of connection establishing communication between objects with the only reservation that it is unwired. We have not considered the transmitters and receivers which them form an integral part of the object itself. The use of radio as a physical link and to control smart objects is most common. But this radio operator link, invisible and diffusing, can have, according to its characteristics, completely different properties with respect to its use - desired mode communication: outdistance, flow rates, quality of service etc. The frequency of the radio wave is one of the first characteristics defining it. Broadly, this frequency can range from a few hundred kilohertz up to frequencies of a hundred gigahertz. One goes from the normally used frequencies to
118
Communicating with Smart Objects
telecommunications between mobiles (standard GSM, GPRS, UMTS) ...to other higher frequencies specific to smart objects (Bluetooth) or worms of higher flow rates towards 5GHz (Hiperlan, IEEE 802.11). Then, at even higher frequencies, not yet used for smart objects, applications are studied, in particular around the frequency of 60 GHz (known as millimetre-length wave for its wavelength is about the millimetre). Generally, the frequencies of communication are strongly related to the potential flow rates. The higher the frequencies, the greater the flow rates in the communications can be. Working Bluetooth with 2.4 GHz allows exchange flow rates of 700 Kbits/sec. 60 GHz allows flows of more than 100 Mbits/sec. Another useful parameter for these radio waves is the emission power. Indeed, the greater the power of emission, the better the wave will carry far information, or also the power can be adapted to the distances desired for the communications. But the emission powers used must be reasonable if one wants to minimise other disturbing effects (electromagnetic compatibility, biochemical effects). The transport of the information bits between the objects can also be done at frequencies of the much higher then placing us in the field of optics. Classically, one will work then with wavelengths for which the technologies of the transmitters and receivers are mature, i.e. in the optical ranges corresponding to visible and the infra red. These optical beams, non-guided, being propagated in open space, allow very high flow rates. The objects could communicate between them with flow rates of the order of Gbits/sec and even more ... The optical beams are propagated in a straight line in the atmosphere and are weakened throughout their course. They do not allow propagation at very long distances. One hopes however to be able to communicate outside for several kilometres and with flows of some gigabits/sec. In a confined habitat, i.e. in a part of a few tens of m2, the communications between objects if they are in direct sight will allow probably connections with very high flows that are so necessary; if it is in nondirect sight, the photons are reflected on any obstacles which there will appear on its course (walls, ceilings, floors, windows and pieces of furniture) making the flows less usable lower but remaining high compared with those that one would obtain by traditional radio technologies. But also, technologies of optical sources and transmitters enable one to foresee the possibility of beams of a few photons and in particular one photon. The singular character of the photon, which is not any more the same one if it were detected, allows prediction of communications with 1 photon of which it is absolutely impossible to copy and to know the contents. There is a technology for smart objects here absolutely singular and expensive, but can require certain quite specific situations. Both technologies, radio and optical, each have advantages and disadvantages. Apart from the performances themselves (flow rates, distance, treatment of the signal), an aspect important to consider is the relation wave-user, or wave-man in an
Networking Technologies for Smart Objects
119
electromagnetic sea. Even if it is not shown scientifically that, engulfed in an atmosphere in which multiple radio waves are propagated, it is suffers physical deterioration, one can provide that an environmental requirement of the whole (or a part) of the population will be to reduce these radio waves in rooms, places of work and the atmosphere. The communication between objects by optical beams fills this requirement; moreover, associated with strong potentialities in flow rates one can predict an attractive future for this technology. The future of communication by optical beams will have to also take into account the public health aspect. It will be essential to quantify with the assistance of the ophthalmologists what are the limits (wavelength, outdistances and power) for which the optical beams do not present any danger to the eyes. For the security of the contents of information between the smart objects,, solid state physics proposes original solutions which will make it possible to ensure communication between objects in a confined space). First attempts to bar, in optical flows, only one wavelength while leaving total transparency for the others are in hand. They could be generalised for all frequencies and wavelengths and in particular for smart objects communicating by radio waves. Except in very particular cases, the communicating objects are not directly visible. The quality of the communications is strongly dependent on the properties of the environment of the electromagnetic wave propagation. Indeed, of pieces of furniture, people, buildings, trees or any other obstacle can attenuate, reflect, diffract or diffuse the wave being propagated in open space. Thus, the wave received by a receiving object is not only the wave emitted by the transmitting attenuated and delayed object but the sum of a multitude of counterparts of the emitted wave. These counterparts follow ways different according to topography from the places at the moment of the exchange of information, and they are characterised by a delay and an attenuation which are clean for them. This phenomenon of multitrajets can cause interference and degrade the quality of the communications. For more certain and the most faithful possible exchanges, and whatever the desired situation of the communications between objects (in a crowd, at the seaside, in a room, a workshop), many of the studies of propagation, signal processing, physics are still necessary. Lastly, before finishing this brief presentation of radio and optical technologies for unwired smart objects, one can dream of objects which communicate at long distances by using technologies of the radio and/or optical ad hoc networks.
120
Communicating with Smart Objects
Frequency
IHz
1KHz
1MHz
1 GHz
ITHz
1 PHz
Wavelength
300000 km
300km
300m
0,3m
300 \im
0,3 (im
Field of the millimetre-length and centimetre waves
These diagrams illustrate frequencies \) and the corresponding wavelengths L (1 = C/D, C being the speed of the light in the air) of the electromagnetic waves usable or foreseeable uses in 2002 for the communications between objects. Some frequencies (wavelengths) are more specifically used and of the more precise details are indicated in the following chapters: * chapter 11 ** chapter 12 *** chapter 13 **** chapter 16 ***** chapter 18
Chapter 11
Wireless Techniques and Smart Devices Jean-Claude Bic Department of Communications and Electronics, GET/ENST, France
1. Introduction Second generation mobile systems, GSM and DECT in Europe, IS-95 in the US, were originally designed essentially for voice and short messages services dedicated to person to person communications. 2.5 G evolutions, without modification of the air interface, have introduced data services with bit rates of tens of kbit/s paving the way to communications more appropriate to smart devices. Simultaneously, the first wireless LANs have been devoted to connections between machines, either wireless Ethernet as Hiperlan or IEEE 802.11, or wired substitution as Bluetooth. The number of foreseeable smart devices have been at least an order of magnitude greater than the number of human beings; the necessary capacity in terms of information rates for the links between Smart devices is a major challenge for the near future. The 3G UMTS, based on a new radio interface, will provide a bit rate of hundreds of kbit/s enabling new multimedia services connecting more sophisticated smart devices. The future WLAN will also offer higher bit rates. The 4G, which generates now a lot of prospective works, can be seen as the convergence of these two kinds of systems toward a global system optimising the radio resources depending on the environment. The air interface will play a major role to take into account an important growth of traffic as the radio spectrum is a scarce natural resource which unlike wired systems is not extensible. As in any radio problem, an air interface has to be designed according to the radio channel characteristics spectrum efficiency, capacity, available frequency bands and quality of service will be at the core of the considerations.
2. Air interface The purpose of the air interface is to adapt the signal to the transmission channel and to optimise the quantity of information transmitted from the source to the receiver.
122
Communicating with Smart Objects
At the transmit side it consists of wave shaping functions with forward error correction often with interleaving, modulation, radiofrequency devices (filter, amplifier, antenna, frequency converter etc) and multiple access techniques for the simultaneous transmission of several signals in the same transmission channel. (Figure 11.1). At the receive side after antenna and amplification and frequency conversion, are implemented demodulation, and specific techniques, e.g. diversity, for correction of the channel distortions (Figure 11.2).
Figure 11.1 Air interface, transmission
Figure 11.2 Air interface, reception
Wireless Techniques and Smart Devices
125
2.1. Transmission channel Adapting the signal to the channel assumes a good knowledge of the propagation mechanisms. They are defined by the laws of electromagnetism and physical characteristics of the propagation media.
2.1.1. Characteristics of smart device links i) Distances and environment Links between smart devices may exist in different environments: • • •
cellular networks, with distances ranging from several hectometres to several kilometres in urban and suburban environments with many obstacles, Radio LAN for indoor coverage and also outdoor picocells (campus) with distances of a few tens of meters, domestic or personal networks with distance of a few meters to connect terminal often but not always in line-of-sight conditions.
ii) Frequency bands A very wide spectrum, depending on radio regulations, is possible for smart device links from some hundreds of MHz to a few tens of GHz and beyond in the infrared and optical bands. The low limit is due to the size of antennas and the high limit to coverage problem considering the possible transmitted power. At present, the frequency bands most used are 806-960 MHz (GSM, AMPS), 1700-1900 MHz (GSM, DECT, PCS), 1920-1980/2110-2170 and 1900-1920/2010-2025 MHz (UMTS FDD and TDD), 2,4-2,5 GHz and 5 GHz for RLAN, 3,5 and 26 and 40 GHz (40.5-43.5 GHz) for wideband fixed access. The 54.2-66 GHz band is considered for mobile broadband indoor access (MBS). Infrared networks also exist. Hi) Bandwidth occupancy The trend is an increase of bit rates for most of the multimedia services. Bit rates of tens of Mbit/s are envisaged for video access. Generally, the larger the mobility is, the lower are the bit rates (see Figure 11.3). A macrocellular network will not provide Mbit/s to high velocity mobiles. The bandwidth occupancy is directly related to the bit rate and coding and modulation parameters.
126
Communicating with Smart Objects
Figure 113 Mobility and bit rate
2.1.2. Propagation effects Path loss is the first effect of propagation. Free space loss expressed in decibel increases as 20 log d where d is the distance and as 20 log f where f is the frequency. Using higher frequency bands is therefore costly because the possibility to trade off the propagation loss by directive antennas, easier to implement at higher frequency, is not really possible in a mobile context. In a mobile cellular environment, propagation is rarely in a line-of-sight (LOS) situation but more often in non-line-of-sight (NLOS) depending on reflection and diffraction. Roughly, the loss is 35 log d. The NLOS propagation is limited to lower frequencies up to a few GHz because the diffraction losses, increasing as 10 log F, become too large. So it is sensible to use the lower frequencies for cellular networks and higher frequencies for short range communications. On the contrary, reflections still exist at high frequencies. Globally, in most of the cases multipath propagation occurs, created by the addition of several replicas of the transmitted signal with different amplitudes and delays. The multipaths can be classified into two categories; small scale multipaths or rapid fadings due to obstacles very close to the receiver (somme wavelengths) which are modelled by Rayleigh (NLOS) and Rice (LOS) processes; and large scale multipaths corresponding to reflection and diffraction on distant obstacles, expressed by the impulse response or the power delay profile of the channel. The delay spread Dr second order moment of the power delay profile or coherence bandwidth Bc (satisfying DrBc > constant) are frequently used to characterise these selective fadings. The range of Dr is large depending on the frequency band and the environment, from a few ns in a room at 60 GHz to several tens of us in mountainous terrain. Channel impulse response models are
Wireless Techniques and Smart Devices
127
recommended especially for the comparison of 3G air interfaces and it is important that realistic models are available to estimate the performances. A rule of thumb is to compare the symbol rate Ds to Dr (or the signal bandwidth to the coherence bandwidth). If Dr > Ds (or Bs > Be) then the channel will distort the signal. 2.1.3. Other channel distortions The transmission channel in a wide acceptance including transmit and receive radio frequency equipment (antennas, amplifiers, frequency converters) is characterised by different kinds of noise, sky noise, thermal noise or low noise amplifier, interferences caused by other signals (co-channel interference in the same frequency band from neighbouring cells, adjacent channel interferences from adjacent frequency bands). In the simplest approach, the noise is modelled as a white additive Gaussian of constant density No/2. 2.2. Performance of radio interface The main criteria to evaluate the performances of the air interface are: i) spectrum efficiency
or more generally for
cellular systems with frequency reuse in various cells the cellular spectral efficiency
ii) link quality: essentially error probability (bit error rate or frame error rate), and others variables like delay conditions for decoding: some services such as voice services being strongly delay-constrained contrary to data services which accommodate several hundreds of ms. iii) signal to noise ratio of the signal power S divided by the noise power N (usually in digital communications a ratio Es/No or Eb/No symbol energy or bit energy to noise density). Note that these criteria could be inconsistent, i.e. a better spectral efficiency could be obtained with a higher interference level at the expense of the BER. Optimisation choices will depend on the system itself.
128
Communicating with Smart Objects
2.2.1. Modulation and coding The maximum bit rate (or capacity per time unit) for which the error probability can be reduced arbitrarily small in a white additive Gaussian noise channel of bandwidth B is given by the famous Shannon formula: DC(bit/s) = B log2 (1 + S/N) . It imposes a theoretical limit and recent results in coding and modulation have shown that it could be closely approached. Forward error correction is obtained by introducing N-K redundancy bits computed from K information bits in order to make the codewords 'more distant' in terms of a given distance (often the Hamming distance, number of positions where two codewords differed). The coding rate R is the ratio K/N (R < 1). The code performances depend on the minimum distance and roughly they are improved when for given N and R, the distance increases.
Figure 11.5 BER vs. Eb/No
Modulation is defined by the number M and the position of states or points in a constellation, the waveform which determines the power spectral density (important for interferences considerations) of the transmitted signal. For M < 8, the classical modulations are PSK (phase-shift-keying) modulations BPSK, QPSK, 8-PSK and for M> 8 QAM (quadrature amplitude modulation). 16-QAM constellation is given in Figure 11.4. In-phase (cos (2rcfoT+(t>o)) and in-quadrature carrier (sin (2rcfoT+4>o)) are the two axes.
Wireless Techniques and Smart Devices
129
Like FEC, performances are related to a distance (often the Euclidean distance). For a given average power, the further the points in the constellation, the lower is the error probability. This implies that the error probability increases as the number of points, i.e. for M-PSK modulation where all the points are equally distributed on a circle with the same radius, the distance between two adjacent points is smaller when the number of points is larger. BER is generally plotted versus the Eb/No ratio. The shape of these plots is given in Figure 11.5. Practically, BER objectives after FEC decoding on a radio link depend on the type of services in the range of 10-3 to 10-10. Revisiting the formula giving the maximum bit rate R < C to obtain an arbitrarily small error probability, it can be derived that the theoretical limit for Eb/No is -1.9 dB. At present, the best coding schemes provide BER of 10-5 for about 1 dB. For a given bit rate Db, the symbol rate Ds is expressed by Ds = Db / (R Iog2 M). The signal bandwidth B varies as Ds. The spectrum efficiency nE is therefore improved with modulations having a large number of points M and FEC of high coding rate R. For instance, EDGE owing to 8-PSK has a spectrum efficiency roughly 3 times better than the GMSK modulation of GSM, which behaves like a binary modulation. Combination of modulations with an adjustable number of states and variable rate FEC is one of the key issues of link adaptation which provides the best compromise between link quality and spectrum efficiency in given propagation conditions. One of the most significant example is HIPERLAN2 air interface which combines different codes and modulations enabling, in the same 20 MHz bandwidth, a low bit rate of 6 Mbit/s with R = 1/2 and M = 2 and a high bit rate of 54 Mbit/s with R = and M = 64 according to the propagation conditions. (Bertin paper).
2.2.2. Multiple access Multiple Access Techniques are designed to transmit simultaneously several signals in the same transmission channel. The basic principle is signal orthogonality. Frequency Division Multiple Access (FDMA) has been used in the first generation of mobile systems. It has been drastically improved with OFDM (orthogonal frequency division multiplexing) which gives a better spectrum efficiency by sparing guard bandwidths between carriers. Time Division Multiple Access (TDMA) is implemented in GSM on a single carrier. Several carriers are frequency multiplexed in the allocated bandwidth. Frequency reuse in a cellular scheme is the key factor for cellular spectrum efficiency.
3/4
130
Communicating with Smart Objects
Code Division Multiple Access (CDMA) is used for IS-95. Every information bit of a signal for a given user is 'multiplied' by a digital sequence (the spreading code) made of 'chips'. This technique, also known as direct sequence spread spectrum, is a spin-off of Shannon's results on the channel capacity. The UMTS air interface is based on an improved version, wideband CDMA (WCDMA). Another spread spectrum technique is frequency hopping. It is specified in the Bluetooth standard. For RLAN, the 802.11b standards are based on CDMA, while 802.11a is an OFDM access. Spectrum efficiency comparisons between all these techniques are usually tricky, depending on a lot of parameters. Multiple access techniques have also specific properties relevant to the channel distortions. For example, OFDM by splitting the signal on several carriers, is robust to selective fadings but sensitive to non linearities. Rake receivers exploit the CDMA spreading properties to combine multipaths and improve BER. Spatial Division Multiple Access (SDMA) is another type of multiple access based on antenna arrays. Beamforming allows the array to focus the antenna pattern simultaneously towards different mobiles located in distinct spatial directions. 2.2.3. Diversity techniques Fadings are predominant features of radio channels. Diversity techniques are designed to correct the fadings effects. i) Time diversity: FEC with interleaving spread the information on several bits in different time slots in order to correct the error bursts due to rapid fadings. Bit interleaved coded modulations (BICM) provide a robust solution with a good spectrum efficiency for channels with and without fading. ii) Frequency diversity: OFDM and CDMA spread the information on several frequencies or frequency bands and offer a better resistance to selective fadings than a single carrier access. iii) Spatial diversity: Using several receive antennas to decorrelate rapid fadings is a well known technique. Antennas arrays could be considered as reconfigurable directive antennas which minimised the effects of multipaths and interference from other users. 3. Perspectives Enhancing air interface characteristics is a prerequisite to offering higher capacities and allow the growth of smart devices links. In this respect, important evolutions are presently investigated.
Wireless Techniques and Smart Devices
131
i) Multiple access combinations of CDMA and OFDM in order to increase the spectrum efficiency: multi-carrier-direct-sequence CDMA (MC-DS-CDMA) where data are spread with the same code on several sub-carriers, multi-carrier CDMA (MC-CDMA ) where each sub-carrier is associated to a 'chip' of the spreading code, Multi-Tone CDMA (MT-CDMA ) where the spreading code is the same for each sub-carrier. ii) Ultra Wide Band Transmission (UWB) using very short pulses positionmodulated, multiple access being is obtained by time-hopping in different time slots (converse of frequency-hopping) not to improve the spectrum efficiency itself but to simplify the access to the channel and the deployment (no licenses?) iii) MEMO diversity (multiple input, multiple output) or spatio-temporal diversity: it is based on multiple antennas at the transmit and receive sides in a propagation medium favourable to multipath, creating several distinct propagation channels and giving a new degree of freedom. If the signals transmitted on each transmit antenna are independent (BLAST) they can be recovered at the receive side by appropriate channel estimation and digital processing. Another solution introduces a time dimension by using FEC codes before spreading the signal on several antennas. Information theory has shown that the Shannon capacity could grow linearly with the number of antennas. iv) Multi-users detection: Multiple-access signal are never fully orthogonal because of distortions introduced by the channel. Resulting interferences added to noise will reduce the performance and the spectrum efficiency. The principle is to consider these interferences not as random noise as in an usual receiver but as other signals to be detected and subtracted from the useful signal. Processing is rather complicate but affordable in a base station. v) Software Defined Radio: its rationale is found in the context of radio networks interoperability and air interfaces for seamless links. Experience has shown that it was not sensible to expect a universal standard which would be technically intractable due to the variety of situations. From a radio point of view, the terminal will be adaptatively reconfigured in different frequency bands and for different combinations of access/modulation/coding, an improvement of multi-modes terminal with pre-programmed processing. One of the key technical issues is the analog to digital converter which must be as close as possible of the receive antenna. High resolution 16-bi,t 500 Msample/s, good dynamic range and sufficient bandwidth, 100 MHz, are the important characteristics to be satisfied. 4. Bibliography Correia L.M. "Wireless Flexible Personalised Communications", John Wiley & Sons, 2001.
132
Communicating with Smart Objects
Frodigh M. et al. "Future-Generation Wireless Networks", IEEE Personal Communications, October 2001, pp. 10—17. Lehne P.H. "Wireless Future", Numero special Telektronikk 1, 2001. Morinaga N. "Wireless Communications Technologies, New Multimedia Systems", Kluwer, 2000. Rappaport T.S. "Wireless Communications", Prentice-Hall, 1996. Verdu S. "Wireless Bandwidth in the Making", IEEE Com Magazine, July 2001, pp. 53-58.
5. Glossary BER: Bit Error Rate Bluetooth: Standard for Wireless around 2,45 GHz CDMA: Code Division Multiple Access DECT: Digital European Cordless Telephony Eb/No: Energy per bit to Noise density ratio: used for evaluating link performances EDGE: Enhanced Data rates for the GSM Evolution FDD: Frequency Division Duplex FDMA: Frequency Division Multiple Access FEC: Forward Error Correction GMSK: Gaussian Filtered Minimum Shift Keying (GSM Modulation) GSM: Global System for Mobile IEEE 802.11: Standards for WLAN LMDS: Local Multipoint Distribution System MC-CDMA: Multicarrier CDMA MC-DS-CDMA: Multicarrier Direct Sequence CDMA
Wireless Techniques and Smart Devices
MIMO: Multiple Input Multiple Output MT-CDMA: Multitone CDMA OFDM: Orthogonal Frequency Division Multiplexing PCS: Personal Communication System PSK: Phase Shift Keying QAM: Quadrature Amplitude Modulation RLAN: Radio Local Area Network SDMA: Space Division Multiple Access TDD: Time Division Duplex TDMA: Time Division Multiple Access UMTS: Universal Mobile Telecommunication System: 3G standard UWB: Ultra Wide Band WCDMA: Wideband CDMA (UMTS Access technique) WLAN: Wireless Local Area Network
133
This page intentionally left blank
Chapter 12
Wireless Local Area Networks Philippe Bertin France Telecom R&D, France
1. Introduction Wireless Local Area Networks (WLAN) technologies have recently take off technologically and commercially. From the first corporate WLAN standards 802.11 to the personal area networks built with Bluetooth technology and using the very high bit rate HiperLAN, radio technologies are emerging in various user domains (commercial and public enterprise, personal networks etc). Initially designed for computer science and telecom applications, these technologies are being integrated into mass market terminals (mobile phones, audio-visual terminals) and make possible naturally communicating support adapted to the needs of communicating objects: high rates, multi-terminal connectivity, data and voice support, with range relatively restricted, manageable power consumption and easy integration. Use can then be developed in this kind of domain, probably in surprising and unexpected applications, and as such the potential in terms of communicating services is huge. In this contribution, we concentrate on the description of WLAN standards in order to position them with respect to others in terms of technology and maturity. Then, after a positioning, we introduce generalities to be considered for technology design and describe more precisely mechanisms of 802.11 and HiperLAN/2 standards. In order to complete this state of the art review, technological options of Bluetooth standard are finally considered in the last section.
2. Positioning the WLAN standards Recent and rapid grow of wireless technologies as well as standards and interest group multiplicity make it difficult to position the various solutions. Therefore in this section we are trying to position the main initiatives. 802.11 standards are defined in IEEE. They target the specification of WLAN standards. Their initial use domain is the enterprise; however they are able to easily being used or adapted in other contexts (domestic, campus, public "hot spot").
136
Communicating with Smart Objects
Targeted terminals are PC (most are laptops) and PDA based. 802.11 specifies a core standard which is complemented by a number of extensions, each one being identified by an extension letter (802.11, 802.11 a, 802.11b). Most current WLAN products conform to the 802.11b extension and support the "Wi-Fi" stamp, supposed to guarantee their interoperability. HomeRF is a "voice/data" standard dedicated to domestic applications. It is based on a scheme derived from 802.11 for data support and a scheme derived from DECT for voice support. First products, only data oriented, have appeared on the market. However, it appears that most HomeRP promoters are turning to other technologies. HiperLAN/2 is a European standard specified by ETSI in order to deploy very high bit rate WLANs. Initially, targeting the support of Wireless ATM networks, the standard has been opened to support other types of networks (Ethernet, IEEE 1394), which make it an adaptable standard for different environments eventually restrictive in terms of quality of service (enterprise, audio-visual etc}. Bluetooth had an initial scope of cabling replacement in the vicinity of mobile terminals, for example of the GSM type (PC connectivity, personal digital assistant, auricle, printer etc). This type of use should bring the technology to large developments, in particular for mobile Internet services support. It enables one to foresee the dynamic set-up of true "personal" networks connecting the various terminals of a given person. These initial aims made it short range (from one to several meters) and a low consumption technology. Since the first announcements, industrial developments have evolved the technology by using higher emitting power. Those evolutions target directly the introduction of Bluetooth in WLAN products, used at a housing or enterprise scale with bit rates largely lower than those supported by 802.11 and HiperLAN/2 products. 3. Wireless networks generalities Before describing existing WLAN standards, it is necessary to introduce some generalities applicable to the different systems.
3.1. Functions defined in WLAN standards As for Local Area Networks (e.g. Ethernet), Wireless Local Area Networks standards specify layers 1 and 2 of the OSI model: Layer 1, the physical layer, support the radio transmission service. It defines the transmitted signal (frequency band, channel bandwidth, modulation, filter, framing) as well as the necessary channel coding to ensure radio transmission robustness. Layer 2, the data link layer, is sub-divided into two sub-layers:
Wireless Local Area Networks
•
•
137
The MAC sub-layer support the media access service for the frame transmission. Depending on standards, this type of access can being supported with contention based or contention free schemes. The link control sub-layer is responsible for handling logical connections and interface with upper layers. Depending on the standards, the link control sublayer may support the error detection and retransmission scheme using ARQ (Automatic Repeat Request) algorithm; admission control functions; connections setup and handling functions; radio resources control functions etc.
Hence, layer 2 support a transport service for data units delivered by the higher layer, i.e. layer 3 ("network layer" of the ISO model). Then, WLAN technologies are commonly used to deliver IP datagrams over the radio link. However, in order to simplify implementation, current products offer the radio transport of Ethernet frames. This allows the delivery of a full service equivalent for higher layers, whether it is done over a classical wired LAN or a WLAN; the terminal protocol stack (e.g. the TCP/IP stack in a PC) will use the same internal interfaces (drivers) whatever the media.
3.2. WLAN architectures Two types of architecture are supported for WLAN (depicted in Figure 12.1): •
•
In "centralised" (or infrastructure) architectures, wireless access is provided through an access point which manages the radio resources in a given cell. It permits access to the rest of the local networks through a "bridge" function implemented between the wireless and the wired LAN. In "ad-hoc"architectures, the WLAN is built over a set of wireless terminals in radio visibility range with each other which form a completely distributed system. This type of architecture permits the setting up of a network in a dynamic way depending on terminals which are in the vicinity of each other. It does not preclude the connectivity to a wired network as this service can be provided by a terminal supporting the two types of interfaces coupled with a bridge or a router function between the two networks.
Generally, WLAN standards are designed to operate alternatively in the two types of architectures.
138
Communicating with Smart Objects
Figure 12.1 WLAN architectures
3.3. Wireless terminals It is foreseen a large number of wireless types of terminals: electronic pen, auricle, cellular phone, personal digital assistant, laptop, printer, web pad, digital camera and recorder etc. First are WLAN applications being developped in enterprises networks; WLAN products are currently oriented in this market segment and support PC interfaces essentially based on PCMCIA and PCI formats. With the coming of lower power consumption such as Bluetooth, some mass market product integration might appear.
3.4. Frequency bands Globally, two frequency bands are identified for WLAN use: the 2,45GHz and 5GHz bands which, depending on continent and country, have different regulatory constraints.
3.4.1. The 2,45GHz band It is the frequency band used for most current WLAN products. The total bandwidth is 80MHz (2400 to 2483,5 MHz). It is a "ISM" (Industrial, Scientific and Medical) band that can be used by any material conforming to electromagnetic compatibility standards. It is then not exclusively reserved for network operations, which implies that the system has to face important interference generated by objects of different types (such as microwave ovens for example). This band is available worldwide with some local restrictions in terms of emitted power or uses as summed up in the table below.
Wireless Local Area Networks
139
Table 12.1 The 2,45GHz band; power of emission and uses
Indoor EIRP lOOmW
Outdoor EIRP 500mW
Other restrictions —
100mW
100mW
France 2001
100mW (2446,52483,5 MHz only) l0mW (full band)
100mW (2446,52483,5 MHz only) 2,5mW (full band)
France 2004
100mW (full band)
100mW (2446,52483,5 MHz) 10mW (full band)
Limitations for public access in some countries Outdoor use at 100mWis authorised only in private areas with a preliminary authorisation from the Defence Ministry. Outdoor use restrictions at 100mWtobe clarified.
North America Europe
Concerning system channeling, 2 types of wireless techniques are foreseen: •
•
the DSSS (Direct Sequence Spread Spectrum) technology uses 14 channels of 22MHz with 5MHz spacing (ie there is some overlap between adjacent channels); the FHSS (Frequency Hopping Spread Spectrum) technology uses 79 channels of 1 MHz each.
Bluetooth technology makes use also of a frequency-hopping transmission technique.
3.4.2. The 5GHz band The following sub-bands are identified for being used by future WLAN systems operating at 5GHz: 5150-5350MHz (worldwide use), 5470-5725MHz (only open in Europe), 5.725-5.825 (only open in North America). Globally, this permit use of up to 455MHz in Europe. However, the regulation allows only a sub-part of the band to be open, which should be of at least 330 MHz. Several WLAN systems (Hiperlan/2, 802.11a) are targeting the use of this band. However, they are based on very similar physical layers in order to permit economy of scale in chipset production. The effective opening of these frequency bands is subject to local regulation in each country. At the European level, CEPT recommendations are foreseen:
140
• • •
Communicating with Smart Objects
Full band allocation to Hiperlan/2 systems operating in indoor (max EIRP of 200mW). The possible use in outdoor exclusively in the higher sub-band (max EIRP of 1W). Sharing with radar and satellite systems using the 5GHz band is supported with the implementation of DFS (Dynamic Frequency Selection) and TPC (Transmit Power Control) which guarantee that the Hiperlan system will generate a limited interference level for the other systems.
Currently, in France, only the lower sub-band is open which allows only indoor use. The following table sums up the worldwide situation: Table 12.2 5Ghz band: power and use
North America
Indoor EIRP 200mW (full band)
Europe
200mW (full band)
France
200mW(51505350MHz)
Outdoor EIRP 1W (5250-5350 MHz) 4W (57255825 MHz) 1W (5470-5725 MHz)
Others restrictions
Dynamic frequency selection and Transmit power control.
No outdoor use
Compared with the 2.45GHz band, the 5GHz band is providing the following advantages: • •
Higher bandwidth availability, permitting larger channelling (20MHz) and the coexistence of several networks with limited interference level. Spectrum sharing between a limited number of standardised systems and indoor use specifically dedicated to WLAN types of systems, which limit considerably inter-systems interferences.
This give the 5GHz band most attractive for applications needing high bit rates and guarantee Quality of Service. However, the current competition between the different standards as well as the European regulatory constraints may delay the worldwide market stabilisation for WLAN operating at 5GHz.
3.5. Range and capacity Typical range for WLAN systems are of about 20 to 40m in a typical office environment and of up to 100 or 200m in Line of Sight environment. They are then relatively short, which is due to two main reasons:
Wireless Local Area Networks
•
•
141
The emitted power is restricted both for practical reasons (battery consumption) and regulatory ones (power restrictions as seen in section 3.4); moreover, local area networks types of services are requiring high peak data rates, only possible with a good link budget, which limits the acceptable transmission attenuation and then the range. Those technologies were defined in priority for indoor private types of applications, ie not to cover outdoor extended areas.
In terms of capacity, current WLAN products support bit rates of 11 Mbit/s over the radio link, which permits really a useful bit rate of about 5Mbit/s at the IP layer. Emerging standards in the 5GHz band are targeting visent support of max bit rates of 54 Mbit/s at the physical layer. Lastly, WLAN systems are using time division schemes for sharing the radio resource, when a terminal is emitting it uses the complete channel bandwidth and then the associated peak rate.
3.6. Mobility Terminals mobility between WLAN Access Points (in a centralised architecture) is managed by the terminals themselves that depending transmission conditions select the Access Point on which to associate. The handoff from one Access Point to another one is much closer to a cell "re-selection" scheme than a cellular handover controlled by the network as it is done in cellular networks. During this handoff, layer 2 connectivity is re-established. Then, when both Access Points are connected to the same local infrastructure (typically the same IP sub-net in a TCP/IP network), the network layer connectivity is maintained. However, when Access Points are parts of different sub-networks, the network connectivity can't be maintained as the terminal need to change its IP address and then start again its ongoing applications. In this case, in order to support a mobility service, it is necessary to use specific networks schemes, e.g. such as the implementation of the Mobile IP protocol. This mobility problem is not critic in enterprise networks which generally are based on switched Ethernet architecture and use routers only in splitting with the external Internet network. However, the problem may become more crucial for a campus size deployment where we may face to routed network architectures".
3.7. Security Even if it is possible to implement security schemes in higher layers (e.g. by using the IPsec protocol at the network layer or end to end security at the application layer), the wireless link should not introduce security weaknesses in the communication system. Then, data prevention against eavesdropping as well as network protection against misuses access restraint the development of WLAN systems, particularly considering that radio propagation does not restrict waves to the user private place. Security functions defined for WLAN systems are:
142
• •
Communicating with Smart Objects
authentication preventing from network access of non authorised terminals; encryption preventing radio eavesdropping.
One of the main issue is to support a secured system for key generation and exchanges between terminals and the security manager (the Access Point or a centralised network server). Indeed, this is necessary as ciphering keys maybe broken when they are not revoked regularly. Two approaches are then possible: •
•
Secret key use which is based o the fact that each terminal owns a secret key, only known from him and the network and used as the basis for authentication and ciphering. The issue is then to provide a secure scheme to distribute secret keys in terminals and all the network elements implied in the authentication/ciphering mechanisms. Use of a combination of both public and secret keys: this type of approach permits a terminal and network elements to "publish" a key usable for another party to encrypt data that only the terminal having published the key can decrypt with an associated secret key. Considering that it requires more CPU, this type of mechanism is generally used only for authentication protocols and secret key exchanges to be used for data encryption. This type of scheme support regular revocation for temporary secret keys.
4. Standards 4.1. IEEE 802.11 standards As already mentioned, WLAN IEEE standards are specified by the 802.11 technical group. The initial 802.11 standard was published in 1997 [IEEE 802.11, 1997] to operate at 1 or 2 Mbit/s in the 2,4 GHz ISM band. Since 1997, some extensions were published: the 802.lib specifies higher bit rates to allow transmissions at up to 11 Mbit/s whereas the 802.1 la specifies a new physical layer able to operate in the 5GHz band. Other sub-groups are also targeting extensions on specific issues such as the introduction of Quality of Service policies. 4.1.1. Architecture Both types of architecture introduced in section 3 are supported by 802.11 standards: •
•
In the "Infrastructure" mode, each cell is managed by an Access Point which controls the association/disassociation of the terminals belonging to the cell. Access points are also performing filtering and bridging between the wireless and the wired domain. In the "ad-hoc" mode, the terminals in a given radio environment are sharing the physical medium.
Wireless Local Area Networks
143
The 802.11 standard defines subsystems (see Figure 12.2): •
•
The Basic Service Set (BSS) is the basic building block of a 802.11 network. It could be seen as a cell in a cellular network. In the ad-hoc mode, the wireless network is built as an "Independent BSS". In the infrastructure mode, each BSS is connected to the infrastructure through an Access Point. The Distribution System is seen as the wired infrastructure which interconnects Basic Service Sets. The Distribution System allows the mobility of stations between the BSS. The Distribution System can rely on any wired LAN technology. 802.11 standards does not describe the protocols to be used between Access Points through the Distribution System.
When a station enters a Distribution System, it has to associates itself with an Access Point before issuing any data transfer. It is also recommended to use an Authentication scheme to authenticate stations before any association. Once associated, the station can moves though the different BSS of the given Distribution System using a re-association scheme with the relevant Access Points ("roaming" function described in 802.11 standards).
Figure 12.2 Architecture of the Distribution System IEEE 802.11
4.1.2. The IEEE 802.11 MAC layer 802.11 MAC layer provides an asynchronous data service compatible with classical LLC entities as in wired LAN. Security services are also provided by the WEP (Wired Equivalent Privacy) mechanism. Whatever the underlying physical layer (802.11, 802.1 la or 802.11b), the MAC layer remains the same.
144
Communicating with Smart Objects
Three types of MAC frames are defined: • • •
Control frames are used to control the transmission (e.g. Acknowledgements, short signaling preceding a frame transmission Polling). Management frames are used for signalling purposes (e.g. Beacon, Authentication/Deauthentication, Association, Reassociation, Deassociation). Data frames are used to deliver data payloads.
The basic medium access scheme allows the stations to access the medium in a distributed way with contention resolution. It is standardised as the Distribution Coordination Function (DCF). Optionally, an Access Point can regularly use the Point Coordination Function (PCF) to temporally preempt the medium for provide access in a controlled and contention free manner. Medium access in the DCF relies on a specific scheme named CSMA/CA (Carrier Sense Multiple Access/Collision Avoidance). This scheme uses a "listen before talk" mechanism in order to wait for idle channel before any transmission start. It is derived from CSMA/CD used in Ethernet and 802.3. The main difference is due to the fact that unless wired Ethernet station, the wireless stations can't detect collisions in real time as they can only send or receive data at one time (half duplex scheme). Then, CSMA/CA has been designed to avoid as much as possible collisions to occur. The scheme can be divided in different phases (see Figure 12.3): •
•
The other stations which have pending frames to be transmitted defer access until the end of the current transmission and a given Inter-Frame time. Then, they can enter in the contention resolution mode in order to compete for accessing the physical medium. In the contention resolution phase, each station which have frames to be transmitted senses the medium for a given slotted Backoff Window. At the first time a station enters in the contention phase for a given MAC frame, the Backoff Window is calculated through a random function. Once it's Backoff Window becomes null, the station enters in the transmission phase. Hence, other stations for which the backoff window is not null notice that the medium is used and stop to decrement their Backoff to defer access until the next contention phase.
Data frames have to be acknowledged once they are correctly received. This is done through short ACK frames sent just after the reception of a valid frame. When a data frame is not acknowledged, the sender station considers that a collision occurred. In a system with hidden nodes (e.g. stations from the same BSS which are not able to listen to each others), it can appear that very frequent collisions occur. In order to reduce this problem, the standard supports optionally the exchange of short control frames before the transmission of a complete data frame. Hence, once the sender gets access to the medium, it first send a RTS (Ready To Send) frame and waits for receiving a CTS (Clear To Send) before starting the transmission of the data frame.
Wireless Local Area Networks
145
To enter in the centralised coordination PCF mode, the Access Point has first to preempt the medium, which is done by using a shortest waiting interval that the one used in DCF mode. Then, it polls the stations which are authorised to emit. Once the point coordination period ends, the systems comes back in the DCF mode. In order to support the above described mechanisms, three "InterFrame Spaces" are defined in the standard: • • •
The DIPS (Data InterFrame Space) is the interval to be waited at the end of a frame transmission before considering the medium "idle"; The PIFS (Polling InterFrame Space) is waited by the Access Point before preempting the media to go in PCF mode; The SIFS (Short InterFrame Space) is the interval waited between a data frame reception and the emission of the corresponding acknowledgement. SIFS is also waited between the reception of the RTS frame and the sending of the CTS answer.
It can be noted that SIFS < PIFS < DIPS.
Figure 12.3 CSMA/CA access scheme
The WEP (Wired Equivalent Privacy) algorithm is defined in the 802.11 standard to ensure data exchanges confidentiality. This algorithm can be complemented by a Shared Key Authentication scheme also standardised. The WEP algorithm is based on a secret key which, concatenated with an Initialisation Vector, gives a seed used as the input of a Pseudo Random Number Generator to generate a key sequence with the same length as the data to be transferred (use of the RC4 algorithm). This key is then used to generate the encrypted text to be transmitted concatenated with the initialisation vector. The way to distribute keys not standardised, most of existing products support manual configuration but enhanced scheme to generate and
146
Communicating with Smart Objects
distribute dynamically session keys have been proposed recently by different manufacturers. 4.1.3. IEEE 802.11 physical layers As already mentioned, IEEE 802.11 group specified several physical layers for 2,45 and 5 GHz frequency bands: • •
•
the FHSS (Frequency Hopping Spread Spectrum) physical layer is specified in the initial IEEE 802.11 standard to operate at 1 and 2 Mbit/s in the 2,45GHz band; the DSSS (Direct Sequence Spread Spectrum) physical layer is specified in the initial IEEE 802.11 standard to operate at 1 and 2 Mbit/s and completed in the IEEE 802.11b specification to operate also at higher bit rates of 5,5 and 11 Mbit/s. It uses also the 2,45GHz band; the multi-carrier COFDM (Coded Orthogonal Frequency Duplex Modulation) physical layer is defined in IEEE 802.1la standard in order to permit bit rates between 6 and 54 Mbit/s using the 5GHz frequency band.
The FHSS specification is based on frequency channels of 1 MHz. Up to 79 channels are available depending on the region regulation (e.g. only 35 channels are authorised to be used in France). Each wireless station and Access Point which belong to a given Basic Sub System uses a standardised frequency hopping sequence, the permutation between two frequency channels is done each 0,4 second. A 2GFSK and a 4GFSK modulations are used in the 1 Mbit/s and the 2 Mbit/s modes respectively. In the DSSS specification, each bit is encoded on a suite of 8 or 11 bits which is then spread over a frequency channel of 22 MHz. Up to 14 frequency channels are identified in the standard. The specified modulation schemes are: DBPSK for the 1 Mbit/s rate, DQPSK for the 2 Mbit/s rate, CCK for 5,5 and 11 Mbit/s. The COFDM specification introduced in 802.1la standard allows communications at rates of 6, 9, 12, 18, 24, 36, 48 and 54 Mbit/s (only support of 6, 12 and 24 Mbit/s rates is mandatory). An harmonised effort between IEEE 802.11 and ETSI BRAN led to the specification of very closed physical layers for both 802.1 la and HiperLAN/2 standards (see section 4.2.3). 4.1.4. 802.11 standards evolution The IEEE 802.11 working group carries on he standardisation effort through different sub-groups. The main issues concern: harmonisation effort permitting cooperation or even interworking between North-America and European standards; liaison with international spectrum regulation instances; MAC layer enhancements
Wireless Local Area Networks
147
permitting to introduce quality of service management schemes; specification of an inter-Access Point protocol; specification of a new high rate physical layer based on COFDM in the 2,45 GHz band (recently approved); the introduction of dynamic frequency selection and transmit power control algorithms required for being used in the 5GHz band in Europe; enhanced security schemes. 4.2. Standards ETSIHiperLAN The ETSI published the first WLAN standard operating in the 5GHz frequency bands in 1997. Named HiperLAN (High Performance Radio LAN) Type 1, it enables operation at 23,5 Mbit/s, which were largely higher than the max 2Mbit/s throughput supported by 802.11 standards at that time. In spite of technical assets, the standard did not receive the industrial support necessary for issuing commercial products and only Dassault Electronique did develop advanced prototypes which shown the its efficiency [BERTIN, 2001]. From 1997 to now, the ETSI BRAN (Broadband Radio Access Network) project focused on the standardisation of HiperLAN Type 2 systems [TR101031, 1999]. The first complete set of HiperLAN/2 standard was published in mid of 2000 [TS101475, 2000] [TS101761-1, 2000] [TS101761-2, 2000] [TS101493-1, 2000] [TS101493-2, 2000]. Further enhancements are still under discussion in ETSI BRAN, particularly through partnerships with IEEE to carry on standards harmonisation as well as with 3GPP targeting interworking between and UMTS 3rd generation cellular networks but also under European research programs [BRAIN, 2000].
4.2.1. HiperLAN/2 reference model and architecture The HiperLAN/2 reference model includes a PHY (Physical) layer, a DLC (Data Link Control) layer and a Convergence Layer (see Figure 12.4). The role of the Convergence Layer is to adapt the above protocol requirements to the DLC layer services. In this way, several Convergence Layer will permit to support several higher layer protocols whereas the DLC layer remains the same. In practice, three types of Convergence Layers are already specified: • • •
the Ethernet convergence layer which, through a packet based sub-layer, adapts Ethernet and 802.3 types of frames to the underlying DLC layer; the IEEE 1394 which uses the same packet based sub-layer to support IEEE 1394 types of frames; the ATM convergence layer which, through an internal Cell based sub-layer, adapts ATM cells.
Later, new Convergence Layers could be standardised in order to support other types of protocols over the radio.
148
Communicating with Smart Objects
Figure 12.4 HiperLAN/2 protocol stack
The DLC protocol provides a connection-oriented data transport service. It support the transmission of fixed size protocol units of 54 bytes (including DLC header) over the radio. A control plane is specified to manage connections, resource and association control. In the lower part of the DLC layer, control and data units are distributed among a MAC structure of fixed length of 2ms. MAC frames are transported by the PHY layer through a COFDM (Coded Orthogonal Frequency Division Multiplexing) scheme bit rate.me with link adaptation permitting the selection of the most adequate bit rate.
Figure 12.5 Comparison between centralised and direct modes transmission
Wireless Local Area Networks
149
The HiperLAN/2 standard supports centralised architectures where the Access Point controls the repartition of radio resource between active connections with Mobile Terminals operating in the cell. However, an option has been integrate to permit direct data transfer between terminals belonging to the same cell, which avoid data frames to be relayed by the Access Point when unnecessary. Even in the direct mode, Access Point keeps its role of central controller and signaling units remains exchanged in the centralised way. However, direct mode permits the support of adhoc types of networks.
4.2.2. HiperLAN/2 DLC layer As depicted in Figure 12.4, the DLC layer is sub-divided into: a control plane based on the RLC (Radio Link Control) Protocol; a user plane which support Error Control schemes, a MAC (Medium Access Control) lower part providing the transport of user and control planes information. The MAC sub-layer The MAC protocol is based on a TDD/TDMA centralised scheme. The Access Point controls Medium Access by allocating time frames to active stations in order to manage and serve connections. The MAC sub-layer relies on a fixed duration frame structure that permits to concatenate different types of information: •
• •
control information broadcast by the Access Point to the Mobile Terminals belonging to its cell (network and Access Point Identifiers, MAC frame structure description, radio resources allocation, etc); downlink, i.e. information sent by the Access Point to one or several stations (data and control); uplink, i.e. information sent by stations to the Access Point (resources requests, data, etc).
150
Communicating with Smart Objects
Figure 12.6 HiperLAN/2 MAC frame structure
The turnaround time between Downlink and Uplink phases is fixed dynamically by the Access Point which allows very flexible resource repartition depending on the needs for asymmetric traffics. The different types of information are grouped in "transport" channels which corresponds to a standardised physical structure. Above the MAC layer, Logical channels are introduced to differentiate information types coming from the different sublayers. Logical channels are mapped onto transport channels in a specified way. The RLC sub-layer The RLC (Radio Link Control) sub-layer provides 3 functions: Association Control Functions (ACF), Radio Resource Control (RRC) and DLC User Connection Control (DCC). The association control is supported up to an association procedure initiated by the mobile when it "enters" in the network. During this procedure, the Access Point allocates to the mobile a temporary MAC Identifier to be used only as long as the mobile remains under the control of the given Access Point. A mutual authentication scheme between the terminal and the Access Point is provided as well as the use of encryption functions. The mutual authentication permits to valid not only the mobile identity but also the Access Point one in order to prevent against traffic eavesdropping from false Access Points. The disassociation procedure could be initiated either by the terminal or the Access Point. Radio resource control function provides the specification for the following procedures:
Wireless Local Area Networks
•
•
• •
151
Quality measurement registered at the terminal side and eventually reported to the Access Point. Three types of handover controlled by the le terminal: the Sector Handover procedure is used when the terminal moves from one sector to another one in case of a multi-sector Access Point; the Radio Handover is specified in the case of the mobile terminal moves from one Access Point transmitter to another one belonging to the same Access Point; the Network Handover supports the movement between two Access Points. The Dynamic Frequency Selection is performed by the Access Point in order to select a free frequency channel to be used depending on radio quality measurements carried out by itself as well as those reported by the mobiles associated to it. The Power Control permits to adapt the transmit power control to the transmission situation between each terminal and Access Point (e.g. depending on the distance). The Power Saving is specified to save power consumption when the terminal has no data to send or receive.
The DLC user connection control supports setting up, maintaining, renegotiating and closing DLC connections between a wireless station and an Access Point. Those procedures may be initiated by the station as well as the Access Point. To each connection is given an identifier unique in the cell. Multicast and Broadcast types of connections can also be established. The error control protocol For error control, three modes of operation are provided: the acknowledged mode is based on a selective repeat scheme for un-acknowledged protocol units; the repetition mode allows repetition of protocol units (usable for multicast transmissions); the unacknowledged mode provides no repetition of loss or corrupted data. In the acknowledge mode, the error control protocol relies on a SR-ARQ (Selective Repeat Automatic Repeat Request) scheme: the sender buffers unacknowledged data packets; when it receives negative acknowledgement, it retransmits only the corrupted packets. The acknowledgment scheme, named "partial bitmap acknowledgement", is based on the sending of bitmaps from each bit corresponds to a given protocol unit and indicates whether it is positively (1) or negatively (0) acknowledged. 4.2.3. The HiperLAN/2 physical layer As introduced in section 4.1.3, during the standardisation process, IEEE 802.1 la and HiperLAN/2 physical layers have been harmonised and are then similar. Each layer specificity is relying of its adaptation to the different MAC approaches, in particular
152
Communicating with Smart Objects
the centralised channel access scheme for HiperLAN/2 and the distributed one for 802.11. Table 12.3 HiperLAN/2 PHY layer parameters Parameter Useful symbol part duration TU Cyclic prefix duration TCP Symbol interval Ts Number of data sub-carriers NSD Number of pilot sub-carriers NSP Total number of sub-carriers NST Sub-carrier spacing Af Spacing between the two outmost sub-carriers
Value 64*T 3.2 us 16*T 0.8 us (mandatory) 80*T 4.0 us (TU+TCP) 48 4 52(N SD +N SP ) 0.3125 MHz (1/TU) 16.25 MHz (NST*Af)
8*T 0.4 us (optional) 72 *T 3.6 us (TU+TCP)
The HiperLAN/2 physical layer provides a "protocol data units train" transport service. Those "data units trains" are formed through the concatenation of transport channels broadcast to the whole terminals (generic control data for the cell and the frame, broadcast user data) or personalised (data and signaling destined to each mobile). In the uplink, the protocol units trains are provided to transport data and signaling for each mobile as well as the random channel access requests. The PHY layer is based on a multi-carrier COFDM modulation. Each OFDM symbol contains data and pilot carriers. Then, 64 sub-carriers are used, from which 48 are reserved for data and 4 for pilots. The remaining 11 subcarriers are used as guards. Numerical values for OFDM parameters are given in Table 12.3. The emission of a complete MAC frame (2 ms) corresponds to 500 OFDM symbols, excluding guard intervals and turnaround time between uplink and downlink transmission. Table 12.4 PHY layer transmission modes Modulation
Coding rate R
Nominal bit rate [Mbit/s]
BPSK BPSK QPSK QPSK 16QAM 16QAM 64QAM
1/2 3/4 1/2 3/4 9/16 3/4 3/4
6 9 12 18 27 36 54
Coded bits per subcarrier NBPSC 1 1 2 2 4 4 6
Coded bits per OFDM symbol NCBPS 48 48 96 96 192 192 288
Data bits per OFDM symbol NDBPS 24 36 48 72 108 144 216
Wireless Local Area Networks
153
As already introduced, seven different modes are supported in order to dynamically adapt to the radio situation of each mobile terminal (interference level, distance to the Access Point...). Those modes and the correspondent parameters are given in Table 12.4.
4.3. Bluetooth The Bluetooth technology was developed in the Bluetooth SIG (Special Interest Group). As introduced in section 2, Bluetooth was initially specified for network applications limited in terms of capacity (1Mbit/s), range (10m) and power consumption. This result I a standard mostly destined to WPAN (Wireless Personal Area Network) types of applications than WLAN. However, in order to better position this technology in regards of Wireless Local Area Networks, we introduce hereafter the designed technologic options. Bluetooth uses the 2,45 GHz frequency band. In order to prevent from interferences from other systems operating in the same band, a "fast" frequency hopping physical layer technology is adopted (up to 1600 hops per second). The GFSK type of modulation supports a bit rate of 1 Mbit/s. The protocol uses a temporal duplex and a centralised architecture where a "master" station shares the radio ressource use in time between "slaves" stations. The, a slave station can't take the medium as long as it receives a "polling" packet from the master. The master station and associated slaves constitute a Bluetooth network, called "piconet". Several packets formats are defined: ACL (Asynchronous Connection Less) packets are adapted to data transfer whereas SCO (Synchronous Connection Oriented) packets support synchronous streams such as voice. Each packet can use 1, 3 or 5 time slots of 625 us each, the system changing of frequency between the emission of two packets. Figure 12.7 illustrates the resources sharing principle in "pointmultipoint" between SCO and ACL packets.
154
Communicating with Smart Objects
Figure 12.7 Resource sharing in a Bluetooth piconet
Lastly, two transmission power are supported: 100mW allowing a range of about 15 meters and 1mW limiting the range to a maximum of 5 meters. Within a piconet, up to 8 nodes can operate and maximum three voice and seven data communications can be supported. 5. Conclusion Appeared on the market last years, WLAN products have a technical and commercial increasing development. It is however needed to start to face emerging of several standards which could be positioned as concurrent or complementary. For personal networks types of applications built around a cellular terminal, Bluetooth technology is well positioned to be developed quickly and a large number of mass market terminals are already announced (phones, digital recorders, digital cameras, PCs, printers...). For local area networks types of applications, 802.11b products are already widespread but new generation such as 802.1la and HiperLAN/2 are being implemented by industrials and should allow the proposal of the first very high bit rate products early in 2002. Currently, those technologies are confined to the professional sphere but could be developed quickly in domestic networks and above all as Internet high rate wireless access points in public places (already widespread used in united states). They are also a natural support for communicating objects needs, either in personal ad-hoc networks, or in interface with telecommunication networks. Hence, with the
Wireless Local Area Networks
155
integration of new types of interfaces and software solutions for service discovery and selection; objects and users will be able to get access in total transparency to communication services dynamically adapted to the environment (connectivity, exchanges and automatic synchronisation with terminals present in the vicinity). We should then assist to the apparition in the next years of a large number of uses on those technologies, the most surprising being probably still to be invented. 6. References [BERTIN, 2001] "A trial of Home Applications over Hiperlan type 1"; Philippe Bertin et Regis Cady; PIMRC 2001, London. [BRAIN, 2000] IST project BRAIN Deliverable 3.1 - Technical requirements and identification of necessary enhancements for HIPERLAN Type 2. Sept. 2000. [IEEE 802.11, 1997] IEEE 802.11 "Standard for Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specification", 1997. [TR101031, 1999] ETSI TR 101 031 V2.2.1 (1999-01), "Broadband Radio Access Networks (BRAN), High PErformance Radio Local Area Network (HIPERLAN) Type 2, Requirements and architectures for wireless broadband access". [TS101475, 2000] ETSI TS 101 475 V1.1.1 (2000-04), "Broadband Radio Access Networks (BRAN); HIPERLAN Type 2; Physical (PHY) layer. [TS101761-1, 2000] ETSI TS 101 761-1 VI.1.1 (2000-04), "Broadband Radio Access Networks (BRAN); HIPERLAN Type 2; Data Link Control (DLC) Layer; Part 1: Basic Data Transport Functions. [TS101761-2, 2000] ETSI TS 101 761-2 V1.1.1 (2000-04), "Broadband Radio Access Networks (BRAN); HIPERLAN Type 2; Data Link Control (DLC) Layer; Part 2: Radio Link Control (RLC) sublayer. [TS101493-1, 2000] ETSI TS 101 493-1 V1.1.1 (2000-04), "Broadband Radio Access Networks (BRAN); HIPERLAN Type 2; Packet Based Convergence Layer; Part 1: Common Part. [TS101493-2, 2000] ETSI TS 101 493-2 V1.1.1 (2000-04), "Broadband Radio Access Networks (BRAN); HIPERLAN Type 2; Packet Based Convergence Layer; Part 2: Ethernet Service Specific Convergence Sublayer.
156
Communicating with Smart Objects
7. Glossary ACL: Asynchronous Connection Oriented. Asynchronous packets transmitted in non-connected mode in Bluetooth systems (support). AP: Access Point. Wireless access point, implements generally a bridge function between a WLAN radio cell and a classic wired local network. ARQ: Automatic Repeat Request. Retransmission schemes for loss protocol units. ATM: Asynchronous Transfer Mode. BRAN: Broadband Radio Access Networks. ETSI group specify broadband radio systems. BSS: Basic Service Set. Radio cell in the EEE802.11 standard. CSMA/CA: Carrier Sense Multiple Access/Collision Avoidance. Channel access method based on a listen before talk mechanism as well as a collision avoidance method. Used in IEEE 802.11 WLAN. CSMA/CD: Carrier Sense Multiple Access/Collision Detection. Channel access method based on a listen before talk mechanism with a collision resolution method. Used in Ethernet Local Area Networks. CTS: Clear To Send. Short signalling scheme sent in answer to the reception of a RTS frame in a IEEE 802.11 system. DCC: DLC Connexion Control. Connection control function in the DLC layer of an Hiperlan/2 system. DCF: Distribution Coordination Function. IEEE 802.11 transmission mode permitting that each station gets the same probability in medium access. DECT: Digital European Cordless Telecommunications. DIPS: Data Interframe Space. Minimum time interval between the emission of 2 data frames in a IEEE 802.11 system. DLC: Data Link Control. Couche de controle de lien. Domestic network: communication network used at the scale of an home. EIRP: Effective Isotropic Radiated Power. Ethernet: communication protocol used in Local Area Networks.
Wireless Local Area Networks
157
ETSI: European Telecommunication Standard Institute. Hiperlan: High PERformance Radio Local Area Network. European WLAN standards specified by ETSI. IEEE: Institute of Electrical and Electronics Engineers. IEEE 802.11: IEEE WLAN standardisation group. ISM: Industrial, Scientific and Medical. LAN: Local Area Network. MAC: Medium Access Control. OFDM: Orthogonal Frequency Division Multiplex. technique.
Multi-carrier modulation
OSI: Open Systems Interconnection. OSI layer: layer defined in the ISO model. PCF: Point Coordination Function. WLAN IEEE 802.11 transmission mode where an access point controls the communication media sharing. PIFS: Polling Interframe Space. Time interval specified in IEEE 802.11 system between the transmission of a frame and the radio media preemption by an access point working in PCF mode. RLAN: Radio Local Area Network. Other denomination for WLAN. RLC: Radio Link Control. Radio control function. RRC: Radio Ressource Control. Radio resources control functions in Hiperlan2 systems. RTS: Ready To Send. Short signaling frame that can be sent before the transmission of a dataframe in IEEE 802.11 systems in order to ensure from radio connectivity. SCO: Synchronous Connection Oriented. Synchronous packets transmission in Bluetooth systems (support of voice). SIFS: Short Interframe Space. Time interval specified in IEEE 802.11 system between the transmission of a data frame and the corresponding acknowledgement. TDD: Time Division Duplex.
158
Communicating with Smart Objects
TDMA: Time Division Multiple Access. UMTS: Universal Mobile Telecommunication System. 3rd generation cellular networks. WEP: Wired Equivalent Privacy. IEEE 802.11 security functions. WLAN: Wireless Local Area Network. WPAN: Wireless Personal Area Network.
Chapter 13
Radio Links in the Millimeter Wave Band Nadine Malhouroux-Gaffet, Olivier Veyrunes, Valery Guillet, Lionel Chaigneaud and Isabelle Siaud France Telecom R&D/DMR, Belfort, France and France Telecom R&D/DMR, Rennes, France
1. Introduction Communicating objects represent a subject of much interest in the domain of service creativity or in the field of new implemented techniques. The latter are based on wireless communication systems that combine simplicity set-up with a great use facility. Wireless local area networks (WLAN) are indispensable for communicating objects and their expansion induces an increasing need for research and development on these systems; in particular concerning the increase of data rate. This increasing need of data exchange associated with pressure on scarce spectrum resources led the manufacturers of WLAN systems to investigate the millimeterfrequency bands (30 to 300 GHz). With such frequencies, the available bandwidth high data rates of about 120 Mb/s to be reached. However, the "outdoor links" above 10 GHz are affected by atmospheric and weather conditions that are of importance to the link budget. In this context, FTR&D initiated a study to simulate the loss due to hydrometeors (rain, snow etc) above 30 GHz, which is the validity limit of the UIT-R models. In the field of the local area networks, the band around 60 GHz shows interesting characteristics to optimise the deployment of indoor systems. Indeed, strong attenuation in open space, thus reducing interference between cells, as well as a large available bandwidth, will be a major asset for the realisation of future high data rate indoor systems.
160
Communicating with Smart Objects
2. Outdoor radio links in the millimeter wave band At frequencies above 10 GHz, the transmitting rate is appreciably better but some natural factors, such as atmospheric gaseous and precipitation, strongly affect propagation. In particular, interactions between electromagnetic waves and hydrometeors, such as rain, hail and snow, produce attenuation by energy absorption in particles and energy scattering in all directions. Thus, the designers of high frequency communication systems require from the propagation specialists, the ability to be able to foresee these attenuation effects to determine the adequate fading margins and to ensure reliable predetermined signal levels for different weather conditions. An overestimation of the propagation effects can result, on one hand, in a very expensive conception of system because of the high prices of the implemented devices to limit the disturbing effects of propagation and, on the other hand, on interference with other services. Besides, an underestimation of these propagation effects can lead to unreliable systems. So, the studies of propagation should be as accurate as possible to satisfy in the best conditions, in a statistical way, the quality criteria of the operational radio links. For some years, important efforts have been initiated by the international community to update the radio climatic databases from which global predictions of propagation are established for terrestrial radio links in the millimeter wavelength band (30-300 GHz). These works were the subject, among others, of CLIMPARA conferences, organised by the committee F of the URSI and the European project (COST) and Canadian project. Many countries combined their efforts to establish recommendations about the propagation aspects of the radioelectric waves in nonionised environments, under the aegis of the International Telecommunications Union for Radio communications (ITU-R). However, these techniques of global predictions are known to suffer from an important lack of observations of propagation in various parts of the world. So, this explains why FTR&D decided to establish an experimental setup (Figure 13.1), near Belfort in France. This experiment should contribute to a detailed knowledge of the physical mechanisms of atmospheric and meteorological phenomena and also of their interactions with the electromagnetic waves. The purpose of the setup is to supply propagation data over a short range for which the weather conditions are almost constant.
Radio Links in the Millimeter Wave Band
161
KEY: Pluie = rain; neige = snow; grele = hail; spectropluviometre = spectra rain gauge; capteur ... = identification hydrometeor, identification and temperature sensors; station meteorologique — meteorological station; pluviometre a auget = bucket rain gauge Figure 13.1 Belfort experimental station set up to supply propagation data over short range
The experimental device consists of multifrequency transmission links over an 800 meters terrestrial path. The propagation path is horizontal, with direct visibility about 5 m above the ground. Four dynamic narrow band links operate at 30, 50, 60 and 94 GHz, with vertical polarisation. Meteorological equipment, set out all along the path, includes three tipping bucket rain gauges to measure rainfall rates and their equivalent for other hydrometeors, two disdrometers to measure raindrop size distributions, two identification sensors to characterise the kind of precipitation and a meteorological station to measure temperature, atmospheric pressure, hygrometry, wind speed and direction. Investigation of the important radioelectrical and meteorological database led to development of attenuation propagation models due to the hydrometeors and thus to satisfy the new needs of the systems designers.
162
Communicating with Smart Objects
3. Rain attenuation One of the most fundamental aspects of knowledge of the rain attenuation characteristics is the relationship between the linear attenuation A (dB/km) and the rainfall rate R (mm/hour). The ITU-R model for rain attenuation intended for the forecast methods "Rec. ITU-R p. 838, 1999", recommends the following relationship: where the coefficients k and a are functions of the frequency. Such a relationship is interesting for its simplicity and its accuracy for the most common attenuation ranges in practical applications. The experiment showed that the values of k and a, recommended by the ITU-R, are respectively underestimated and overestimated above 30 GHz for the climate in the north east region of France. Other authors have already observed these results in other locations of the world. The ITU-R model tends to overestimate the rain attenuation above 50 GHz for a rainfall rate exceeding approximately 20-25 mm/hour. Figure 13.2 shows a comparison of the linear attenuation (dB/km) at 30, 50, 60 and 94 GHz, modelled according to the measures, versus the rainfall rate (mm/hour). It clearly shows that the rain attenuation increases with the rainfall rate. Furthermore, for a given rainfall rate, rain attenuation will be more significant with the highest frequencies, at least in the 30-100 GHz band.
Figure 13.2 Comparison of modelled rain attenuation (dB/km) at 30, 50, 60 and 94 GHz
Radio Links in the Millimeter Wave Band
163
These studies represent only a sample of a number which were conducted in this domain. Thus, this vast program allowed apprehension of the complexity of the propagation phenomena in natural environments and to establish statistical distributions about the occurrence of parameters involved in propagation, as needed by the designers. These statistical studies provide reliable transmission support for high data rate systems over short radio links. 4. Millimeter wave indoor LAN 4.1. Introduction In recent years, the increasing use of processing applications with high data speed exchanges have involved an important development of the local area networks (LAN). At the same time, the progress of wireless communication systems lead to development of WLAN which are very attractive because of their layout flexibility and the portability of their terminals. Currently, the two most advanced standards (Hiperlan 2 and IEEE 802.11) working at frequencies of about 2, 4 or 5 GHz offer data rates of about 20 and 50 Mbit/s that remain lower than those for cable networks. In order to exceed these limits and to reach data rates higher than 100 Mb/s, the 60 GHz band is currently studied. Indeed, this frequency band is particularly adapted to local area networks. The strong attenuation of the waves at this frequency ensures relative safety between rooms and the outside. That makes possible to employ limited cells by partition of the rooms and facilitate the engineering deployment. The low dimension of the antennas working at 60 GHz is also an advantage compared with lower frequencies. Moreover, the available bandwidth at 60 GHz is a major asset compared to the other lower frequencies and makes possible to reach rates higher than 100 Mb/s. 4.2. Study of the propagation channel Because of the multipaths (multiple reflections from the ground, the walls, furniture etc), the received signal is composed of several differently attenuated and delayed components. This behaviour is translated in the frequency field by a different selectivity from the various spectral components of the propagation channel that behaves like a selective filter with regard to frequency. It is thus necessary to optimise the system according to the propagation conditions. The deployment of a wireless indoor system must thus obey two major constraints: • •
To ensure a controlled radio coverage of the building in order to optimise the use of the frequency band and the cost of the infrastructure. To know the selectivity of the channel, which allows the optimisation the system parameters (modulation, coding etc) in order to reach the target rate with
164
Communicating with Smart Objects
a minimal error rate. The preliminary estimation of the characteristics of data rates and error rates attached to the services envisaged is an essential stage.
4.3. Statistical study of the channel of propagation The first stage consists in knowing how to establish a link budget of typical connection according to the distance transmitter-receiver and the environment. The statistical study of the channel was carried out from measurements taken in various kinds of environments: offices for the R-LAN and residential, for home applications (multimedia). Line of sight (LOS) measurements carried out (Figure 13.3) present a decrease that is proportional to the logarithm of the distance. On the other hand, in the none line of sight (NLOS) conditions this loss law in function of the distance is not respected (Figure 13.4).
Figure 13.3 Propagation loss in LOS condition [GUILET, 2001]
Figure 13.4 Propagation loss in NLOS condition [GUILET, 2001]
On the other hand, taking into account the number of crossed partitions gives the significant results. Figure 13.5 presents the additional loss measured compared to that of free space as a function of the number of crossed partitions. The values are, for a partition (plasterboard), about 19 dB and of 27 dB for two partitions. These high values show that a propagation model integrating the number of crossed partitions will be better.
Radio Links in the Millimeter Wave Band
165
Figure 13.5 Influence of the number of partitions crossed by the direct path on the propagation loss [1]
The estimated link budget cannot thus be only based on the direct path but needs to take into account additional elements such as the interior architectures and the propagation phenomena (reflection, transmission, diffraction) that a ray model, based on geometrical optics, can simulate.
4.4. The propagation model Modelling the propagation at 60 GHz in an indoor environment has two distinct interests: • •
To help to deploy and to give a cost estimate of the coverage for such systems implemented in different environments. To generate a channel model that can be used in the transmission-simulated system and allows an optimisation of the parameter setting of the system.
A 3D ray-tracing model has been developed and adjusted at 60 GHz for an indoor environment, coverage examples for different heights of reception (antennas on PC, in height) are described in Figure 13.6.
166
Communicating with Smart Objects
Figure 13.6 Radio coverage with 2m (a), 1.5m (c) and 0.5m (d) above the ground for the configuration (b)
A study of the model parameters allows optimisation of the number of reflections to be calculated to simulate the indoor propagation at 60 GHz [CHAIGNEAUD, 2001]. From this tool, a propagation channel model can be generated in order to test various configurations of link without having to realise the corresponding measurements. 4.5. System study The simulated system study was realised in synergy with the characterisation of the channel in order to adapt the system to the propagation conditions in this band. The objective is to define the physical layer most adapted to reach data rates higher than 120Mb/s. 4.5.1. Multi-carriers techniques The techniques of multi-carrier transmissions like COFDM were developed for different wireless systems such as DAB, HIPERLAN... different multi-carrier techniques were used to evaluate the performances of the physical layer for the 60
Radio Links in the Millimeter Wave Band
167
GHz RLANs systems. MC-CDMA techniques initiated in 1993, were confronted with COFDM modulations. MC-CDMA, in particular, allows one to combine the advantages of COFDM modulations (multipaths robustness) with the spread spectrum techniques (high capacity). On the other hand, it increases the complexity of the receiver. 4.5.2. Results The first results in a multi-carries configuration in "offices" and "residential" environments are promising in spite of the un-optimised parameter setting. The results in Figure 13.7 represent the error rate obtained according to the useful bit energy proportional to the system's signal to noise ratio. These curves were simulated for two configurations: • •
LOS: Transmitter and receiver are of visibility. NLOS: Transmitter and receiver are masked by one or more elements of interior architecture (partition, movable, carries..).
For each of these configurations (LOS and NLOS), the tests were made with "typical" files corresponding to the propagation channel having some average wideband characteristics. These first results show that for an error rate of 10-2, a signal-to-noise ratio needs to be twice more powerful in none-visibility (NLOS) than in visibility (LOS). Optimisation of parameter setting of the COFDM chain in progress will allow improvement of these results. Moreover, it appears starting from the first curves, that the use of multi-input/output techniques in none-visibility is to be considered. The degradation induced by RF stages is particularly penalizing in the COFDM modulation, and the need to evaluate a broad panel of system solutions resulted in also studying mono-carrier solutions integrating the algorithms of turbo-equalisation in reception [HELARD, 2001].
Figure 13.7 Evaluation of the error rate according to various favourable and unfavourable positions in residential environment [SIAUD, 2001]
168
Communicating with Smart Objects
5. Conclusions These studies allowed apprehension of the complexity of the phenomena of propagation in the millimeter bands. Thanks to the outdoor studies, prediction models for the losses due to the hydrometeors were developed in order to offer reliable support for transmission in the millimeter band for high data rates proximity links. The characterisation and the modelling of the indoor propagation channel at 60 GHz allow use of simulating systems working at data rates higher than 100 Mb/s. Thanks to the systems working in the millimeter bands, the expectation of doubling or even tripling the data rates of the current systems offer new possibilities of services in many fields (multimedia, professional, medical, video) for communicating objects.
6. References [GUILET, 2001]: V.GUILLET. Narrowband and wideband characteristics of 60 GHz radio operator propagation in residential environment, Electronics letters, 2001, Vol37,N° 21. [CHAIGNEAUD, 2001]: LIONEL CHAIGNEAUD, VALERY GUILLET AND RODOLPHE VAUZELLE, A 3D Ray tracing tool for broadband wireless systems, VTC 2001, Atlantic City, October 2001. [SIAUD, 2001]: I SIAUD, R. LEGOUABLE, Mr. HELARD, 'One Multicarrier Transmission Techniques over Recorded Indoor Future Propagation Channel Models for Broadband RLANs at 60 GHz, PIMRC', 2001, 30 September-October 2001, San Diego, California. [HELARD, 2001]: Mr. HELARD, I SIAUD, C LANGLAIS, 'Principles of the turbo equalisation: Application to the transmissions radio operator Indoor with 60 GHz', to appear in the fourth days of study 'Electromagnetic propagation in the atmosphere of decametric in the angstrom', March 2002, Rennes.
7. Glossary COFDM: Orthogonal Coded Frequency Multiplexing Division. Hydrometeors: Rain, snow, fog. MC-CDMA: Multi-Carrier-Codes Division Multiple Access. Millimeter-length: Frequency band from 30 to 300 GHz.
Chapter 14
Propagation of Radio Waves Inside and Outside Buildings Herve Sizun France Telecom R&D, Belfort, France
1. Introduction Wireless communications between radioelectric systems and communicating objects require a good knowledge of the propagation channel. First it is necessary to evaluate the behaviour of the waves in the environment considered (rural, suburban, urban, dense urban, indoor) to be able to parameterise the emission power, the polarisation and to choose the antennas, the modulation, the transmission protocol, etc. A first part of this chapter will consist in briefly defining what is a radio wave and then presenting the various phenomena being able to affect it (reflection, transmission, diffraction, diffusion, guidance). In a second part the difficulties of transmission will be detailed resulting from the interactions of the waves, between them and with the environment, outside and inside buildings (shadowing, interferences, fading, etc). The last part will approach the available techniques for the study of propagation between communicating objects, mainly with models of propagation in various environments (rural, mountainous, suburban, urban, inside buildings, etc.) macro cell, small cell, micro cell, pico cell, launching and ray tracing models. These two last are more particularly adapted to understand and illustrate the various phenomena, inside and outside the buildings. 2. Electromagnetic waves The parameters characterizing the electromagnetic wave propagation are the electric field E, the magnetic field H, the electric flux density D and the magnetic
170
Communicating with Smart Objects
induction B. Only E and D vectors produce actions by which it is possible to measure the electromagnetic field. The vectors D and B are connected to E and H vectors by the following linear relations:
The coefficients 8 and u are respectively the permittivity and the magnetic permeability of the medium. In the absence of charged particles, they are linked by Maxwell's equations:
Their resolution brings us to the Helmholz equation for each vector E, B , D and H. The vector V (where V belongs to the unit E, B , D or H) is then given by the following equation:
where: Ais the Laplacian operator: grad(div) - rot (rot) In sinusoidal mode, the oscillations of the vectors E and H are propagated in space perpendicularly one to the other in the form of a wave (Figure 14.1) at velocity The time interval between two equal successive elongations in direction and size is called the period. The space crossed by the wave during this time is the wavelength. It is given by the relation:
where: f is the wave frequency, co is the pulsation, T is the period.
Propagation of Radio Waves Inside and Outside Buildings
171
Figure 14.1 Illustration of the propagation of an electromagnetic wave The Pointing vector describes amplitude and direction of transported power flux. Electromagnetic waves are generated by oscillating electric circuits, electric vibrations of dipoles or by electronic tubes. Their spectrum is very broad: we have ELF (fc>100 km), VLF (10<X<100 km), LF (1<X<10 km), MF (100<X<1000 m), HF (10<X<100 m), VHF (1<X<10 m), UHF (10
172
Communicating with Smart Objects
irregularity d, met during propagation, compared to the wavelength (connected to the frequency) plays an important role. The three following cases are considered: • • •
d/X < 1: irregularities are very small compared to the wavelength; statistical methods are then applied. The physical phenomena concerned will be primarily attenuation resulting from absorption and scattering. d/A, = 1: irregularities are of the same order as the wavelength; approximations are then not possible (resonance zone). dfk > 1: irregularities are very large compared to the wavelength; asymptotic methods based on optics and rays theory are applicable. The physical phenomena concerned will be primarily reflection, transmission, diffraction, etc.
3. Mechanisms of propagation Propagation paths (Figure 14.2) are of a different nature: direct, transmitted, reflected, scattered, diffracted or guided.
Figure 14.2 Various mechanisms of propagation
3.1. Direct paths A path is known as direct when the transmitter and the receiver are in radioelectrical visibility (in line of sight): the first Fresnel ellipsoid is not obstructed. This ellipse delimits space area through which nearly all the energy passes. It is the place of the
Propagation of Radio Waves Inside and Outside Buildings
173
M points which fulfills the following equation (E indicates the site of the transmitter, R that of the receiver, A, is the wavelength):
3.2. Transmitted paths Most of the time the direct path does not exist: it is blocked by obstacles (masks). Transmission is a phenomenon which allows waves to cross an obstacle (wall, building, vegetation, etc). 3.3. Reflected paths Reflection occurs when the wave meets a surface whose dimensions are large in comparison with the wavelength (ground, wall, fa9ade of building etc). The reflection characteristics of an unspecified surface depend on several factors: the surface of the material (smooth or rough), the wavelength and the incidence angle. The roughness of a surface relative to the wavelength constitutes an important parameter for the shape of the reflection diagram. A smooth surface reflects the incident radiation in only one direction as a mirror (specular reflection). On the contrary, a rough surface will reflect the incident radiation in all the directions. A surface is regarded as rough, according to the Rayleigh criterion, if the following relation is satisfied:
where: £ is the maximal height of the surface irregularities, A, is the incident radiation wavelength, 0i is the incidence angle. For an infrared radiation characterised by a wavelength of 1550 nm, under normal incidence, a surface is known as rough if the maximum height of the irregularities £ is greater than 0,19 jam. This result shows that the majority of surfaces met inside buildings are regarded as rough with infrared radiation; the diagram of reflection then presents a diffuse component (diffuse reflection).
174
Communicating with Smart Objects
3.3.1. Specular reflection Specular reflection, a phenomenon common to all the frequencies, is due to a perfectly plane homogeneous surface. The loss path induced by such reflections arises from the Fresnel relations and depends on the dielectric characteristics of the reflective surface (conductivity a, permittivity e). For example, the majority of surfaces inside the buildings have a reflection coefficient, in the infrared band, ranging between 0.4 and 0.9 [GFELLER, 1979]. Various reflection coefficients have been mentioned in the literature for different materials [YANG, 2000]. 3.3.2. Diffuse reflection Diffuse reflection is due to the reflections by surfaces which are not plane but rough; such surfaces present irregularities. The, an incident wave is not reflected in only one direction but diffused in multiple directions. Two models are usually used to represent reflection of the infrared band: Lambert and PHONG model. Lambert model Some surfaces are very irregular and reflect the infrared radiation in all the directions independently of the incidental radiation. Such surfaces are known as diffuse and can be represented by the Lambert model. The model is very simple and very easy to implement inside software. It is described by the following equation:
where: p is the surface reflection coefficient, Ri represents the incident optical power, 00 is the observation angle. Phong model The reflection diagram of several rough surfaces is well represented by the Lambert model except around specular reflection where the experimental diagram presents an important component. The Phong model considers the reflection diagram as the sum of two components: diffuse and specular components. The percentage of each component, depending mainly on the surface characteristics is a parameter of the model. The diffuse component is given by the Lambert model. The specular component is given by a function which depends on the angle of incidence 0i and the angle of observation (angle of reflection) 90. The Phong model is described by the following relation:
Propagation of Radio Waves Inside and Outside Buildings
175
where: p is the surface reflection coefficient, Ri represents the incident optical power, rd is the percentage of the ray which is considered in diffuse form (it is a value ranging between 0 and 1), m is a parameter which controls the directivity of the specular component of the reflection, 9i is the incidence angle, 90 is the observation angle. It will be noted that the Lambert model is equal to Phong's by taking rd equal to one. The Phong model, dependent on the observation and incidence angle, is more complex than Lambert's. The computing times are lengthened. The reflection pattern presents a principal lobe centered around the direction of the specular reflection. 3.4. Diffracted paths Diffractions occur when a wave meets an edge (hill, building, vegetation, roofs, corners of buildings, road structures etc.) whose dimensions are large compared to the wavelength. It constitutes one of the most important factors intervening in the propagation of radio waves [BOITHIAS, 1983]. The use of the geometrical theory of diffraction (TGD) makes possible to represent this phenomenon in the form of rays [McNAMARA, 1990]. Approximate expressions were proposed for attenuation evaluation by diffraction [BOITHIAS, 1983]:
for v > —0.7 and is usable in the vicinity of 0.
valid for v > 1,5
176
Communicating with Smart Objects
where:
d1 and d2 are respectively the distances from the edge to the transmitter and the receiver. The reader will find hereafter, as an example, some typical values of attenuation at 4 frequencies (20, 40, 60 and 100 GHz) for edges (of height 1 and 10 m) located at 1 km of the transmitter and receiver. The results are summarised below. Table 14.1 Typical values of attenuation by diffraction (in dB)
Diffraction attenuation (dB) Frequency = 20 GHz fREQUENCY =40gHZ
Frequency = 60 GHz Frequency =100 GHz
d=l km, h=1m 10 14 16 18
d=lkm, h=10m 30.3 33.3 35 37
3.5. Guided paths Certain environments (deep streets (street canyons), corridors, tunnels, etc.) behave like true waveguides with respect to the propagation of the radiowaves consecutively to multiple successive reflections on the walls (application of propagation modes and ray theory more particularly when the wavelength is very small compared to the transverse section of the tunnel, for example). 4. Propagation channel properties These various paths of different amplitude and phase interfere while arriving at the receiver. The interferences are constructive (the different paths arrive in phase) leading to signal reinforcement or destruction; one then has signal fade. Moreover, it should be noted that the mobile moves in this figure of interferences. It sees successively luminous and dark patches (interference fringes) causing fading of the signal. Because of the presence of multiple paths and displacement of the receiver, the propagation channel has three fundamental properties: attenuation, variability and selectivity in frequency. In analogical communication, the attenuation concept was sufficient to study the propagation channel. In the context of numerical communication, fading due to
Propagation of Radio Waves Inside and Outside Buildings
177
variability and the selectivity induce deterioration of the quality of the communication independently of attenuation.
4.1. Attenuation We generally distinguish the free-space attenuation and the attenuation relative to free-space (excess attenuation) Attenuation in free-space (loss transmission) is due the dispersion of energy in the vacuum as it moves away from the transmitter. It is given by the following relation:
where: d is the distance between the transmitter and the receiver, A, is the wavelength, f is the frequency. Example: d=lkm f =900 MHz, A0 = 91.4 dB f=1800MHz,A 0 = 97.4dB This supposes that one has only one radioelectric path whose 1st Fresnel ellipsoid is not obstructed. It attenuation is governed by the 1/d2 law (energy conservation). That means that we lose 6 dB each time the distance is doubled. Attenuation relative to free-space attenuation is the difference between the basic transmission loss and the free-space basic transmission loss expressed in decibels; this due to absorption by gases, the hydrometeors, the walls, the vegetation, attenuation by diffraction, etc.
4.2. Variability The propagation environment fluctuates: the passage of vehicles and people, the wind in the trees, the opening of doors cause fluctuations of the radioelectric paths and generate fast variations the observed signal. Combined with the movement of vehicles, these phenomena create a variability of propagation in space and time. The variations of the signal are of random nature. Statistical analysis will make it possible to evaluate the impact of the multiple paths on the transmission of radiomobiles systems. The law of the field received and the fading average time for example are essential data to dimension the transmission equipment and their ability to fight against fading (interlacing, diversity).
178
Communicating with Smart Objects
Modelling of the fast variations was proposed [CLARKE, 1968]. By supposing that the mobile moves in an interference figure generated by the superposition of a great number of plane waves of amplitudes, independent phases and random directions, the field received (the complex envelop) is, by the application of the theorem of the central limit, a Gaussian variable. The narrow band signal envelope, the power received, then follows a Rayleigh distribution. One speaks about Rayleigh fading. In the case of a predominant path (line of sight, open environments such as suburbs, railway stations, rural areas, etc), the narrow band signal envelope, the received power, then follows a Rice distribution. This is observed when one of the paths is predominant. The Rice distribution corresponds to a Rayleigh distribution when r = 0 (absence of direct path) and makes it possible to identify a direct path and its preponderance. It is also characterised by the K coefficient (Rice parameter) defined by the relation:
The K parameter represents the relation between the power of the direct path and the contribution to power in the secondary paths following a Rayleigh distribution. The larger the parameter K, the more important the direct path is in comparison with the multiple paths and the clearer the connection is. Conversely, if the predominant path is weak, (K < -5 dB) one considers that the Rice distribution is identified with a Rayleigh distribution [CCIR, 1990]. The K value depends on the environment (dense urban, suburban, rural, etc). Others distributions, such as those of Weibull and Nakagami [BRAUN, 1991] also make it possible to characterise the radiomobile signal envelope. In order to determine the variation law followed by the signal one applies statistical tests for example the Kolmogorov-Smirnov test [BARBOT, 1992] to its fast variations resulting from the interference of the waves received. Knowing the statistical characteristics of the signal (density of probability, function of distribution), makes it possible to determine the relevant parameters of a radioelectric system operation among which: •
•
probability of going below a certain level. When the signal power is lower than the noise threshold tolerated by the receiver (thermal noise, jamming, industrial parasites, etc), the signal is mothered by the noise. The receiver is then unable to correctly interpret the transmitted information. statistical fading duration. During fading, information packages are lost. The slower the mobile moves, the longer the cut-off times.
Propagation of Radio Waves Inside and Outside Buildings
179
4.3. Selectivity When the differences of the multiple paths delays are important, the transfer function is not constant over all the width of the spectrum: the path loss depends on the frequency used. The channel is known as selective in frequency. Wideband modelling of the channel is then essential to evaluate the performance of a complete transmission chain, to conceive the new systems and to ensure the transmission quality of transmission of numerical signals. The radioelectric channel is represented by its variable impulse response in time, h(t-t), i being the delay and t pointing out the dependence on time (and thus in space since the vehicle moves). A function of two variables, it translates the three characteristics of the channel: the attenuation, the variability (t) and the selectivity (I). The dual variables, by Fourier transform of T and t are respectively the frequency and Doppler speed [PARSONS, 1992], [KATTERBACH, 1995]. The output signal y(t), as a function of the input signal x(t), is represented in the form of an equation of convolution whose convolution core is variable in time:
where h(t-x) is the impulse response at the moment t to a radioelectric impulse x(t) which one would have emitted at the moment t-T. It completely defines the propagation channel. It makes possible to distinguish the various echoes according to their delays from propagation. It is useful to qualify the selectivity of the channel by parameters deduced from the average profile of power of the impulse response. The most used are the mean delay, the rms delay spread, the delay interval, the delay window and the coherence bandwidth of the channel [FAILLY, 1989].
180
Communicating with Smart Objects
Figure 14.3 Schematic representation of the temporal evolution of the impulse response of the propagation channel
4.3.1. Average delay profile One defines the average delay profile P(x) of the impulse response from h(t-i) by the relation:
It corresponds to an average over a certain duration T (Figure 14.3) selected so that the measured impulse responses can be presented as a stationary and an ergodic random process [LAVERGNAT, 1997]. 4.3.2. Average delay The average delay is the average of the delays weighted by their power. It is given by the first moment of the impulse response:
where: TLQS is the delay of the line of sight path, 13 is the instant when P(t) exceeds the cut-off level for the last time, Pm is the total energy of the impulse response, defined by the following relation:
Propagation of Radio Waves Inside and Outside Buildings
181
where: P(t, i)is the power density of the impulse response, T is the excess delay, TO is the instant when P(t) exceeds the cut-off level for the first time.
4.3.3. The rms delay spread The rms delay spread is the power weighted standard deviation of the excess delays. It characterises the variability of the rms mean delay. It is given by the second moment of the impulse response:
The rms delay spread illustrates the risk of appearance of intersymbol interferences and the disturbing effects that the remote and powerful echoes are likely to generate.
4.3.4. The delay interval The delay interval at X dB is defined as the time interval between the time T1 when the amplitude of the impulse response exceeds for the first time a given threshold and the time 12 when this amplitude becomes for the last time lower than this threshold (Figure 14.4).
182
Communicating with Smart Objects
Figure 14.4 Example of power delay profile: highlighting of the delay interval at X dB
Figure 14.5 Example of power delay profile; highlighting of the delay window containing y% of the total energy found in the impulse response
4.3.5. The delay window The delay window is the duration of the central portion (T2-T1) containing y % of the total energy found in the impulse response. Times T1 and T2 are defined by the relation (Figure 14.5):
Propagation of Radio Waves Inside and Outside Buildings
183
4.3.6. The correlation bandwidth The correlation bandwidth (Bc) of the channel is defined in the following way. Let C(t, f) be the autocorrelation of the transfer function (Fourier Transformation of the power of the impulse response).
The correlation bandwidth is defined as the frequency for which |C(t, f)| is equal to X % of C(t, f=0) [PARSONS, 1992]. It indicates the amplitude of selective attenuation depending on separation in frequency. The correlation band is thus the frequency from which the autocorrelation function of the transfer function falls below a given threshold. For the analysis of the experimental data, the UIT-R recommends use of delay intervals for thresholds of 9, 12 and 15 dB below the peak value, the delay windows for 50%, 75% and 90% of energy and a correlation bandwidth for 50% and 90% correlation. The correlation bandwidth is linked to the delay spread by the following relation [LEE, 1993]:
The propagation channel is more or less selective according to the environment. The form of the impulse response is also different according to the environment. In pico cell environments (inside buildings), the decrease of the power with respect to delay on the impulse response is exponential whereas in small and micro cells important delays are very clearly distinguished. One will note, in Figure 14.6, that the time scales are not the same for a pico cell and for small and micro cell environments.
184
Communicating with Smart Objects
-40:
From left to right: inside building, small and micro cell environments [Retard relatif= relative delay] Figure 14.6 Examples of impulse responses measured in various environments 5. Propagation modelling In the face of unceasingly increasing demands, the operators were made to improve their networks: they increased the basic stations number and reduced cell size. The cell is the area zone covered by a base station. One generally distinguishes four kinds of cells (Figure 14.7). They sum up particular physical attributes of the base station antenna and its given geographical area: their characteristics are related to the power used, to the position and the height of the base station antenna and to the geographical environment.
Figure 14.7 Various types of cells The largest cell is the macro cell. The surrounding medium is in general rural or mountainous and the base station antenna is positioned at a very high point. The distance between the base station and the mobile can be greater than ten kilometers. This geographical area, little urbanised, involves a certain number of paths in great
Propagation of Radio Waves Inside and Outside Buildings
185
delays (up to 30 [is). Moreover, the diffusers, being moved away and in a restricted number, do not generate very marked fast fading. Facing the increasing number of mainly downtown users, it was necessary to reduce the cells size in order to cut down the re-use of the allocated frequencies distance. The most current of the established downtown cells is the small cell. It covers a ray of less amplitude than a few kilometers and the base station antenna is located above roof level. The maximum duration of the impulse response is 1010uS. In a very dense urban zone, the small cell is replaced by a micro cell which has an activity ray of a few hundreds of meters. For antennas under roof level, the wave propagation is guided by the streets. The maximum duration of the impulse response is 2 us. There is finally the pico cell, of a few ten meters ray, which corresponds to a communication inside the buildings in which the antennas of the base stations are placed. The maximum duration of the impulse response is 1 us. To each environment corresponds a certain number of models (theoretical, empirical or statistical, or semi empirical). The theoretical models are based on the fundamental laws of physics with adequate approximations. The empirical models are based on the statistical analysis of a great number of experimental measurements carried out according to various parameters such as the frequency, the distance, the effective height of the emission antenna, the reception antenna, etc. The semi-empirical models combine an analytical formulation of physical phenomena (reflection, transmission, diffusion, diffraction) and an adjustment of the variables using experimental measurements. Propagation models are used when designing radio interface to optimise the performances and also during the systems deployment to determine the radioelectric cover. We will only detail here the most-often adapted models to the deployment of communicating objects: the microcell model, ray launching model, penetration model and ray tracing model inside as well outside buildings.
5.1. Microcell model Microcell modelling is built on a duality related to the short distances considered (a few hundreds of meters at most): a calculation of line of sight (LOS) and a calculation in non line of sight (NLOS). When the transmitter is located in the street below roof level, the propagation is guided along the streets: either in the street where is a transmitter (LOS), or in the adjacent streets after the passage of at least one corner (NLOS). We find these two kinds of calculation in the majority of the
186
Communicating with Smart Objects
analytical models 2D (profile) suggested in the literature, in particular the double slope model [XIA, 1993] which considers an attenuation in d2 (20log10d) for line of sight (LOS) at short distances and an attenuation in d4 (401og10d) for non line of sight (NLOS) or for line of sight at long distances (combination of the direct path and the path reflected on the ground). The breakpoint is generally given by the following relation:
where hb and hm represent respectively the height of the emission station and the mobile one's. In practice, it is very difficult to estimate the slope value in NLOS because it strongly depends on the angle a between the streets. In fact, it works on the uniform theory of diffraction (TUD) [KELLER, 1962], [KOUYOUMJIAN, 1974] leading to a precise analytical formulation of the propagation phenomena in urban zones [BERTONI, 1994], [BERG, 1995], [JAKOBY, 1995]. An analysis of the propagation at 900 MHz and 1800 MHz on passing a street corner, by simulations carried out using electromagnetic methods (TUD) led to simplified analytical expressions for the propagation mechanisms in line of sight, reflection and diffraction depending on the origin and destination of widths streets of width respectively W1 and W2, of distance from the transmitter to the corner (D), distance from the corner to the receiver (X), angle a formed by the two streets and with emission frequency (Figure 14.8). Attenuation (Aff) on the level of the receiver after passing of the corner breaks up in the following way:
where: AffVis is the line of sight attenuation at distance D
AffRef is attenuation due to the reflection [WIART, 1993].
with
Propagation of Radio Waves Inside and Outside Buildings
187
f is the frequency used, D is the distance between the transmitter and the street corner, S represents the slope of the power decrease with passing of the street, W1 and W2 are the street widths in line or in non line of sight, X is the distance from the receiver to the street corner, f(a) is a function of the angle a of the street. AffDif is the attenuation due to diffraction [WIART, 1993].
where:
Figure 14.8 Standard case of the passage at a crossroads in microcell model
To determine attenuation between a transmitter and a receiver, the model seeks all possible paths between the emission point and the reception point. The contributions delivered by the various paths are totalled in Watts, and the resultant is converted into decibels.
188
Communicating with Smart Objects
5.2. Ray launching model The technique of ray launching is a particularly promising deterministic technique. Being based on very precise geographical data bases and on physical theory, it is particularly well adapted for urban environments and makes it possible to obtain computation results with very rich information like the impulse responses. It is for these reasons that, recently, many models using this technique have appeared, [KURNER, 1993], [LAWTON, 1994], [LIANG, 1998]. The RAY model developed by FTR&D is a model based on a technique of ray launching and is designed to carry out a systematic research of the paths connecting the transmitter to the receiver using a combination of the physical phenomena reflections, horizontal and vertical diffractions, penetration and crossing of vegetation. All these phenomena are deduced only from their theoretical values given by the uniform theory of diffraction (TUD) for diffractions and by the Fresnel and Beckmann formulae for the reflections. The traditional method is to launch, in all directions, a significant number of the guidelines starting from the transmitter. Each ray represents the part of the electromagnetic wave included in a cone around this guideline. For reasons of saving calculation, some assumptions make it possible to simplify the geometry of the problem. Figure 14.9 describes the actual steps followed [ROSSI, 1991], [ROSSI, 1992]. Initially, only the horizontal plane is considered, ie the base station E as all the obstacles are represented by their vertical projection on this plane. Then the rays are sent, starting from the transmitter E, at regular angular intervals in the plane. At this stage, in 3D space, the ray does not represent any longer a cone around the ray but a vertical section of space. When a ray comes up against a building face (point 1 on Figure 14.9), it has the possibility either to pass over, or to be reflected. The two cases are explored by creating, starting from this point, two branches: one continues its way in a straight line, the other is directed in the specular reflection direction. The same duplication is produced each time the ray meets an obstacle until the ray i too much attenuated or comes out of the framework of the ground data. When the ray meets a receiver R, the third dimension is then considered by unfolding the ray according to its curvilinear co-ordinates on the viewgraph (Figure 14.9); the reflective buildings are represented there by a simple vertical bar. The way followed by the ray corresponds to the shortest way between the base and the mobile which passes above the diffracting buildings. Once this cut is obtained, the rigorous attenuation calculation on the connection is possible. The average power received by the receiver is the resultant of the powers of each ray arriving at this point. Figure 14.10 gives an example of all the rays connecting a transmitter and a receiver whose power is in a dynamics of 20 dB.
Propagation of Radio Waves Inside and Outside Buildings
Figure 14.9 Principle of a ray launching model
189
190
Communicating with Smart Objects
Figure 14.10 Diffracted and reflected rays between a transmitter (e) and a receiver (r) (ray model)
5.3. Penetration model
The penetration attenuation in a building is defined as being the power loss undergone by the electromagnetic field between the outside of the building and one or many positions inside the building. It is calculated by comparing the external field and the field in the parts of the building where the receiving mobile is situated. The attenuation models, integrated into the field prediction tools, must take into account the environment close to the building studied. The parameters affecting the values of penetration attenuation are multiple and their effects intermingle most of the time. Among all these parameters, one generally distinguishes the following traditional parameters: neighbouring environment. One will distinguish districts with more or less large open viewing towers from each other, and the more traditional districts comprising average height buildings. depth of reception in the buildings. The field decreases when the mobile moves from the front towards the bottom of a room inside the building. The effect of the face inhomogeneities decreases when one approaches the bottom of the room. The waves penetrate much easier into the building through the glazed parts than the brick walls and consequently certain paths are more or less attenuated or even shut out. The glazed faces in general present a 6 dB penetration attenuation power compared to the unglazed faces [RAPPAPORT,
Propagation of Radio Waves Inside and Outside Buildings
• •
• • • •
191
1994]. At the bottom of the building attenuation is more important but also much more homogeneous. angle of incidence angle. It acts on the reflection and transmission coefficients through a surface. reception height more commonly called "floor effect". In a general way this floor effect is expressed in the form of a reduction in the penetration attenuation or of a relative power gain compared to the lower floor in a small cell context (emission antenna located above the roof level). The calculation base is thus the penetration attenuation at ground floor determined comparative with the external field. In a small cell, the power gain observed is classically about 2 to 3 dB per floor at 900 and 1800 MHz. However, the strong diversity of the situations causes a dispersion of the gain by floor; values of about 4 to 7 dB were already measured in practice [GAHLEITNER, 1994]. The floors of the lower part are illuminated by rays undergoing a number of reflections and diffraction on the roofs and in the street whereas the floors of the higher part have the advantage of a much more significant illumination, sometimes even direct. The effects on the penetration are thus varied [RAPAPPORT, 1994] and [WALKER, 1983]. distance between the transmitter and the receiver, when the building sheltering the mobile is visible from the transmitting antenna. Then the penetration attenuation depends on the distance by the free space propagation law. height of the emission antenna. frequency. type of crossed materials. The crossing of materials causes an electromagnetic wave attenuation, from about 4 dB (wood) to 10 dB (concrete) [COST231, 1999].
Various measurement techniques were developed to characterise the penetration attenuation due to building materials. One can in particular quote the method with two reverberating rooms [FOULONNEAU, 1996]. Most traditional models draw from the Motley Keenan model [MOTLEY, 1988] used for propagation inside the buildings. They take into account the penetration attenuation according to parameters such as: the distance between the transmitter and the building external wall where the receiver is located, the distance between the external wall and the receiver, the number of internal walls cut by the profile, the floor effect, the attenuation of the building's external wall, the attenuation of the internal walls. The path loss (L) is expressed by a sum of the losses in free-space (L0), losses due to the obstacles crossed by the direct ray (flagstones, walls, doors, windows), of a constant (Lc) [MOTLEY, 1988]. The data base can differentiate the various
192
Communicating with Smart Objects
obstacles with which a particular attenuation value is associated. It is the most used model.
where: NJ is the number of crossed walls of type J, LJ are the losses due to walls of type J, N is the number of the types of different walls, Nf is the number of crossed flagstones, Lf are the losses by flagstone. 5.4. Indoor modelling Propagation inside buildings depends on the kind of environment: dense (buildings of office type), open (buildings of office type, large offices being able to accommodate several people), broad (buildings having very large rooms such as warehouses, airports, stations) and corridor (transmitter and receiver being located in the same corridor). It is also multi path: the predominant mechanisms are reflection, transmission, diffraction and scattering [HASHEMI, 1993b], [VALENZUELA, 1997].
5.4.1. Empirical models One distinguishes the profile models by distance, the Motley-Keenan profile type [KEENAN, 1990] and the multi-rays models called corridor model. In the case of the profile models at a distance, the parameters taken into account are the frequency and the distance between the transmitter and the receiver. Two models are proposed, the first [COST231, 1999] supposes a logarithmic dependence of attenuation of function of the distance, the second supposes a linear dependence [COST231, 1999]. The two models are usable for situations of non line of sight. The models of the Motley-Keenan type apply to the cases of non line of sight in a dense environment (office). The loss path (L), as in penetration, is expressed by a sum of the losses in free-space (L0), losses due to the obstacles crossed by the direct ray (flagstones, walls, doors, windows), and of a constant (Lc) [MOTLEY, 1988]. The data base can differentiate the various obstacles with which a particular value of attenuation is associated. It is the most used model.
Propagation of Radio Waves Inside and Outside Buildings
193
where: NJ is the number of crossed walls of type J, LJ are the losses due to the walls of type J, N is the wall kinds number, Nf is the number of crossed flagstones, Lf are the losses by flagstone. 5.4.2. Deterministic models In the ray launching technique, the models suggested are generally in 3 dimensions because the propagation medium is dense. Developed by many authors [RAPPAPORT, 1994], [SEIDEL, 1994], [CICHON, 1994], they lead to encouraging results [RAPPAPORT, 1994]. They constitute good references for the settling of the multipath model but remain of restricted use in engineering because of substantial computing time. The ray tracing technique consists of building rays by using the image theory. It makes it possible to easily take into account the mechanisms of propagation (multiple reflections, diffractions, etc). This technique was used by many authors [MC KOWN, 1991], [COST231, 1999], [VALENZUELA, 1994], [JENVEY, 1994], [LAURENSON, 1993]. The two other approaches, the finite differences method [MURCH, 1994], [LAUER, 1994], and the Tayleig-Gans approximation [LU, 1993], use relations derived from Maxwell's equations. The propagation medium is meshed. The radioelectric field propagates, iteration after iteration, in space surrounding the transmitter. The computing time is very important because the mesh step must be lower than wavelength (about A/ 8). However, one currently does not have representative results of their performances, the methods being very recent. 5.5. Ray models Ray models are deterministic models. They rest on a precise knowledge of the reality and require to have geographical data bases of "contour" type or interior of buildings. They make it possible to predict the various propagation paths in a given configuration. After adjustment with the frequency band considered, these models also make it possible by simulation to carry out parametric studies to analyse for example the influence of the antenna's diagram pattern or material characteristics, which is much more economic than the realisation of multiple series of measurements. We give below, in illustration, two examples of ray tracing: one outside the buildings (Figure 14.11) and the other inside the buildings [CHAIGNEAUD, 200la, 2001b, 2002] (Figure 14.12).
194
Communicating with Smart Objects
Figure 14.9 Example of ray tracing outside buildings
Figure 14.10 Example of ray tracing inside buildings
Propagation of Radio Waves Inside and Outside Buildings
195
6. Conclusion After having briefly described electromagnetic waves we presented the various mechanisms of propagation (reflection, transmission, diffraction, scattering etc) as well as the properties of the channel (attenuation, variability, selectivity). The selectivity of the channel is defined starting from parameters deduced from the average delay profile (average delay, delay spread, delay interval and delay window, correlation bandwidth). These various parameters make it possible to characterise the propagation medium necessary for the definition and deployment of the communicating objects. In the last part are presented the various models for the deployment of the mobile systems and communicating objects in various environments (rural, mountainous, suburban, urban, interior of buildings) and various types of cell (macrocell, small cell, microcell and pico-cell) by primarily limiting us to the models with narrow band. Moreover, more details concerning these various models can be obtained in the literature [LAGRANGE, 2000]. The broadband models such as path models (COST 207, ATDMA [RACE, 1994], IUT-R, CSELT), representation models (model of channel recorded in deterministic propagation) and geometrical models were not approached. 7. Bibliography [BARBOT, 1992] BARBOT J.P., LEVY A.J., BIG J.C.; "Estimation of fast fading distribution functions", Com. URSI Commission F Open Symposium, 1992. [BERG, 1995] BERG J-E.; "A recursive method for street microcell path loss calculations", PIMRC'95, Toronto, Canada, pp. 140-143, 1995. [BERTONI, 1994] BERTONI H.L., HONCHARENKO W., MACIEL L.R., XIA H.H.; "UHF propagation prediction for wireless personal communications", Proceedings of the IEEE, Vol. 82, n° 9, pp. 1333-1359, 1994. [BOITHIAS, 1893] BOITHIAS L.; "Propagation des ondes radioelectriques dans 1'environnement terrestre", Dunod, 1983. [BRAUN, 1991] BRAUN W.R., DERSCH U.; "A physical mobile radio channel", IEEE Transactions on Vehicular Technology, Vol. 40, n° 2, pp. 472-482. [CHAIGNEAUD, 2001a] CHAIGNEAUD L., GUILLET V., VAUZELLE R.; "3D ray tracing method for indoor propagation modelling at 60 GHz", European Conference on Wireless technology, London, September 2001.
196
Communicating with Smart Objects
[CHAIGNEAUD, 2001b] CHAIGNEAUD L., GUILLET V., VAUZELLE R.; "A 3D ray tool broadband wireless system", Vehicular Technology Conference, Atlantic City, October 2001. [CHAIGNEAUD, 2002] CHAIGNEAUD L., GUILLET V., VAUZELLE R.; "Methode de trace de rayon 3D pour la modelisation de la propagation en interieur a 60 GHz", Propagation electromagnetique dans 1'atmosphere du decametrique a 1'angstrom, Rennes, March 2002. [CICHON, 1994] Cichon D.J., Wiesbeck W., "Indoor and outdoor propagation modelling in pico cells", PIMRC'94, Personal Indoor Mobile Radio Communications, September 1994. [CLARKE, 1968] Clarke R.H., "A statistical theory of mobile-radio reception", B.S.T.J. pp. 957-1000, August 1968. [CNET/CSELT Cooperation, 1998] CNET/CSELT Cooperation, Data transmission on DECT standard, "Definition of common propagation models, regeneration scheme and performance evaluation criteria for the aligment of the two radio link simulators", March 1998. [COST 259] COST 259, Web informations: http://www.lx.it.pt/cost259. [COST 231, 1999] COST 231, "Evolution of land mobile radio (including personal) communications", Final report, Information, Technologies and Sciences, European Commission, 1999. [CROCHIERE, 1981] Crochiere R.E, Rabiner L.R., "Interpolation and decimation of digital signals - a tutoral review", Proceedings of the IEEE, Vol. 69, n° 3, pp. 300-331, March 1981. [FAILLY, 1989] Failly M., "Final Report of COST 207, Digital Land Mobile Radio Communications", CEE Luxembourg, 1989. [FOULONNEAU, 1996] Foulonneau B., Gaudaire F., Gabillet Y., "Measurement method of electromagnetic transmission loss of building components using two reverberation chambers", Elect. Letters, 7, Vol. 32, N° 23, pp. 2130-2131, 1996. [GAHLEITNER, 1994] Gahleitner R., Bonek E., "Radio waves penetration into urban buildings in small cell and microcells", Technische Universitat Wien, Vienna, Austria, Proceedings Vehicular Technology Conference, Stockholm, pp. 887-891, June 1994. [GFELLER, 1979] GFELLER F.R., BAPST U.R.S., "Wireless in-house data communication via diffuse infrared radiation" Proceedings of the IEEE, Vol. 67, n° 11, 1979.
Propagation of Radio Waves Inside and Outside Buildings
197
[HASHEMI, 1993b] Hashemi H., "The Indoor Radio Propagation Channel", Proceedings of the IEEE, Vol. 81, N° 7, pp. 943-968, 1993. [HATA, 1980] Hata M., "Empirical formula for propagation loss in land mobile radio service", IEEE Transactions on Vehicular Technology, Vol. 29, pp. 317-325, 1980. [IUT-R, 1996] International Telecommunication Union Study Groups "Guidelines for evaluation of radio transmission technologies for IMT-2000/FPLMTS", FPLMTS.REVAL, Question ITU-R, document 8/29-E, June 1996. [JAKOBY, 1995] Jakoby R., Liebenow U., "Modelling of radiowave propagation in microcells", Proc. Intern. Conference on Antennas and Propagation, ICAP, Eindhoven, The Netherlands, pp. 377-380, 1995. [JENVEY, 1994] Jenvey S., "Ray optics modelling for indoor propagation at 1.8 GHz", Proceedings of the IEEE 44th Vehicular Technology Conference, Stockholm Sweden, June 1994. [KATTENBACH, 1995] Kattenbach R., Fruchting H., "Calculation of system and correlation functions for WSSUS channels from wideband measurements", Frequenz 49, 3^, pp. 42-47, 1995. [KEENAN, 1990] Keenan J.M., Motley A.J., "Radio coverage in Buildings", British Telecom Technol. J., Vol 8, N° 1, January 1990. [KELLER, 1962] Keller J.B., "Geometrical theory of diffraction", JOSA Vol. 52, pp. 116-130, 1962. [KOUYOUMJIAN, 1974] Kouyoumjan R.G., Pathak P.H., "A uniform geometrical theory of diffraction for an edge in a perfectly conducting surface", Proceedings of the IEEE, Vol. 62, N° 11, pp. 1448-1461, November 1974. [KURNER, 1993] Kurner T., Cichon D.J., Wiesbeck W., "Concepts and Results for 3D Digital Terrain Based Wave propagation Models: an overview", IEEE Trans. Selected Areas in Com., Vol. SAC 11, N° 7, September 93, pp. 1002-1012. [LAGRANGE, 2000] LAGRANGE X., "Les reseaux mobiles"; Chapitre 2: Propagation radioelectrique (SIZUN H., BIC J.C.), Reseaux et Telecoms, Information-Commande-Communication, HERMES, 2000. [LASPOUGEAS, 2000] Laspougeas P., Pajusco P., Bic J.C., "Radio propagation in urban small cells environment at 2 GHz: Experimental spatio-temporal characterisation and spatial wideband channel model", Proceedings of the IEEE Vehicular Technology Conference (VTC'2000), Boston, 2000.
198
Communicating with Smart Objects
[LAUER, 1994] Lauer A., Bahr A., Wolff I., "FDTD simulations of indoor propagation", Proceedings of the 44th Vehicular Technology conference, Stockholm, Sweden, June 1994. [LAURENSON, 1993] Laurenson D.I., McLaughlin S., Sheikh A.U.H., "The application of ray tracing and the GTD to indoor channel modelling", IEEE Conf. GLOBECOM'93, Houston, USA, December 1993. [LAVERGNAT, 1997] Lavergnat J., Sylvain M., "Propagation des ondes radioelectriques", Collection Pedagogique de Telecommunication, MASSON, 1997. [LAWTON, 1994] Lawton M.C., Macgeehan J.P., "The application of a deterministic ray launching for the prediction of radiochannel characteristics in small cell environment", IEEE Transactions on Vehicular Technology, Vol 43, N° 4, pp. 955-969, November 1994. [LIANG, 1998] Liang G., Bertoni H.L., "A new approach to 3D ray tracing for propagation prediction in cities", IEEE Transactions on Antennas and Propagation, Vol. 46, N° 6, June 1998. [LU, 1993] Lu Y.E., "Site precise radio wave propagation simulations by time domain finite difference methods", Proceedings of the 43th Vehicular Technology Conference, Meadowlands, USA, May 1993. [Me KOWN, 1991] Mc Kown J.W., Hamilton R.L., "Ray tracing as a design tool for radio networks", IEEE Network Magazine, November 1991. [McNAMARA, 1990] McNAMARA D.A., PISTORIUS C.W.I., MALHERBE J.A.G.; The Uniform Geometrical Theory of diffraction, Artech, House, London, 1990. [METAMORP, 2000] Metamorp Project, "Description of the modeling method", Deliverable C2/1, 2000 (http://www.nt.tuwien.ac.at/mobile/projects/METAMORP/ en/. [MOTLEY, 1988] Motley A.J., Keenan J.M., "Personnal communication radio coverage in building at 900 MHz and 1700 MHz", Electronics Letters, Vol. 24, N° 12, June 1988. [MURCH, 1994] Murch R.D., Cheung K.W., Fong M.S., Sau J.H.M., Chuang J. CL. "A new approach to indoor propagation prediction", Proceedings of the 44th Vehicular Technology conference, Stockholm, Sweden, June 1994. [PARSONS, 1992] Parsons J.D., "The mobile radio propagation channel", Pentech Press Publishers, 1992.
Propagation of Radio Waves Inside and Outside Buildings
199
[RACE, 1994] RACE ATDMA Project, "Channel models Issue 2", R084/ESG/ CC3/DS/029/bl, Ed. R. GOLLREITER, May 1994. [RAPPAPORT, 1994] Rappaport T.S., Sandhu S. "Radio Wave Propagation for Emerging Wireless Personal Communication Systems", IEEE Antennas and Propagation Magazine Vol. 36, No. 5, pp. 14-23, October 1994. [ROSSI, 1991] Rossi J.P., Bic, J-C., Levy A.J., Gabillet Y., Rosen M., "A ray launching method for radio-mobile propagation in urban area", IEEE Antennas and Propagation Symposium, London (Ont.), Vol. 3, pp. 1540-1543, June 1991. [ROSSI, 1992] Rossi J.P., Levy A.J., "A ray model for decimetric radio-wave propagation in an urban area", Radio Science, Vol. 27 N° 6, pp 971-979, November-December 1992. [ROSSI, 1997] Rossi J-P., Barbot J-P., Levy A.J., "Theory and measurement of the angle of arrival and time delay of UHF radiowaves using a ring array", IEEE Transactions on Antennas and Propagation, Vol. 45, N° 5, pp. 876-884, May 1997. [SIAUD, 1996] Siaud I., "A digital signal processing approach for the mobile radio propagation channel simulation with time and frequency diversity applied to an indoor environment at 2.2 GHz", Personal Indoor Mobile Radio Communications Conference, PIMRC'96, Taiwan, 15-18 October 1996. [SIAUD, 1997] Siaud I., "A mobile propagation channel model with frequency hopping based on a digita signal processing and statistical analysis of wideband measurements applied in micro and small cells at 2.2 GHz", IEEE Vehicular Technology Conference, Phoenix, AZ, Vol. 2, pp. 1084-1088, 4-7 May 1997. [SIAUD, 1997] Siaud I., "Simulation du canal de propagation radiomobile en environnement urbain pour 1'etude des performances des systemes de communication de 3ieme generation avec diversite de frequence", 3 iemes journees d'etude, "Propagation electromagnetique dans 1'atmosphere du decametrique a 1'angstrom, pp. 277-282, 7-9 October 1997. [SEIDEL, 1994] Seidel S.Y., Rappaport T.S., "Site-specific propagation prediction for wireless in building personal communication system design", IEEE Transactions on Vehicular Technology, Vol 43, N° 4, November 1994. [VALENZUELA, 1994] Valenzuela R.A., "Ray tracing prediction of indoor radio propagation", PIMRC'94, Personal Indoor Mobile Radio Communications, September 1994. [VALENZUELA, 1997] Valenzuela R., Landron O., Jacobs, D.L., "Estimating Local Mean Signal Strength of Indoor Multipath Propagation", IEEE Transactions on Vehicular Technology, Vol. 46, N° 1, pp. 203-121, 1997.
200
Communicating with Smart Objects
[WALKER, 1983] Walker E.H., "Penetration of Radio Signals into Buildings in the Cellular Radio Environment", The Bell System Technical Journal, Vol. 62, N° 9, pp. 2719-2730, November 1993. [WIART, 1993] Wiart J., Marquis A., Juy M., "Analytical microcell path loss model at 2.2 GHz", PIMRC'93, Yokohama, September 8-11, 1993. [XIA, 1993] Xia H.H., Bertoni H.L., "Radio propagation characteristics for line-ofsight microcellular and personal communications", IEEE Antennas and Propagation, Vol. 41, N° 10, October 1993. [YANG, 2000] YANG, H., LU C.; "Infrared wireless LAN using multiple optical sources", IEEE Proc. OptoElectron., Vol. 147, N° 4, 2000. Acknowledgements My thanks to Mr. Jean Claude BIC for the assistance brought to the drafting of the "Radioelectric Propagation" chapter and to "Les reseaux radiomobiles", of the Reseaux and Telecoms HERMES collection, under the direction of Xavier LAGRANGE, a principal source of this paper.
Chapter 15
Ad-Hoc Networks Patrick Tortelier France Telecom R&D, France
1. Introduction Just as the Greeks had felt the need to encapsulate their notion of a human being as 'an animal which lives in spoils (zoon politikon)\ in the same way one could define current mankind as the only animal which communicates with its like at distances much greater than those that its senses can achieve - hearing, seeing, sense of smell (soon to be possible) etc. Thus, if one limits oneself only to technologies of wireless communication, one can note that if the first systems transmitted speech only, present day and more especially future systems will make it possible to see the interlocutor (transmission of animated images). There is also a scheme of communication where source and recipient are the same person at two different moments: for example when we note the hour and the place of an appointment, a telephone number to be recalled, a list of tasks to be carried out i.e. all kinds of personal data which we need to available for future use. Personal numerical assistants help us in this task. A last point is that, even if multimedia traffic is today rather asymmetrical (the flow towards the user being greater than that emitted by the user), multiplication of the means of creation of numerical content (images, sound, videos) or of their replication(hacking) and the development of the peer to peer networks change this situation. A consequence of that is that we are surrounded by an increasing number of increasingly sophisticated objects, which are supposed to make our personal life simpler, more lucid or more creative, so that technologies, which yesterday were reserved for laboratories, can be seen today in our living rooms. One finds a PC in every home, often a printer, personal assistants, CD engravers, DVD readers, and kits of satellite reception. All these objects require a number of cables, either to feed them on the sector, or to interconnect them or to the network, and one finds oneself quickly in the midst of a forest of not very aesthetic cables.
202
Communicating with Smart Objects
2. Communicating objects Naturally one would like to remove all these wires and to connect these devices by radio, just as one had been freed from the wire connection to the telephone network, in cities, with the cellular networks like GSM, or at home with for example DECT handsets. Of course there certain requisites for this kind of radio interface: 1. This interface must be standardised: objects coming from different manufacturers must be able to communicate; the number of potential candidates (IEEE 802.11b, HomeRF, Bluetooth) indicates that one is not yet at this stage. 2. It should not be expensive: Bluetooth would have a certain advantage on 802. lib in this respect. 3. It must allow a high data rate; in this respect 802.11b outperforms Bluetooth. 4. If designed for short ranges (a few tens of meters) it must consume little energy: if each object thus modified requires to be plugged into an outlet in order to work one will not have saved much cabling; they must thus work if possible using batteries, and must not be very greedy (batteries end up being costly!). 5. Finally, the management of all these objects and the routing of information in this network should not require the competence of a network engineer! It would be nice if the network reconfigures itself every time an object is introduced into the environment. This last aspect of the problem is the object of this contribution. 3. The Ad-Hoc networks Battlefield communications are an area of research which, at the very beginning of the Eighties, addressed a rather similar problem: how to deploy a radio communication network between "mobiles" (i.e.: soldiers, vehicles etc) disseminated on territory which, by its nature, does not have an infrastructure of communications (which it should not have, for it would be one of the first targets of the enemy). In such a context, the traditional scheme of the cellular network (where the mobiles of the same zone, the cell, are connected to a base-station) is impossible. In the solution that emerged each mobile uses a radio interface to transmit packet data and plays the role of a router (it transfers the packets from other users) within a meshed network that reconfigures itself automatically (it takes into account the mobility of the nodes, their traffic, their emergence or possible disappearance). More precisely it is a network whose infrastructure is made by the terminals themselves. The following figure illustrates the concept: the randomly distributed points are nodes, with for some of them the range of their radio interface (large circles in dotted lines). To go from A to B (which is out of reach of A) the packets pass by the intermediate nodes Cl C2 and C3.
Ad-Hoc Networks
203
Figure 15.1 The idea for Ad-Hoc networks
In spite of its military origin, the idea was quickly considered to be interesting for civil applications: •
•
•
The facility of deployment is an advantage in certain circumstances where one has no longer an infrastructure of communications: it is enough to think of the zones as victims of a catastrophe, a natural one (earthquake for example) or not: a good example was given after the attacks of last 11th September, where the rescuers could use as communication network the meshed Ricochet network system whose some relays (on pole tops) remained in the devastated zone; this network is an Ad-Hoc one. The fixed wireless access can benefit from this technique to relieve the constraint of direct visibility between base station and user; in a suburban habitat, each user having an antenna on the roof of his house can be used as a relay for other users, the link between a user and the access point to the fixed network being completed in several hops (multihop networks). The deployment is simplified. The capacity that have the nodes to be used as a router comes from the nature of the traffic generated at these nodes: this type of network is interesting in packet mode, for bursty traffic alternating active and idle periods where the radio interface can be used for routing the traffic from the other nodes.
204
Communicating with Smart Objects
burst n
burst
Figure 15.2 Model of traffic by bursts
There is another advantage which is of more physical origin and can be presented in the following simplified form. One can suppose all nodes to be of the same kind and p one notes the minimum received average power R that each receiver needs to work correctly. The power which must be emitted for a hop of length D is then of the form
PT = CstexPRda where a is path loss exponent; equal to 2 for a propagation in free free space and lying between 3 and 4 for more realistic environments. By supposing a distance L divided into say N hops of the same length d=L/n, the total power used for the TV transmissions overall is:
which shows a reduction by a factor n^(a-l) compared to a single hop transmission with a length L. The first idea which comes is to decrease the average distance between nodes of the network; this results in an increase of the density of users (which is interesting) and for each of them a decrease of the transmit power. But routing is a big difficulty.
4. The problem of routing When one joins by an edge the nodes which are reciprocally in the range of their radio interface, one draws a graph (a random one) which looks like the one depicted in Figure 15.3.
Ad-Hoc Networks
205
Given two arbitrary nodes, the problem of routing results in calculating 'the best' path which makes it possible to join them. It is a shorter path problem in a graph using several powerful algorithms (Ford-Bellman, Dijkstra, are best known) to solve it, but the true difficulty is elsewhere. Each node must have its own up to date routing table, and it is necessary to update these tables (the nodes are mobile, they can appear or disappear, and their activity varies with time). This constraint is the cause of a considerable overhead of data in the network, the mobiles having to emit updating information for these tables. This overhead increases with the size of the network and the mobility of the nodes. The various routing algorithms which are proposed (for example at IETF) aim to solve this problem.
Figure 15.3 Random grid
There is a difficulty in the choice of the criterion that states that a path is 'better' than another. It is a question of assigning a metric ('length') to each edge of the graph, so that the length of a path is the sum of the lengths of the branches that constitute it. Several metrics are possible, in particular: • •
The number of branches of the path. The quality of each branch, in order to favour the hops of good quality because they allow the transmission of a greater information rate. Moreover, if one mechanism of retransmission is envisaged on each hop, choosing a good hop minimises the number of retransmissions and thus the delay from end to end.
One can also mix several metrics.
206
Communicating with Smart Objects
5. A simplified solution It is clear that the mobility of the nodes causes a complexity increase at routing level. Thence arises the idea (at first sight contradictory with the initial idea) to conceive a mesh whose nodes are fixed! The difficulty is then to know what realistic usage corresponds to this scenario. Two commercial applications give us some indication: 1.
The first comes from Nokia-Rooftop, originally a start-up spun off from work financed by Darpa, which is now part of Nokia. It is an alternative solution to the wireless local loop, in which the constraint of visibility between the user and the access point to fix the network is slackened: the various users each have a modem/router on the roof of their home (where from the name of the product) and they constitute a mesh network where each node acts as a router for his neighbours towards the access points to the fixed network.
Figure 15.4 The Rooftop concept (ref. I)
2.
In the preceding example nodes do not have mobility any more; the second example presented introduced mobility of the users, but the users are no longer the nodes of the Ad-Hoc network. The Ricochet system (developed initially by the Metricom company, which has since gone bankrupt, but the network was repurchased) is a micro-cellular system in which the base stations form a mesh network (they are boxes fixed on pole tops in streets); a user is connected to one of these nodes with another radio interface that is used to connect the nodes, or the same interface in a different frequency band as shown in the following figure.
Ad-Hoc Networks
207
Figure 15.5 The Richotet system
6. Conclusion Contrary to the initial vision of a completely decentralised network, a less general and more pragmatic solution appears by introducing some constraints which simplify the routing problem: fixed terminals, or, if one really wishes to keep a certain mobility, a micro-cellular system in which the infrastructure consists of an Ad-Hoc mesh network, which should largely simplify its deployment, thus keeping this aspect. The Ad-Hoc network is often presented as a manner of connecting objects in an indoor environment, but a telecommunication operator (either fixed or with mobiles) can use them to deploy his network infrastructure. 7. References The bibliography on the Ad-Hoc networks is immense, so we will restrict it to some interesting papers or web links: The site of Nokia Rooftop: http://www.nwr.nokia.com/
208
Communicating with Smart Objects
T.J. Shepard: Decentralized Channel Management in Scalable Multihop SpreadSpectrum Packet Radio operator Networks, Report/ratio MIT/LCS/TR-670, July 1995. "Ad hoc mobile Networking (MANet)" http://www.ietf.org/html.charters/manet-charter.html
Chapter 16
INDEED: High Rate Infrared Communications in the "Indoor" Context Jean-Christophe Prunnot1,5, Adrian Mihaescu2, Christian Boisrobert1, Pascal Besnard2, Pierre Pellat-Finet3, Philippe Guignard5, Frederique De Fornel4 and Fabrice Bourgart5 1
LPIO Nantes, 2ENSSATLannion, 3LAUBS Lorient, 4LPUB Dijon, 5 France Telecom R&D, Lannion, France
1. Introduction The introduction of digital wireless links in local and private networks has been boosted by the development of portable PCs in business and leisure environment. The need for physical resources follow the constant increase of services in applications like videoconferencing and multimedia. Consequently, technology will lead to more and more intelligent functions in all kind of devices and equipment. They will be designed to communicate between themselves and the rest of the world. The connection through cables and wires is now considered inconvenient and unbearable in many situations and a major handicap. Furthermore, we must not forget the overlapping of wire connections caused by the multiplicity of links. The tremendous anticipated need for high quality communications justifies the development of radiowave or infrared solutions. Compared to what is considered to be their competitor, infrared optical waves can support heavy traffic, high bit-rates and more privacy/confidentiality in high density areas [Heatly, 1998]. Although Bluetooth and wide personal area network is commercially growing in 2002 as well as Hiperlan/2 (ETSI standards) in Europe and IEEE 802.1 la in the US, wireless local area networks, wireless infrared networks follow the IEEE 802.11 standards. IrDa [Williams, 2000] live a more harmonious existence with a universal standard IrDa which should allow for a full use of a large bandwidth opened to infrared waves (§ 3). Complete exploitation of infrared technology will justify the highest rates, far more than Hiperlan/4, which defines 155 Mbit/s on the lower level regarding physical connections, ie OSI model physical layer.
210
Communicating with Smart Objects
INDEED project (InDoor InfrarEd Eyesafe Dialogues) follows this context. The object of this study is to consider high rate infrared indoor communications problems, including one application which aims at creating a high rate safety bidirective link, in the 1550 nm optical area, between a top lightwave source and a mobile receiver in an office room. One part of this project will consist of evaluating a realised link by integrating it in ATM support TIC (Terminal Installation Client) of FT R&D [Bourgart, 2000].
2. Wireless infrared transmissions We firstly summarise the history of some of the applications and position the project in its context.
Figure 16.1 Different types of wireless transmissions [Kahn, 1997]
Several research teams have already studied indoor wireless transmission links (Figure 16.1) spectrally centered on 1550 nm. The two closest experiments concerning INDEED project are summarised in Table 16.1. Direct light detection and On Off Keying (OOK) modulation are used. The experimental results given in this table correspond to a bit error rate (BER) of 10-9.
INDEED: High Rate Infrared Communications
211
Table 16.1 Characteristics of wireless systems at 1550 nm
Reference Transmitter power Tx Receiver surface Photo-detector Sensitivity Rx Optical filter Attenuation Concentrator Gain Surface/FOV Cell area dimensions MAC Protocol Rate Propagation loss Margin
[Wisely, 1997] © + 8dBm 16 Tx Bootstrapped 1 mm2 PIN -32 dBm Yes -IdBm Yes +23 dB 710° 1m 2 4x4x2 m3 100 Mbit/s -60 dB +3dB
[Jungnickel, 1998] ® + 7 dBm ITx Unegalised 0.04 mm2 APD -39 dBm No Lens +6dB 25 cm2/ 20 m2 room H=2m CSMA/CD 140 Mbit/s -46 dB No
Wisely (British Telecom member) considers an area divided into cells and illuminated by 16 transmitters with a total power of 6.3 mW (8 dBm) and receivers with bootstrapped large area PIN photo detectors in a "line-of-sight" (LOS) scheme. In the non-LOS system proposed by Jungnickel (Institut Heinrich-Hertz, Berlin), the transmitter output power is 5 mW (7 dBm). The incident light shines on the avalanche photodiode through a concentrator. There is no power margin in the case of the Jungnickel's system. Experimental improvements have been obtained at higher bit rates using network access protocol, here CSMA/CD. 3. Eye safety The effects of the optical energy of a collimated beam laser are very different from the same optical energy transmitted by other light sources and this is due to its concentration, coherence, collimation and divergence properties. As for all illuminated surfaces, the biological tissues receive a high power density. The main phenomenon caused by the laser light is caused by high light density absorption. Absorption happens between atomic or molecular energy levels and is a process dependent on wavelength. Hence, it is wavelength that determines the tissue that laser can damage. Figure 16.2 gives the range and designation of different wavelength bands of the electromagnetic spectrum.
212
Communicating with Smart Objects
Wavelengths Figure 16.2 The electromagnetic spectrum, from the UVC to the IRC
Some biological tissues of the skin, the eye lens and more precisely the retina can show non-reversible modifications caused by long time exposure to moderate light levels. These changes are the result of photochemical reactions which occur after molecule activation by photon capture. They can be drastic if the time of exposure is too long or if the exposures are repeated over a long period. The eye, in Figure 16.3a, is an optical instrument which receives and focuses the light beams on the retina. The importance of the injuries and damages depend on the transmission and absorption characteristics of the tissues as well as the capacities to recover and regenerate from the lesions. We must emphasise that the fovea is the most important area of the retina since it senses the images for the clearest vision.
Figure 16.3 a) Transversal view of human eye and b) spectral absorption of different human media taken separately in human being [Sliney, 1980]
According to the International Electrotechnical Commission (IEC), 1550 nm wavelength is part of IRB. Pathologies associated with this spectral band are: •
On the ocular level: inflammation of aqueous humor, cataract and cornea burn.
INDEED: High Rate Infrared Communications
•
213
On the skin level: burns.
The grades relative to safety laser products are given by standards [CEI 825-1, 2000] and [CEI 825-1/All, 2000].
Figure 16.4 Characteristics of absorption and transmission of light in human eye: a) in visible and in the IRA, b) in the IR B and C, so that the UVB
For infrared beams in the B band, either the aqueous humor absorbs more than the cornea, or the cornea aborption is widely dominating (from 70% to 100%, in function of wavelength, as indicated by *• in Figure 16.3b). In Figure 16.4a, the light is focused on the retina whereas, in Figure 16.4b, light is absorbed by the cornea and by the lens. A retina burn can involve partial or total loss of vision, temporarily or in a definitive way. A lesion of the lens leads to cataract. Laser transmitters of UVC and far IRC radiation are dangerous for the cornea while visible and near infrared radiation are transmitted to the retina, the cornea protecting the lens against exposures superior to 4 W/cm2 [Pitts, 1980]. Lasers which can operate in free propagation without danger for users' eyes, must hence transmit in the spectral bands in which the absorption is very weak. The most appropriate spectral region lies around 1552 nm since it is covered by many commercially available optical products (fiber optics technology); the possibility of establishing direct link between a local network and cornea light absorption looks low enough in its region. At 1550 nm, the first ocular media crossed by light beam is the cornea, the absorption being not maximal (75%); it is spread over its entire thickness. We can note that if absorption was of 100% (that is, for a wavelength of 2 nm or upper than 2.5 nm), the aborption length would be weaker and therefore the dissipated energy per unit of volume would be more important, hence leading to more important damage. Power transmitted to aqueous humor is therefore weak and its absorption of 25 % has little influence. The cornea has a resistance to radiation that is equal to the skin resistance and can regenerate itself very quickly. In an extreme case, a cornea graft can still be possible.
214
Communicating with Smart Objects
The Maximum Permitted Exposure levels a 1550 nm on a subject taken from references [CEI 825-1, 2000] and [CEI 825-1 /All, ,, 2000] are: • •
1000 W.m"2 at cornea level. From 100 W.m-2 to 1000 W.m-2 at skin level, MPE vary inversely for the surface exposed, from 0.1 m2 to 0.01 m2 respectively. For a larger surface, greater than 0.1 m2, the limit 100 W.m"2 applies.
A transmitter can be considered to be accessible under maximum security conditions when its light output power does not exceed 10 mW. It is then considered a "class I" light source. Its radiometric characteristics and measurement conditions are: • • •
a 50 mm aperture diameter, a radiation pattern inside a 0.01 steradian solid angle, at a distance of 100 mm.
The calculated values given above are given for specific conditions: • •
time of exposure greater than 10s, observation aligned with the source axis.
We can conclude this normative overview by giving a definition of Nominal Ocular Hazard Distance (NOHD), in which illumination or energy exposure and MPE are equal on the cornea level. Beyond this distance, the risk is nil. In conclusion NF EN 60825-7, June 2000 (1st version), rewritten by European Committee of Electrotechnical Normalisation (CENELEC) about CEI 825-7 reference of IEC purposes sources of wavelengths greater than 780 nm (IR) used in free space data transmissions. This document specifies higher elements given by references [CEI 825-1, 2000; CEI 825-1 /All, 2000] and gives more details about essential limit values from which we establish one INDEED system specification.
4. The INDEED solution The INDEED solution is a cellular-type hybrid line-of-sight link (Figure 16.1, (D).
4.1. Downstream transmitter The downstream light sources are 1550 nm Fabry-Perot cavity semiconductor laser diodes (-FP-LDs). Each cell transmitter contains three diodes and one-diode optical output power is 8 mW. Their essential characteristics and specifications are:
INDEED: High Rate Infrared Communications
• • •
215
Several longitudinal modes around 1550 nm and a low degree of coherence to avoid speckle problems at the output of optical multimode fibre divergent adaptors. The resonance frequency must be higher than 1 GHz to avoid excess relative intensity noise and limits of the modulation frequency. The emitting area is approximately 1 to 2 um2 and its radiation pattern opens up to 40°.
4.2. Receiver The photo-detector is a semiconductor PIN-hetero-j unction photodiode. The semiconductor is a III-V compound: InGaAs/InP. The active area is 0.2 mm2 (Figure 16.5).
Figure 16.5 The InGaAs/InP photodiode
The OOK modulated signal, coded in NRZ (No Return to Zero), is then amplified and processed to obtain a minimum detectable power (or sensitivity) of -40 dBm (100 nW) at 155 Mbit/s at 10'9 BER. Typically, a concentrator [Kahn, 1997; Street, 1997] is used, with a 1550 nm selective filter. The presence of a lens (light energy concentrator) justifies itself especially in directive systems where the concentration bonus is inversely proportional to numerical aperture of receiver (FOV-Rx) [Street, 1997]. As diffuse systems, the gain in power can however reach 3 dB with sources approaching 900 nm [Kahn, 1997].
216
Communicating with Smart Objects
Compatibility of INDEED point-to-point system Power/Number Tx AN Type of source Number FOV Receiver surface AN Surface Rx/FOV Sensibility Rx, BER=10"9 Cell area Concentrator Gain Active area FOV Rate Propagation loss Penalty margin
ATM ATM 25.6 Mbit/s 140 Mbit/s + 15dBm/6 0.4 FPLD 1/Tx 40° 10 mm2 0.12 0.2 mm2/60° -44dBm -41dBm 3.5m 2 Yes + 17dB 10mm 2 13° 32 Mbit/s 155 Mbit/s -52dB +7dB +4dB
Figure 16.6 The transmitter-receiver LOS hybrid INDEED point to point link
The INDEED photo-receiver unit is very different from the devices tailored (custom-designed) for optical fiber applications: •
The photodiode is a "large sensitive area" component to collect the highest number of photons. Therefore, the parasitic "function+metallisation" capacitance is much higher. The structure of the low signal-low noise transimpedance amplifiers need to be revised.
INDEED: High Rate Infrared Communications
217
The photodiode is exposed to ambient radiation and must be protected and shielded. The noise sources are due to: - Shot noise, noise due to random nature of light emission and thus detection, associated with exit current at darkness (photons generation and recombination), - Scintillation noise due to random movements of junction mobile charges, - Detector thermal noise (Background Limited Infrared Photodetector BLIP, detector very sensitive to IR and the obscurity noise which is essentially due to photons issued from ambient radiation), - Surface effects noise, as the consequence of capture and release of mobile charges. 5. Perspectives and conclusion Adaptation, Figure 16.7, an holographic diffuser at reception level should allow an homogeneous diffusion spot on the receiver with bigger dimension and with the same shape as the detector. Besides, this diffuser, followed by a reflecting mirror, before the detector, will allow a certain mobility of receiver, within the retrieval of diffusion spot in the detector active area.
Figure 16.7 Adaptable optics of the receiver
This holographic technology has been explored for some years [Eardley, 1996], before relapsing into a certain anonymity. A system with a holographic diffuser and reflecting mirror is also proposed by Jivkova, from the University of Pennsylvania (United States) [Jivkova, 2001], but the source emits at 850 nm.
218
Communicating with Smart Objects
Integration of the receiver is to follow, the present system being not thought of as a unit so early in its conception. Many devices have been tested, but toward wavelengths close to 900 nm [Kahn, 1997; Street, 1997]. It is still possible to think again about the adaptability of some of these solutions, without neglecting research more orientated toward advanced technologies. 6. References [Bourgart, 2000] F. Bourgart & G. Ramel, ATMOSFEERIC-2 Extensions du domaine d'emploi d'ATMOSFEERIC aux installations collectives et aux services domotiques, Tech. Report., FT R&D, 2000. [CEI 825-1, 2000] NF EN 60825-1, Securite des appareils a laser: partie 1, classification des materiels, prescriptions et guide de l 'utilisateur, 2000. [CEI 825-1 /All, 2000] NF EN 60825-1/A11, Amendement a la norme NF EN 60825-1 sur la securite des appareils a laser, 2000. [Eardley, 1996] P.L. Eardley, D.R. Wisely, D Wood & P. McKee, Holograms for Optical Wireless LANs, IEE-Proc.-Optoelectron., volume 143, number 6, 1996, pp. 365-369. [Heatly, 1998] DJ.T. Heatly, D.R. Wisely, I. Neild & P. Cochrane, Optical Wireless: The Story So Far, IEEE Communications Magazine, 1998, pp. 72-82. [Jivkova, 2001] S. Jivkova & M. Kavehrad, Receiver Designs and Channel Characterization for Multi-Spot High-Bit-Rate Wireless Infrared Communications, IEEE Transactions On Communications, volume 49, number 12, 2001, pp. 21452153. [Jungnickel, 1998] V. Jungnickel, C.V. Helmot, & U. Kriiger, WireLan: A Broadband Wireless IR Lan Architecture Compatible With The Ethernet Protocol, ECOC'98 (Madrid, Spain), 1998, pp. 367-368. [Kahn, 1997] J.M. Kahn & J.R. Barry, Wireless Infrared Communications, Proceedings of the IEEE, volume 85, number 2, 1997, pp. 265-298. [Pitts, 1980] D.G. Pitts & al, Determination of ocular threshold levels for infrared radiation caractogenesis, US Dept. Health and Human Sciences, National Institute for Occupational Safety and Health Publications, 1980, pp. 80-121. [Sliney, 1980] D. Sliney & M. Wolbarsht, Safety with laser and other optical sources, Plenum press, New York, 1980, p. 145.
INDEED: High Rate Infrared Communications
219
[Street, 1997] A.M. Street, P.N. Stavrinou, D.C. O'Brien & DJ. Edwards, Indoor Optical Wireless Systems - a review, Optical and Quantum Electronics, volume 29, 1997, pp. 349-378. [Williams, 2000] S. Williams, IrDA: Past, Present and Future, IEEE Personnal Communications, volume 7, number 1, 2000, pp. 11-19. [Wisely, 1997] D. Wisely & I. Neild, A 100 Mbit/s Tracked Optical Wireless Telepoint, Proceedings IEEE: 8th Symposium Personal Indoor And Radio Communications (Helsinki, Finland), 1997, pp. 964-968.
7. Glossary ATM: Asynchronous Transfer Mode, switching technique between cells used to accommodate many different traffics in data transmission. Beam collimation: beams are collimated when they propagate parallel along the same axis toward the same 'target'. BER: Bit Error Rate for example 10-9 corresponds to one wrong bit among 109 bits. Cornea: transparent thin skin which covers the aqueous humor in front of the iris and the lens and sealed to the sclera. CSMA/CD: Carrier Sense Multiple Access with Collision Detection. Divergence (beam): Cross section increase of an electromagnetic beam vs the distance to the source. ENSSAT: Ecole Nationale Superieure de Sciences Appliquees et de Technologic. ETSI: European Telecommunication Standards Institute. Hiperlan: High PERformances Radio Local Area Network. IEEE (Institute of Electrical and Electronic Engineers): International industrial organisation in the field of electrical and electronics components and systems whose one of the main tasks is the definition and the publication of standards. IrDA (Infrared Data Association): created in 1993 to coordinate infrared optical transmissions of data. LAUBS: Association of research laboratories of Southern Brittany University.
220
Communicating with Smart Objects
Lens: transparent, biconvex, nearly spherical lens which focuses the images through the pupil on the retina. LPIO: Laboratory of Insulator Physics and Optronics. LPUB/OCP: Physics Laboratory of Burgundy University / Near Field Optics. MAC: the Media Access Control is a protocol which manages the stations connected to and sharing the same network/resources and gives them the right to transmit. Mode (electromagnetism): one solution of Maxwell equations which represents an electromagnetic field in a given space and belonging to a set of independent solutions defined by given boundary conditions. Multimode source: source of electromagnetic waves propagating in several modes. NRZ: Non-Return to Zero, simplest coding/modulation digital scheme one can design from pure binary concept. Numerical Aperture (N.A.) of an instrument: N.A. = n0.sin©0 where n0 and 00 are respectively the refractive index of the incident beam medium and the instrument acceptance angle in the same medium. OSI model (Open Systems Interconnect): the seven layer model of an open system. Photo-detector: any device which responds electrically to an optical illumination. Photodiodes: usually semiconductor p/n-junction devices used under a reverse bias. PIN photodiode: a thick intrinsic semiconductor region has been inserted between the p- and the n-regions of a junction to improve the photon absorption and therefore its quantum efficiency. Retina: sensitive area of the internal ocular sphere which receives the images, set of sensitive cells which convert the image local light energy into chemical energy transmitted along the optical nerves to the brain. WLAN: Wireless Local Area Network. WPAN: Wireless Personal Area Network.
Chapter 17
Artificial Materials for Protected Communications Frederique de Fornel1, Rabia Moussa1, Laurent Salomon1, Christian Boisrobert2, Herve Sizun3 and Philippe Guignard4 'OCP/LPUB Dijon, 2LPIO Nantes, 3FTR&D Belfort, 4France Telecom R&D, Lannion, France
1. Introduction Infra-red technologies in open space constitute an interesting solution for communicating objects and in particular for indoor communications: the introduction of optics gives rise to the solution to the problem related to high data transmission. Its use in open space contributes, by mean its flexibility, to the quality of the service offered [Prunnot 02]. As for other types of information transfer, it is essential to ensure the transmission confidentiality of the exchanges, which is more difficult due to the "not guided" nature of these communications. The securisation has in fact two different goals: 1. Guarantee of the confidentiality of the transmitted information. 2. Minimisation of the disturbances brought to the external environment of the room within which the exchanges of information take place. Before giving the physical details of the problem, the question of the confidentiality level must be clarified. Thus, the different kinds of applications dictate the lower or higher level of confidentiality. A typical case where the confidentiality must be very high is for banking data transmission. Of course one will have to find a compromise between the desired level of confidentiality and the cost of the corresponding technical solutions. We have not yet completely defined the criteria of confidentiality; this will constitute one of the objectives of this study, namely: for an given establishment receiving-transmitter, which are the parameters to be respected for the materials constituting the walls and the glazing so that the intensity transmitted outside the room will be lower than a threshold level?. This threshold level will correspond to our criterion of confidentiality. So, let us consider a simple case to illustrate these remarks.
222
Communicating with Smart Objects
2. Description of a standard establishment If a standard room is considered, the infra-red radiation is stopped (reflected or absorbed) by the walls. On the other hand, a part of this radiation can escape through the openings: glazing, open doors. Numerical simulations make it possible to study the light propagation in the room [Sizun 2002]. We wish to indicate some orientations of search that allow us to find solutions to limit or prohibit the escape of the photonic component of the communication through the windows. In particular, our work concerns the choice of the materials which constitute the windows and the wall lining.
Figure 17.1 A diagram of a room showing the path of a incident wave which is reflected or transmitted through obstacles it meets
The securisation of the openings can be made in various ways, principally while cutting the communication at the time of the opening of the doors: it is an effective and radical means but which we will not retain for this study. The other method consists in determining the optical properties of the walls and the glazing which will minimise the stray reflections which would result in an escape of light through openings when the doors are open. The confidentiality level can be affected by the furniture arrangement in the room. In our simulation, we will suppose that confidentiality is assured by the materials consisting the walls. The securisation of the glazing is to be taken into account. Indeed, the glazed surface of the room can represent a surface higher than that of one of the walls. Under these conditions, it is necessary to find glazing which have the property to let pass all the light of the visible spectral, in order to ensure first visual comfort and second to reflect the signals of the wavelength chosen for the communications, here: 1,5 5 urn.
Artificial Materials for Protected Communications
223
3. Artificial materials In our study, we will call 'artificial materials' the structures which have particular optical properties (absorption, reflex ion, transmission) that a natural material could not have. These properties are due at the same time to the properties of the materials which constitute the structure and the geometrical parameters of these [Yablonovitch 87, John 87, Yeh 94, Joannopoulos 95]. Thus, while structuring the matter, one manages to create specific and particular optical properties, these properties not being able to be obtained for a homogeneous material. Namely, we will consider a new family of artificial materials: photonic crystals. The photonic crystals are periodic structures with a periodicity of one, two or three dimensions [Yablonovitch 87, John 87, Yeh 94, Joannopoulos 95].
ID
2D
3D
Figure 17.2 Examples of artificial structures of photonic crystal, one, two and three dimensions
Since their appearance, photonic crystals have seen a number of applications growing in an extremely strong way, in very varied directions. The field of application of photonic crystals opens new prospects, namely for example the inhibition of spontaneous emission, the control of electron-hole recombination in the materials [Yablonovitch 91, Ozbay 95, Russel 95, Gerard 02]. The applications of photonic crystals exist now in the field of microwaves or millimeter-wavelength as well as in the optical region. One can quote: perfectly reflective mirrors, filters, polarisers, modulators, substrates for millimeter wavelength antennas, switches, high efficiency light emitting diodes LED and finally lasers without threshold. These enthralling applications open a new prospect in all the branches of physics such as atomic physics, optoelectronics, telecommunications and optical interconnections. Moreover, the nonlinear effects in such photonic crystals contribute to the development of powerful machines such as the optical computer. Thus, electronic switches and transistors will be replaced in the more or less long term by their optical parents. Motivated by these significant applications, technologists for their part provide an enormous efforts in order to create other structures which meet future needs. For
224
Communicating with Smart Objects
example the Yablonovite [Yablonovitch 91] was fabricated to operate in the millimeter-wavelength, structure "layer by layer" [Ozbay 95], two-dimensional structures of parallel cylinders etc The manufacture of photonic crystal operating in the optical region proves to be a real challenge which currently puts into competition different laboratories at international level such as MIT, Corning, the University of Southampton, the University of Glasgow, and in France: Alcatel, the LETI, the LEOM, the IEF, etc [Russel 95, Gerard 02]. 4. Application of artificial materials to the confidentiality of the indoor transmission Based on photonic crystals, such structures can conceive perfect reflectors for certain wavelengths, whereas, for the other wavelengths, the structure is completely transparent. For these confidentiality reasons we have to choose a structure which will fit the problem for indoor communications. The ideal structure would be the three dimensional structure. However, the technological realisation of such a structure applied to the glazing seems to us premature. We therefore proceeded by steps, the first consisting of the study of one dimensional photonic crystals. In the next, we had to include more dimensionality (i.e. two or three dimensions) and its intrinsic and geometrical parameters to the required confidentiality criteria. Figure 17.3 shows the first structure studied.
Figure 17.3 Schema of the one-dimensional photonic crystal, consisting of alternating layers of index «/ and n2
The optical transmission curves through two identical structures but of different material geometries are shown on the following figures. In Figure 17.4, a normal incidence of the beam with respect to the periodic structure also is considered, the refractive indices of each system being respectively: (1,5/1) and (1,7/1,4). We notice that all the visible region is transmitted and reflect perfectly the signal at l,55um. The transmission can reach in this case -90 dB.
Artificial Materials for Protected Communications
Figure 17.4 Transmission curves versus the incident wavelength of the two different n,/n2: (1,5/1) and(1,7/1,4)
225
systems
Unfortunately, for this one dimensional structure the answer depends and is strongly affected by the change of the angle of incidence. Indeed, when the angle exceeds 30° our structure as illustrated in Figure 17.4 deviates from its principal goal which is reflecting the incident wave at 1.55um (see Figure 17.5). Thus, a shift of the photonic band gap to the higher wavelength and a decrease in its magnitude are simultaneously obtained.
Figure 17.5 Transmission curve versus the incident wavelength for the second system (n//n2 ) - (1,7/1,4) for different angles
226
Communicating with Smart Objects
For the application aimed in our case, if one limits oneself to an attenuation of the transmitted signal of -40dB, the solution consists in using at least two sets of multilayer systems, which cover higher numerical aperture. If one requires a rejection rate higher than -40dB, it is necessary either to optimise the multi-layer systems parameters (i.e. the refractive index, the ratio r/a and the lattice constant a) or to consider a photonic crystal of two or three dimensions.
5. Discussion and prospects To secure the indoor communications, a certain level of confidentiality has to be guaranteed. This will force us to minimise at the maximum the transfer of the information outside the room where the communications take place. We showed starting from a simple example that the use of structured materials could guarantee the confidentiality. The artificial materials structured with one dimension already make it possible to have very good transmission of the visible region and a very high reflection at l,55um. The transmission is therefore lower than -70dB for angles of incidence lower than 20°. Now, if the angle of incidence exceeds 30° the properties of the structure do not meet our aim any more. To have a rejector filter at l,55u.m whatever the angle of incidence, it is possible to use more complex structures, initially a combination of multi-layer systems of different optical and geometrical characteristics. Another alternative consists in using structured materials according to two or three dimensions. Photonic crystals with three dimensions make it possible to carry out perfect reflectors whatever the incident angle of the light. In parallel with the theoretical studies, it is necessary to consider the realisation criteria of such structures. Some of these criteria are the manufacturing cost of the definite structures which will be taken into account in the final choice of the solution. It is evident that one dimensional structures were cheaper and easier to fabricate than their counterpart in two dimensions. It is also necessary to take into account the integration of these structured materials in buildings that already exist or are to be built. Thus, we have to predict such structures which can be constructed during the manufacture of the glazing or to be added to existing glazing. Furthermore, for the choice of the materials used it will be necessary to take into account the reliability of the structure (reliability with respect to thermal variations, pollution, and cleaning product etc). Until now, and in the preceding discussions we have not taken into account the nature of the connection (the source, the detector, the direct or diffuse transmission link etc). Thus, to completely define the confidentiality level of the system, we should integrate in our simulations the radiation diagram of the source. For a given structure, the effective confidentiality rate will depend on the nature of the link: direct, or diffuse.
Artificial Materials for Protected Communications
227
6. References [Gerard 02] D. Gerard, L. Berguiga, F. de Fornel, L. Salomon, C. Seassal, X. Letartre, P. Rojo-Romeo; P. Viktorovitch: "Near-field probing of active photoniccrystal structures", Optics Letters, 27, 3, 173, 2002. [Joannopoulos 95] J. D. Joannopoulos, R. D. Meade, and J. N. Winn, Photonic crystals: Modelling the Flow of Light, Princeton University Press (1995). [John 87] S. John, "Strong localization of photons in certain disordered dielectric superlattices", Phys. Rev. Lett. 58, 2486 (1987). [Ozbay 95] E. Ozbay, G. Tuttle, J. S. Me Calmont, M. M. Sigalas, R. Biswas, C. M. Soukoulis, and K. M. Ho, "Laser micromachined millimeter wavesphotonic bandgap cavity structure", Appl. Phys. Lett, 67, 1969 (1995). [Prunnot 02] J.-C. Prunnot, G. Normand, C. Boirobert; A. Mihaescu, P. Besnard, P. Pellat-Finet; F. de Fornel; F. Bourgart, P. Guignard "Les Objets Communicants", Chapitre 16: INDEED Communication infrarouge haut debit dans le contexte 'indoor', Hermes (2002). [Russell 95] P. St. J. Russell, T. A. Birks, F. D. L. Lucas, "Confined Electrons and Photons: New Physics and Applications", E. Burstein and C. Weisbuch, Eds. Plenum, New York (1995). [Sizun 02] H. Sizun: "Les Objets Communicants", Chapitre 14: La propagation des ondes radioelectriques a 1'interieur et a 1'exterieur des bailments, Hermes (2002). [Yablonovitch 87] E. Yablonovitch, "Inhibited Spontaneous Emission in Solid-State Physics and Electronics", Phys. Rev. Lett. 58, 2059 (1987). [Yablonovitch 91] E. Yablonovitch, T. J. Gmitter, and K. M. Leung, "Photonic band structure: The face-centered-cubic case employing nonspherical atoms", Phys. Rev. Lett. 67, 2295 (1991). [Yeh 94] C. Yeh, "Applied Photonics", Academic Press, San Diego, CA, (1994). 7. Acknowledgements This work is undertaken within the framework of the contract France Telecom R&D "INDEED", reference 01 IB 397. One of the authors R.M. was given financing by the "la region de Bourgogne" within the framework of a post doctoral grant.
This page intentionally left blank
Chapter 18
Free-space Optical Communication Links Olivier Bouchet and Herve Sizun France Telecom R&D, Cesson-Sevigne and France Telecom R&D, Belfort, France
1. Introduction Today the wireless numerical data transmission world market is based primarily on Hertzian type technologies. All the futurologies of large industrialised countries show some limit to this technology. They will not be able to absorb the growing development of the requirements in new transmission channels for cordless phone, networks data-processing, high definition television, communicating objects etc. Several techniques are available: microwaves, links at frequencies higher than 100 GHz and free-space optical communication links. This last technique uses in free-space the modulation of an infrared laser beam to exchange binary data full duplex by the intermediary of a Transmitter/Receiver couple. Several factors determine the revival of this technique: absence of regulation, free licencing, easy, fast and inexpensive deployment, high data rates. In this contribution we shall treat systems and propagation aspects, including restrictive effects. The aspect system relates to apparatuses laser safety, results of a survey of manufacturers as well as an analysis of their uses. The performance evaluation of such links passes by a knowledge of the atmospheric effects on the propagation in the used frequency spectrum. Fog, rain, and snow in particular constitute very restrictive elements in their use.
2. System aspects 2.1. Apparatuses laser safety Any laser can present danger to man, at ocular level and at cutaneous level; the human eye being very sensitive to infrared radiation. The most important factors to take into account to evaluate the risks are: the signal wavelength, power and beam
230
Communicating with Smart Objects
form. Eye safety is governed by International Electrotechnical Commission (IEC) standards. Those in force in France are: NF EN 60825-1 and NF EN 60825-2. The lasers are divided in various classes according to the known risks which they represent, allowable exposure limit (AEL), allowable maximal exposition (AME) and nominal distances of ocular risk (NDOR).
2.2. Manufacturers and products A review manufacturers was carried out, based on various sources of information (conferences, reviews, magazines, research engines, web sites etc) in order to obtain the most complete possible panorama of products and their capabilities. The principal parameters considered are transmitted data type, range, data rate and the type of suggested application or other parameters such as the wavelength, type and number of optical transmitters (laser and possibly LED), emitted power, control, deployment facilities, maintenance, cost etc. Table 18.1 gives an overall picture of most of the products listed starting with information obtained by manufacturers or by consulting their web site. Figures 18.1 and 18.2 (at the end of the chapter) give the data rate suggested (Mbits/s) and range in function of the wavelength for various equipment.
Free-space Optical Communication Links
231
Table 18.1 Manufacturers of free-space optical products Manufacturers
City
Country
Internet site
ACT1POLE AirFiber Inc Airlinx
Le Bouscat
France
www.laser-com.com
San Diego (CA)
USA
www.airfiber.com www.laseroptronics.com
Astroterra
San Diego (CA)
Canada USA UK
www.cablefree.co.uk
Amsterdam Munster
Holland Germany
www. canon-europa. com
Eagle Optoelectronics Firlan
Boulder
USA
Ottawa
Canada
www.firlan.com
FSONA
Ricmond B.C. Dreieich
Canada Germany
www.fsona.com
GoC
Infrared Com. Syst. JOLT
Edmonds Jerusalem
LASERBIT
CABLEFREE GUN CBL
www. astroterra.com
www.cbl.de
USA
www.goc.de www.laserinfraredwireless.com
Budapest
Israel Hungary
www.jolt.co.il www.laserbitcommunications.com
USA
www.powertechnology.com
Laser Syst Diode.
Hawthone (Ca)
USA
www.ldsc.net
LIGHTPOINTE
Boulder
USA
www.lightpointecom.com
LSA
Exton
USA
Lucent Technologies MDS OPTEL
Plessis Robinson
France
www.lsainc.com www.lucent.com
Taluyers Hamburg
France Germany France
www.mds.fr www.optel.de www.opticalaccess.com
Israel England
www.oraccess.com
Canada
www.plaintree.com
Canada USA
www.silcomtech.com
Laser Coin. Syst.
OPTICAL ACCESS
GIF/Yvette
OrAccess
New Indus. Zone
PAY Dated Systems
Cumbria Nepean
Plaintree Systems SILCOM TeraBeam Networks
Ontario Seattle
www.pavdata.com
www.terabeam.com
2.3. Uses In this paragraph the various uses suggested by the manufacturers are mentioned. In an artificial way, they were classified in three types of approaches: geographical, application and technical.
232
Communicating with Smart Objects
2.3.1. Geographical approach Initially, an approach related to geographical specificities, is presented below, including: • • • • •
Isolated sites (mountainous sites, places of interest) or risks (lightning, IEM), Charged electromagnetic environments (airports, factories, etc), Classified sites, private domains, Not easily passable obstacles (motorways, rivers, etc), Dense urban zones.
2.3.2. Applications approach A second approach relates to the applications aspect of a free-space optical communication link. •
• • • •
temporary link, - event action link (sport, concert, etc), - intervention link (work, modification or network repair, etc), - help link (accident, fast commercial action, etc), "last mile" connection, inter-site connection, independent, rented or private networks, closing an optical loop, inter-cell GSM or UMTS link (micro or picocell), etc.
2.3.3. Interface approach This approach relates to the various types of interfaces available with a laser link. Products with interfaces of any types in the various potential fields of application are proposed by the majority of manufacturers: • • • •
LAN Networks (Ethernet, Fast Ethernet, GigaEthernet, FFFDI, Token Ring, etc), Telecom (El, ...E3,T1 ...T3, ...ATM, SONET, SDH, PDH, etc), Video (SDI, SMPTE, etc), Mixed LAN-Telecom (Ethernet lOBaseT and 2xEl).
3. Propagation aspects The various types of attenuation to be considered when deploying a free-space optical communication network are geometrical attenuation, atmospheric attenuation, scintillation and optical mispointing.
Free-space Optical Communication Links
233
3.1. Geometrical attenuation The beam emitted by the transmitter being diverging (1-3 mrad), the receiving cell will collect only some of the emitted energy. The following relation gives the geometrical attenuation:
where: Scapnin.-' Receiver capture surface (0,005 m2, 0,025 m2 for example), 0: Beam divergence, d: Transmitter-receiver distance.
3.2. Atmospheric attenuation Atmospheric attenuation results from an additive effect of absorption and dispersion of the infrared light by aerosols and gas molecules present in the atmosphere. Transmittance in function of the distance is given by BEER relation:
where: T(d): Transmittance at distance d of the transmitter, P(d): Power of the signal at a distance d of the transmitter, P(0): Emitted power, a: Specific attenuation or extinction coefficient per unit of length. Attenuation is connected to transmittance by the following expression:
The extinction coefficient a is the sum of four terms:
where: am is the molecular absorption coefficient (N2, O2, H2, HO, CO2, O3,..),
T
an is the absorption coefficient by the aerosols (small solid or liquid particles present in the atmosphere (ice, dust, smoke etc) (3m is the Rayleigh scattering coefficient resulting from the interaction of the wave with particles of size smaller than the wavelength, (3N is the Mie scattering coefficient. It appears when particles are of the same order of magnitude as the transmitted wavelength. Absorption dominates in the infrared while scattering dominates in the visible and ultraviolet band. Being given the low values of molecular and aerosol absorption coefficients as well as Rayleigh scattering coefficient, the extinction coefficient can be written by the following relation:
where: V is the visibility in km X is the wavelength (nm) Q is the size distribution of the diffusing particles = 1,6 for strong visibility (V>50 km), = 1,3 for average visibility (6
3.3. Cloud attenuation When the air cools beyond its saturation point, the water vapor condenses to form water droplets or ice crystal if the temperature is very low. Generally, the water particles thus formed have small sizes (<100 urn), but their concentration can be important (a few hundreds per cm3). The existence of the clouds is strongly related to the climate of the area considered. Cold moderate areas generally have a minimal cloud cover in summer whereas continuous rainfall is maximum. In the Mediterranean, the reverse is observed. Presence of clouds is characterised by a nebulosity index which indicates the fraction of the sky covered and it is expressed in l/10ths of percentage. The effects of multiple diffusion are important when a light beam crosses the cloud. The diffusion, in addition, involves a time and frequency scattering as well as a depolarisation.
Free-space Optical Communication Links
235
3.4. Fog and haze attenuation Fog and haze consist of very small water droplets (<100 urn) suspended in the air. It is formed by a process of condensation close to the ground following radiative cooling of the ground (especially at night) or to air movement on cold ground (advection fog or haze). To distinguish haze from fog, it is generally admitted that the visibility is higher than 1000 m in the presence of haze while it is lower than 1000 m in the presence of fog. An excellent correlation was found between attenuation due to fog and liquid water concentration. The following relations for the attenuation coefficient were obtained at 0,63 urn and 10,6 urn [VASSEUR, 1997]:
where W is the liquid water concentration (g/m3). The liquid water concentration in the fog is typically equal to approximately 0,05 g/m3 for a moderate fog (visibility of about 300 m) and to 0,5 g/m3 for a thick fog (visibility of about 50 m) [UIT-R P. 840-3]. Weather measurements taken in Belgium in 16 stations during about twenty years provide indications on its maximum (worse case) and median (not exceeded in 50% of the cases) annual frequency. Table 18.2 gives the fog annual frequency [BODEUX, 1977]. Table 18.2 Annual frequency of fog
Light, moderate or thick fog (V<1000 m) Moderate or thick fog (V<500 m) Thick fog (V<200 m)
Median frequency 0,055% 0,035% 0,020%
Maximum frequency 0,14% 0,11 % 0,08 %
3.5. Attenuation due to precipitation The rain is formed from the water vapor contained in the atmosphere. It consists of water drops whose form and number are variable in time and space. Attenuation due to rain is due primarily to the scattering phenomenon as in the case of aerosols. In infrared the wavelength is very much smaller than the raindrops' diameters. The value of the standardised cross section, Qd remains equal to 2 whatever the wavelength.
236
Communicating with Smart Objects
The expression of the rain scattering coefficient is:
where R and dN(r) are respectively given in cm and in number/cm 4 . When the size of the irregularities due to precipitations becomes important compared to the wavelength, the wave is attenuated by reflection and refraction. Attenuation, independent of the wavelength, is a function of the precipitation intensity R (in mm/h) according to the following relation: Attenuation (dB/km) = 0,365R0,63 where R is the rain intensity measured in mm/hour. Rain intensity is the fundamental parameter being used to locally describe the rain. Its measurement is carried out either directly on the ground by means of pluviometers or comparable apparatuses or in an indirect way by means of weather radar. The latter is particularly well adapted to the analysis of the rain structure. The relation given by Carbonneau et al.. [CARBONNEAU, 1998] is slightly different, attenuation values being much higher: Attenuation = 1.076 R0.67 dB/km
3.6. Other hydrometeors (snow, hail, etc) Many other hydrometeors are present in nature. They are mixtures of ice, air and/or liquid water. Among these mixtures, snow is one of most complex. Snow generally falls in the form of flakes which are crystal aggregates of ice. The flakes can reach diameters of 15 mm. Although they do not have a defined form, one compares them to spheres and one classifies them by their water content after fusion. Attenuation due to snow is strongly related to its humidity or density. Analysis of the experimental data [VASSEUR, 1997] makes it possible to release the following results: • • •
For low humidity snow fall (dry snow), attenuation can reach 20 dB/km in the visible and 40 dB/km in infrared. For wet snow falls, attenuation varies from 4 to 8 dB/km as well in the visible as in infrared. Hail can be regarded as ice containing air bubbles. They are the largest observable hydrometeors (up to 8 cm in diameter).
Free-space Optical Communication Links
237
3.7. Refraction and scintillation Under the influence of thermal turbulence within the propagation medium the formation of random, variable size (10 cm-1 km) and of different temperature cells occur. These various cells have different refraction indexes thus causing scattering, multiple paths, arrival angle variations: the signal received quickly fluctuates at frequencies ranging between 0,01 and 200 Hz. The wave front varies in a similar way causing focusing and defocusing of the beam. Such signal fluctuations are called scintillations. Scintillation amplitude and frequency depend on the cell size compared to beam diameter. When heterogeneities are large compared with the beam cross section, it is deviated, when they are small, the beam is widened. The tropospheric flicker effect is generally studied starting from the logarithm of the amplitude xt^B] of the observed signal ("log-amplitude"), defined as the ratio in decibels of its instantaneous amplitude and its average value. Intensity and speed of the fluctuations (frequency of scintillations) increase with the wave frequency. For a plane wave, a weak turbulence and a specific receiver, the scintillation variance a.2 [dB2] can be expressed by the following relation:
where: k [m-1] is the number of waves (2rc/X), L [m] is the link length, Cn2[m-2/J] is the structural parameter of the refraction index representing the turbulence intensity. Scintillation peak-to-peak amplitude is equal to 4crand attenuation related to scintillation is equal to 2cr . For strong turbulence, one observes a saturation of the variance given by the relation above [BATAILLE, 1992]. One will note that Cn2 parameter does not have the same value at millimeter and optical wavelengths [VASSEUR, 1997]. Millimeter waves are especially sensitive to humidity fluctuations while in optics, refraction index is a primarily function of the temperature (the water vapor contribution is negligible). One obtains in millimeter waves a value of Cn2 of about 10-13 m-2/3 which is an average turbulence (in general in millimeter range we have 10-14< Cn2 <10-12) and in optical a value of Cn2 about 2xlO- 15 m-2/3 which is a light turbulence (in general in optics we have 10-16< Cn2 <10~ 13 ), [BATAILLE, 1992].
3.8. Optical misalignment Being given the equipment characteristic (low divergence of the laser beam), very precise alignment is necessary. The alignment of the transmitter and the receiver
238
Communicating with Smart Objects
characterises optical link coupling. This one can be disturbed following mechanical vibrations.
4. Conclusion The advantages and the disadvantages of use of a free-space optical link are summarised. Among the advantages we have quoted: high data rates (from 2 Mbps to 2500 Mbps and more), transparent protocol, interfaces diversity, absence of licence and regulation, simple, fast and reusable deployment, relatively protected transmission etc. Among the disadvantages we have quoted: no data guarantees by the manufacturers in term of availability and QoS, limited distances, line of sight technology, new technology, thus retreat and lack of information, parameters of ocular safety related to the laser class of the equipment. In the search for equipment, one will choose the weakest laser class in order to limit the number of precautions and restrictions at the moment of deployment and maintenance. It appears that the optimal establishment configuration of an equipment seems to be a short link, a high data rate and a point-to-point service. We have also presented the various aspects of the propagation to be considered at the moment of the deployment of a free-space optical links (geometrical attenuation, atmospheric effects, scintillations, beam misalignment). We saw that the weather phenomena which bring the most important attenuation are in order, fog, dry snow, rainstorms (generally short) and light rains. Fog constitutes the most important phenomenon. It is frequent in certain years and slow to be dissipated. The installation of experimental links should make it possible to show that freespace optical links can constitute a reliable alternative broadband to the optical fiber installation and lead to a better acceptance of this technology in industry than the high data rate telecommunications networks necessary, between others, for the future deployment of many communicating objects.
5. Bibliography [BATAILLE, 1992] Bataille P.; "Analyse du comportement d'un systeme de telecommunications optique fonctionnant a 0,83 micron dans la basse atmosphere", These de doctoral, Universite de Rennes 1, 1992. [BODEUX, 1997] Bodeux P.; "La frequence du brouillard en Belgique", Bruxelles: Institut Royal Meteorologique de Belgique, 1977.
Free-space Optical Communication Links
239
[CARBONNEAU,1998] Carbonneau H.T., Wiseley D.R.; "Opportunities and challenges for optical wireless; the competitive advantage of free space telecommunications links in today's crowded marketplace", SPIE conference on optical wireless communications, Boston, Massuchussetts, Vol. 3232, 1998. [KIM, 1998] Kim 1.1., Kootz J, Harkakha H., Adhikari P. Stieger R., Moursund C, Barclay Mr., Stanford A., Ruigrok R., Shuster J, Korevaar E; "Measurement of scintillation and link margin for the TerraLink communication system", SPIE vol. 3266, 1998. [LAVERGNAT, 1997] Lavergnat J., Sylvain M.; "Propagation des ondes radioelectriques, Introduction", Collection Pedagogique de Telecommunication, Masson, 1997. [UIT-R P.840] "Attenuation due to clouds and fog", Rec. UIT-R P. 839, 1999. [VASSEUR, 1997] Vasseur H., Oestges C Vander Vorst A.; "Influence de la troposphere sur les liaisons sans fil aux ondes millimetriques et optiques", Propagation electromagnetique du decametrique a Tangstrom, 3iemes journees d'etude, Rennes, 7, 8, 9 October 1997.
Page 240 shows Figure 18.1 Data rate (in Mbps) in function of the wavelength for various equipments Page 241 shows Figure 18.2 Range in function equipments
of wavelength for
various
This page intentionally left blank
Part 4 Evolution of Smart Devices Claude Kintzig France Telecom R&D, France
Before considering progress in the field of smart devices, and considering their possible characteristics, it is necessary to get a look backwards to find tracks that have led to their emergence and current development.
1. Areas of development Several areas contribute to the development of devices with processing and communication capacity. 1. In the industrial world (leaving computers to be treated later) two types of devices can be identified. First of all, relatively passive devices which return after solicitation, information of 'yes or no' type have been bestowed with more highly evolved processing and communication capacities. They can receive an order, apply a more or less sophisticated, processed reply. They unite devices that have been constructed with strong processing capacities (microprocessors), and communication capacities that make them qualified to offer high levels of functionality. "Electronic tags" are devices of the first type, and "electronic cards" without contacts correspond to devices of the second type. These are physical entities of relatively small dimension. But be careful not to attribute all smart devices to objects of small size. Indeed, billboards, that can be of great size and updated by mini-messages via a SMS function are strong smart devices. It is the same, for example, with washing machines, that detect a breakdown and call the after-sale service via the GSM terminal. In our analysis, it would be necessary not to neglect objects that can, because of their size or functionality, not be included in seminaries devoted to communicating devices, and to be thereby excluded from research. By extension, an object that has a minimum of processing and communication capacity has access to the status of small devices.
244
Communicating with Smart Objects
2. In the computer world, the miniaturisation of components on the one hand, and progress of communication protocols on the other hand, are at the root of significant progress in research and creativity. In general, the object, the slave of an object master in a first phase, detaches itself by the integration of more sophisticated communication tools. By claiming its autonomy in a second phase, it passes on a status of extension of what? To that autonomous tool. On the way, it wins some functionality that slips from the older master to the new. It is the path followed by objects such as "Palm" and other Pocket PC, pens and electronic paper (that are in a phase of development). Conversely, the world of telecommunications is going to endow mobile phones by function increasingly rich to reply to the needs of the office. It is necessary to notice in recent years a strong evolution of the fact that these objects acquire increasingly "intelligence" to take into account communication by their own resources. Protocols such as Jini or Bluetooth enter this category. This tendency is confirmed with protocols of higher level such as P2P. They get symmetrical communication between devices without passing by servers one of whose roles is to ensure this symmetry by coupling two asymmetries. 3. Without restraining considering of smart devices only as physical entities, software agents (which execute well with physical machines) constitute an area of very fertile research. These are software entities which have processing and communication capacity. In view of the suppleness of implementation, studies are mainly conducted mainly on architectural aspects and dialogue with humans or other software agents. That does not mean that they include the totality of data-processing especially classic, generic, and centralised but it concerns an available resource and is deployed according to need.
2. The future So what evolutionary trends that become apparent? What can we draw from the areas discussed above? If we accept the postulate that there can be no dialogue without reciprocal context knowledge (reference and intention) we can suppose that integration of the context (or at least of a formalisable part of the context) to the environment of smart devices will be an important research area in years to come. This tendency is already perceptible with the software entities that characterise their dialogue according to an evaluated and re-evaluated context permanently. This dialogue must going to simplify (by being careful to note that the simplification becomes more complex) both between communicating devices (progress made in quoted protocols above show it) and between them and users. Current work on t multi-modality comes within the framework of this step.
Evolution of Smart Devices
245
For the moment all of these small devices need a source of energy. The area of utilisation of these objects is going to widen according to progress in this area (pace combustible batteries). The miniaturisation of smart devices, that affects the physical connection of the different functionalities (access, interface, use, etc.), would have, through the constraints of each area, to bring elements of responses and integration at the same time. This will help with clarification of risks. For example, the dialogue interface will have to integrate the various modalities in the place function. Through the totality of these progresses, smart devices will be better and better integrated with the arts (music, cinema, dance, photography, etc.), and used in the health domain, entertainment, among others. A question remains opened: what will be the autonomy of smart devices? Will they stay under the user's control, or will these objects, through their autonomy, propose to the users non-solicited actions? Might we see some users suppress or destroy objects or devices becoming too intrusive? Might we see objects or devices at war? We manufacture smart devices, experiment with them, sell them and chuck them out all the time. They are becoming more and more ubiquitous, and more varied, so they may strike us. Give us a chance to domesticate them to improve their usefulness instead of to leaving them in the wild state, under the pretext of innovation.
This page intentionally left blank
Chapter 19
Mobile and Collaborative Augmented Reality Laurence NigayI, Philippe RenevierL , Laurence Pasqualetti3 , Pierre Salembier4 and Tony Marchand4 1
Department of Computer Science, University of Glasgow, UK, 2CLIPS-IMAG, University of Grenoble, France, 3France Telecom R&D, Issy-les-Moulineaux, France and 4GRIC-IRIT, University Paul Sabatier, Toulouse, France
1. Introduction One of the recent design goals in human computer interaction (HCI) has been to extend the sensory-motor capabilities of computer systems to combine the real and the virtual in order to assist the user in his environment. Such systems are called augmented reality (AR). This is also the objective of other innovative interaction paradigms such as ubiquitous computing, tangible bits, pervasive computing and traversable interfaces. Augmented reality (AR) has been the subject of growing interest. In [Dubois 99], we emphasised the diversity of AR systems and presented one important classification characteristic: • •
Systems that enhance interaction between the user and her/his real environment by providing additional capabilities and/or information. We call such systems augmented reality (AR) ones. Systems that make use of real objects to enhance the interaction between a user and a computer. We call such systems, augmented virtuality (AV) ones.
There are many application domains of augmented reality (AR) and augmented virtuality (AV), including construction, architecture and surgery [Dubois 99]. Examples of AR systems are the computer assisted medical intervention (CAMI) systems, also called augmented surgery. AR plays a central role in the medical domain because the key point of CAMI systems is to "augment" the physical world of the surgeon (the operating theatre, the patient, the tools etc.), by providing preoperative information including the pre-planned strategy. Information is transmitted between the real world and the computer world using different means: computer
248
Communicating with Smart Objects
screens, mouse, pedals, tracking mechanisms, robots, etc. Examples of AV systems in human-computer interaction involve input modalities based on real objects (cubes), such as the phicons (physical icons) [Ullmer 98]. Ishii has described this interaction paradigm as the tangible user interface. All these systems are all based on the manipulation of objects in the physical environment. AR and AV systems are numerous in many different application domains but the most attention has been paid to the technical issues related to image processing and data fusion. Very little effort has been applied to modelling the interaction between the user and the system. The design approach so far has been technology-driven. In our project, we adopt a complementary user-centred approach providing a usercentred design method based on scenarios. Our application domain is archaeological prospecting. In this chapter, we first present the main characteristics of archaeological prospecting. Indeed we base our study on a specific mobile fieldwork: archaeological prospecting We then explain our design approach for mobile AR systems based on field studies and on the design of scenarios of actual and expected activities. We then describe the conceived and developed interaction techniques via the MAGIC (mobile augmented reality group in context) platform. 2. Archaeological prospecting Archaeological prospecting influences whether excavation at a site will take place. It provides a global overview of the environment, including a systematic census of the archaeological clues. The goal is to check the state of the archaeological archives and the potential of the sites [Dabas 99]. The archaeological evaluation must fulfil the following requirements [Nigay 2002]: (1) establish the location of the deposit, (2) find the boundaries of the site, (3) define their nature (habitat, necropolis, etc.), (4) evaluate the density of the structures, and (5) date the site [Blouet 94]. Prospecting is done by a group of archaeologists and consists initially of a ground analysis based on a systematic division of the zone. If necessary, the archaeologists consult a specialist whose opinion will determine whether prospecting will continue. Currently, this consultation with the expert is inefficient, because it requires repeated trips to and from the site by the archaeologists. The prospecting requires long journeys between sites, where the topographic characteristics are often poorly known. The long distances of the journeys cause problems since they result in asynchronous interaction with distant specialists who possess specialised knowledge whose nature cannot be anticipated a priori. As explained in [Nigay 2002], the characteristics of the archaeological prospecting activities (e.g., the co-operative process, the nature of shared information, the type of interaction), appear to be representative of the co-operative activities found in mobile situations.
Mobile and Collaborative Augmented Reality
249
3. Design approach In the context of archaeological prospecting activities, our work focuses the harmonious fusion of the real world or physical environment with the digital world or data processing in mobile and collaborative situations. The objective is to understand the use of the mobile supports and the services awaited in collaborative situation for a user task in the real world, justifying the fusion of the two worlds, the physical and digital worlds. Two properties are the bases of our study: transparency of the interaction and ubiquity. •
•
The transparency sought in the interaction makes it possible to the user to pay her/his attention on the task to be realised and not on the use of the computer. It is therefore necessary not to separate the user from his physical environment during the use of the computer. The aim is to reduce the contrasts between the two work contexts of the user: there is on the one hand the real world and on the other hand computer equipment and digital information. Ubiquity arises from the use of the mobile supports. The user wishes to reach services and to collaborate with her/his colleagues at any time and from anywhere. The objective is thus to conceive group ware on mobile supports: the user is then no more prisoner of the workstation on her/his desk to be able to collaborate and communicate.
Figure 19.1 Design steps
250
Communicating with Smart Objects
These two properties fall under the current approach of the human computer interaction (HCI) which preaches the universality of the interfaces. The dataprocessing tool is integrated into the physical environment, accessible from everywhere and by all. To satisfy these two properties, we adopted a design approach based on scenarios [Carroll 001 presented in Figure 19.1. In addition to the recognised interests of a scenario based design approach [Jacobson 951, the fact that the scenarios are centred at the same time on the users and the various tasks makes it possible for us to determine their needs and their relationships to the physical environment. As shown in Figure 19.1, from activity scenarios, we carry out a projection of what could be the activity of the users after the introduction of new tools. These hture activities are described in projected scenarios. The formalism of the scenarios is described in [Salembier 971. Based on an empirical analysis of the real task and on an analysis of the activity, we elaborate real task and activity scenarios. They constitute the basic material for the requirement elicitation. Based on the requirements, specifications are then elaborated. Of these specifications, we define projected scenarios. Initially, the specifications are carried out without any limit, by taking account of no constraint. Then in order to be able to develop a concrete platform, we adapt the specifications to the technological limitations. The projected scenarios initially based on these new specifications (11) are used to adjust the specifications (first loop of iteration). After a phase of software design and development, we carry out experimental pre-tests with the end-users. The results may lead to modification of the initial specifications (second loop of iteration). In [Nigay 20021, we fully describe the design steps and illustrate them. In particular examples of real task and projected scenarios are provided. Having presented the design steps, we now present the outcomes, the designed and developed MAGIC platform. 4. Magic platform
MAGIC (Mobile, Augmented reality, Group Interaction, in Context) is a hardware and software platform dedicated to archaeological prospecting activities. It enables the archaeologists to perform ground analysis of the site and to communicate with other mobile archaeologists working in the site as well as with distant archaeologists. We first describe the hardware and then the software responsible for the fusion of the two worlds, the physical and digital worlds. The complete software and its architecture are detailed in [Renevier 011.
Mobile and Collaborative Augmented Reality
251
Figure 19.2 A MAGIC user
Figure 19.3 User interface of the MAGIC pen computer
The hardware platform is an assembly of commercial pieces of hardware. We use a Fujitsu Stylistic 3400 pen computer. This pen computer is a PC (processor Pentium [II 450 MHz, 196 Mb of RAM), with a tactile screen having the size of a A4 sheet of paper. Its weight is 1,5 kg. Moreover, it has a video exit allowing the dual display, to which we connect a semi-transparent Head-Mounted Display (HMD), SONY LDI D100 BE. A camera is fixed between the two screens of the HMD (between the two eyes). The hardware platform also contains a magnetometer (HMR 5000 of Honywell) which determines the orientation of the camera as well as a GPS
252
Communicating with Smart Objects
which locates the mobile user. For sharing data amongst users and communication between users, a WaveLan network by Lucent (11 Mb/s) was added (PCMCIA). Figure 19.2 shows a MAGIC user, fully equipped. Based on the above described hardware platform, we designed and developed interaction techniques that enable the users to perform the functions associated with the scenarios. Figure 19.3 presents the graphical user interface displayed on the tactile screen of the pen computer. The developed software offers several functions for communication (electronic forum and messages), for coordination (archaeologists location on the map of the archaeological site) and for production (editing tools, database of found objects). In order to smoothly combine the digital and the real, we create a gateway between the two worlds. This gateway has a representation both in the digital world on the screen of the pen computer (bottom right part window in Figure 19.3) and in the real environment, displayed on the HMD.
4.1. From the physical world to the digital world Information from the real environment is transferred to the digital world thanks to the camera carried by the user. The camera is positioned so that it corresponds to what the user is seeing, through the HMD. The real environment captured by the camera can be displayed in the gateway window on the pen computer screen as a background. Based on the gateway window, we allow the user to select or click on the real environment. The interaction technique is called "Clickable Reality". Before taking a picture, the camera must be calibrated according to the user's visual field. Using the stylus on screen, the user then specifies a rectangular zone thanks to a magic lens [Bier 93]. The specified rectangular zone corresponds to a part of the real environment. By selecting the button "take" (to take a picture), the user carries out a capture of the part of real world contained within the framework of the lens. The real world becomes clickable like digital objects. All the images thus captured are recorded in a database with their localisation thanks to the GPS.
4.2. From the digital world to the real world Information from the digital world is transferred to the real environment, via the gateway window, thanks to the HMD. For example the archaeologist can drag a drawing or a picture stored in the database to the gateway window. The picture will automatically be displayed on the HMD on top of the real environment. Moving the picture using the stylus on the screen will move the picture on top of the real environment. This is for example used by archaeologists in order to compare objects, one from the database and one just discovered in the real environment. In addition, when an archaeologist walks in the site, s/he can see discovered objects removed from the site and specified in the database by colleagues: we called this
Mobile and Collaborative Augmented Reality
253
interaction technique "Augmented Stroll". The technique consists in superimposing an image of an object in its original real context (in the real world), thanks to the semi-transparent HMD. Because a picture is stored along with the location of the object, we can restore the picture in its original real context (2D location). "Augmented Stroll" is an example of mobile and collaborative augmented reality (AR) technique. On the one hand, it is a mobile AR technique because the augmentation of the real world is based on the current position (GPS) and orientation of a user (magnetometer). On the other hand, it is an asynchronous collaborative technique: a user initially captures an object in its physical context before removing it from the site; another user can later on perceive the object in its original real context. While walking in the archaeological site, the user can observe green spots superimposed in the real world to indicate that objects are available. By selecting a green spot, the user can see the digital object in its original real context. The reuse of digital objects initially from the real world but no more physically present supplements the interaction cycle between the user, her/his environment and the computer. This mode makes it possible to follow the evolution of the archaeological site through space (the movements of the user within the site) and time (the various recorded information collected by several users). Our approach consists of augmenting the user: a MAGIC user is wearing and holding the MAGIC platform. Nevertheless, from the point of view of the user, physical objects or places are augmented. Indeed the approach adopted is to assist the user by providing extra information about the physical field via a device carried by the user. Another approach [Mackay 98] would be to augment the physical environment itself. Each time the user removes a physical object, s/he would have to place an input/output devices at the position of the object. The links between the physical and digital worlds would then have been dynamic but explicit: the user would have to perform a specific action to define the link. Our design solution enables links between the physical and digital worlds that are dynamic but implicit, because the localisation of an object is automatically computed by the system.
5. Conclusion In this paper we have focused on the fusion between the physical and digital worlds. We have explained our design method based on scenarios: real task and activity scenarios as well as projected scenarios that describe the future activity of the user with the tools to be developed. Projected scenarios are therefore used as a support for predictive evaluation during the design. We applied our design method for the design and implementation of a mobile collaborative augmented reality (AR) system dedicated to a group of archaeologists prospecting fields: the hardware and software MAGIC platform. In the MAGIC platform, the technological solutions for the fusion of the two worlds include the gateway window as well as the "clickable reality" and "augmented stroll" techniques. Although our solutions are designed and developed for a given application domain, archaeological prospecting activities, our interaction techniques are generic. One of our research avenues is to prove this generic aspect
254
Communicating with Smart Objects
of the designed techniques by developing a mobile collaborative game based on the MAGIC platform. 6. References [Dabas 99] Dabas, M., Deletang, H., Ferdiere, A., Jung, C., Haio Zimmermann, W., 1999, La Prospection. Paris: Errance. [Dubois 99] Dubois, E., Nigay, L., Troccaz, J., Chavanon, O., Carrat, L., 1999, Design Space for Augmented Surgery, an Augmented Reality Case Study, Proceedings of INTERACT'99, Sasse A. & Johnson C. Eds, IFIP IOS Press Publ., p. 353-359. [Bier 93] Bier, E. et al., 1993, Toolglass and Magic Lenses: The See-Through Interface. Anaheim, Proceedings of SIGGRAPH'93, ACM Press: New York Publ., p. 73-80. [Blouet 94] Blouet, V., 1994, Essais de comparaison de differentes methodes d'etude archeologique prealable, Les nouvelles de Parcheologie, 58, p. 17-19. [Carroll 00] Carroll, J.M, 2000, Making use. Scenario-based design of computer interactions, MIT Press, Cambridge, Massachusetts. [Jacobson 95] Jacobson, I., 1995, The Use Case Construct in Object-Oriented Software Engineering, In John M. Carroll, editor, Scenario-Based Design: Envisioning Work and Technology in System Development, John Wiley and Sons, p. 309-336. [Mackay 98] Mackay, W., Fayard, A.-L., Frobert, L., Medini, L., 1998, Reinventing the Familiar: an Augmented Reality Design Space for Air Traffic Control. Proceedings of CHI'98, ACM Press: New York Publ., p. 558-565. [Nigay 2002] Nigay, L., Salembier, P., Marchand, T., Renevier, P., Pasqualetti, L. Mobile and Collaborative Augmented Reality: A Scenario based design approach, Proceedings of Mobile HCI 02, Springer-Verlag, 2002, a paraitre. [Renevier 01] Renevier, P., Nigay L., 2001, Mobile Collaborative Augmented Reality, the Augmente Stroll, Proceedings of EHCI'2001, LNCS 2254, SpingerVerlag, p. 315-334. [Salembier 97] Salembier, P., Kahn,,J., Zorola, R. & Zouinar, M., 1997, Cognitive modeling, Deliverable WP6, RHEA Project, DG VII, EC. [Ullmer 98] Ullmer, B., Ishii, H., Glas, D., 1998, MediaBlocks: Physical containers, transports, and controls for online media, Proceedings of SIGGRAPH'98, ACM Press: New York Publ., p. 379-386.
Mobile and Collaborative Augmented Reality
255
7. Acknowledgements We wish to thank our partners from the CEA (Centre d'Etude d'Alexandrie in Egypt) for welcoming us.
This page intentionally left blank
Chapter 20
Towards a Description of Information seeking Tasks Contributing to the Design of Communications Objects and Services Andre Tricot and Caroline Golanski CNRS and Universite de Toulouse 2, France
1. Introduction useit.com, the site of one of the leading specialists in Web ergonomics, Jacob Nielsen, contains many interesting considerations concerning the Web, Wap and the development of communications objects. In brief, the success of the Internet and the Web from the mid 1990s onwards led observers to believe that it was possible to design almost anything in the field of communications objects, in particular in terms of information and document access. Fairly soon, however, it became clear that more than half of the information searches performed on the Web resulted in failure. Furthermore, Web and Wap access via communications objects such as mobile telephones or PDAs, which were initially considered to be a development or extension of the Web, have not proved themselves to be particularly usable and are employed only very infrequently. A number of hypotheses have been considered on the basis of this observation. It was thought that technical advance, in particular in terms of data rates would solve some or all of the problems relating to the development of any given tool. Other analysts suggested that users' skills would develop, as they always do, and thus solve the utilisation problem. In brief, the development of usability was expected to boost utilisation. In this chapter we intend to defend a different point of view. We may consider that, considering among other things its usability characteristics, any given communications object is specifically useful for a particular information-seeking task but not for others. A description and a categorisation of the information-seeking tasks and the establishment of a relation between this categorisation and the communications objects which do or do not permit the implementation of each of these tasks would allow designers to choose the most appropriate tools in the light of
258
Communicating with Smart Objects
the services that they want to develop. According to Tricot and Nanard [TRICOT, 1998] the description of an information task should take account of: • • •
The user's representation: the representation that users construct of the task and their level of expertise in the field in question together with their skill in using the tool or the information service; The implementation of the goal: the address and the number of targets in the system, the procedures to be used in order to access these targets, the general structure of the system and the interface; The context of the activity: the reason why a subject chooses to use a particular system in order to search for certain information (learning, document design, problem solving etc.).
We have defined four objective variables which are independent of the user or the topic in question, and which make it possible to characterise the implementation of the goal in an information-seeking task. These variables are the repetitiveness of the task, the level of explicitness of the targets, the location of the targets and the quantity of targets. We have started to analyse the hierarchical relations which may exist between these variables. This has enabled us to describe 12 informationseeking tasks and classify these tasks on the basis of the mean performances they result in during information searches on the Web. 2. Wap ergonomics Nielsen [NIELSEN, 2000] reports a study into Wap usability that was conducted in late 2000. Since Nielsen is acknowledged as the leading specialist in the field of Wap ergonomics and given that the study in question was extremely thorough, this report is our exclusive reference point. This study reveals that, in its current state, Wap utilisation is not satisfactory. During their information-seeking tasks, users are regularly confronted with a range of problems that occur at various levels and relate, for example, to connection, navigation or information retrieval. These problems are due to a large extent to the fact that Wap is still a very new technology and it is to be expected that the connection and download problems will be solved within a few years. However, the problems relating to information searches and retrieval will not be entirely resolved by technological advances. These are design problems which will have to be solved by designers and engineers. The constraints due to the size of the screen and the limitations in the way these tools can be handled (small keypad) would seem to make it clear that the data available via Wap and the navigation options that can be used to obtain this data should be specific to this type of tool. However, it would appear that this is not always the case.
T
2.1. Searching for information Wap navigation is based on the same principles as Internet navigation. The idea is that users choose options from a series of menus until they find what they are looking for. Each of the Wap networks attempts to provide all the information that is likely to be of interest to its customers. To do this, it offers all the services that users might need in the form of lists and makes these available directly via the gateway. The limited screen size makes it impossible to display all the options clearly and precisely in a single screen. These are therefore grouped into simple categories in a portal (news, entertainment, etc.). Once the required option has been found, a sublist with a new set of options is displayed. Given the large number of available options, users have to pass through many menus and submenus before they access the desired information. However, they often do not get that far after being prematurely disoriented in their search. Users have also pointed out another problem: some of the options or sites that are proposed are actually links to non-existent addresses or sites that are currently being built. Wap therefore offers its users data which is not actually available. Apart from the data which is directly accessible via the options, users may also need to perform specific searches and they are able to use search engines to do this. However, these functions are not easy to find and few users access them. This is another consequence of the size of the screen since even if these search engines are available, they are not indicated clearly. To give an example, there are six good search engines available on Wap, yet during the study only five out of 20 users found them. After finding the search engines, the users had considerable difficulty formulating their requests. The small size of the keypad makes it difficult to enter data and increases the risk of error. 2.2. Retrieving information After searching for information, users have to process it. In Wap, the data is distributed over several screens that users can browse through using a scroll bar. This confronts users with new difficulties. The first is the use of these scroll bars which they consider to be hard to manipulate. The second is reading the screen display. The ability to read small screens differs from user to user. Some are prepared to read large amounts of text while others are not. However, for 70% of them the small size of the screen is a consideration which dissuades them from using Wap. Finally, the users were generally fairly critical of the quality of the information they obtained, finding most of it to be unsatisfactory. It should be noted that these results were obtained in a context in which the range of available services was still relatively restricted (December 1999) and was offered via a mobile telephone. The
260
Communicating with Smart Objects
study conducted by Salembier and co-workers [SALEMBIER, this volume] seems to indicate that PDAs, on the one hand, and multimodal techniques (voice, tactile, gestural, embodied), on the other, may provide users with a more satisfactory solution.
2.3. Conclusion Nielsen's study brings to light two positive points: the use of mobile telephones for Wap access is learnt easily and users can easily remember the various functions of the buttons. However, overall, Wap is affected by serious problems of usability, in particular when we consider the criteria of efficiency, the management and prevention of errors, and user satisfaction. Users are confronted with many difficulties, lose their connections and the few instructions that do exist do not enable them to take sufficient control of the system and make it truly effective. The navigation and option labelling often make the user/machine interface totally uncommunicative. Searching for information is a painstaking task which all too often yields unsatisfactory results. It might be imagined that tools that allow subjects to customise their portals and the various download sites would represent a considerable aid to users in allowing them to access the options they require much more swiftly. However, this is not enough. These tools must also be capable of helping users perform their information-seeking tasks in a much more effective way. We can well imagine that interfaces that adapt automatically to the user's operating habits and query types (adaptive interfaces) would be extremely useful here. Going further, intelligent agents could allow users to simply describe what they are looking for and then let the system perform the search for them. Such tools would probably make Wap simpler to use and lead to more satisfactory search results. However, Wap does not enable users to perform all the information searches that are available on the Internet and adaptive interfaces do not provide the necessary performance for all types of information search. In effect, the integration of adaptive interfaces for Internet-based information-seeking tasks makes it necessary to define the type of task for which they are to be adapted. The problem concerning the Internet here is the same as the one we have already formulated for Wap: what are the types of information-seeking task for which these communications objects (mobile, PDA, laptop PC) or these protocols (Web, Wap) are suitable? Before we can answer these questions we must be able to provide a description of the information-seeking tasks. As a first stage, be intend to study (a) the implementation of the goal or "the objective characteristics of the information-seeking task", and (b) the effects of these characteristics on users' activities. We believe that some of these characteristics will correspond to cases in which an adaptive interface is useful and, indeed, to cases in which a particular communications object makes it possible to search for information effectively.
Design of Communications Objects and Services
261
3. Objective characteristics of an information-seeking task The study of the field of the description of tasks and adaptive interfaces has allowed us to identify four objective variables which are independent of the subject and make it possible to characterise an information-seeking task. These variables are "the repetitiveness of the task", "the level of explicitness of the targets", "the location of the targets" and "the quantity of targets". 3.1. The repetitiveness of the task The study of adaptive interfaces has shown us that intelligent agents are useful for repetitive tasks. These are tasks that are performed on a regular basis which require the use of the same operations in order either to retrieve the same information or a similar piece of information. They are distinguished from non-repetitive tasks which are one-off tasks 3.2. The level of explicitness of the targets Ever since the first empirical studies of the use of hypertexts (for example [ROUET, 1990]), researchers have distinguished between explicit and implicit targets. An "explicit" target corresponds to an extract from a document, for example a text paragraph, which simply has to be understood by the subject. The subject does not need to search for any other information or produce any inferences to attain the goal. In contrast, an "implicit" target requires the subject to call on additional information or knowledge in order to achieve the goal since the target is not sufficient in itself. On the basis of an analysis of a variety of empirical data, Tricot [TRICOT, 1993] has defined two other objective variables which make it possible to specify a search task. These are the quantity of targets and the location of the targets.
3.3. The location of the targets A target's location is the place at which it can be found on the information networks. Here we distinguish between precisely located and distributed targets. If a target is local then it is fully present on a single page (in certain cases, it may also be redundant, i.e. it may be present "in the same way" on several different pages). If a target is distributed then it is present on several pages and the subject must view all these pages in order to obtain all the information. If the target is "local" then the search terminates as soon as the site containing the target is found. In contrast, if the target is "distributed" then the search has to be continued if all the information is to be retrieved. The search for "distributed" information may take some time and imply a greater cost.
262
Communicating with Smart Objects
3.4. The quantity of targets The "quantity of targets" variable defines the number of targets that exist for a search task. For this variable, the information is fully present on each of the pages on which it is present but there may be one or more pages or sites containing this information. If the information is "unique", it is only present on a single page and the difficulty for the subject is finding it. Otherwise, if the target is "multiple" the subject has a number of equivalent ways of finding the target. We want to assess the effects of these variables on users' activities. We imagine that it will be more or less difficult for subjects to perform information-seeking tasks as a function of the values of these variables. The adaptive and customisable interfaces as well as the new media (mobile, PDA) correspond to specific utilisations. We imagine that the study of the effect of our objective variables on the search tasks conducted on the Web will enable us to arrive at a partial description of these utilisations. To do this, we have defined an experimental protocol by means of which we evaluate subjects' performances when performing information-seeking tasks defined on the basis of our variables. In the multimedia field, assessing subjects' performance is not an easy task since the behaviour of one and the same subject may be judged to be good by one author but not by another. Nevertheless, there are rational criteria which are acknowledged to permit a good assessment of subjects' performances. These are the recall and precision indexes. Recall evaluates the number of targets attained by the subject out of the total number of existing targets. When the recall index has the value one, the subject has found all the targets. Precision is the ratio of the number of targets found by the subject to the number of pages opened. If the subject has opened only relevant pages during the search then the precision value is one. We evaluated subjects' performances on the basis of these two indices.
4. Experiment 4.1. Subjects and method 25 subjects took part in this experiment. These subjects were interested in the Web and new technologies. They all regularly used the Internet. The experiment was performed using a laptop computer equipped with a mouse. Searches were performed using the Voila portal.
4.2. The task types Using the values of our four variables, we developed twelve types of task resulting from an almost complete crossing of the modes of our variables: repetitive/non-
Design of Communications Objects and Services
263
repetitive; explicit/implicit; local/distributed; unique/multiple. We performed all the possible crossings. Below is an example task and the accompanying description. Task 1. Find a way of getting from Toulouse to Montpellier by public transport this Friday (arriving between 4 and 5 pm). •
• •
The number of targets variable here is "unique" since the only way of getting from Toulouse to Montpellier by public transport on this date and at this time is by train. The timetable is located on a single page in a single site, namely that of SNCF (the French railway operator). The target distribution variable here has the value "local" since the timetables are all on the same page in the SNCF site. The target explicitness variable here has the value "explicit" since the subject did not need to produce inferences or look for any other information in order to know whether or not there was a train matching the times requested.
4.2.1. Protocol The subjects completed two pre-tests, one concentrating on their Web browsing skills and the other relating to their knowledge in the various fields. The 25 subjects were randomly divided into two subgroups, one consisting of 15 subjects and the other of 10 subjects. The group of 15 subjects performed the nonrepetitive tasks while the group of 10 subjects performed the repetitive tasks. In the group of subjects performing the non-repetitive tasks, each subject performed three tasks, namely one autonomous task and two prescribed tasks. In the other group, each subject performed one autonomous and one repetitive task followed, if time permitted, by a second repetitive task. The subjects started their searches from the Voila portal interface and performed the tasks on the basis of the instructions given to them. Each of the non-repetitive tasks was performed by five subjects and each of the repetitive tasks was performed by two subjects. While the task was being performed, the path taken by the subjects was stored in order to keep a record of their search. Their comments were recorded and the time was measured. 4.2.2 Instruction "You will see the Voila portal interface and you must search for the information that you will be asked to find from this portal. After that, you will be asked to fill in a questionnaire concerning these searches. First of all, what is the last information search that you have performed on the Internet at your own initiative? Could you repeat it, describing aloud what you did, i.e. what you clicked on and what you entered?
264
Communicating with Smart Objects
Now you are going to perform the search tasks that I am going to ask of you. You will start each task from the Voila portal. For each defined task you will have to reformulate what is asked of you (and tell me how you intend to go about it)."
4.3. Results We adopted an exploratory, qualitative approach in order to define our research hypotheses. For this reason, the results below are essentially qualitative and we have not performed any significance tests. Our intention is to determine the effect of our four variables: "level of explicitness of the target", "quantity of targets", "distribution of targets" and "task repetitiveness" on the subjects' behaviour in an information-seeking task. To do this, we used the recall and precision indices. In order to analyse the results, we attempted to develop an optimal search model for each of the search tasks. This model was to act as a reference when evaluating the searches performed by the subjects. We did not define information search models for the two search tasks: "repetitive, multiple, distributed and implicit" and "nonrepetitive, multiple, distributed and implicit" since the number of equivalent possibilities was too great. The results obtained from the subjects on these tasks were not therefore taken into account in the calculation of the recall and precision indices. Nevertheless, an observation of the subjects during their search operations together with a qualitative analysis of the results allowed us to note that recall and precision appear to be at their weakest for these tasks. That is why it should be pointed out that the results below for the implicit, distributed and multiple tasks are probably better than the actual results.
4.3.1. Recall Table 20.1 Mean recall for each value of each variable and difference in mean recall between each mode of each variable Variables
Mode
Mean recall
Quantity of targets
unique
0.59
°.04
Distribution of targets
multiple local
0.55 0.6
0.16
distributed explicit implicit repetitive
0.44 0.66 0.43 0.72
non-repetitive
0.5
Explicitness of the target Repetitiveness of the task
I
I Difference between the modes of each variable
0.23 0.22
Design of Communications Objects and Services
265
The recall index is the ratio of the number of targets accessed by the subject to the number of existing targets.
4.3.2. Precision The precision index is the number of targets accessed by the subject divided by the number of pages opened by the subject. We observed a greater or lesser effect of the variables on user's activities. The variables which had the greatest effect on task success were also those that had the greatest effect on the precision of the information search, namely "repetitiveness of the task" and "level of explicitness of the targets". We observed that users performed repetitive and explicit tasks successfully and accurately. In contrast, the non-repetitive, distributed and implicit tasks were performed with difficulty and inaccurately. Table 20.2 Mean precision for each value of each variable and difference in mean precision between each mode of each variable I
Mode
Variables
Mean precision I Difference between the modes of each variable
Quantity of targets
unique
0.39
°.06
Distribution of targets
multiple local
0.45 0 44
0.05
Level of explicitness of the
distributed explicit
0.39 0.49
0.15
implicit repetitive
0.34 0.64
0.3
non-repetitive
0.34
target
Repetitiveness of the task
4.3.3. Effect of the different task types on users' activities Our intention was to study the effect of crossing the modes of the variables on users' activities. However, it was difficult to study the results of the "repetitive" tasks given that only two subjects were tested for each of these tasks. The difference between the "repetitive" tasks and the "non-repetitive" tasks is due to the fact that one was performed just once whereas the other was performed a number of times. We can therefore hypothesise that the conclusions that we are able to draw from the "non-repetitive" tasks will also apply to the "repetitive" tasks if we consider that, generally speaking, these will be characterised by higher recall and precision indices. The figure below represents the effect of crossing the variables in the non-repetitive tasks on subjects' activity.
266
Communicating with Smart Objects
Figure 20.1 Mean recall and precision for each task type across all the subjects (the position of the "implicit, multiple, distributed" task is indicated in grey; our hypothesis is that this task should have lower recall and precision indices than the other tasks)
We can observe that, overall, as recall increases from one type of task to another, precision also increases and vice versa. The tasks for which recall and precision are high are the tasks in which the subjects made few errors and found a large number of targets. In contrast, the tasks for which these two indices were low were the tasks in which the subjects made a large number of errors and found only a few targets. This relation between recall and precision is unusual. In fact, the usual relation between these two indices is of the type: precision = 1 - recall (inversely proportional) [BUCKLAND, 1994]. In other words, generally speaking the broader the search conducted by the subjects and the larger the number of pages opened, the greater the likelihood that a large number of targets will be accessed. [BUCKLAND, 1994] describes the results we have obtained as "perverse": precision increases with recall. Everything suggests that, depending on the nature of the task, all the subjects either manifested imprecise behaviour in opening a large number of pages irrespective of their content or precise behaviour consisting of opening only the relevant pages. One approach that would permit us to develop interesting hypotheses concerning subjects' activities would be to find a way of predicting the effect of each of the variables on each task type. To do this, we decided to arrange our variables hierarchically as a function of their effect on recall and precision 4.3.4. Hierarchical organisation of the variables We used a contrastive approach to perform this task classification. First of all, we calculated the variable that had the greatest effect (all other variable values confounded) and we then arranged the two modes of this variable in hierarchical order on the basis of their recall and precision scores. We repeated this operation for
Design of Communications Objects and Services
267
each value of each variable, thus gradually reducing the number of variables for comparison. We then applied the results obtained for the non-repetitive tasks to the repetitive tasks. This results in a tree structure with the variables arranged from left to right as a function of their effect on recall and precision and the tasks arranged from top to bottom as a function of their recall and precision levels. 5. Discussion The results that we obtained permit us to conclude that each of our variables had an effect on the information-seeking tasks. The greatest effect was exerted by the "repetitiveness of the task" and "explicitness of the targets" variables while the "target location" and "target quantity" had a lesser effect. We observed that when subjects successfully perform the first task in a series of repetitive tasks, they apply the same search strategy in the succeeding tasks which they also perform successfully. Recall and precision for each of the "repetitive" tasks are therefore high. In contrast, we observed that a subject who failed in the first task, performed the second and subsequent tasks on the basis of the results obtained in the preceding tasks and either changed their strategy or re-used the same strategy in an attempt to improve it. This results in an increase in recall and precision scores. We may well imagine that the fact of repeating the task allows subjects to improve their searches through a learning process. We may therefore hypothesise that repetitive tasks are tasks that are frequently performed successfully. This would seem to call into the question the utility of adaptive interfaces for information-seeking tasks. Nevertheless, repeating the same actions or sequences of actions is time-consuming to users. We might therefore imagine that adaptive interfaces which automatically perform this job would prove to be very useful in that they could enhance ease of use by eliminating time-consuming repetitions. We also observed that the "explicitness of the target" had a large impact. The results obtained for the explicit tasks were better than those observed in the implicit tasks. During the experiment, we noticed that this variable could be influenced by the subjects' knowledge. These were simple tasks and, if subjects did indeed possess any relevant knowledge, this took the form of the location at which the information might be found and the method required to access it. These subjects therefore performed more accurate searches than the others. The implicit tasks were more difficult to perform.
268
Communicating with Smart Objects
Variable with the greatest effect on recall and precision
Variable with the least effect on recall and precision
Figure 20.2 Classification of the types of information-seeking tasks as a function of the recall and precision indices and of the effect of each of the variables
Design of Communications Objects and Services
269
In the majority of cases, if subjects had any knowledge then this related to the content of the topic in question but not the location of the target. Furthermore, we observed that the subjects who had some knowledge concerning the content of the target topic also had a more accurate representation of the target than the others. Thus, as of the moment when they found the target they were in a better position to respond rapidly to the question since they simply had to check the information without needing to analyse it. However, in general the subjects were largely unsuccessful in this type of search. However, this might have been due to the experimental conditions which were not suitable for this type of task. The comments made by the subjects suggested that they needed more time either in order to perform a more extensive search or to summarise the results of the information they had found. In situations such as those used in this protocol, subjects are working under optimum search conditions. They could not print the documents and their search was stopped if it took longer than 20 minutes. This suggests that the level of performance achieved in implicit searches is low when the search time is less than 20 minutes and the documents can only be consulted on screen. However, this does not exclude the possibility that such searches may be performed successfully under different conditions. We observed that the "target distribution" and "target quantity" variables had a lesser effect. We can well imagine that this could be due to the choice of the modes associated with our variables. As we stated in the description of our results, we would probably have observed a greater effect for the "target quantity" variable if we had chosen different modes. We assume that this same observation applies to the "target distribution" variable. We observed that when the targets were distributed the subjects had difficulty finding them. When targets are distributed, subjects must continue their search until they have found all the information. The search for "distributed" information takes time and even though our subjects were not placed under any time constraints they expressed the need to find the information quickly. In many cases, the subjects abandoned their searches before finding all the relevant information. However, a "distributed" target may be spread over a greater or lesser number of pages. According to Tricot and co-workers [TRICOT, 1999], performance in a search for a "distributed" target can vary depending on the number of pages that exist within the search field. In other words, for a given number of pages that may contain the information, subjects will more easily find all the information if the targets are distributed through all the pages than if they are present on certain pages. We can therefore suppose that the size of the effect of the "target distribution" variable will vary as a function of the number of pages which may contain these targets or, in other words, the target distribution effect is undoubtedly very closely related to the effect of task "selectivity". Tricot and co-workers [TRICOT, 1999] have also hypothesised that one relevant characteristic of the task is the complexity of the procedure that has to be implemented, i.e. the number of different decisions that have to be made between the start and end of the activity. We should therefore also like to assess the effect of other objective variables such as
270 Communicating with Smart Objects
the number of different decisions that have to be made in order to attain the goal (complexity of the procedure), but also whether the target is defined a priori or discovered a posteriori, the type of sensory modality (auditory/visual) involved in processing the target, the amount of data to be transferred or stored, the volume of data to be displayed (e.g. number of words). The hierarchical organisation of the tasks allowed us to describe the types of information-seeking task more precisely. For each type of task, we have been able to evaluate the effect of each of the variables and the associated level of performance. Finally, we wish to test the hypothesis which holds that the more difficult a task is, the less suitable it is for a communications object which does not itself offer a high level of usability (for example: the more difficult an information-seeking task is, the less suitable it is for a mobile telephone). In a second stage, this should permit us to describe the set of tasks that can be performed using each type of communications object. It is important to fix the limits of validity of our (present and future) results as well as of our approach. Our hierarchical organisation was developed within a precise context (limited time, information retrieval restricted to reading on screen) and the tasks that the subjects were asked to perform had the sole aim of enabling them to answer a question. However, Internet applications are more varied than this. For example, Bernstein [BERNSTEIN, 1993] distinguishes between three types of hypermedia application, namely information "mining", "manufacturing" and "farming". Information "mining" searches involve an attempt to extract information. In this type of task, the relevant information is a valuable resource which has to be extracted effectively and refined. Information "manufacturing" searches are searches which make it possible to design or draft a document. This type of information search conceives of the acquisition, refinement, assembly and maintenance of information as an ongoing endeavour. Information "farming" conceives of the "tending " of information as a continuous, co-operative activity conducted by groups of individuals working together in order to accomplish changing individual and communal objectives. Bernstein notes that the appraisal criteria used in these three activities are radically different and that any attempt to perform an activity in a system which has not been designed for this purpose is doomed to failure. Given Bernstein's categorisation, we may imagine that our hierarchical organisation is relevant for one defined type of application, namely information "mining" and, consequently, is unsuited for the evaluation of "manufacturing" or "farming" type tasks. 6. Conclusion We have provided a model of the objective characteristics of information-seeking tasks. This model permits a precise definition of the objective variables that may play a role during an information-seeking task, the hypothetical effect of each of these variables and a hypothetical hierarchical organisation of these effects. These
Design of Communications Objects and Services
271
characteristics also permit us to assess the different information-seeking tasks that can be performed in terms of their information search performance. These results clearly deserve validation in a better controlled experiment with larger groups of subjects who are more representative of the user population and which would permit the statistical processing of the information. We therefore intend to conduct a metaanalysis of the empirical results available in this field in order to verify whether they bear out our hypothesis. However, this model only takes account of one dimension of task description and ignores the user's model and the context within which the activity is performed, i.e. two considerations that play a role during an information-seeking task. Furthermore, when developing our protocol, we assigned binary values to our variables. It would be interesting to design an alternative hierarchical organisation with more than two mode values for each variable. Such an organisation would enable us to provide an even more precise description of information-seeking tasks and consequently formulate more precise hypotheses concerning the characteristics which might indicate the cases in which an adaptive interface would be useful or even cases in which a given communications object (mobile phone, PDA, laptop etc.) permits an effective information search using a particular protocol (Web, Wap). However, our model does seem to provide a relevant description on "information mining" tasks. It can therefore provide us with a framework enabling a comparison of search tasks performed using different communications objects. Our aim is to design a new experiment in which we intend to compare search tasks based on different media. These search tasks will be developed on the basis of our classification while bearing in mind that we will probably need to assign more than two values to our variables. This experiment will permit us to identify the respective capabilities of the various communications objects and the usefulness of adaptive interfaces. 7. References [BER 93] Bernstein M., "Enactment in information farming", Proceedings of Hypertext' 93 Conference, ACM Press, p. 242-249, 1993. [BUC 94] Bukland M., Gey F., "The relationship between recall and precision", Journal of the American Society for Information Science, vol. 45, n°l, 1994, p. 12— 19. [NIE 00] Nielsen, J., "WAP Usability, Deja Vu: 1994 All Over Again". Report from a Field Study in London, Fall 2000, Nielsen Norman Group.
272
Communicating with Smart Objects
[TRI 93] Tricot A., "Ergonomie des systemes hypermedia", Actes du Colloque de Prospectives Recherches pour 1'Ergonomie, Toulouse, 18-19 November 1993, p. 115-122. [TRI 98] Tricot A., Nanard J., "Un point sur la modelisation des taches de recherche d'information dans le domaine des hypermedias", in A. Tricot, J-F. Rouet, (Eds.), Les hypermedias, approches cognitives et ergonomiques, p. 35-56, Paris, Hermes, 1998. [TRI 99] Tricot A., Puigserver E., Berdugo D., Diallo., M, "The validity of rational criteria for interpretation of user-hypertext interaction", Interacting with computer, vol. 12, 1999, p. 23-36. 8. Glossary Adaptive interface: interface which adapts automatically to the user's habits. Information farming: continuous, co-operative information collection activity conducted by groups of individuals working together in order to accomplish changing individual and communal objectives. Information manufacturing: activity of exploiting information including the acquisition, processing, assembly and maintenance of information. Information mining: information extraction activity in which the target is a valuable resource which has to be extracted effectively and refined. Portal: Web site whose main function is to provide access to other Web sites. Precision: number of targets accessed by the subject divided by the number of pages opened by the subject. Recall: number of targets accessed by the subject divided by the number of existing targets. Target: relevant document, reference or piece of information. Task: goal to be achieved in a given environment by means of (physical) actions or (mental) operations with or without the use of tools. Wap (Wireless Access Protocol): protocol for accessing information networks via mobile telephones or personal assistants (PDAs). Web (World Wide Web): protocol for accessing the Internet information network which makes it possible to establish links between data stored on remote computers.
Chapter 21
Making Context Explicit in Communicating Objects Patrick Brezillon LIP6, University Paris 6, France
1. Introduction One can speak about context only with reference to something (no definition of context out of context): the context of an object, the context of interaction, the context of problem solving, etc. Here, however, only the context of interaction between agents seems of interest because it is in this context that other contexts are referenced or evolve. For example, if an object, such as a telephone, could provide you the context in which the person called is free, in a meeting, using the phone on voice recorder, you could balance your wish to establish your communication versus the availability of the person called. Several domains have already elaborated their own working definition of context. In human-machine interaction, a context is a set of information that could be used to define and interpret a situation in which interact agents. In the context-aware applications community, the context is composed of a set of information for characterizing the situation in which humans, applications and the immediate environment interact [Dey, 1998]. In artificial intelligence, the context is what does not intervene directly in a problem solving but constrains it [Brezillon, 1999a]. The community interested by the context can be divided into two families, "tangible computing" and "social computing" according to the terms [Dourish, 2001]. The interest of social computing is the development of context-based representation of knowledge and reasoning (e.g. see the example of the contextual graphs in [Brezillon, 200lb]). Such formalisms provide the system with natural capabilities of learning and explanation generation. This approach is essentially based on the user and includes the dynamic dimension of the context. Tangible computing is more concerned with the technical aspects of context that are immediately useable, particularly through an exploitation of data. The focus is mainly on mobile computing (context-aware applications, smart devices, ubiquitous computing, communicating objects, etc) in domains as different as tourism and e-maintenance.
274
Communicating with Smart Objects
Context in tangible computing is often limited to location and time, and the information on the user is often ignored or very limited. This is a type of deviceoriented approach and context is supposed to evolve only through changes of state. Clearly, none of these two approaches is totally right and it is necessary to find a generalisation taking into account the advantages of both approaches, avoiding their respective weaknesses. This is supposed to: • • •
Manage data, information and knowledge that evolve as a function of time and/or coming from heterogeneous sources. Take into account the user (from his interaction with the system). Develop distributed software with a central part and personalised extensions for users.
Such challenges can be faced by an explicit use of the context, mainly in the design and development of communicating objects. These objects could be then real intelligent extensions of a central repository of data, information and knowledge, repository updated in real time. In this chapter, we present first the meaning that we give to the notion of context, and then present what seem to us some challenges that can be faced by combining the two approaches, "tangible computing" and "social computing". 2. Making context explicit 2.1. Context and knowledge It is important to distinguish data, information and knowledge (see Figure 21.1). Data are the symbols perceived by an observer through sensors. From these data emerge information, that is data with a strong semantic content as a result of an interpretation process. The observer's knowledge permits association of the semantic content with data. The following step is a process of appropriation and reasoning that leads to the integration of the information in the background knowledge of the observer.
Making Context Explicit in Communicating Objects
275
Figure 21.1 Relationships between data, information, knowledge and context
Thus, knowledge has several roles: (1) the transformation of data to information, (2) the derivation of new information from that existing, and (3) the acquisition of new knowledge. Knowledge is thus simultaneously a result and a process. Two types of knowledge must be considered, i.e. explicit and tacit knowledge [Polanyi, 1962; Nonaka, 1994]. Explicit knowledge is easily formalised and communicated, while tacit knowledge is highly personal and difficult to express. (Note that recent discussions make clearer this distinction, e.g. see [Brezillon, 200 la].)
276
Communicating with Smart Objects
Figure 21.2 The context and the movements between tacit and explicit knowledge
Figure 21.2 presents four types of movements between these two types of knowledge: • • • •
Socialisation permits adaptation of our tacit knowledge with our interaction with others (a kind of internal knowledge creation), Externalisation permits communication, at least partially, of our tacit knowledge, Combination permits generation of new explicit knowledge after our interaction with the others, and Internalisation permits assimilation of new knowledge from external sources that enriches our tacit knowledge. (see [Nonaka, 1995] for an initial presentation and [Pomerol, 1999; Brezillon, 1999b], for a discussion on this subject).
Here, the process of externalisation interests us particularly with respect to context because it anticipates the process of proceduralisation leading to the proceduralised context, which will be discussed in the next section. Note that, more generally, the relationships between knowledge and context (see Figures 21.1 and 21.2) can be summed up as [Pomerol, 2001]: • •
Context and knowledge can be made explicit or implicit but both must be made explicit to be communicated (i.e. exploited), The context may contain deep and shallow knowledge,
Making Context Explicit in Communicating Objects
• •
• • •
277
Contextual knowledge is task-oriented but are not limited to know-how, Contextual knowledge permits description of the state of nature in which is made a decision or accomplished an action, and can be combined in different ways for that, Proceduralisation of contextual knowledge is a decisive step in the execution of an action and in the triggering of the know-how, The proceduralised context is task-oriented and highly subjective as the knowhow and situated knowledge, and The link between the proceduralised context and the corresponding action can be made explicit or implicit (as a result of the proceduralisation that gives the proceduralised context).
Making context explicit in terms of knowledge permits identification in a better way that is necessary for a system (or an object) with respect to a user at a given time according to the available data, information and knowledge.
2.2. Identification of the context Some results indicate that dimension of the context is infinite [McCarthy, 1993]. As a consequence, a context is always relative to another more general context, and thus can not be totally described. On our side, we first distinguish the part of the context that concerns the focus of attention (the contextual knowledge) from the knowledge that is not relevant at that phase of the focus of attention (the external knowledge). At a given step of the focus of attention, a sub-set of the contextual knowledge is mobilised, situated, organised and structured to be used at that step of the focus of attention. We call the result of this compilation of the contextual knowledge, the proceduralised context. Figure 21.3 gives a synthesis of the three types of context. For a given focus of attention (e.g. a step of a problem solving), a static definition of context is the set of contextual elements that give a meaning to the focus of attention without intervening in it explicitly. Thus, we consider that we have not to distinguish context from the other objects concerned by the reasoning, objects entering or going out the context according the events. Moreover, the move of the focus of attention implies a dynamic dimension of context. In the framework of communicating objects, we think that objects must share a reference, a shared source for data, information, knowledge and context. This referential will be updated regularly and shared by objects. Thus, different communicating objects (for example, inside a house) can coordinate their activities according to the knowledge that can be useful for them (their respective proceduralised context) and knowledge of other objects (a part of their contextual knowledge).
278
Communicating with Smart Objects
Figure 21.3 The three types of context 2.3. Lessons learned in this section
Context is knowledge, and knowledge is context. At a given moment, there are external knowledge and contextual knowledge, and a part of the contextual knowledge is compiled in the focus of attention (through the proceduralised context). Context is relative to a focus of attention (e.g. the context of knowledge use) with an organisation of the contextual knowledge around this focus (e.g. see the onion metaphor in [Brezillon, 1997]). The contextual knowledge, which is organised around the focus of attention, also has a granularity that depends on the distance to the focus. Moreover, the movement of the focus of attention implies a change in the proceduralised context, and explains the dynamic of context. As a consequence, the "reasoning" of a communicating object must account for this dynamic of context. 3. Challenges 3.1 Dynamics of the environment The roles played by the context are different if the object is mobile or not. The reason is that the environment itself has a dynamic particularly important for a mobile object. For example, changes in the environment can transform an optimal
Making Context Explicit in Communicating Objects
279
solution in an inadequate solution in another context [Brezillon, 1999]. Moreover, two concepts can be close in a context and distant in another one. A mobile object must be able to revise all its beliefs at a given moment, even during the course of a plan execution. This supposes that the object can follow users' actions and watch for an eventual derive of the observed behaviour with respect to a predicted behaviour (e.g. see [Brezillon, 2000] on case-based intelligent assistant systems). The objective is to ensure that the local user's needs respect global constraints at any time. Up to date, the dynamic of the environment is taken into account through the evolution of physical factors as user's location, time of the request. However, this dynamic should also take into account knowledge, not only data, on the environment and the user. 3.2. Individual contexts A system using contextual knowledge can develop a user's model increasingly elaborated along user-system interaction. Note that we do not speak of a model drawn from a library but an online modelling of the user as the system can view him, i.e. through their interaction. Thus, the system can provide relevant answers to users' questions, and even helping first the user in the formulation of his questions. The experience acquired by the system with a user accomplishing a task then can be reused for helping the same user in other tasks.
3.3. Shared context and individual contexts A system can also reuse the experience acquired with a user for helping other users with the same task. This is realised directly with either a stand-alone system or by interaction among agents. In the latter situation, each agent helps a user (e.g. see the works of Maes at MIT for the last approach), the agents exchanging their experience with their user to support other agents. A system can support a collaborative work between humans by intervening in all the phases of the collaboration (cooperation, negotiation, etc.). This situation is increasingly important when manufactured objects are more and more complex and require the collaboration of different specialists (think of the design of a spacecraft). The system can then take in charge the adjustment of individual contexts of the users in order to make compatible their interpretation on a given event [Karsenty, 1995]. For example, a TV, as a communicating object, would have to make compatible the interests (eventually diverging interests) of the father, the mother and the children (say, a boy and a girl).
280
Communicating with Smart Objects
3.4. Focus of attention andgranularity of the context
The granularity of the context can be compared to a distance measure from a contextual element to the focus of attention. The closer the contextual element is to the focus, more detailed the context. For example, for sending a letter, you need to know the way from where you are to the nearest letter box, when you only need to know that the location of Scotland (from Paris) is North. Practically, context granularity is restricted now to the distinction between a local context and a global context. Van Dijk (1998) gives a good example in the analysis of political discourses. In computer-aware applications, the Fisheye system does a similar operation [Pook, 20001. However, this approach is not new: Conceptual graphs already proposes mechanisms of aggregation and expansion [Sowa, 20001). Nevertheless, context must be represented in a machine in an efficient way for modelling knowledge and reasoning, from the programming point of view as well as the viewpoint of its effective use. 3.5. The object of the context induces different contexts An explicit use of the context may bring some insights on known problems in information technology, such as the management of information presentations in response to a query, a support in the formulation of a query, information exchange between heterogeneous databases. Figure 2 1.4 presents where different contexts intervene in the interaction between a user and a system (as a communicating object).
System Repository
'
Mechanism of reasoning
Differ entcon texts
P
Figure 21.4 The different types of context in human-machine interaction
User
Making Context Explicit in Communicating Objects
281
Moreover, Goh proposes a context manager to make compatible the contexts of the emitter ontology and the context of the receiver ontology [Goh, 1995]. Indeed, the main observation here is that context would permit a dynamic organisation of the data, the information, and the knowledge in memory for the extraction of the elements of an answer as well as the acquisition of new items. 4. Conclusion For accounting for context in communicating objects, one must have: (1) an access to a base of data/information/knowledge updated practically in real time, (2) a formalism for a context-based representation of the data, the information, the knowledge, and a user's model, (3) an efficient combination of the "tangible computing" and the "social computing", and (4) modelling of the user in real time from his actions during his interaction with the system. Indeed, as it was said in artificial intelligence several years ago, a communicating object must be something between a reactive object (i.e. reacting only to user's interventions) and a strongly active object by guiding the user among the possible alternatives. In this chapter, we present two approaches, a "social computing" approach in which are ascribed our works in the framework of the context-based intelligent assistant systems (CIAS), and a "tangible computing" approach in which are supposed to evolve communicating objects and context-aware systems (CAS). Both approaches have particular viewpoints in the context, mainly in relation with the fact that the CASs are concerned with data, and the CIASs are concerned with the knowledge.
Figure 21.5 Respective positions of CAS and CIAS with respect to data, information and knowledge
As shown in Figure 21.5, which presents a unified view of the two approaches, the main challenge is a joint effort of both communities to design and develop communicating objects. This position is close to Dourish's position (2001) with his concept of embodiment to merge "tangible computing" and "social computing".
282
Communicating with Smart Objects
5. References [Brezillon, 1997] Brezillon P., Gentile C., Saker I., Secron M., "SART: A system for supporting operators with contextual knowledge", First International and Interdisciplinary Conference on Modeling and Using Context (CONTEXT-97), Rio de Janeiro, Brazil, Federal University of Rio de Janeiro (Ed.), 1997, pp. 209-222. [Brezillon, 1999a] Brezillon P., "Context in problem solving: A survey", The Knowledge Engineering Review, 1999, 14(1), pp. 1-34. [Brezillon, 1999b] Brezillon P., Pomerol J.-Ch., "Contextual knowledge sharing and cooperation in intelligent assistant systems", Le Travail Humain, PUF, Paris, 1999, 62(3), pp. 223-246. [Brezillon, 2000] Brezillon P, Cavalcanti M., Naveiro R., Pomerol J.-Ch., "SART: An intelligent assistant for subway control", Pesquisa Operacional, Brazilian Operations Research Society, 2000, 20(2), pp. 247-268. [Brezillon, 2001a] Brezillon P., Pomerol J.-Ch., "Some comments about knowledge and context", Research Report 2001-022, LIP6, Universite Paris VI, Paris, France, 2001. http://www.lip6.fr/reports/lip6.2001.022.html [Brezillon, 2001b] Brezillon P., Pasquier L., Pomerol J. Ch., "Reasoning with contextual graphs". European Journal of Operational Research, 2001, 136(2), pp. 290-298. [Dey, 1998] Dey A.K. and Abowd G.D., "A conceptual framework and a toolkit for supporting the rapid prototyping of context-aware applications", 1998. http ://www. cc ;gatech.edu/fce/contextext toolkit [Dourish, 2001] Dourish P., "Seeking a Foundation for context-aware computing". Human-Computer Interaction, 2001, 16(2-4). http://hci-journal.com/editorial/vol16.html [Goh, 1995] Goh G.H., Madnick S.E., Siegel M.D., "Ontologies, contexts, and mediation: Representing and reasoning about semantic conflicts in heterogeneous and autonomous systems, 1995. http://context.mit.edu [Karsenty, 1995] Karsenty L., Brezillon P., "Cooperative problem solving and explanation", Expert Systems With Applications, 1995, 8(4), pp. 445-462. [McCarthy, 1993] McCarthy, J., "Notes on formalizing context", Proceedings of the 13th IJCAI, 1993, Vol. 1, pp. 555-560. [Nonaka, 1994] Nonaka I., "A dynamic theory of organizational knowledge creation", Organization Science, 1994, 5(1), pp. 14-37.
Making Context Explicit in Communicating Objects
283
[Nonaka, 1995] Nonaka I., Takeuchi H., "The Knowledge-Creating Company". Oxford University Press, New York, NY, 1995. [Norman, 1988] Norman, D. A., "The psychology of everyday things", New York: Basic Books, 1988. [Polanyi, 1962] Polanyi M., "Personal Knowledge: Toward a Post-Critical Philosophy", Harper Torchbooks, New York, NY, 1962. [Pomerol, 1999] Pomerol J.-Ch., Brezillon P., "Dynamics between contextual knowledge and proceduralized context", Modeling and Using Context (CONTEXT99). In Lecture Notes in Artificial Intelligence, n°1688, Springer Verlag, 1999, pp. 284-295. [Pomerol, 2001] Pomerol J.-Ch., Brezillon P., "About some relationships between knowledge and context", Modeling and Using Context (CONTEXT-01). Lecture Notes in Computer Science, Springer Verlag, 2001, pp. 461-464. (Full paper at http://www-poleia.lip6.fr/~brezil/Pages2/Publications/CXT01/index.html) [Pook, 2000] Pook S., Lecolinet E., Vaysseix G., Barillot E., "Context and interaction in Zoomable User Interfaces", AVI 2000 Conference Proceedings (ACM Press), pages 227-231 & 317, 23-26. http://www.infobiogen.fr/services/zomit/ avi2000/ [Sowa, 2000] Sowa, J.F., "Knowledge Representation: Logical, Philosophical, and Computational Foundations", Brooks Cole Publishing Co., Pacific Grove, CA, 2000. [Van Dijk, 1998] Van Dijk T.A., "Cognitive Context Models and Discourse", In M. Stamenov (Ed.), Language Structure, Discourse and the Access to Consciousness. Amsterdam: Benjamins, 1998, pp. 189-226.
This page intentionally left blank
Chapter 22
Dynamic Links for Change-sensitive Interaction Philip Gray and Meurig Sage Glasgow Interactive Systems Centre, Computing Science Department, University of Glasgow, UK
1. Introduction The Paraglide Project has developed a mobile, context-sensitive wireless-enabled computer system for pre- and post-operative data capture and assessment for use by anaesthetists and anaesthetic nurses. To support these activities, we have developed a novel approach to the updating of the system from changes in remote and local data sources, based on a new dynamic link software technology. Our dynamic link architecture provides a set of features not found in similar systems, including the ability to: • • • •
relate data to either remote or local sources using the same approach; specify that changes in the source cause either data updates or a change in the likelihood of a data item's value; specify that user confirmation, or more complicated interaction, is necessary before the updates or predictions are performed; specify the dynamic links in a simple XML-based language and to create the links dynamically at run-time from descriptions in this language.
2. The context of the work 2.1. The Paraglide Project The Paraglide system consists of a set of computer-based "clinical assistants" that hold information about a clinician's current cases requiring assessment, along with associated data [1], and that communicate with a set of information services that supply relevant data or that accept the results of clinical assessments. Clinical assistants can be used on a variety of platforms, from full-sized workstations through
286
Communicating with Smart Objects
laptops to hand-held devices. A clinician uses a clinical assistant to collect data (e.g. from a patient), to examine data (e.g. blood tests returned from a laboratory), to record their assessments, and to develop plans of drugs and techniques that will be used during an operation. Figure 22.1 illustrates the overall architecture.
Figure 22.1 The Paraglide system
The situation in which the clinician carries out his or her work is highly dynamic. There are several forms of context- and change-sensitivity, therefore, that the Paraglide system must be able to handle: •
•
•
Changes to local data on the clinical assistant. Mutual dependencies exist between a great deal of the data within the system. These frequently can only be expressed as probabilistic predictions rather than changes to case data. For instance, if a patient has a condition, like asthma, we can predict that they are going to be on a limited set of related drugs. Changes to the relevance and accessibility of data from Paraglide services. A clinician's case load changes from day to day; typically, only data for immediately upcoming cases will be relevant. Also, certain data, such as operating theatre schedules, staff rotas and lab test results, may become available at any time. Changes in the context of use. Paraglide clinical assistants operate in a wireless environment with intermittent connections. Additionally, they can operate outside the normal clinical environment (e.g., in the anaesthetist's home) where the nature of the connection may be very different, without access to sensitive data that must remain inside the hospital's LAN. Thus the system must also be sensitive to changes in the connection status and user location.
2.2. Dimensions of change There are two dimensions of change that we consider in handling these types of change: predictions vs value updates and implicit vs explicit updates.
Dynamic Links for Change-sensitive Interaction
287
2.2.1. Predictions vs value updates In cases like the relationship between a patient's condition and their current drugs, it is often inappropriate to change actual data in a case record; what is wanted is an update to the likelihood of its value. Our approach is to model both value and a predicted value in all case object attributes. The predicted value can be as simple as a default or a likely subset for menu selection through a complex probability distribution for the value set. Updates can be specified as affecting the value or the prediction (see section 3.2).
2.2.2. Explicit updates There are often updates that can be performed automatically. For instance, when the anaesthetist enters the height and weight of the patient, the system can generate the "body mass index", used to determine if a patient suffers from obesity. However, many such changes demand some user involvement, if only for clinical and legal reasons. For example, when a new set of blood results becomes available from a hospital lab, it would be inadvisable to perform the update of values in the case record without making the clinician aware of the update. We use our smart pasting mechanism, described in section 3.2 below, to support such explicit updates.
2.3. Clinical Assistant architecture A Paraglide Clinical Assistant consists of the following components: a set of interactors, a set of resource managers, a set of dynamic links, a library of documents and a communications subsystem. The relationship between the components is shown in Figure 22.2.
Figure 22.2 The Paraglide Clinical Assistant architecture
288
Communicating with Smart Objects
Users interact via a set of configurable interactors, specially designed for deployment on differently sized displays [3]. The interactors are used primarily to provide access to the service store, holding objects that represent clinical case data. However, a user can also interact with dynamic links when performing explicit updates (see section 3 below) and resource managers when viewing or interacting with general clinical domain data. Resource managers are responsible for supplying domain-related information to the system as a whole and also for creating links associated with its domain. Currently there are four types: • • • •
the scene manager which maintains the collection of context-related interactors and links and handles and records navigation; the case manager which produces new cases, held in the service store, and provides access to them; the technical environment manager which maintains data about the system such as battery power and network connectivity; and domain knowledge managers which maintain data about drugs, procedures, staff etc.
The librarian mediates queries to external information sources and incoming documents from these sources, providing a local cache and access to the broker that handles the transmission to, and reception of messages from, other Paraglide services. Finally, dynamic links are components that provide appropriate responses to the various kinds of change described above [2].
3. Dynamic links Paraglide's dynamic link structure handles two sorts of links: those that maintain consistency between different elements of a case object; and those that import document data. Our fundamental approach is to try to unify the link framework, so that we can cope with change-sensitivity in a principled way. Therefore all links are viewed as associations that relate a source object to a destination object with respect to an aspect, or operation, via a link function: link = <source, destination, operation, link_function> A link goes from a given source, or set of sources, to a given destination, or set of destinations. It applies a link function to transform the data from the source, and then performs some operation on the destination. For instance, a link could go from a lab system serving blood test documents to a local blood test results object, extracting all
Dynamic Links for Change-sensitive Interaction
289
blood results from the document and transforming them into Java objects, before adding them to the blood test results collection.
3.1. Local and remote sources There are two sorts of link source, document and value sources. Note that these sources are not simple documents or values but (remote) services or (local) objects that will provide new documents or values over the duration of the link. A link's local source is an active value and, indeed, all potential local sources, the contents of the service store, have attributes that are active values. Links employ listeners to be notified about changes in a local source. In the Paraglide system, information is transmitted between services as documents, i.e., structured text. A link's remote source requires a specification of which services to query for relevant document(s) and what to say to the service to get such documents. For instance, it may specify an SQL query to extract data from a database service. These documents may well contain information relevant to a number of local links. This information must be extracted from incoming documents in order to resolve the links dependent on that document.
3.2. Updates and smart pasting Implicit value updates are straightforward; simply set the value, although with a collection destination we can reset the collection, add or delete one or more items to or from the collection. In the case of explicit updates, the link performs a notification operation based on the results of a summary function that specifies how to summarise the source document for the user. It also contains a set of sub-operations that specify what to do if the notification is accepted. Each of these sub-operations may itself extract some data from the source document. The user interface implements an unusual paradigm for handling such notifications. Via visual cues in the interactors, the availability is signalled of unread information in the form of read-only documents. The user has the option to inspect the data by reading a document (for example, a Full Blood Count report), and may elect to close the document without copying any data into the anaesthetic assessment or "paste" the information into the assessment with a single button-press, causing the execution of the link's acceptance operations. Although it may appear to place a small but unnecessary burden on the user, the paradigm gives the user control of data entry in a natural manner, since the task sequence, "browse documents; select and inspect a document; copy data" mirrors current practice in paper-based working. It also ensures that the user has always
290
Communicating with Smart Objects
inspected and endorsed any information written into the assessment, a desirable situation with respect to clinical records, since the clinical user will have to take full responsibility for the content of the assessment.
3.3. Link specification Because links depend on potentially variable relationships and because they must be created at run-time, we also include link specification as an explicit element in the architecture. A link specification object holds the information necessary to create a link of a specified type, defined in terms of the types of its arguments.
linkspec = <source-type, destination-type, operation-type, function-type> Link specifications are held by resource managers and used to instantiate links when called upon by objects in the service store or, indeed, as the outcome of a link operation. To allow configurability, link specifications are written in XML. They can be stored in documents and transferred around a network. New links types can be added without the need for recoding. They can, in fact, even be added while the system is running, thus enabling dynamic reconfigurability. For instance, if a new document type were to be added in a hospital, an update could be broadcast to allow all Clinical Assistants to interpret it, without any need to disrupt the users of the system. We are currently using this approach to implement a dynamically reconfigurable audit system in which links specification are downloaded that, when instantiated, create a pathway for the deployment of an audit questionnaire and the transmission of its data back to the audit service.
4. Conclusions and future work We are currently working on enhancing the link mechanism, including support for bidirectional links and link collections. We are also investigating the application of Paraglide technology in other related application domains, including new medical settings and police work.
5. References [ l ] M Gardner, M Sage, P Gray, CW Johnson, Data Capture for Clinical Anaesthesia on a Pen-Based PDA: Is It a Viable Alternative to Paper?, in A Blandford, J Vanderdonckt and P Gray (eds), People and Computers XV Interaction without Frontiers, Joint Proceedings of HCI 2001 and IHM 2001, p 439456, BCS Conference Series, 2001.
Dynamic Links for Change-sensitive Interaction
291
[2] P Gray and M Sage. Dynamic Links for Mobile Connected Context-Aware Systems in M Little, L Nigay (eds), Engineering for Human-Computer Interaction, Lecture Notes in Computer Science 2254, p 281-299, Springer, 2001. [3] M Sage, M Gardner, P Gray, A Multi-Scaled Display Technique for PDAs. In Proceedings of CHI 2001, p 123-124, ACM Press, April 2001.
Acknowledgements This work was carried out with the support of EPSRC grant GR/M53059. We also wish to thank our colleagues on the Paraglide Project, Professor Chris Johnson, Professor Gavin Kenny and Dr Martin Gardner, who contributed to the work reported here.
This page intentionally left blank
Chapter 23
Communicating Devices, Multimode Interfaces and Artistic Creation Guillaume Hutzler, Bernard Gortais and Gerard Poulain LaMI- UMR 8042 CNRS/Evry University, France and Laboratory for Data Processing, University of Paris VI, France
1. Introduction In both new technology and artistic creation, communication of information now uses networks whose codes are specific. Almost all professional fields now invest heavily in multimedia systems and networks, the field of art does not escape from it. In our time the transition from an industrial mode of production to a preponderance of information services is characterised by the concomitant presence of creations concerned with varied artistic traditions. With the art produced in France one can simultaneously find artistic developments which go from the prehistoric time to our time, using a varied range of technologies. Among those, one will also find the new technologies of information and communication; naturally so, because artistic development consists in answering contemporary questions. Today, information processing systems are more and more autonomous. They become and will increasingly become expressive and intelligent. While passing from "servomechanism" to "brain-mechanism", communicating entities unload, express, analyse, learn and collaborated with us. But this collaboration is carried out for the moment at the price of training by the user of these modes of expression and communication. Certain steps of artistic creation are confronted with this evolution. They integrate information processing systems and communicating purpose. They associate scientific reasoning and the process of creation artistic without setting up one against the other. They provide, mutually, matter to be thought out and create without giving up their paramount function of knowledge and communication. It is within this framework that "Mises en Scenes" is presented.
2. To establish a creative dialogue between a dancer-actor and a system The objectives of the "Mises en Scenes" project are to develop an experimental device connecting an dancer-actor with an intelligent graphic and sound information
294
Communicating with Smart Objects
processing system via the most natural and intuitive methods of interaction possible. From the creative point of view, it is a question of setting up a dialogue between the dancer and the information processing system: the gestures and displacements of the first are a proposal which guides or directs simply the musical and other reactions of the second; in a symmetrical movement, the dancer will react in his turn with the answer suggested by the information processing system. •
• • •
From the data-processing point of view, it is all the question of the interface between the man and the machine which is thus posed on new bases and potentially rich scource of direction. It acts indeed, for the machine on the one hand to perceive and interpret what the human user proposes to him in a manner which is only partly formalised, and on the other hand to generate in return a multimode response in accordance with the expressed request. A platform made up of intelligent agents and communicating devices in the network and dedicated to the activity of a dancer can be regarded as a possible actor for assistance with the design, the production and the execution of a spectacle. One then is lead to new interfaces, controls of data feedback on the produced effect, feeling. Relational interactions between actors, actors and directors, actors and public. Semantic interpretations taking account of prepositional differentiations.
2.1. The experimental device The human "dancer-actor" who will carry out the performance will be in interaction with an autonomous information processing system, a kind of numerical partner with whom it will have to establish a creative dialogue. So that that is possible, in fact is needed that the system itself perceives and analyses the performance of the human dancer, imagining a "suitable" response to this performance and finally carries out it. That corresponds to the 3 traditional phases which any autonomous agent follows: perception, decision, action (see Figure 23.1).
Multimode Interfaces and Artistic Creation
295
Figure 23.1 The autonomous information processing system perceives the evolutions of the dancer via adapted sensors, decides on an adapted answer and carries it out
In other words, it is a question of envisaging: • •
How the module of perception will represent the performance of the dancer so as to make it possible for the module of decision to choose the answer to be given? How the module of decision will represent the answer which the system will have to make thanks to the action module?
In addition, to be able to establish a dialogue between the dancer and the system, it is necessary to install the bases of a common language between the two. This is a question of making a dialogue between two modules of the system or the system and the dancer. It is the necessary in each case to define: • •
A common semiology so that each one does include/understand the other (the "what"?). A common technical platform so that they can communicate indeed (the "how"?).
296
Communicating with Smart Objects
From the semiological point of view, the approach adopted consists in analysing in a similar way the various fields of expression concerned with the spectacle imagined (dance, music, painting and scene setting), in the form of pairs of qualities (for example "heat-cold"). Each pair of qualities can be represented by a quantitative value, and the scene setting of the spectacle could be based on the writing of partitions indicating, in the course of time, the evolution of each one of these qualities. These partitions, making it possible to analyse the dynamics of the interaction between dancer and system, must authorise the three modules of the autonomous system to communicate one with the other.
2.2. The dancer and the communicating physical device The experimental device such as it was defined initially utilises one dancer and a multimedia information processing system. From the point of view of scene setting, it is however interesting to introduce facilities present in the scene that the dancer can use. The communicating physical facilities, which have usually as a function to offer services to the users, would have here the role of accessories or elements of scene setting credits. These communicating physical facilities are equipped with: • • •
an interface with the network; data handling capacities of information; sensors and/or actuators.
These characteristics enable them to interact between them and with the user. In physical interaction with their environment in the broad sense, these communicating physical facilities can: • • • •
modify their state; act on the other purposes; interact with the user; interact with their environment.
The term "communicating devices" gathers in fact a whole range of very different numerical devices, with the various capacities of interaction. Those known as "communicating" can either perceive the action of a user (direction, distance, acceleration, temperature, contact sensors, etc), or restore information for the user, or to carry out actions (screens, engines step by step), or still allow the control of other facilities for the user (remote controls, PDAs, etc). These facilities can become actors, many taking part directly in the dynamics of the performance of the dancer and the spectacle in general. By the use of tags (active or passive badges), these purposes can allow simple localisation of the dancer in the scene. Finally, with more than one dialogue between a dancer and a system of software agents, the spectacle will be composed of a dialogue between human actor, communicating physical devices and software agents of multimedia composition.
Multimode Interfaces and Artistic Creation
297
2.3. Operational assistance This tool constitutes a personal assistant in the form of intelligent prosthesis, instrument of capture and shared interpretation model. Three examples illustrate this operational assistance: 1. A capture instrument: interfaces of sensor type, sensing remotely or placed on the body, video cameras or positioners in space constitute many sources of the tracking of positioning, body language and the physical and emotional state of an artist during a representation. 2. A shared interpretation model: the relational returns of feeling of another actor, the director, the witnesses, constitute guides for the artist in scene to know where, in phase or movement, the effects which function best and the environments which increase or decrease emotional resonance. 3. An interpreter: grammars and grids of interpretation of gesture, rhythm, posture, colour and sound make it possible to explore new forms of representation and scene setting. 3. Analysis of interfaces, behaviours and gestures of a dancer To analyse, apply and illustrate the directions of exploration carrying respectively the individual, common and collaborative representations with dimensions, to the transfer of roles and their sharing between humans systems and the other, it seemed to us that the activity of a dancer, in a setting in a scene at the time of an event could be used by us as a support of exploration and an illustrative example. This scene setting is accompanied and reinforced using physical devices of observation and capture of information on the one hand (communicating purposes), and of production of proposals and mediations on the other (intelligent agents). 3.1. The perception of the spectacle by the system The perception of the spectacle relates to analysis by the system of all that is external to it, i.e. in particular what occurs in the scene and in public. That relates to, in particular, all displacements of the dancer, his body and gestural attitudes, his physiology (rate of heartbeat, breathing, surface tension of the skin, etc), words, and more generally all his interactions with the environment (devices, other dancers, public, etc). This analysis is a stage of abstraction representing rough perceptions of the system in terms of qualities as defined in the preceding paragraph, then in partitions of qualities. It will be carried out by considering each medium of expression as the support of a communication whose preliminary phase of analysis will make it possible to define the words, syntax and semantics (Figure 23.2).
298
Communicating with Smart Objects
Figure 23.2 General principle of communication of a message (gestural, language, physiological, etc.)
Figure 23.3 Customer-server model of the platform 3.2. Transfer and movements of the dancer
3.2.1. Constraints There are various techniques of capture of movement but their performances are not equivalent when taking into account our criteria: • • •
perception: must be as precise as possible but it is possible to work and do interesting things even with weak precision; response time: the interaction must be made in real time; perception must thus be not only carried out in real time but also leave calculating time for the other operations necessary (decision, realisation of the performance); the device: must be most discrete possible so as not to centre scenography around technology, and must be as transparent as possible so as not to block the dancer in his choreography.
Multimode Interfaces and Artistic Creation
299
3.2.2. Techniques considered Within sight of this synthetic presentation of the various existing solutions, our choice was made, as a first approach, of the least constraining technique for the most economic dancer, and analysis of video images. So thereafter, if the precision obtained proved to be insufficient, other heavier techniques could be considered (Figure 23.4).
Figure 23.4 Perception performance of dancer by the system
In the device considered, a camera (Webcam type or more sophisticated) films the scene in a continuous way. The exact position of the camera (face to face with the scene, at 45 degrees as in the figure above or with the balance of the scene) will be specified according to results' of the first experiments. A 3D model of the scene will have to be developed to facilitate the recognition of the position and postures of the dancer. Recognition of the movements of the dancer is carried out in several stages: •
•
all primitive descriptors are extracted from the video image, including box, direction of transfer, (high/low, ahead/back, left/right), dynamics (acceleration / deceleration). These descriptors can be supplemented by information of reliability. Webcams of the market are sold with software which already does part of this work; these descriptors can then be exploited, complementing structural information on the human body, for the determination of the position of the dancer in the scene and the recognition of postures.
300
Communicating with Smart Objects
4. Analyses of relational interactions Analyses of the relational interactions, in parallel and jointly, is a question of observing, of clarifying, of controlling the transfer of the processes of relational organisations based on the traditional functional distribution of the roles, to those based on the new interpolations during creation of a factual type (news distributions of roles of scenes, systems, actors and public, in continuous interaction). One models the processes which govern the passages from one model to another: • •
of a traditional system to a designed and dynamic multimedia system or if all the actors are present simultaneously and credits; passage of a provision of scene (public, scene, decoration, machinery) to a provision of shared and evolutionary common space.
The scene setting of an interactive event connecting a dancer/actor and the system emerge from semantic and software search. There are today artistic searches in the world of arts which go in the same direction as there ours. 4.1. State of the art We are located in a whole of range creations which use sensors as means of information on the environment. We are useful ourselves in the information processing system as partners of creation. Within this framework, there are several approaches which go from the scripted approach to the behavioural approach. The scripted approach consists of starting a precise script which will always produce the same effect in the same circumstances. This type of system, whose typical example is CD-Rom, still constitutes the basis of the majority of interactions in technological artistic creations. This principle of interaction is rigid, its effect often foreseeable and consequently prejudicial to the play of the actor and the quality of the spectacle. It leads the actor to adopt a behaviour of an explorer seeking all the possible solutions in the system to the detriment of the expression of emotion. We use a multi-agent system based on artificial intelligence and endowed with behaviour. I.e. a whole group of real or virtual entities interacting in an environment, able to perceive it and communicate. The agents have autonomous behaviour, knowledge, and can make decisions according to a goal. In Mises en Scenes the agents constitute a cluster spectacle made up of agent musicians, agent dancers, agent plastics technicians, actors, multimedia, etc. They operate jointly but each one directed to the achievement of the wishes of the director or the actor, With the performance of the learning cycles, the system acquires a personality, i.e. a way of being in the scene, comparable with its human partner. The interaction between the system and the actor never arises twice exactly in the same manner
Multimode Interfaces and Artistic Creation
301
since the goal of the repetitions is to define a coherent space of improvisation for the actor, the director and the system. The system can be partly or completely intended to be transformed, if that is necessary, into the scripted system. The more the system scripted the more one approaches the traditional methods of scene setting. The more autonomous the system, the less it is repetitive. The work becomes a living organism, then implying risk taking and permanent vigilance.
4.2. Design of the interactive event 4.2.1 General objective It is too early to describe setting a scene, which will start to emerge only after a period of thinking and creation, much on the scientific level rather than on the artistic level. The partnerships and the contents of the interactive event itself will result naturally from the meetings and the relationships which the search will make it possible to tie. It is about a process of creation and, as such, it is necessary that it remains open as long as possible. We can however underline the constraints of the experimental device Mises en Scenes.
4.2.2. Director, actors and dancers augmented The purpose of the project Mises en Scenes is to propose an interactive platform based on behaviour, and usable tools for creation by the other artists. The tool should make it possible for the actor to increase his traditional means of expression by amplifying them, by translating them into multimedia sequences. The communicating platform transforms at the same time the design and execution of the oeuvre. In the design phase, the answers which the system gives to the proposals of the actor or the director can exceed his imagination and surprise him all while, being appropriate to him in scene setting. In the production run, the public, whose activity can be perceived by the system, can interact with the system and the dancer, with an aim of scene setting.
302
Communicating with Smart Objects
4.2.3. Arts of time Various media have jointly to be all arts of time. Choreography, music, sound environments, theatre, visual arts, it video, puppets - often separated from/to each other for technical and cultural reasons - become for the system the particular ways to treat sound (the voice, music, sound environments, song), light (light, colours, forms), movement (displacements, dance, acrobatics, gestures). See Figure 23.5.
Figure 23.5 Reasons for a connection in settings in scenes
4.3. Ergonomy 4.3.1. The metaphor of the scene The Mises en Scenes project is a metaphor for the multimodal relationship that the human being has or will have with the machine in the immediate future. We are already surrounded, in public places, work or dwellings, by artifacts and communicating devices. They are surrounded by an average of 40 microprocessors, including 5 able to communicate via networks (cabled, electric, without wire, telephone, etc). It is generally estimated that, within ten years, we will be, in everyday life, in permanent contact with several hundreds of microprocessors, the majority of which will be connected to these same networks (Internet or others). Mises en Scenes is a platform thanks to which this new world of intelligent man-
Multimode Interfaces and Artistic Creation
303
machine interfaces, which allows semantic intuitive relations, body gesture, sound or visual, can be tested.
4.3.2. Semiologies
Figure 23.6 Relations beteween the functions of experimental device: classical system (left); intentional multimedia system (right)
In the research field as in the field of creation, we would have interest, there too, to exchange knowledge of the technicians and the artists and the ergonomics. The metaphorical relations which exist between the scene of the experimental device and everyday life, as between the actor and the user of NTIC, should be explored using the ergonomics of France Telecom. We will base ourselves not only on the semiology of the fields of multimedia expression but also on modelling of the functions and the space of the interactive device. Analysis of the specificity of the system passes by the analysis of the relation of the elements which make it up. It is necessary for the design of the man-machine interface to locate and model the various functions which come into play in a traditional spectacle - author, actor,
304
Communicating with Smart Objects
dancer, decorator, musician, manager, etc - and to observe the transfers of these functions during a creation of Mises en Scenes. We will also obtain invaluable information by modelling the scenic space traditionally composed of technical spaces, of the elements of decoration, scene, space reserved for the public and theatrical place. The semiology of the various media of the scene is not very different from the semiology of the media used in everyday life. The multisensorial interface of "scene setting" will initially make it possible for a group of actors to create their own image. This will be able in the second time of being used by the system to propose a response to a situation while behaving like a multimedia virtual partner. The "scene setting" project should make it possible to analyse the cognitive and physical characteristic behaviour of the users and, beyond the physical behaviour, mental aspects of the interaction (perception, memorising, reasoning). Ie the semiological study of the various fields of expression and the experimental use of scene setting 5. Analyses of semantic interpretations In parallel and complementarily, it is a question of analysing the information and emotional meanings of interactions between the communicating purposes, the humans, the activities and the systems, in order to validate models of generations of new forms enriched by communication which are sufficiently explicit and explanatory to make them effective, controllable and evolutionary in real time and in the direction of a better humanisation of the interfaces. One is based on the modelling of activities artistic, by nature complex and strongly charged in significance and emotions while using a semantic analysis of significant type of positioning, gestures, words, moods.
Multimode Interfaces and Artistic Creation
305
Figure 23.7 Variations of the expression for a point and a line in a square 5.1. Analytical fields of artistic expression
There is an analytical field in any artistic form of expression in which the creator tries to define the role of his art to acquire freedom, determined by his gesture and decisions. The plastics technician knows the laws of colour, architecture which can an present an image, the symbolic system of the forms and formats, etc; the type-setter knows the laws of harmony, the instruments, the musical structures, etc; the choreographer knows how to establish a partition by breaking up in a systematic way a sequence of movements which it characterises by the posture, intensity, trajectory, etc; the actor knows a range of gestures, types of displacement and modulation of the voice suitable to be useful for such or such expression. Thus there is a semiology of visual signals, a semiology of the aural beeps, a semiology of the gestural field of expression, a semiology of scenography. The process of creation connects dialectic of the analytical field of expression based on semiology with an unspecified field of expression, in order to express a new relation. In other words, at a given time, a creator knows the tools for which it lays out their instructions. To organise and include/understand the data which it receives, a creative system must also have functions of semiological evaluation.
306
Communicating with Smart Objects
A second stage of analysis consists in working out a correspondence between the elementary physical qualities identified in each field of expression and emotional qualities which result from a combination of physical qualities. One will be able to consider for example that "tragic" emotional quality.
5.2. Towardsaa common commonmultimedia multimedia writing in the form fformof ofgrids girds 5.2. Towards While transforming, for each quantitative value, the means of expression obtained starting from the sensors or the information processing system (quantity of energy, value of the colours, rate of travel, loudness, etc.) into comparable qualities (tragedy, peaceful, hot, cold, noisy, calm), we establish, for the various means of expression, a virtual common platform, a grid of reference. The role of scene setting will be then to establish, in reference to these comparable qualities, the rules of behaviour for the actors of the virtual platform and the human actor (Figure 23.8). The performance of the dancer such as is perceived by the system and the answer which the system brings there are represented by the integral of the grids of the quality of the various superimposed fields of expression. If one considers this integral in the course of time, each field of expression can be looked at like a partition of an instrument, and all of the partitions like the partition of the whole orchestra. It will be interesting to try out the use of the formalisms of writing used in
Multimode Interfaces and Artistic Creation
307
the contemporary music on the partitions of the grids, to imagine a temporal multimedia writing based on emotional partitions.
Figure 23.8 The integral of the multimedia grills constituting th performance partition
The below shows an example of a grid of reference, in which the lines correspond to the four forms of expression (body, musical, plastic, scenographic), and where the columns correspond to a succession of moments or phases of more or less long duration. 1
2
3
4
5
body expression
peaceful
peaceful
tragedy
tragedy
peaceful
musical expression
silencer, rhythmic
piano, rhythmic
silencer, strong, asynchronous rhythmic
plastic expression
cold, statics cold, dynamics, slow calm, calm closed
mezzostrong, asynchronous heat, dynamics, slow a little agitated
scenographic expression
heat, dynamics, rapid agitated
cold, statics
calm, closed
6. Results So one obtains a framework, a model, a method and an instrument of production of single events from their non deterministic character, non repetitive and enriched by perception, the representation and the emotion of each interactor. For that purpose, it is necessary to start with a simple relation between an actor and a system and to find the means to communicate between them not only in term of communication of state or information but of reciprocal regulations and in real time. It acts to some extent, to take an analogy of the mirror, as a passive instrument of thinking about an image to a mirror which returns part of its image that one can enlarge without deforming, with a mirror which gives another indication of your image, as a double which
308
Communicating with Smart Objects
perceives the impression that you want to transmit to it as to the public, and not yourself. In term of repercussions on the interfaces of the communicating set-up, nothing then prevents one from imagining a dialogue of collaboration in which the purposes will reflect part of yourself (personalisation) and part of the impression that you wish to give of you outside (adaptation).
6.1. Applications - general context The Mises en Scenes project opens up a great potential number of application, which concerns potentially art, cognitive sciences, man-machine interfaces and industrial control. The Mises en Scenes project allows the representation of new types of interactive devices, truly multimedia. The relationship between the partners of the spectacle is modified. The contents of the spectacle appear at the same time as its form. The new relations with space and the public make it possible to consider scenography in very varied spaces multimedia: interactive events, teleconferences, exposures, museums, theatres. In the field of cognitive sciences Mises en Scenes is a platform for study of communication in the everyday life whose multimedia spectacle represents a particular case. The system should improve the means of dynamic study of the concept of creation and improvisation, of analysis and comprehension of human emotional aspects starting from varied sensors and the development of possibilities of non-verbal interaction. 6.2. Demonstration of innovative technologies Artistic expression and daily expression reflect one the other, nourish one the other. Some examples: Moliere, Brecht, Becket in the theatre, Stravinsky, Dvorak, Armstrong and folk music; Martha Graham, Cunningham and the introduction of industrial body gesture into contemporary choreography. It is not chance that the terms used are the same. We frequently intend to speak about the scenario of everyday life when it is a question of describing typical situations of life of every day. We often employ the word theatre to indicate the place of such or such topicality. The means of expression are often common. To quote only one example: the descriptive industrial one, that of the codes of conduct or the codes of publicity, the advertising images use the same laws of colour, the same symbolic system of lines and forms as the plastics technicians. It is the use of the same semantic fields which allows the relation between the opera and the witness. The metaphor of the scene is based on the stylisation of the daily means of expression. It is a notation symbolic of everyday life. The scene of the life to the scene of the spectacle, the distance is often not very large.
Multimode Interfaces and Artistic Creation
309
7. Conclusion From the capture of information to their interpretation in context and the generation of alternative solutions to solve a professional problem or one of everyday life, to put in scene a spectacle, the subjacent mechanisms and ingredients are the same. Only the goals and situations felt differ. The models of analysis of artistic activities generate rules or regularities which make it possible to control a grammar of communicable and skeletal expressions, therefore usable for communicating purposes. The technique in this case does not replace the man but provides the assistance to be included/understood, chosen and carried out. With time, intelligent and emotional prosthesis, this multimode supplement brought about by control and exploitation of art will more humanise the current interfaces while allowing the passer by of the mode orders or entered come out from a conversational mode or cooperation with a mode of user-friendly relation or interpretation.
8. Bibliography [BRE 00] Breton L., Zucker J.-D., Clement E. (dir.), A multi-agent approach for the resolution of equations in granular physics. Multi-agent systems and agent-based simulation, Springer Verlag, Boston, 2000. [CHE 00] Chevaleyre Y., Zucker J.-D., 'Noise-Tolerant Rule Induction from MultiInstance Data', Proceedings of the ICML-2000 Workshop on Attribute-Value and Relational Learning: Crossing the Boundaries, Stanford, p. 47-52, 2000. [CHE 01] Chevaleyre Y., Zucker J.-D., 'A Framework for Learning MultipleInstance Decision Trees and Rule Sets, European Conference on Machine Learning, 2001. [DEN 92] Deneubourg J.-L., Theraulaz G., Beckers R., 'Swarm-Made Architectures' in Towards a Practice of Autonomous Systems, MIT Press, Cambridge (Mass.), p. 123-133, 1992. [DRO 98] Drogoul A., Zucker J.-D., Methodological Issues for Designing Multiagent Systems with Machine Learning Techniques: Capitalizing Experiences from the RoboCup Challenge, LIP6, Paris, 1998. [ISH 97] Ishii H., Ullmer B., 'Tangible Bits: Towards Seamless Interfaces between People, Bits and Atoms', Proceedings of Conference on Human Factors in Computing Systems (CHI'97), ACM, Atlanta, 1997. [LEL 01] Lelarge V., Zucker J.-D., Multi-Agent Learning in the Bar El Farol Problem: In Preparation, 2001.
310
Communicating with Smart Objects
[MOU 96] Moukas A., 'Amalthaea: Information discovery and filtering using a multiagent evolving ecosystem', in Proceedings of PAAM'96, London, 1996. [PAC 00] Pachet F., Roy P., 'Musical harmonization with constraints', Constraints journal, Kluwer, 2000. [PIC 97] PicardR., Affective
Computing, MIT Press, Cambridge, 1997.
[SAI 01] Saitta L., Zucker J.-D., 'A Model of Abstraction in Visual Perception', Applied Artificial Intelligence, 2001. [SPA 00] Sparacino F., Davenport G., Pentland A., 'Media in performance: interactive spaces for dance, theater and museum exhibits', IBM Systems Journal, vol. 39, n° 3-4, MIT Media Laboratory, 2000. [TUF 83] Tufte E.R., The Visual Display of Quantitative Graphics Press, Cheshire, 1983.
Information,
[WAN 00] Wang J., Zucker J.-D., 'Solving Multiple-Instance Problem: A Lazy Learning Approach', International Conference on Machine Learning, Stanford, Morgan Kaufmann Publishers, 2000. [WEI 93] Weiser M., 'Some computer science issues in ubiquitous computing', Commun. ACM, vol. 36, n° 7, p. 75-84, July 1993. [ZUC 00] Zucker J.-D., Chevaleyre Y., Solving multiple-instance and multiplepart learning problems with decision trees and decision rules. Application to the mutagenesis, Paris, France, LIP6, 2000.
Chapter 24
Powering Communicating Objects Didier Marquet France Telecom R&D, France
1. The powering issue There is not one single problem and a simple solution to the issue of powering the communicating objects, because their perimeter is fuzzy and they have scattered functions over a wide variety of familiar objects (household appliances, clothes, vehicles etc). They have very different uses, and data processing, interfaces, communication, and fixed objects are linked through nano networks (like Bluetooth) with portable objects like PDA, mobile phones, PC. So their power consumption spreads over a very wide range, from nanowatts to some tenth of a watt, depending on the object, the operating mode (standby or activity) and its interactivity. To determine the adapted energy sources and the charging methods, we have tried, with our present knowledge, to determine and classify the communicating objects with respect to their power consumption and use constraints. The most difficult objects to power are, obviously, those that are mobile and have higher energy consumption. Because they are not always connected to an electric grid, they must have a pretty long duration autonomy, with the weight and volume are limited. Unfortunately, the energy density of electric power source or storage is rather small. Energy management must also be manageable for these objects that become more and more ubiquitous in our environment. The power supply of these objects consists of a complex chain with one part internal to the object (battery + safety circuit), internal power management (activity control, energy level-metering, fault detection), and a second external part, an intelligent charger, a power distribution network (electricity mains, fuel pellet resellers), and external management (charge level, ageing level of power sources) etc. This approach to the powering of communicating objects may help to indicate future ways for future development.
312
Communicating with Smart Objects
2. Classification of objects by use and practice To define the essential qualities that the energy powering of communicating objects should possess, it seemed that it would be useful to describe the use constraints that depend on the degree of mobility. They are inspired by wider studies on known objects: 1.
Very mobile objects, permanently or often carried about (eg clock, e-pen, PDA, phone, MP3, DVD, CD walkman, e-book etc.). The power source must be light and compact. It may possess perfect autonomy, for example, several weeks for the Palm PDA, that avoids the need to bring the charger; another example is the clock battery that may last several years. Except in these rare cases, energy recovery must be easy, either with a powerful, light and compact charger, by primary battery replacement, or by accepting several energy sources [3 MARQUET, 2001]. A level meter is also necessary. 2. Mobile objects that we occasionally or regularly bring (digital camera, movie recorder, handbook etc). Extension or restoration of energy must be easy (battery replacement, primary/secondary cell compatibility, light and compact charger). These objects could be stored for a long time, and we need alwaysavailable energy when needed (with low self discharge taken into account by the energy meter). A small object may be charged by a larger one (a PC may power an external device via the USB plug). 3. Limited mobility objects in space that we use and move about in the house or in the garden (remote control, wireless phone, radio etc). The energy restoration must be transparent (docking for charge of the object, infrequent primary battery replacement). A discharge signal at least is needed. 4. Slightly movable objects (keyboard, mouse, wireless joystick, temperature or light sensors, heating or cooling control, RF ID label, webcam). Solutions are autonomous powering, or a single cable that transports signal and energy or inductive charge. The aim is to reduce wiring. RF ID, i-cards, e-buttons must be in contact or centimetres away to receive energy. 5. Heavy and rather static objects, (play station, computer, HIFI, TV-movie recorder, home movie, house devices etc). These devices are plugged in on main. Here also the target is to reduce the wiring, employing for that purpose appropriate energy distribution (low voltage bus, remote powering). Moreover, the standardisation of "green electronic" includes a standby mode or an on-off switch and consequently management of the consumption and the parameter back-up in memory. 6. Particular objects with move autonomy like robots. Such robots have their own energy source. The move autonomy is associated with the self-powering [3 SEMPE 2001]. An ultra-fast charge is useful for these objects when strongly solicited or when several objects share the same charging dock. This classification of objects is not absolute nor complete, but allows one to identify the problems of energy powering.
Powering Communicating Objects
313
3. Energy needs In order to find powering solutions, it is necessary to consider the instant power need P(t) and energy need E ie power consumption given by E = Jp(t)dt. The power of the communicated object is very variable. In fact, they frequently pass from active to standby mode to optimise the energy consumption. Standby consumption includes a minimum number of useful functions for fast restart, at commencement of radio reception etc.
Figure 24.1 Standby and active autonomy of some objects
From the energy consumption, one derives the autonomy that is an essential feature of wireless communicating objects; it must correspond to use, or the ergonomy of the object will be very bad. Unfortunately, autonomy tends to decrease for with friendly function and with computing power on board (graphical colour interface, data compression, radio packet standard) as well as miniaturisation. Moreover, the objects offering more functions (data down-load, streaming), will be used for a longer time on average.
314
Communicating with Smart Objects
As an illustration, Figure 24.1 presents the autonomy of some objects related to their standby consumption (left scale) and in active mode (right scale). Some objects are only active when interacting with the user and their autonomy is, above all, limited by the self discharge of the energy source. 4. Selection of energy sources for objects Use and practice are, of course, of first importance for the functional and ergonomic definition of the object, but it corresponds, as previously seen, to energy needs. To make the choice of one or more possible sources, the essential criteria are instant electric power, the energy is then a combination of the average values of power and autonomies for user profiles. So, we will deal with both aspects. 4.1. Micropower objects: nW to uW These communicating objects do not give present problems, as, for example, the RF ID labels, self powered by EM field of RF ID readers. In the field of nanotechnologies (eg medicine, ADN processor), MIT (Massachusetts Institute of Technology) has studied low speed processors that consume less than 1 uW, which offers several possibilities of powering solar cells adapted to artificial light. nW or u W: Full autonomy is easy to obtain with an integrated source (battery cell, solar cell, inductive interface, etc).
4.2. Low power: uW or mW Electricity consumption is generally some mW to some tenths of mW, but this excludes functions like a fast backlighted colour screen (ex: calculator, wireless mouse, sensors, radio). In this case, primary batteries suit. For some mW, it is possible to reduce the consumption of the battery using a solar cell, but the object must be close to a window. The artificial light does not suit this level of power. The battery might be replaced by a small capacitor or ultra-capacitor of several farad. uW to mW: One can target the full autonomy of several years with primary batteries with low self-discharge or a rechargeable battery or capacitor (ultra-capacitors) associated with photovoltaic cells which avoids replacement of battery.
4.3. Average consumption: mW or W This range of power corresponds to objects equipped with a low speed processor (< 30 MHz), a short distance radio link, a LCD screen, that often offer autonomy of some days or even weeks. These frequently used objects (PDA, DECT, remote
Powering Communicating Obj ects
315
commands) are generally equipped with rechargeable batteries (lower running cost than primary cells), with higher possible capacity in a given space not to recharge too often. The necessity of some functions like synchronisation has the huge advantage that we naturally dock the object for communication and recharge. The recharge is rather slow (some hours) but the object usually remains for a long time on the dock. As these functions tends to be performed with a wireless link, the necessity to dock the object might decrease, which is a problem for the recharge. Fortunately, one of the good reasons to dock anyway is perhaps to put it away (like with a DECT). Good autonomy is possible with primary batteries but solely for intermittent use objects. Accumulators are of much lower cost but need to be recharged. To dock some objects (PDA, DECT) to recharge or to put away is a transparent operation that eases recharge. 4.4. Highpower: Ws to tens of W This concerns equipment equipped with powerful processors, large backlighted screen, motors, lasers (for ex. audio readers, motorised webcam, robots). Consumption is very variable due to considerable power changes. The energy management must be treated with great care because in active mode the autonomy is very limited. Rechargeable batteries associated with a fast charger are a solution. The compatibility of size between primary cells and rechargeable batteries is sometime possible as well as the association between a rechargeable battery and other power sources (solar panel, fuel cells, manual dynamo, low voltage plug etc). An autonomy of some hours provides the choice of rechargeable cells and a simplified charge (dock, fast charge on mains, low voltage local network).
316
Communicating with Smart Objects
5. Some energy sources 5.1. Primary and rechargeable batteries
Figure 24.2 Characteristics of primary and rechargeable batteries
The more frequent sources in the form of primary and rechargeable batteries exist in a wide variety of technologies and types. For choice, it is absolutely necessary to have a method of selection related to criteria linked to technical characteristics and uses. Many parameters must be taken into account: energy density (Figure 24.2), instant and mean power, use temperature, self discharge, lifetime, shape and flexibility, recycling and cost. Primary cells are of interest only for weak energy need, ie for objects that consume little or that are occasionally used unless one prefers to recharge the battery. One sees that, sometimes, the object (camera or walkman) must accept both options, with the choice of standard compatible formats. But it may be emphasised that there is a risk with down market primary cells of leaks and that they cannot be stored for a long time. Some examples of ordinary current choices illustrate the selection criteria that are prevalent: The very small objects with high consumption and frequent use operate with economical and high density energy primary batteries, for example hearing aids that use Zinc-air cells. A clock uses a zinc-silver cell of which the voltage is ultra-stable. Alkaline cells are generally used for high currents but are of relatively short lifetime. Small lithium cells are soldered (memory back-up: clock calendar, camera, TV
Powering Communicating Obj ects
317
channel, measurement parameters). High density energy on a very wide range of temperature is required in the cameras (lithium battery LiFeSO2 or Li/MnO2). An original solution would be that society Powerpaper develops a solution to print flexible batteries on the surface of packing cardboard. This would complete the offer of printable electronics. For intense usage, rechargeable batteries, though expensive, are unbeatable for annual cost because they can perform many cycles. The following table (Table 24.1) sums up the main features of rechargeable batteries and shows in what the choice may be different depending on use. Table 24.1 Characteristics and use of rechargeable batteries Technology
Energy density -Wh/kg -Wh/1
Lead 2V
20-40 <80
-Life time - Cycles 100% 5 years 200
NiCd 1,2V
40-60 <120
10 years 1500
Complexity of charge Charge maxi Easy 5 to 15 min Easy
Self charge
< 10%/ month
Cost c euro/ Wh/ cycle 0,1 to 0,3
Use
0,5
Wireless, tools, wide range of temperature, solid, high power DECT, GSM, PC, picture, games image recoder GSM, PenPC, PDA, camcorder
NiMH 1,2V
60-120 <350
5 years 500
Mean Ih
10% in some days 100% in 5 months 100% in 3 months
Li-ion 3,6V
110-160 250
2 years 400
Sophisticated 1h
< 10%/ month
1,5 to 5
Li-ion polymer 3,6V
130-170 400
2 years 500
Sophisticated 1h
< 10%/ month
?
5 to 15 min
0,4 to 0,6
Robot, low cost, high power
Replacement of flexible Li-ion and safer
It must be emphasised that nickel batteries need, due to their high self discharge, frequent recharging (weekly to monthly). For the objects that use rechargeable batteries, the issue of the charger is crucial. One sees that the charger is often much heavier and bigger than the battery. It may weigh 10 times the battery weight (200 g for 20 g in a GSM for a charge of 3 hours). Faster chargers exist (for example 600 mAh lithium-ion charged in 1 h) that are a little lighter at the price of sophistication. This type of analysis about energy related to use, choice of the source and functional evolution will have to be made for every object. For example, the arrival of 3G mobiles (high bandwidth, sound and image remote download) that consume more energy, leads main manufacturers of lithium batteries for mobiles (Sanyo, Panasonic etc) to search for battery packs that have a capacity of 1000 mAh weighing less than
318
Communicating with Smart Objects
30 g. Projections are 400 Wh/1 and 250 Wh/kg in 2005/7. As the autonomy of objects is a essential need difficult to obtain due to increase of power consumption, much research is focused on an external autonomous source complementary or competitive to the classical charger [2 MIT]. The solutions range from single sources (primary battery, solar cell, inductive coil, piezzo-electric, dynamo etc) to hybrid sources that associate several sources and energy storage (demonstration of a pen-computer with solar panel: Figure 24.3).
Figure 24.3 Solar pen-computer (France Telecom)
Ubiquitous and opportunist battery charge is also possible using ultrafast charge or connection to a hybrid telecom+energy network.
5.2. Photovoltaic cells Small size cells (Figure 24.4), especially adapted to indoor light, can supply some uW under natural light. For example, a Panasonic cell (16 x 58 mm) gives 24 uA under 1,3 V ie 30 uW under a 200 lux fluorescent light or under a 40 lux incandescent bulb. An outdoor cell close to a window of some mm2 will supply some mA.
Figure 24.4 Thin film solar cells
The combination of sources is necessary to overcome the problem in micropower (memory backup), for example, a photovoltaic connected to a rechargeable battery or a capacitor or an ultra-capacitor: •
the solar source charges the energy store that powers the object;
Powering Communicating Objects
319
in calculators, a solar cell saves the primary battery that may therefore be installed for the lifetime. When the battery is exhausted, the objects partially operate ie only in light. 5.3. Ultra-capacitors The ultra-capacitors are very large capacitors (100 times more than ordinary capacitors) based on a molecular double layer capacity effect. The energy density reaches almost the capacity of a rechargeable battery (up to 10 Wh/kg). An ultracapacitor can operate associated with a solar cell in replacement of a rechargeable battery, or associated with a high capacity primary or a rechargeable battery but of low power, to provide peak power. It can serve as an ultrafast charge storage (some seconds in a charging zone) for a small objects that consume little. The advantages are a very long lifetime (> 10 years), several hundred of thousand charge/discharge cycles, of a low impedance of some mQ. So, combined with a primary or rechargeable battery, it helps provide high power pulses.
5.4. Fuel cells These generators operate with a gaseous, liquid or solid fuel and oxygen of the air. The chemical reaction is a slow oxidation freeing electrons in an external circuit. Depending on the technology, the products of reactions are water, carbon dioxide or ionic metal salts. The efficiency is 30 to 50 %. The operating temperature is around 60 to 80 °C.
Figure 24.5 Fuel cell
The fuel is contained in micro-storage: methanol-ethanol-water cartridges (Motorola, Sanyo, Medys), pressurised hydrogene or metal hydride (Los Alamos, CEA) or borohydride bulb (Manhattan scientific, CEA). The energy density target is 3000 Wh/kg. The challenge is very strong because such a battery, weighing as much as a battery charger, should give 20 to 50 equivalent charges with the advantages of
320
Communicating with Smart Objects
suppressing the dependency on the electric plug and offering ultrafast energy recovery by simply replacing a cartridge. First industrial products will not be available before 2005/7 with a competitive price compared to lithium batteries. Reducing the use of platinum catalyst and the price of proton exchange membrane are the mains issues. But there, also exists a promising approach with the metal-air battery (zinc-air AER or aluminium-air TRIMOLE reaching 400 to 800 Wh/1), perhaps easier to put into practice. Still more surprising research is taking place about a biobattery that, with the help of conventional batteries, would digest organic material (sugar, alcohol) producing usable hydrogen in the battery core. This approach is very interesting because one could take care of communicating objects like plants or animals! 5.5. UPTI wiring (Universal Power & Telecom Interface) A possible solution to power ever-more numerous communicating objects in the coming year, is to imagine a safety ultra low voltage bipolar cable that necessitates only partial insulation. It may be 12 V complying with the safety standard CEI 950 and complying with car voltage. The circuit can be very easy to install like a decorative piece and allows one to clip small objects almost everywhere to power and charge them. A hybrid function of information transport on the energy cable and a leakage antenna option to send the information at very short distance may be envisaged. This UPTI cable could suit in buildings but also in cars and public transportation. A PC, a mobile or a PDA could be almost charged on UPTI. As well as a communicating device, the holder may have electrical charging area. It only remains to define a standard! 6. Object autonomy management Communicating objects can calculate for themselves their energy autonomy and send the results to a common station (PC, TV, PDA). This station can then show graphically the energy levels of objects and localise them in the environment. The service offered consists in editing the list of batteries to be bought, the automatic writing of the list in the PDA, a budget estimation, the observed replacement period and advice (for example, to use rechargeable battery in place of primary cells). Of course, the object should have a visual indicator (blinking LED for example) warning of the end of discharge. For complex objects, the problem is very difficult. The autonomy is a function of several parameters: the remaining capacity of the energy source, the discharge rate
Powering Communicating Objects
321
(ratio capacity/discharge current), temperature, ageing, storage self discharge, history of operation of sources affected by "memory effect". Self-discharge is an irreversible loss of capacity ranging from some % per year to some % per day depending on the technology of the source, temperature and ageing. Ageing creates not only self discharge increase and loss of capacity, but also impedance degradation. This last effect leads to abnormal loss of voltage during current pulses, common features of fast processors or mobile radio emitters. Due to rapid change of consumed current, the determination of consumed capacity can only be approached by precise real time measurements and calculations. The problem of measurement is then the high dynamic between the active state and standby state. The problem of calculation is the need for perfect knowledge of the consumption of the different operating states. Generally the capacity gauge is completely wrong and prediction of autonomy still more, not to mention the rechargeable battery heath state indicator. The TRUST study [1 GREWAL, 2001], has proposed for mobiles 3 and 4 G the use of neural networks and fuzzy logic to improve this. It may be said also that some objects may create an affective link to the user because they become indispensable. Simulating expressive reactions reinforce this, so that the user wants to take care of them (tamagochi, robots, PDA, MP3 players) and powering them is almost like feeding them. Energy management is essential especially when communicating objects become numerous. An energy information frame in the radio link protocol should be defined with at least the energy gauge and source description, and even the source state of heath. An "intelligent" help will reduce random interruption and will lighten the replacement task of primary and rechargeable batteries, but for complex objects, autonomy prediction is very difficult. Real use analysis will help to optimise the source (primary and rechargeable battery) and the "energy" budget.
7. Conclusion Wireless communicating objects will become more and more numerous at home and, from now on, it is necessary to reduce, or even make transparent energy management. In this contribution we have dealt with the reduction of power consumption, the increase of autonomy, optimal choice of the source and charger, autonomy and state of health prediction, definition of minimal information to be transmitted to an object energy management assistant.
322
Communicating with Smart Objects
New tracks are also to be explored like hybrid information/energy wiring and robot ability to automatically feed energy to smaller fixed objects (Figure 24.6) [3 MARQUET, 2001] [4 SEMPE, 2001].
Figure 24.6 Service robots to assist human beings
8. References [1] Report of 1ST project 1999-2001 TRUST: Transparently Reconfigurable Ubiquitous Terminal, Specific power management techniques, S. Grewal, University of Bristol. [2] Energy for wearable computer: MIT media labs internet site. [3] France Telecom R&D patents pending: Automatic and ultrarapid robot charging, August 2001, F.Sempe FT R&D, D.Marquet, FT R&D. Multi-source input charger with an internal storage accepting ultrafast charge, November 2001, D. Marquet, FT R&D. [4] Autonomous robots sharing a charging station with no communication, a case study, November 2001, Francois Sempe, FTR&D, Universite LIP6, Paris, France. 9. Glossary Autonomous computing: The new generation of plug & play computers (according IBM), with self learning and internet seeking capability to self-improve their operation. Dealing with robots (French Telecom) they will, for example, be able to seek behaviour increasing their energy autonomy (to charge itself, improved gauge) and improving global autonomy (to charge other communicating objects). CD: Compact Disk, optical disk to store digital data. DECT: Digital Enhanced Cordless Telecommunication (radio standard).
Powering Communicating Objects
323
DNA: Deoxyribonucleic acid, double left rotating helicoidal molecule bearing the genetical code inside the kernel of living cells. DVD: Video data compressing standard on a disk of the size of a CD. A DVD player can read a CD. e-button: Button shape device that gives information when one touches it with a special tool. Generally, it contains a chip powered by a very long lifetime primary battery. EM: Electro-Magnetic. EMC: Electromagnetic Compatibility, class of standards that define limited level of emission and susceptibility levels of electric equipment. This is conducted at wireless levels. This is included in EC (European Community) requirement. e-stylo: Digital electronic stylo transmitting information to a computer that can translate what is sensed. (drawing, text, etc), for example Anoto. GSM: Global System Mobile, one of the cellular radio standards. LCD: Liquid Crystal Display, flat display technology with low energy consumption. MP3: Music compression standard. PC: Personal Computer. PDA: Personal digital agenda equipped with an operating system, for example PALM OS, Windows CE, Psion OS. RF ID: Radio Frequency Identifier Label, this is a tiny self oscillating circuit with a printed antenna that may power a memory chip and transmit data to a special handheld receiver. Tamagochi: Japanese word for a machine simulating affective behaviour. TRUST: Transparently Reconfigurable Ubiquitous Terminal, mobile terminal that will automatically choose the best standard radio during a communication to improve radio transmission quality and make a better use of radio resources. TV: Television. UPTI: Universal Power and Telecom Interface (name from France Telecom R&D) any electric link distributing energy and data together (it may be CPL: modulated current or PLT power line telecom modem on energy cable, and alternatively, Power on LAN, the remote powering of IP terminals on Ethernet pairs, and, of course, the
324
Communicating with Smart Objects
traditional phone pair that may power some terminals during communication mode, and finally USB.
off-hook
USB: Universal Serial Bus, this communication link allows remote power supply of computer peripheral power of some watts (ADSL modem, scanner, wireless mouse). Wearable computer: electronic that integrate into communicating clothes (memory, display, radio transceiver, calculator, sensors). Webcam: Camera connected to Intranet or Internet.
Conclusion
From "Things That Connect" to "Ambient Communication" Gilles Privat France Telecom R&D, Human Interfaces Division, Smart Devices Laboratory, France
"Things That Think", "connected devices", "smartifacts", "ubiquitous/pervasive computing" [1, 2], "context-aware environments", "ambient intelligence", "disappearing computer"... This paper attempts to extract the gist of converging technological evolutions underlying these seemingly ill-assorted catchphrases.
1. Introduction: convergence of technologies, divergence of devices Be they called connected things, communicating objects or smart devices, their most seductive image springs from this dizzying, mind-boggling cornucopia of digital gadgetry that is permanently on offer and permanently renewed throughout the media. They are the most tangible aspect of an evolution that we may at first, if superficially, analyse as such. The genealogy of emblematic devices, say a GPRS-mobile-phone-MP3-walkman combo, or an MMS-PDA, shows them as typical from the convergence of telecom terminals, computers, and audio-visual consumer electronics. Devices characteristic from these three domains did, not so long ago, belong to radically different, uncrossbreedable species. The digital convergence of technologies threw down genetic barriers between the three, opening up a slew of possibilities for the hybridation of their respective devices. The seemingly logical end-point of this convergence could have been to the integration of all these devices into a single device. As a matter of fact, the multimedia home PC could now, in theory at least, subsume telephone and videophone, stereo set, television and VCR, providing, in theory also, much better versatility than each of the legacy devices thus superseded. Whether it is under the guise of some avatar of the hideous "beige box", or under the more seductive
326
Communicating with Smart Objects
appearance of the "Swiss Army Knife PDA", the tendency to concentration of individual processing/communication functionalities will always exist, but will remain in fact marginal. Users tend to consistently favour the separate material embodiment, in distinctive devices, of functions which are unambiguously distinct to them, with, however, the exception of those already-mentioned hybrids corresponding to the more or less natural coupling of 2 or 3 separate functions in the same device. The dominant tendency is definitely towards the decentralisation of functions in separate devices, as a broader historical and technological perspective will explain.
2. Processing, communication, physical interaction, human interaction: four dimensions of smart devices Drawing from a lore of technology punditry [3], the present evolution can be set in the perspective of three fundamental dimensions of information technology, namely processing, communication (including transmission and storage) and physical interaction (including interaction with users and with the environment). Each of these dimensions can be seen as representative of three technology waves which came more or less sequentially, with of course a large overlap. The first two waves have already produced their most significant revolutionary effects, and we are now ushering into the first wave, opened up by the availability of sensors and actuators in standard technologies. The effect of this third wave will be to enrich, quantitatively and qualitatively, interactions between the physical world and the information/communication sphere. It is first and foremost in this respect that smart devices have a truly revolutionary potential.
"Things That Connect" to "Ambient Communication"
327
2.1. Processing From this point of view, which is that of computing proper, smart devices are first characterised by being just that, smart, in a raw, zero-degree sense, not anywhere implying autonomy or intelligence proper: they embed some (digital) processing power. As such they represent the end-point of a long-term evolution towards the decentralisation of computing capabilities. An enlightening analogy has been drawn by Donald Norman [4]: at the beginning of the twentieth century, the electric motor was a bulky and costly piece of hardware, and as such had to be used sparingly: one all-purpose domestic motor was available for the home and could be used to turn everything that required being turned. Centralising all computing/storage capabilities on the domestic PC, as the tendency may still exist to day, would amount to more or less the same: considering that processing power is such a scarcity that it requires to be centralised in one single device. Of course, Moore's law has been disproving that for a long time. The fetish device of the post-PC era is thus the information appliance [4], embodying separately some specialised processing/storage function taken over from the PC.
2.2. Communication These information appliances need to communicate, if only to maintain the consistency of their respective information stores. As a network of distributed appliances, they may jointly recreate the overall functionality of the domestic PC. As
328
Communicating with Smart Objects
such, they take for granted both the abundance of processing power, which is replicated, and the cheap abundance of transmission capacity, which is needed for this distributed storage and processing. They will also be used as terminals for information retrieval services and as such they are communication appliances as much as information appliances. Yet, from a broader telecom-oriented viewpoint, the grand idea of "connecting everything to everything else" [5] may reach much beyond, towards the networking of various physical devices (e.g. home appliances, industrial apparatuses), which are primarily neither information processing nor communication devices, yet are already equipped with embedded1 processing/storage capabilities, and may benefit from a range of entirely new, as yet un-exploited, network-based services. This corresponds to the transformation of these (so far stand-alone) smart devices into networked devices, leveraging their direct remote interaction with users, software agents or peer devices to provide new capabilities, beyond the reach of stand-alone devices. As such, these devices need not be endowed with a physical user interface proper. User interaction with these devices can take place entirely via the network (possibly by way of other mobile, smart user-side devices belonging to the previous two categories). Networked devices, when understood in this way, open up a brand new service domain for telecommunications. This can be envisioned to be the fastest growing domain of telecom service, as the total count of embedded microprocessors already surpasses, by a wide margin, that of human population, with a faster growth rate. The growth potential of regular human-centric telecom services is always limited by the capability of human users to communicate at either end of the communication chain, whereas these new services take human users "out of the loop": they need not be at one or the other end of the communication link, when the traffic is essentially "device to device" or "device to server".
2.3. Physical interaction The new category of networked devices do incorporate processing, storage and transmission capabilities, yet their defining characteristic is the integration of physical interaction capabilities of extremely varied kinds, which account precisely for their specific function in the physical environment. These capabilities can be as specific as those of dedicated hardware, such as industrial machinery or domestic appliances, or as generic as those of a location-sensing device, for example. Physical transduction is the common relevant abstraction of these capabilities: as sensing 1Taking into account this evolution, embedded processing should be construed in a narrower an d more precise sense of "embedded in non-IT device", instead of embedded in a nongeneral-purpose-processing device".
"Things That Connect" to "Ambient Communication"
329
devices, they can input physical data in whatever modality as numerical values, and as actuating devices they output numerical values as physical effects of whatever kind. These sensing-actuating capabilities, made possible by cheap integration with standard silicon-based technology [6], (e.g. MEMS) are the "defining abundance [3]" of the interaction era, and they will make physical interaction as widespread as processing and transmission have already become.
2.4. Human interaction A special yet fundamental case of physical interaction is the interaction of devices with users. All three preceding evolutions can be re-envisioned from this point of view.
2.4.1. Reification of human interfaces As for information appliances, the very idea of using specialised devices rather than general-purpose ones is often justified from the viewpoint of a better and more intuitive human interface made possible by streamlining functionalities of the device: a single-purpose device is also (should be, at least) a simple-purpose device, and should as such get more easily mastered by the user. The material embodiment of a single function is also more appealing than its purely abstract existence as a single menu item. In this view, the WIMP/desktop metaphor is seen as the hopelessly constricting projection of a rich and multidimensional information space into the two-dimensional simulation of an environment thought of as giving mostly access to dumb files, spreadsheets and word processors. Getting the interfaces back into the real world could mean, for example, replacing those widgets which have become so commonplace in 2D graphical user interfaces (buttons, icons) by physical items acting of elementary pieces of tangible interfaces. As said before, the new category of networked devices may become humaninterface-less, being operated entirely remotely through other devices and exchanging information exclusively with other devices and servers, rather than directly with the user himself. This is not contradictory with the previous statement, and does not imply a re-centralisation of human interfaces.
2.4.2. Implicit interaction Letting devices communicate between themselves means mostly one thing: the user is relieved of the burden of having to interact with all those devices that will surround him, as devices are let to "do their own thing" together, informing the user only when a mandatory decision is required from his/her part. Actually, most of the interaction that could have occurred before with these devices would have amounted
330
Communicating with Smart Objects
to just requiring the user to read information from one device and re-input it into another. Networked devices make this unnecessary. In a world of overabundant information, the only remaining scarcity is the time and attention of the user. For this reason, devices should become proactive and take actions on their own, finding the relevant information for this in their environment (including other devices).
2.4.3. Context awareness Where classical AI has more or less failed to equip applications with some software equivalent of "common sense", retrieving and taking into account a set of very concrete information nuggets from the physical context of the application can provide the closest equivalent to an elusive lore of human intuition. Among such elementary physical information, location is the most obvious and the most universally exploitable one. It can contribute to make interaction implicit: if the user gets close enough to an appliance or vending machine; for example, this may usually mean that he/she wants to use it, and corresponding action may be attempted, for example providing him with a control interface for this device.
2.4.4. Situated information Physical location becoming integral to any interface means much more: the world becomes the interface. The grand idea behind using physical location to address information thus goes much beyond the cellular-network-location-based services already on offer, and even beyond the more general context-awareness idea. In this view, the physical world is the most compelling interface metaphor to cyberspace, and geo-location is used, not only as the user's own position, but as a unifying navigation anchor and a intuitive representational tool, to make sense of the overwhelming multidimensionality of the information space.
3. Defining and categorising smart devices Smart devices cover a wide spectrum in hardware complexity, from passive electronic tags or minimal data-providing smart cards, through networked sensors and actuators, up to such highly sophisticated gizmos as wireless-enabled PDAs. The word "smart" does not in this context indicate autonomousness of devices in any strong sense.
"Things That Connect" to "Ambient Communication"
331
3.1. What is a "smart device"? A minimal, lowest-common-denominator definition of a smart device, bridging all previous complementary views, could be given as follows. A smart device is a physical object equipped with: • • • • •
an embedded processor; memory; sensors and/or actuators; at least one network connection; human interface (input/output devices).
The latter is not mandatory as the user interface can be partially or entirely exported on a remote client or peer device, through the network connection. In this minimal, system-theoretical view, a "smart" device: • •
interacts with its environment, including (not necessarily) a human user, through sensors, actuators and possibly classical I/O devices; updates its internal state and its outputs, with joint inputs from the network, its past internal state and its sensors.
This definition is probably too general, as it implicitly lumps together classical computers, telecommunication terminals, and the newer kind of devices on which we want to place the emphasis. A quantitative difference in the relative importance of the five components listed above can, however, make this distinction more concrete: smart networked devices are the ones for which the "sensor/actuator" and "network connection" components become critical; the user-interface may vanish altogether, whereas processor and memory do just correspond to a minimal requirement for embedded processing. In another view, this definition is too restrictive, because the low-end category of devices described below will include neither a network connection proper nor a processing unit, only a bilateral asymmetric wireless attachment such as RFID, memory and possibly sensors.
3.2. Smart devices categories The more descriptive categorisation proposed below is probably conducive to a more intuitive and concrete understanding of the smart device concept. It will help articulate the correspondence with end-user services, illustrated afterwards in a comprehensive scenario.
332
Communicating with Smart Objects
3.2.1. Handheld personal communicator These high-end devices are a short-term evolution of present-day mobile phones or network-enabled PDAs, and are the most obvious candidates for the title of 'smart device'. Yet, as will be shown, they are only the tip of the iceberg of smart devices. Their main function is that of gateway devices, mediating between a general-purpose access network such as GPRS/UMTS, and the local-area or personal-area network of smart devices, belonging to the other three categories, with which the user may wish to interact in his/her own environment. The general-purpose access network will serve for both classical communication services (e.g. videophone, video on demand) and the remote control/administration of smart devices. These devices should not become Swiss army knifes, as their own common denominator capabilities may be augmented by those of category 2 or 3 devices for more specialised functionalities. Their minimal required features could be the following: • • • •
A minimal I/O user interface, which may if needed be exported to more convenient category 2 devices, such as a smart pen or wireless headset. GPRS/UMTS Interface. WLAN/WPAN Interface: 802.1l/Bluetooth//802.15.x (interfacing to category 2 and 3 devices). RFID tag/reader (for interfacing to category 4 devices).
3.2.2. Wearable, personal interface/interaction devices These devices are directly complementary to first category devices: they serve to augment the personal communicator's general-purpose capabilities with more specialised ones, whether they are human interface2 or physical interaction capabilities obtained through specific sensors/actuators in various physical modalities. They will be worn, (rather than carried) by the user at any given time, depending on his/her/her activity and environment. They will enrich his/her/her interfacing/interaction capabilities in this environment. Examples for such devices are: • •
Biomedical sensors, e.g. glucose-level monitoring device, heart-rate monitor. Physical sensors, e.g. outdoor location sensors such as GPS, or indoor location sensors such as ultrasound/infrared, etc.
2 It should be borne in mind that traditional user interfaces inherited from the PC world) are poorly adapted to situations of mobility.
"Things That Connect" to "Ambient Communication"
•
333
Distributed interfaces for first category devices, e.g. smart pen, projectiondisplay eyeglasses, fiber-optics-based wearable display, wireless earpiece/ microphone, wireless digital camera, etc.
The minimal required generic feature for these devices is Bluetooth/802.15.x Wireless Personal Area Network interface (for interconnection with the personal communicator).
3.2.3. Ambient environment devices These devices are present in the user's environment at any given time, and are characteristic of the nature of this environment, whether it is e.g. a home, office, public space, industrial, or transportation environment. These devices may be movable, but are usually not mobile. Their required features are WPAN/WLAN or Powerline network connectivity for direct or indirect interaction with the personal communicator. Examples of such devices are the following: •
• •
Physical apparatuses: - Domestic (white) appliances; - Point of Sale Terminals; - Robots; - Industrial equipment (e.g. machine tools, etc); - Computer peripherals; - Vending machines. Fixed sensors/actuator devices (e.g. environmental sensors, material-embedded sensors, etc.). High-end human-interface devices: - Plasma screen - High-resolution camera - High-end 5.1 sound system
The spectrum of such potential environments is illustrated below.
334
Communicating with Smart Objects
3.2.4. Passive devices and physical icons These lowly devices are not the most obvious candidates for the title of "smart" devices, yet they are the immersed part of the iceberg, quantitatively the most important category, potentially comprising up to trillions of manufactured objects annually. These devices are not network peers. They are typically managed and interfaced through devices from previous categories acting as their network proxy and possibly physical location/identification container. They comprise potentially all devices which may be individually locatable and identifiable. A potential technological solution for this is RFID inductive coupling (ISO 14443). The cost of readers and tags is due to decrease further with wider-scale adoption of the technology for all kinds of devices [7].
"Things That Connect" to "Ambient Communication"
335
4. A classification of smart-device-based services 4.1. Specific functionalities It is envisioned that services of interest to end-users can be made up from various combinations of some of the specialised functionalities listed below. These functionalities are considered to be as elementary as possible with regard to the network, the devices acting as terminals and any intervening support hardware/software. • • • • • • • • • • • • •
Teledetection, telewarning; Remote diagnosis; Telemetry, remote information capture; Teleobservation, surveillance, remote sensing; Remote control; Remote management; Remote maintenance; Remote robot control; Location-based services; Tracking (electronic tags); Device to environment absolute location awareness; Device-to-device relative location awareness; Authentication.
4.2. Generic functionalities Another category of more general functionalities can be combined with the previous ones and implemented in a distributed, cooperative fashion, by joint operation of several devices through the network, or joint use of other processing "nodes" in the network itself. This could correspond to network-based processing of downstream information (devices to user), as either joint event detection, distributed array processing (based on joint processing of sensor readings), distributed pattern detection/recognition, distributed location-based processing.
4.3. Domains of potential use These purely technical functionalities can correspond to very different services, as seen from the end-user, depending on the economic sector in which they are applied. These different economic sectors of potential use are listed below. • • •
Home automation; White products (appliances); Brown products (Consumer electronics);
336
• • • • • • • • •
Communicating with Smart Objects
E-commerce; Telecom network operation (including satellite networks); Transportation; Fleet management; Industrial; Manufacturing equipment; Inventory; Civil engineering; Health care.
5. A smart devices scenario
As soon as Mrs. Dupont enters the shopping mall, her sleek UMTS-enabled PDA provides her with a menu of services available within the precinct, discreetly introduced by a subdued welcome jingle. This service is a virtual kiosk, automatically detecting Mrs. Dupont's location and uploading to her PDA an interface that she may use to i-augment this familiar bricks and mortar environment. She enters one of the stores in the mall (a bookstore), and the same kiosk service operates here on a smaller scale, providing her with a menu of regularly listed items, as well as sales and special offerings for this particular store. She doesn't actually browse the menu (she is not so adroit at pointing on the tiny screen, and hates talking
"Things That Connect" to "Ambient Communication"
337
to the thing in public, lest onlookers think she is getting senile): she prefers to fiddle with the books on display, letting herself be enticed by their covers, bindings and photo spreads. She removes one of them from the shelf, and, as the book is equipped with an e-tag that the shelf can locate and identify, the menu for this book is automatically downloaded to her PDA: she is advised that a DVD is available for a movie version of this book; she expresses her interest in this item by getting inside a cosy lounge adjacent to the store, where her presence and her holding of the book is again detected: a video trailer for the movie is downloaded to the huge plasma screen, while she can enjoy the crisp 96kHz 6-channel sound, eerily surrounding her. She could have downloaded the stuff to her PDA, which obviously is no match for this state-of-the-art media centre. She is convinced, and orders the DVD for download to her home: she does not care for the ungainly case in which these disks are usually packaged, and will dispense with the material support of the DVD altogether, as a copy-protected version will be available on the hard disk of her home server. The book is something else, and back home, lying on the couch, she will browse it leisurely. The beautiful, nicely bound and soft-smelling piece of glossy paper will also act as a physical icon for the movie, a material placeholder for the virtual DVD, which she will summon by waving the book close to the player (it is also fitted with a reader to detect the book's tag). Mrs. Dupont never got used to exclusively web-based shopping: she enjoys the rich displays and the atmosphere of stores, she needs to feel, touch, manipulate and retrieve immediately the items she buys, and with this service she gets the best of both worlds. Comparison-shopping is a cinch, just as with web-based shopping (being a thrifty homemaker, she usually requests a price quote from the competition by physically selecting an item, waving her PDA close to it, before buying it in a store.
6. Conclusion This presentation has been very much device-centric. As such, it has emphasised what could correspond to the intermediate stages of an evolution from smart devices to ambient computing. The end-point of this evolution, which began with the multiplication and diversification of devices, could be their disappearance, a subjective disappearance from the user's viewpoint, when devices vanish behind the service they offer, behind their informational abstraction, and cease to be the primary focus of the user's attention. As catchphrases go, attentive environment and ambient networking characterise best this elusive vision of interfaces to information/communication services getting widely distributed and de-localised, implemented in a fully dynamic and adaptive way by invisible and interchangeable devices. For all the hype, there may be too much of an archetypal technology-fiction dream in this vision of a sleek, information-saturated environment. It does probably give short shrift to the human
338
Communicating with Smart Objects
craving for unique material things that may become their own and only things, not unlike collector etchings, tamagochis, teddy bears or garden gnomes. Fetishism apart, a less technology-driven vision would probably refrain from delocalising interfaces altogether, and retain, stripped of any legacy technical appendage, the uniqueness of these personal things, be they books, diaries or oldfashioned "terminals", for the sake of themselves, for the pleasure of holding and touching them, if only for their intuitive iconicity and their snug, familiar thingness.
7. References [1] Burkhardt, J. (Editor), Horst Henn, Stefan Hepper, Klaus Rindtorff, Thomas Schaeck, "Pervasive Computing: Technology and Architecture of Mobile Internet Applications", Addison-Wesley, 2002. [2] Hansmann U., Merk L., Nicklous M. S., Stober T., "Pervasive Computing Handbook", Springer Verlag, 2001. [3] Gilder, G: "Telecosm, the word after bandwith abundance", Touchstone, 2002. [4] Norman, Donald A. "The invisible computer, why good products can fail, the personal computer is so complex and information appliances are the solution", MIT Press, 1998. [5] Lucky, R. W. "Connections", IEEE Spectrum, March 1999. [6] Saffo, P. "Sensors, the next wave of innovation", CACM, vol.40, n°2, February 1997. [7] www.autoIDcenter.org
Index ambient communication, and 'things that connect' 325 et seq archaeological prospecting 248 artificial materials for protected communications 221 et seq, 223 artistic creation, multimode interfaces and communicating devices 293 et seq artistic expression, analytical fields 305 attenuation atmospheric 233 cloud 234 fog and haze 235 geometric 233 precipitation 235
deictic gesture 29 dependent people 9 et seq design, affective 53 emotional 53 digital to physical world 252 dimensionalists 49 diversity techniques 130 emotions 48 eye safety, IR radiation etc 211 free space optical communication links 229 generic middleware 58 human-computer dialogue 18,19
Bluetooth 153 capacity, preconditions of 30 categorialists 49 clinical assistant architecture 287 code division multiple access 130 communicating objects, concept 39 emotions and the voice 53 making context explicit 273 et seq powering 311 et seq computers as social actors 51 conclusive nominal group 29 confidentiality of indoor transmission, and artificial materials 224 context 280 granularity 280 object 280 cyber monde vii dancer -actor and system dialogue 293 behaviours and gesture 297 and communicating physical device 296 transfer and movements 298
INDEED, high rate IR communications 209,214 individualising set 27 indoor transmission and confidentiality 224 information-seeking tasks and design of communicating objects 257 et seq objective characteristics 261 interaction, change-sensitive, dynamic links for 285 et seq Jini framework 84 et seq linguistic reference 22 localisation 112 reverse 102 et seq implementing 113 network of peers 114 location architecture 77 location management, smart devices 71 et seq Jini based 84 location models 73, 76
340
Communicating with Smart Objects
location search queries 81 direct 81 inverse 83 location structure 72 MAGIC platform 250 middleware framework 61 mobile and collaborative augmented reality 247 et seq mobile physical objects, distributed applications 91 multimedia player, communicating objects network 103 et seq multimodal human-computer dialogue 20 intelligent agents 17 et seq multimodal interaction on mobile artefacts 39 multimodal mental representation 26 multimodal object representation 28 multimodal reference 23 multiple access techniques 129 networking, spontaneous 59 networks, ad hoc 201 et seq, 202 routing 204 ORB service 95 OSGi framework 92 P2P framework 65 physical to digital world 252 POGO instruments 4 propagation channels 126,163 modelling 184 indoor 192 microcell 185 penetration 190 ray launching 188 properties 176 protected communications, artificial materials for 221 et seq
radio interface 127 radio links, millimeter waveband 159 indoor LAN 163
outdoor 160 radiowave propagation, in and outdoor 169 et seq mechanism 172 diffracted path 175 direct path 172 guided path 176 reflected path 173 transmitted path 173 rain attenuation 162 rational communicating agents 19 rational effect 30 referring, act of 28 relational interactions 300 rooftop concept 206 Service Location Protocol 59 smart devices ix ambient intelligence 57 et seq in communication vii et seq evolution of 243 et seq transmission channels 125 and wireless devices 121 et seq smart houses 9 et seq smart objects, networking technologies for 117 et seq -based services 335 defining and categorizing 330 software infrastucture, smart devices/ambient intelligence 57 et seq spatial division multiple access 130 telelocation 112 reverse 115 tools active prototypes 4 new distributed and active 3 et seq Universal Plug and Play (UpnP) 59, 64 voice, emotion in 50 means of humanising man-machine interfaces 47 et seq natural 52
Index
synthetic 52 WAP ergonomics 258 information, searching and retrieving 259 wireless infrared transmissions 210
341
wireless local area networks (WLAN) 135 et seq architectures 137 functions 136 standards 135, 142