AN APPLICATION SCIENCE FOR MULTI-AGENT SYSTEMS
MULTIAGENT SYSTEMS, ARTIFICIAL SOCIETIES, AND SIMULATED ORGANIZATIONS International Book Series Series Editor: Gerhard Weiss Technische Universität München Editorial Board: Kathleen M. Carley, Carnegie Mellon University, PA, USA Yves Demazeau, CNRS Laboratoire LEIBNIZ, France Ed Durfee, University of Michigan, USA Les Gasser, University of Illinois at Urbana-Champaign, IL, USA Nigel Gilbert, University of Surrey, United Kingdom Michael Huhns, University of South Carolina, SC, USA Nick Jennings, University of Southampton, UK Victor Lesser, University of Massachusetts, MA, USA Katia Sycara, Carnegie Mellon University, PA, USA Gerhard Weiss, Technical University of Munich, Germany (Series Editor) Michael Wooldridge, University of Liverpool, United Kingdom
Books in the Series : CONFLICTING AGENTS: Conflict Management in Multi-Agent Systems, edited by Catherine Tessier, Laurent Chaudron and Heinz-Jürgen Müller, ISBN: 0-7923-7210-7 SOCIAL ORDER IN MULTIAGENT SYSTEMS, edited by Rosaria Conte and Chrysanthos Dellarocas, ISBN: 0-7923-7450-9 SOCIALLY INTELLIGENT AGENTS: Creating Relationships with Computers and Robots, edited by Kerstin Dautenhahn, Alan H. Bond, Lola Cañamero and Bruce Edmonds, ISBN: 1-4020-7057-8 CONCEPTUAL MODELLING OF MULTI-AGENT SYSTEMS: Engineering Environment, by Norbert Glaser, ISBN: 1-4020-7061-6
The CoMoMAS
GAME THEORY AND DECISION THEORY IN AGENT-BASED SYSTEMS, edited by Simon Parsons, Piotr Gmytrasiewicz, Michael Wooldridge, ISBN: 1-4020-7115-9 REPUTATION IN ARTIFICIAL SOCIETIES: Social Beliefs for Social Order, by Rosaria Conte, Mario Paolucci, ISBN: 1-4020-7186-8 AGENT AUTONOMY, edited by Henry Hexmoor, Cristiano Castelfranchi, Rino Falcone, ISBN: 1-4020-7402-6 AGENT SUPPORTED COOPERATIVE WORK, edited by Yiming Ye, Elizabeth Churchill, ISBN: 1-4020-7404-2 DISTRIBUTED SENSOR NETWORKS, edited by Victor Lesser, Charles L. Ortiz, Jr., Milind Tambe, ISBN: 1-4020-7499-9
AN APPLICATION SCIENCE FOR MULTI-AGENT SYSTEMS
edited by
Thomas A. Wagner Honeywell Laboratories, U.S.A.
KLUWER ACADEMIC PUBLISHERS NEW YORK, BOSTON, DORDRECHT, LONDON, MOSCOW
eBook ISBN: Print ISBN:
1-4020-7868-4 1-4020-7867-6
©2004 Springer Science + Business Media, Inc. Print ©2004 Kluwer Academic Publishers Boston All rights reserved No part of this eBook may be reproduced or transmitted in any form or by any means, electronic, mechanical, recording, or otherwise, without written consent from the Publisher Created in the United States of America
Visit Springer's eBookstore at: and the Springer Global Website Online at:
http://www.ebooks.kluweronline.com http://www.springeronline.com
Contents
An Application Science for Multi-Agent Systems Tom Wagner
1
Coordination Challenges for Autonomous Spacecraft Bradley J. Clement, and Anthony C. Barrett
7
A Framework for Evaluation of Multi-Agent System Approaches to Logistics Network Management Paul Davidsson and Fredrik Wernstedt
27
Centralized Versus Decentralized Coordination: Two Application Case Studies Tom Wagner, John Phelps, and Valerie Guralnik
41
A Complex Systems Perspective on Collaborative Design Mark Klein, Hiroki Sayama, Peyman Faratin, and Yaneer Bar-Yam
77
Multi-Agent System Interaction Protocols in a Dynamically Changing Environment Martin Purvis, Stephen Cranefield, Mariusz Nowostawski, and Maryam Purvis
95
Challenges to Scaling-Up Agent Coordination Strategies Edmund H. Durfee
113
Roles in MAS: Managing the Complexity of Tasks and Environments Ioannis Partsakoulakis and George Vouros
133
An Evolutionary Framework for Large-Scale Experimentation in Multi-Agent Systems Alex Babanov, Wolfgang Ketter, and Maria Gini
155
Application Characteristics Motivating Adaptive Organizational Capabilities within Multi-Agent Systems K. Suzanne Barber and Matthew T. MacMahon
175
vi
Applying Coordination Mechanisms for Dependency Relationships under Various Environments Wei Chen and Keith Decker
199
Performance Models for Large Scale Multi-Agent Systems: A Distributed POMDP-Based Approach Hyuckchul Jung, Milind Tambe
221
Index
245
AN APPLICATION SCIENCE FOR MULTIAGENT SYSTEMS Tom Wagner Honeywell Laboratories 3660 Technology Drive Minneapolis, MN 55418
[email protected]
1.
Introduction
This edited collection might have been more appropriately titled Toward an Application Science for Multi-Agent Systems as the community is still developing an understanding of when to use particular multi-agent system techniques to create an application for a particular problem. As of yet there is no hardand-fast methodology that enables an agent researcher or practitioner to examine the features of the application domain, and the application requirements, and then to know which techniques or algorithms to apply and their respective trade-offs. Currently, when faced with a new application domain, agent developers must rely on past experience and intuition to determine whether a multi-agent system is the right approach and, if so, how to structure the agents, how to decompose the problem, how to coordinate the activities of the agents, and so forth. The papers in this collection are all designed to help address this topic – to provide some illumination into the issues of understanding which technique to apply, and when, and the potential trade-offs or caveats involved. The work presented here ranges from mature technology to anecdotal evidence. This is a testimony to the complexity of the issues and partly attributable to the relative youth of agent-based systems. Of course, there are many more mature subdisciplines of computer science that struggle with the same issues – given a problem, how does one evaluate the problem space and determine the right solution technique? There are seldom simple answers, though the total sum of knowledge on the subject continues to grow. When this collection was originally conceived, the focus was to be only on coordination methodologies. Through dialogs with community members, it rapidly became clear that there were too many other issues intertwined with distributed activity coordination for that particular slant to tell the right story.
2 However, as a researcher in coordination I tend to see multi-agent systems (MAS) and application spaces along those lines – how do we solve distributed problems in such a way as to approximate the solutions obtainable if all computation could be centralized? Much research in MAS is on exactly this problem – how to make it so agents can act locally but achieve coherent global behavior. This class of research takes many forms and includes multi-agent coordination, negotiation, distributed scheduling, distributed planning, agent organization, and problem decomposition to name a few. One might generally refer to this as multi-agent system control problem solving or MASCPS for short. The papers in this collection focus on MASCPS issues with a specific emphasis on the relationships between said issues and problem spaces or application domains, and performance. The reason is that properties of different problem spaces impact the selection and performance of MASCPS approaches. For instance, with coordination, problem spaces affect the degree to which coordination is necessary and possible. In some applications, little coordination is necessary because the activities of the agents are mostly independent. However, in other applications, coordination may be critical to obtaining the desired system-wide results and performance characteristics. Just as properties of the underlying problem space or application domain, influence the degree to which coordination is necessary, they also influence the approach taken to coordinating the local decisions of individual agents. For instance, many successful strategies in robotic soccer do not require reasoning about deadlines or planning far downstream temporally – this is in contrast to strategies for supply chain problems or agent-driven logistics. Domain and solution features which should likely be examined to determine how to approach a problem include, but are not limited to: The need for coordination between the agents. The presence of deadlines or temporal constraints. The presence of interactions between agents (e.g., task, resource, etc.). Required tempo/pace of decision making. Level of uncertainty in agent tasks or interactions. Sequential decision making aspects. The need for temporal synchronization. Availability or scarcity of required (shared) resources. Problem grain-size.
An Application Science for Multi-Agent Systems
3
Complexity and scope of individual decision making. The degree to which non-local information is visible or obtainable (e.g., via communication). The uncertainty of the environment. Dynamics. System openness. Predictability. And many others. In essence, the character of the MASCPS and its attributes vary along many dimensions and these differences are driven by applications/problem spaces, different assumptions about problem decomposition, and the types and frequency of interactions between agents to name a few. Along the same lines, different MASCPS approaches are often designed to solve particular classes (generally implicit) of problems, have different characteristics and exhibit different trade-offs. Knowing which technique or approach to use to obtain the desired MAS result is the objective of this collection – or rather a long term goal toward which this collection is a small step. This collection is organized in a progression from work that focuses on coordination or other MASCPS issues driven by specific application domains, to work that focuses on more general MASCPS issues illustrated in a specific application domain, to work that focuses on more general coordination or control issues, to work that addresses predictability and analysis of different MASCPS techniques. Specifically: Papers focusing on coordination and control issues driven by specific applications or application classes include: In “Coordination Challenges for Autonomous Spacecraft,” Clement and Barrett focus on distributed spacecraft missions, the motivation for a distributed approach, domain issues, and coordination requirements. In “A Framework for Evaluation of Multi-Agent System Approaches to Logistics Network Management,” Davidsson and Wernstedt concentrate on the applicability of MAS techniques to production and logistics network management, In “Centralized Versus Decentralized Coordination: Two Application Case Studies,” Wagner, Phelps, and Guralnik examine two multi-agent systems (an automated caregiver system and a system
4 for aircraft team coordination) they developed and the different coordination approaches used in each case. Papers focusing on more general classes of issues with examples in, or motivations from, a given domain include: In “A Complex Systems Perspective on Collaborative Design,” Klein, Sayama, Faratin, and Bar-Yam provide a different angle on multiagent control issues using a complex systems view of the collaborative design process. In “Multi-Agent System Interaction Protocols in a Dynamically Changing Environment,” Purvis, Cranefield, Nowostawski and Purvis advocate the importance of adaptive interaction protocols for MAS in dynamic environments, and provide details of their use, with crisis response as the underlying domain example. Papers focusing on more general coordination or multi-agent control issues include: In “Challenges to Scaling-Up Agent Coordination Strategies,” Durfee enumerates a wide range of challenges that must be addressed, or at least considered, when deploying multi-agent systems on realworld and realistic scale problems. The intellectual discussion in this paper relates to coordination and broader MASCPS concerns and provides a good overview of some of the issues addressed in the application-specific papers which appear earlier in the collection. In “Managing the Complexity of Tasks and Environments,” Partsakoulakis and Vouros examines roles in MAS, which are used to organize the distributed computation, from three views: agentoriented system engineering, formal models, and implemented MAS deployed in complex domains. Papers relating to prediction or evaluation of multi-agent system performance where the analysis is specific to coordination algorithms, design decisions, or environmental characteristics: In “An Evolutionary Framework For Large-Scale Experimentation In Multi-Agent Systems,” Babanov, Ketter and Gini describe a framework for conducting large-scale experiments in electronic marketplace MAS to support systematic testing of agent strategies. In “Application Characteristics Motivating Adaptive Organizational Capabilities within Multi-Agent Systems,” Barber and MacMahon focus on the organization of the distributed computation (decision
An Application Science for Multi-Agent Systems
5
making and action execution responsibilities) and empirically examine the relationships between MAS design decisions and performance. In “Applying Coordination Mechanisms For Dependency Relationships Under Various Environments,” Chen and Decker examine the importance of environmental characteristics and their impact on the choice and performance of different coordination protocols. In “Performance Models for Large Scale Multi-Agent Systems: A Distributed POMDP-Based Approach,” Jung and Tambe assist in the prediction of large-scale MAS performance by providing a framework to model and evaluate coordination or conflict resolution strategies. In the future we believe the community will develop a scientific approach for mapping out a particular application domain and selecting a class of agent control technologies for dealing with the control problems present in the domain. It is our hope that this collection is useful in moving both science and practice forward in that direction.
This page intentionally left blank
Coordination Challenges for Autonomous Spacecraft
Bradley J. Clement and Anthony C. Barrett Jet Propulsion Laboratory, California Institute of Technology
Abstract:
Characterizing multiagent problem spaces requires understanding what constitutes a multiagent problem or where multiagent system technologies should be applied. Here, we explore the multiagent problems involved in distributed spacecraft missions. While past flight projects involved a single spacecraft in isolation, over forty proposed future missions involve multiple coordinated spacecraft. We present characteristics of such missions in terms of properties of the phenomena being measured as well as the rationale for using multiple spacecraft. Then we describe the coordination problems associated with operating different types of missions and identify needed technologies.
Key words:
coordination, space, planning, execution
1.
INTRODUCTION
A potential weakness of any research is its applicability to real problems, and it is often difficult to determine whether a particular application needs a particular technology. Distributed spacecraft missions offer a wide variety of multiagent problem domains, but the technology needs of these missions vary. We examine how different motivations for using multiple spacecraft translate into different multiagent problems and research challenges. The past few years have seen missions with growing numbers of probes. Pathfinder has a lander and rover (Sojourner), Cassini includes an orbiter and the Huygens lander, and Cluster II has 4 spacecraft for multi-point magnetosphere plasma measurements. This trend is expected to continue to progressively larger fleets. For example, one proposed interferometer mission [1] would have 18 spacecraft flying in formation in order to detect earth-sized planets orbiting other stars. Another proposed mission involves
8 44 to 104 spacecraft in Earth orbit to measure global phenomena within the magnetosphere. To date over 40 multiple platform (multi-spacecraft) missions have been proposed, and they can be grouped into 3 families depending on why multiple platforms were proposed: multi-point sensing for improved coverage when observing/exploring large areas (like the satellites with passive microwave radiometers for the Global Precipitation Mission and similar sensors on the Global Electrodynamics Mission, Leonardo-BRDF, and the Magnetospheric Constellation); building large synthetic aperture sensors with many small spatially separated sensors for imaging very remote targets (like Constellation-X, Terrestrial Planet Finder, and TechSat-21); and specialized probes with explicitly separate science objectives (like coincident Mars Program missions or the PM train within the Earth Observing System). While these reasons for having multiple platforms in a mission are not exclusive, they do have a major impact on how the resulting missions are formulated and managed. For instance, the Air Force’s TechSat-21 mission concept [2] involves a distribution of clusters of platforms. Each cluster forms a synthetic aperture for radar sensing, and the number of clusters depends on the desired global coverage. While the operations of spacecraft in a cluster must be closely choreographed to make each joint observation, the operations between clusters are only loosely coordinated to determine how to allocate observations to clusters. There are currently large efforts focused on formation flying and communications between spacecraft. Here, we address operations issues related to managing future multiple platform missions. While automating operations for a distributed constellation of orbiters has been addressed for communications satellites, these results do not apply directly to science missions due to cost reasons. Communications satellites are designed with cost in mind, but they also need large resource margins in order to handle growing markets. Science missions are designed with tight resource margins to minimize cost while flying spacecraft that are just capable enough to answer the motivating scientific questions. In the next section, we describe the rationale for multiple platform missions in terms of the phenomena being measured. We then characterize open issues in managing different families of multiple platform missions and then describe similar issues for autonomy technologies currently in development. We then characterize the coordination problems that must be addressed by these technologies. We aim to analyze coordination problems for a class of domains but do not give a survey of coordination techniques that address them.
An Application Science for Multi-Agent Systems
2.
9
MULTIPLE PLATFORM RATIONALE
Science missions measure phenomena in various locations by making remote/local observations with active/passive sensors in one of five classes of planet centric orbits shown in Figure 1. Taking a more formal view, we can characterize phenomena in terms of a spatially and temporally grouped set of signals and a mission in terms of an information transfer system [3] to get the information from signals into the scientists’ hands in order to facilitate answering questions. For instance, the constellation-X telescopes measure x-ray spectra of points on the celestial sphere. Here each signal is a time varying x-ray spectrum.
Following this information transfer approach to characterize a mission, we can formally characterize phenomena along five metrics with respect to answering the motivating questions: Signal location involves which sphere contains the phenomena’s signals (affecting orbit selection); Signal isolation involves separating spatially distinct signals within a target phenomenon; Information integrity involves the noise inherent to signals related to the phenomenon; Information rate involves how fast the signals change and have to be sampled; and Information predictability involves the probability of catching signals pertaining to a phenomenon during an observation.
10 We identify three rationales behind multi-platform missions: signal separation, signal space coverage, and signal combination. The following subsections describe these.
2.1
Signal Separation
This rationale arises from a desire to separate signals related to the target phenomena both from each other and from extraneous signals to account for signal isolation and information integrity issues respectively. For instance, the proposed Terrestrial Planet Finder (TPF) [4] will search for earth sized planets orbiting other stars and detect key spectral signatures to find signs of life. To do this the mission needs a 0.75 milli-arcsec angular resolution. Thus the instrument needs to isolate signals that are 0.75 milli-arcsec apart on the celestial sphere. This isolation requirement motivates a tightly controlled formation of five spacecraft that simulates a spacecraft with a kilometer wide telescope (see Figure 2), which orbits either around the L2 Lagrange point or trails behind the earth.
On the information integrity side, faint sources on the celestial sphere motivate either large detectors or long measurement integration times to capture enough of the signal to separate it from background noise. In the case of faint high-information-rate sources, the only solution is multiple spacecraft to implement a large enough detector. For instance, four satellites are proposed for Constellation-X to take simultaneous observations of X-ray sources on the celestial sphere while orbiting the L2 point. By providing a large enough detector, this mission will be able to measure short-lived X-ray phenomena like flares around other stars and events around black holes. In terms of mission design, signal separation issues motivate actively flying spacecraft in formations around a reference orbit. This facilitates
An Application Science for Multi-Agent Systems
11
implementing both kilometer sized interferometers for signal isolation and multiple simultaneous remote sensors for improving information integrity. In both cases, the phenomena are remote respect to the collection of spacecraft.
2.2
Signal Space Coverage
This rationale arises from a desire to use a sensor web that measures whole regions of the signal space related to a phenomenon often enough to account for high information rates or low information availabilities. For instance, the proposed Magnetospheric Constellation mission (MC) [5] will study how the magnetotail stores, transports, and releases matter and energy. Here information availability is fairly low because the magnetotail is unstable and prone to catastrophic phenomena like magnetospheric substorms, which are not precisely predictable. The only way to measure particle and field signals within such a phenomenon involves having probes on site when the substorm occurs, which motivates multiple probes spread over multiple orbits to maximize the probability of observing the phenomenon (see Figure 3).
On the high information rate side, the global precipitation mission (GPM) [6] objective is to measure the time varying global rainfall. The main reason for an evenly distributed constellation of orbiters looking at the atmosphere involves a need to sample every point on the globe every 3 hours. This information rate is driven by the speed in which thunderstorms can form and dissipate. In terms of mission design, signal space coverage motivates distributing spacecraft evenly over a region either along the mission’s orbits or about a reference orbit. The distribution facilitates implementing a sensor web to measure phenomena that are in-situ with respect to the population of spacecraft. For GPM and MC “in-situ” means continuous observation of the
12 entire atmosphere and large regions of the magnetotail respectively. For other missions with spacecraft clustered around a reference orbit “in-situ” means intermittent observation of a select region about the reference orbit (e.g. global electrodynamics and magnetospheric multi-scale missions). In either case, the spacecraft in the sensor web require formation knowledge for combining measurements to observe the underlying phenomenon, but precise formation control is not necessary. While some formation geometries are preferable to others, each formation has a high spacecraft positioning tolerance.
2.3
Signal Combination
While the previous rationales focused on single missions with multiple coordinated platforms, this rationale derives from attempts to get multiple missions with separate platforms to coordinate. One example involves getting five separate missions within the Earth Observing System to coordinate their observations (see Figure 4). For instance, CloudSat has a millimeter-wave radar to observe clouds and precipitation, and Calipso has a polarization-sensitive lidar for observing vertical profiles of aerosols and
An Application Science for Multi-Agent Systems
13
clouds. Each mission was designed around separate questions, but combining signals enables answering questions about relationships between aerosols and precipitation. As this example implies, this rationale motivates missions flying in a close string-of-pearls formation, where there is a strict ordering of the spacecraft. The first spacecraft ignores all the rest, and each other spacecraft ignores its successors while flying in formation with its immediate predecessor. For instance, CloudSat flies in formation with Callipso, which flies with Aqua. The international science community is planning sixteen missions to Mars over the next ten years, and these missions will cooperate in multiple ways. Earlier missions will provide precision approach navigation for later missions, and real-time tracking for critical events like descent and landing or orbit insertion. Orbiters will provide relay services to landed assets and positioning services to rovers and other mobile “scout” missions. All missions will cooperate on radiometric experiments and maintain a common time reference for relating data between missions. These features have been conceptualized as a “Mars Network” of orbiting satellites [9]. While all missions will improve the potential for collecting data on Mars by placing multiple sensors, actually realizing this potential requires treating the multiple missions as a single meta-mission with signal combination from platforms distribute about Mars. Given that the landers and rovers use positioning information from orbiters, these missions can be characterized as a “string of pearls” where rovers follow positioning information from orbiters. In general, there is a tremendous similarity between signal combination and signal separation rationales. Signals are combined to separate out the different phenomena components of each signal. The only real difference between these two rationales derives from the underlying evolution of a program’s mission set. Signal separation issues motivate multiple platforms for a single mission, and signal combination opportunities motivate launching new spacecraft that take advantage of the observations made by older spacecraft. Thus signal combination leads to a string of pearls with a predecessor relationship between the spacecraft instead of clusters where each spacecraft is cognizant of all its neighbors.
2.4
Multiple Rationales
Often a mission has more than one motivating question, and each question can involve a different class of phenomena raising a different rationale for multiple platforms. For instance, the mission concept motivating TechSat-21, a US Air Force mission [2], involves a set of
14 clusters of spacecraft evenly distributed on a circular orbit (see Figure 5). Multiple spacecraft cluster together to improve signal separation for radar imaging and clusters break up to improve signal space coverage for enabling point-to-point communications.
Leonardo-BRDF, a proposed NASA mission [7], extends on this by having all three rationales for multiple platforms. This mission involves a number of spacecraft observing the Earth with various optical sensors from a number of angles to determine how light reflected from the earth varies with the angle – the “Bidirectional Reflectance Distribution Function” (BRDF). To improve signal isolation, a larger spacecraft cluster improves the measurement of a location’s BRDF by increasing number of angles sampled over a short interval. On the other hand, a larger number of smaller spacecraft clusters improves signal space coverage, and letting investigators insert spacecraft with different sensors results in enabling signal combination for an evolving mission.
3.
GROUND OPERATIONS ISSUES
At its most abstract level, operating a spacecraft involves five feedback loops (see Figure 6). The tightest loop involves the guidance, navigation, and control (GN&C) system, which articulates the spacecraft hardware to satisfy commands like measuring a phenomenon or despinning a reaction
An Application Science for Multi-Agent Systems
15
wheel. This system is subsequently controlled by the command and datahandling (C&DH) system, which passes commands to the GN&C to collect data and transmit it to ground. The mission operations center takes this data and controls the C&DH by analyzing telemetry in the data to determine spacecraft health and sending up the next batch of commands to execute. The desired measurements that motivate these commands are specified by the science operations center, which takes the science component of pasttransmitted data and poses new measurement requests. Finally the scientific community controls science operations by taking science data products produced by the science operations center and posing questions that motivate generating new science products.
Current practices that place multiple instruments on a spacecraft complicate this process by breaking science operations into multiple instrument-operations teams to service different scientific communities. These teams compete for spacecraft resources and submit a prioritized list of measurement requests to mission operations, which tries to satisfy as many requests as possible. Another added complication comes from multiple missions having to negotiate over access to deep space antennas. Here multiple mission operations teams schedule time on antennas weeks to months in advance to communicate with the C&DH system of their respective spacecraft. The movement to multiple platform missions further complicates this process by increasing the number of GN&C and C&DH systems that the mission operations center has to manage. The main issues here involve reducing the rate at which the required mission operations staff grows with the number of spacecraft and overcoming cross-platform instrumentcalibration and data-validation complexities within science operations.
16 Missions typically have to face a cost-risk tradeoff when focusing on operations. One way to keep this tradeoff under control involves using spacecraft that are made robust by an expensive over abundance of onboard resources, and another involves underutilizing cheaper spacecraft by enforcing very conservative resource margins. Both keep risk constant while reducing operations cost by simplifying operations complexity. Unfortunately neither performs well over the ultimate cost per bit of scientific information metric. The first dramatically increases the spacecrafts’ costs while the second decreases the amount of science data collected. The approach focused on here involves using automation.
3.1
Signal Space Coverage
Current work on multiple platform control automation has been spearheaded by companies like ORBCOMM [8], which operate constellations of 37 communications satellites (see Figure 7). This work focuses on a signal-space coverage mission, and treats each spacecraft as an isolated entity to automate as much of its mission operations as possible. While the result was impressive in that ORBCOMM was able to automate all but investigating anomalies and developing operational workarounds, the underlying problem was easier than that of a science mission. A communications satellite is simpler than a probe with a sensor suite to answer a number of scientific questions. Also, a communication’s constellation only has one goal to transfer data from one location to another,
An Application Science for Multi-Agent Systems
17
it lacks a science operations team to manage/calibrate instruments and to change the daily measurement regime for a scientific community. Thus ORBCOMM provides a point solution for a signal space coverage mission that has a large number of ground stations distributed around the planet. This distribution further simplifies the satellites by turning them into simple repeaters between a local ground station and a mobile terminal. This simplification with the single objective facilitates automating most ground operations activities. Extending this solution to science missions involves improving anomaly detection and diagnosis techniques to handle more complex spacecraft, and adding planning and scheduling automation to manage these spacecraft as well as respond to new science requests.
3.2
Signal Separation
While the ORBCOMM approach might be extendable to missions with spacecraft distributed for signal space coverage, it does not extend well to missions with cluster or string-of-pearl formations – for signal separation. Formation flying spacecraft require GN&C systems that communicate in order to determine and control relative spacecraft positions and orientations. For instance, StarLight [10] will involve two formation flying spacecraft in an earth-trailing orbit to implement a large interferometer (see Figure 8). Each spacecraft has a large disk-shaped sunshade to keep the optics dark, and a collector spacecraft reflects light from a star to a combiner spacecraft. The combiner then uses this light with light directly from the star to measure an interference pattern to downlink. Combining multiple interference patterns results in generating a single image with enough resolution to see a star’s planetary system.
18 From the perspective of the GN&C and C&DH, the main issue revolves around precision and robustness. The spacecraft have to attain and maintain a formation that is a kilometer across with centimeter positioning precision and even greater positioning knowledge. Also, there is no such thing as a truly safe operating mode for a formation flyer. The standard technique of just pointing solar panels sunward and listening for commands is problematic if it results in formation flyers drifting apart. For instance, StarLight is only required to have the spacecraft fly at most 2-Km apart, but the spacecraft cross-link is designed to support communications up to a 200Km distance just in case of drift during an anomaly. For the same reason, both spacecraft will be able to communicate directly to Earth The mission operations center for a signal separation mission has its own issues to surmount. These involve optimizing observation ordering, minimizing anomaly response time, and maintaining coordination across multiple spacecraft. Since each observation requires time and propellant to reconfigure a formation, optimally ordering observations results in being able to gather more data. This need to minimize time and propellant usage also motivates a rapid response to anomalies. The farther the spacecraft drift apart during an anomaly, the more time or propellant it will take to get them back together. Both lost time and lost propellant result in lost observations. Finally mission operations has to craft coordinated sequences for multiple C&DH systems, and these sequences must respond appropriately to anomalies both within and between spacecraft. While sequence coordination is not much of a problem for the two-spacecraft StarLight mission, the five-spacecraft TPF (see Figure 2) will have coordination issues. Finally, the science operations center will have to validate measurements collectively taken by multiple spacecraft. This validation will involve more than just determining the health and calibration of a single instrument. Since instruments will be distributed across the cluster, cross-calibration is needed between spacecraft in combination with calibrated cluster position, orientation, and configuration measurements.
3.3
Signal Combination
Signal combination missions have easier formation requirements, but the complexity moves into coordinating multiple science and mission operations centers for the collaborating missions. Here each spacecraft can fly in isolation, but the operations centers have to coordinate their command generation processes in order to maximize science collection not only within each mission, but also across all collaborating missions. For instance consider EO-1 following less than a minute behind Landsat-7, as depicted in
An Application Science for Multi-Agent Systems
19
Figure 9. Here EO-1 flies relative to Landsat-7, but Landsat-7 is oblivious to EO-1. In the case of EO-1, the coordination was fairly painless. All the Landsat-7 operations staff had to do was determine Landsat-7 targets in isolation and then pass them to the EO-1 operations crew. Since EO-1’s goal was to test its instrument technologies, there was no need for EO-1 to affect Landsat-7’s operation. In general this will not be the case and operations centers will have to coordinate their command generation in order to facilitate answering questions that motivate coincident observations from multiple sensors on different spacecraft.
4.
AUTONOMOUS OPERATIONS ISSUES
The previous section pointed out where segments in the spacecraft control structure are made more complex when adapted to a multiple platform mission. While communications companies have automated much of a constellation’s operations, their results do not directly apply to the more complicated evolving demands of science missions. Fortunately research within the space autonomy community has been focusing on automating the operations of complex missions. The question is, “How well will this technology generalize to complex multiple platform missions?” The main thrusts of autonomy research involve reducing costs and enabling missions that focus on phenomena with high information rates and
20 low information predictabilities. This research can be grouped in terms of three technologies: Robust execution includes performing activities with automatic mode estimation & recovery using models of how spacecraft subsystems behave, to broadly cover anomalies within the modeled subsystems; Planning and scheduling involves determining when to perform which activities as a spacecraft’s capabilities and science collection goals evolve; and Science analysis involves processing observation data onboard a spacecraft to determine both the value of observations as well as new science collection goals. While the first two technologies focus on raising the level where mission operations commands a spacecraft, the third raises the level of science operations’ interaction. Instead of prioritized observation lists and timed command sequences, mission and science operations respectively produce situation dependent activity determination strategies and data dependent observation strategies. The goal of raising the spacecraft commanding level is to reduce latency in responding to anomalies as well as the detection of observation opportunities by closing as many control loops as possible onboard the spacecraft. The Techsat-21 mission (Figure 5) will demonstrate onboard science analysis, replanning, robust execution, and model-based estimation and control [11].
Multiple platform issues arise upon considering how the three systems motivated by these technologies distribute across the collection of spacecraft. There are multiple ways varying from putting all three systems
An Application Science for Multi-Agent Systems
21
on a single spacecraft that treats the others as slaves to putting all three systems on each spacecraft and having them collaborate as peers. Assuming a peer-to-peer approach, the multiple platform issues can be characterized in terms of implementing the horizontal interactions between systems in Figure 10. Among these interactions, those between execution systems are used to facilitate executing coordinated activities like formation flying and multiple platform observations. Those between planning systems similarly facilitate determining when to perform which coordinated activities, and those between science analysis modules facilitate both cross-platform data fusion and letting one platform send new science goals to another. While signal space coverage missions will have little need for the horizontal interactions, the other two rationales will motivate cross-links. The earlier operations issues mentioned for signal combination missions map onto a need to provide horizontal interactions between planning and science analysis systems, and those for signal separation missions at least motivate execution systems on each spacecraft with cross-links. In the case of those signal separation missions where cross-links periodically break and reestablish, like those to measure the magnetosphere, this intermittent communications loss also motivates distributing onboard planning/scheduling and science analysis systems that interact.
5.
COORDINATION CHALLENGES
Now we describe the coordination challenges for each of the three autonomy thrusts. As mentioned previously, depending on the rationale behind the mission, coordination may not be needed among components at all levels.
5.1
Execution
Coordinated measurement Spacecraft that perform coordinated measurements often require constant communication and processing for cross-calibration and fault diagnosis and correction in both measurement and motion control. Local and shared resources The execution system must ensure that the spacecraft does not oversubscribe local and shared resources. In the case of orbiters, shared resources could be communication bandwidth to downlink data, memory to store data, or the spacecraft themselves for investigating a shared target. Surface explorers may additionally share physical space.
22
Uncertainty, failure, and recovery The timing of events and consumption of resources can only be estimated. Activities can fail, subcomponents can malfunction, and the state of the spacecraft may need to be estimated, diagnosed, and corrected. During coordinated measurements, the spacecraft must also monitor and perform mode estimation and diagnosis on each other. If one spacecraft is failing to operate sufficiently, the execution systems may decide to restart a measurement, fail the coordinated activity, or continue with sacrificed accuracy or precision.
5.2
Planning and scheduling
Local and shared activities Over a fixed or varying duration, an activity for a spacecraft can consume depletable metric resources (such as fuel or energy), use nondepletable metric resources (such as power), replenish resources (solar power) or change states (position, operating modes). The start time, duration, and state and resource changes of an activity may be functions of other variables (e.g. energy = power . duration). The environment may also change states and resource levels (e.g. day/night). The planner/scheduler is responsible for ensuring safe resource levels and states by adding, deleting, or rescheduling activities as motivated by science goals dictated by the science analysis module. Coordinating the planners in this respect requires that they resolve conflicts over shared states and resources as well as those involving joint activities that can violate local constraints. The planners must reach consensus in when and how they perform these joint activities. Communication constraints Inter-spacecraft communication and communication with ground is limited in bandwidth and latency. Spacecraft can only communicate in windows determined by orbits and ground antenna availability. The planner must model these constraints and track local power and memory resources that communication affects. Coordinated planning strategies that ignore these communication constraints may fail to establish consensus among the joint activities of the spacecraft. Computation constraints Different spacecraft have different processors and storage devices that are shared by different components. The performance of the flight computer is usually limited because it is designed for harsh environments. This heterogeneity will affect the usefulness of different coordination strategies.
An Application Science for Multi-Agent Systems
23
For example, a centralized approach may perform better than a peer-to-peer approach for spacecraft with widely varying computational resources. Uncertainty, failure, and recovery A planner can estimate timing and resource consumption, but in order to forecast the effects of future events, it needs feedback from the execution system about the state and the success of activities. This feedback can result in broken commitments to other spacecraft, requiring re-coordination at the planning level. Metrics Spacecraft performance is evaluated according to scientific gain. This corresponds to the amount of data transmitted and the value of that data. The planner/scheduler is responsible for coordinating its activities with others to maximize the summed value of the downlinked data.
Cooperation / negotiation The multiple spacecraft participating in a single mission may cooperate to answer the same scientific questions. (In many cases, however, different scientists manage different instruments on a single platform and negotiate over local resources on the spacecraft.) For multiple missions, planners may negotiate over shared resources, such as bandwidth to transmit data to ground.
5.3
Science analysis
Communication constraints Distributed science analysis can involve the transfer of large images and must be designed around communication constraints described earlier. Computation constraints Onboard science analysis can potentially be expensive if processing large images. A coordination strategy must adapt to the different computational capabilities of the spacecraft. Uncertainty, failure, and recovery An autonomous science analysis module may predict the value of science targets for closer investigation and/or decide whether to retry a failed investigation. It may also detect new, unexpected opportunities and decide how to distribute them to the spacecraft planners.
24 Metrics Mission performance is measured in terms of both scientific gain and cost. A good strategy must address the previous coordination issues to handle science goals and analysis in a way that increases scientific value while reducing operations costs. The distributed analysis modules may increase science throughput by only reporting data that they judge to be interesting and downlinking only the interesting part of the data (e.g. by cropping images). This also reduces the costs associated with manually processing large datasets and images on the ground. Cooperation / negotiation Spacecraft may cooperate/negotiate to perform measurements for each other to increase the scientific value of their data.
6.
CONCLUSIONS
We have described multiple platform space missions in terms of properties of a mission’s scientific objectives. Despite the observation location, the rationale determines how the spacecraft populate the orbit. There are three rationales: signal separation, signal combination, and signal space coverage. These rationales respectively motivate a single cluster of spacecraft flying in formation around the orbit, a string of spacecraft flying close together on the orbit, and a distribution of spacecraft evenly spread along the orbit. Regardless of whether a standard or autonomous approach to mission management is adopted, several issues need to be addressed before flying a multiple platform mission. For a signal space coverage mission, the main issue is to automate as much of operations as possible to minimize the people-per-spacecraft ratio. However, this requires no special coordination technology. The main issues include needs for anomaly detection and response automation to reduce effort in fixing intermittent anomalies and planning and scheduling automation to reduce daily effort in handling new science requests. To this pair of issues, signal combination between missions raises extra issues to facilitate collaboration either between operations staffs or autonomous spacecraft. These issues include needs for collaboration techniques to merge observation priorities both within and between missions and coordination techniques to optimize the planned data gathering activities of multiple spacecraft satisfying these merged priorities.
An Application Science for Multi-Agent Systems
25
Finally signal separation missions raise their own unique issues that derive from formation flying and instruments distributed across multiple spacecraft in order to make a single measurement. The main issues include the added difficulties of anomaly detection and response both within and between formation fliers, planning and scheduling to minimize fuel used to reconfigure a formation between observations and during anomaly response, and validating data collected by multiple spacecraft. We then characterized the coordination problems autonomous multiplatform missions face at the execution, planning, and science analysis levels. In addition to the challenges listed, different missions may warrant different levels of autonomy, and coordination strategies must address how the human operator is involved. The rationale-based approach to analyzing multiagent domains may help characterize the coordination needs of some other domains. Although many domains, such as robotic soccer, are not clearly related to multi-spacecraft missions, autonomous unmanned vehicles serve similar roles to spacecraft. They are typically used to identify targets and neutralize them (by taking measurements or attacking).
7.
ACKNOWLEDGEMENTS
This work was performed at the Jet Propulsion Laboratory, California Institute of Technology, under a contract with the National Aeronautics and Space Administration. This work has benefited from discussions with a number of people including Stephen Talabac and Raymond Bambery.
8.
REFERENCES
1.
E. Mettler and M. Milman, Space Interferometer Constellation: Formation Maneuvering and Control Architecture, In SPIE Denver ’96 Symposium.
2.
Martin, M. and Stallard, M.J. Distributed Satellite Missions and Technologies – The Techsat 21 Program. Space Technology Conference and Exposition, 28-30 September 1999, Albuquerque, New Mexico (AIAA 99-4479)
3.
H. Helvajian (ed.), Microengineering Aerospace Systems, The Aerospace Press, El Segundo, CA, 1999.
26 4.
C. Beichman, NASA’s Terrestrial Planet Finder: The Search for (Habitable) Planets, A presentation at the 193rd Meeting of the American Astronomical Society 5-9 January 1999 in Austin, Texas.
5.
H. Spence, et al., The Magnetospheric Constellation Mission Dynamic Response and Coupling Observatory (DRACO): Understanding the Global Dynamics of the Structured Magnetotail, NASA/TM-2001-209985, May 2001.
6.
J. Adams, Planning for Global Precipitation Measurement (GPM), IGARSS (International Geoscience & Remote Sensing Symposium) Sydney, Australia, 9-13 July 2001.
7.
J. Esper, W. Wiscombe, S. Neeck, M. Ryschkewitsch, Leonardo-BRDF: A New Generation Satellite Constellation, 51th International Astronautical Congress, Rio, Brazil, June 2000.
8.
J. Tandler, Automating the Operations of the ORBCOMM Constellation, AIAA/Utah State University Conference on Small Satellites, September 1996.
9.
R. Cesarone, R. Hastrup, D. Bell, D. Lyons and K. Nelson, Architectural Design for a Mars Communications & Navigation Orbital Infrastructure, paper presented at the AAS/AIAA Astrodynamics Specialist Conference, Girwood, Alaska, August 16-19, 1999.
Annual
10. W. Deininger et al., Description of the StarLight Mission and Spacecraft Concept, 2001 IEEE Aerospace Conference, March 2001.
11. S. Chien, R. Sherwood, M. Burl, R. Knight, G. Rabideau, B. Engelhardt, A. Davies, P. Zetocha, R. Wainright, P. Klupar, P. Cappelaere, D. Surka, B. Williams, R. Greeley, V. Baker, J. Doan, The Techsat-21 Autonomous Sciencecraft Constellation Demonstration, Proceedings of i-SAIRAS 2001, Montreal, Canada, June 2001.
A FRAMEWORK FOR EVALUATION OF MULTI-AGENT SYSTEM APPROACHES TO LOGISTICS NETWORK MANAGEMENT
Paul Davidsson and Fredrik Wernstedt Department of Software Engineering and Computer Science, Blekinge Institute of Technology, Soft Center, 372 25 Ronneby, Sweden
Abstract:
1.
We study the applicability of multi-agent systems (MAS) to production and logistics network management. The goal is to create and evaluate sets of intelligent agents that can cooperatively support production and logistics network decisions, as well as to compare their performance to other more traditional methods. A short description of supply chains is given, as well as a formal characterization of the problem space under investigation. We outline a general simulator that allows for a systematic evaluation of different multiagent approaches across the different parts of this problem space. This is illustrated by a case study on district heating systems. A major concern in this domain is how to cope with the uncertainty caused by the discrepancies between the estimated and the actual customer demand. Another concern is the temporal constraints imposed by the relatively long production and/or distribution times. In the case study we show how to lessen the impact of these problems by the usage of agent clusters and redistribution of resources.
INTRODUCTION
Despite the fact that more parties are becoming involved in the supply chain, the process is becoming more clearly separated. Each party in the chain has become more specialized and is only responsible for the performance of a small part of the total process. Furthermore, the chain is
28 getting more dynamic as a result of the constant changes and movements that is part of a today supply chain. These characteristics indicate the need for sophisticated software systems that connect the logistics flows of individual companies [1]. As the supply chain is getting more fragmented the need for communication and coordination grows. However, the concept of coordination has not been incorporated in system design until recently [2]. The common usage of enterprise resource planning (ERP) systems offers the promise of integration of supply chain activities. However, the flexibility of ERP systems has been less than expected or desired [3]. Advanced planning and scheduling (APS) is an upcoming development for controlling the logistics flow through the individual elements of a supply chain. Although it is not perfectly clear how to accomplish integrated control of supply chains, some commonly mentioned requirements are the possibility to monitor the state of all involved parties, to perform collaborative planning, to perform advanced scheduling, and being able to measure performance and cost. In order to satisfy these and related requirements, a number of agentbased approaches has been suggested. Several authors propose agents for auction-oriented management of supply chains, e.g., Fan et. al. [2] provide a theoretical design that could plan the operations in a supply chain, and Hinkkanen et al. [4] focus on optimisation of resource allocation within a manufacturing plant. A rule-based approach has been proposed by Fox et al. [5], which is concentrated on coordination problems at the tactical and operational level. Preliminary work towards collaborative inventory management has been performed by Fu et. al. [6]. Furthermore, agent-based approaches concerning the simulation of the dynamics in supply chains have been considered by, e.g., Parunak et al. [7], and Swaminathan et al. [8]. We argue that time has come to try to develop methods and tools for systematic evaluation of these (and other) approaches in different types of supply chain management problems. The objective of this paper is to provide guidelines and a starting point for this type of research. In the next section we give a short description of supply networks and management philosophy. This is followed by a formal characterization of the problem space under investigation. We then outline a general simulator for production and distribution and argue that this type of simulator allows for a systematic evaluation of different multi-agent approaches across the different parts of this problem space. Finally, a case study concerning district heating systems is described. A major concern in this domain (and many other supply chains) is how to cope with the uncertainty caused by the discrepancies between estimated and real demand. Another concern is the temporal constraints imposed by the relatively long production and/or distribution times. We show how to lessen the impact of these problems by the usage of agent clusters and redistribution of resources.
An Application Science for Multi-Agent Systems
2.
29
SUPPLY CHAIN NETWORKS
A supply chain is a network of autonomous or semiautonomous suppliers, factories, and distributors, through which raw materials are acquired, refined and delivered to customers. According to the simplified view that we will adopt here, and which is illustrated in Figure 1, supply networks can be outlined as having an hourglass shape [7].
Of course, a supply chain is sometimes more complex with multiple end products that share components or facilities. The flow of material is not always along a tree structure, e.g., there could be various modes of transportation. Also, distribution, manufacturing and purchasing organizations along the supply chain often operates independently and have their own objectives, which may be in conflict, i.e., it can easily be argued that it is of most importance to coordinate the activities to attempt to achieve the desired global performance. The purpose of a supply chain is to add value to its products as they pass through the supply chain (the input part) and to transport them into geographically dispersed markets in the correct quantities, at the correct time, and at a competitive cost (the distribution part). Supply chain management is concerned with integration of purchasing, manufacturing, transportation, and inventory activities. It also refers to integration of these activities across geographically dispersed facilities and markets. Finally, it also refers to temporal integration of these activities. The temporal horizon is usually separated into three levels, the strategic, the tactic, and the operational level. Long-term decisions are made on the strategic level, e.g., location and capacity of facilities in the supply chain. The tactical level is dependent on the strategic level and copes with medium term decisions, e.g., distribution planning and inventory level (buffer size). Finally, on the operational level, which is dependent on the strategic as well as the tactical level, the short-term decisions are made, e.g., scheduling of local transportation. Typically, the types of decisions made on the operational level are similar to those made on the tactical level but taken with a shorter time horizon in mind.
30 The decisions at the different levels are made in order to achieve one or more of the following goals: minimize production costs (e.g., by having an even production), minimize distribution and inventory costs, and maximize customer satisfaction. The overall goal often is to maximize profit, which almost always leads to a trade-off situation between these goals. Since this trade-off is application dependent, a specific balance between the goals is desired for a particular application. A typical supply chain faces uncertainty in terms of both supply and demand. Thus, one of the most common problems faced by managers is to anticipate the future requirements of customers. Large errors in forecasts lead to large discrepancies between production and actual demand. This results in higher inventory costs, i.e., larger buffers are needed and/or the worth, or quality, of the products in the buffers decrease over time (deterioration). To deal with this problem, Just-In-Time (JIT) strategies have been developed. Monden [9] gives a brief definition of JIT as “producing the necessary items, in the necessary quantities at the necessary time”. Various benefits of JIT have been widely discussed in literature. However, most success stories take place in large manufacturers with stable demand, such as the automotive and electronic industries. One of the long-term aims of our work is to develop successful JIT strategies and JIT software tools for more dynamic situations.
3.
A FORMAL CHARACTERIZATION OF THE PROBLEM DOMAIN
In this section, we formally define the problem space under investigation. We will restrict our attention to the distribution part of the supply network, see Figure 2.
An Application Science for Multi-Agent Systems
31
Thus, we will not model complete supply chain networks, e.g., the details of the manufacturing process within the producer and interaction with subcontractors are not modeled. However, we believe that the simplifications made do not change the applicability of the general approach we are suggesting. We divide the description of the production and distribution network in three parts, production, consumption and distribution. Production. Let be the set of all producers and be the set of all commodities. Then for each pair of producer, commodity, we denote; the production time, the production cost, the production capacity, the production at time t ,
and
Consumption. Let be the set of all customers. Then for each pair of customer, and commodity, we denote; the consumption at time t , the demand at time t , Distribution. Let the distribution network D = (N, E) be a directed graph, where is the set of all nodes, is the set of all internal distribution nodes, C is the set of all customers, and is the set of all edges. Here an edge corresponds to a distribution channel between two nodes (there may be more than one edge between two nodes) and, N, is indexed as Then for each pair of edge, and commodity, we denote; the distribution time, the distribution cost, the distribution capacity, the distribution at time t , For each pair of node, and commodity, we denote: the buffer cost, the buffer capacity, the buffer usage at time t , For each commodity, we denote: the deterioration rate, Although this model assumes that production and distribution costs etc are linear, we argue that it is possible to describe many interesting production and distribution problems using this formal model. Furthermore, there might be constraints and dependencies between different commodities concerning production, distribution, and buffer capacities. Note that the production, distribution, and buffer dynamics are part of the solution rather
32 than the problem, and that the possible amount of consumption is governed by these dynamics.
4.
A GENERAL SIMULATOR FOR PRODUCTION AND LOGISTICS NETWORK
We are currently developing a general simulator able to simulate all the relevant production and distribution problems that can be described by the formal model presented above. This includes problems at the strategic, tactical, as well as the operational level. Each part of the model corresponds to a set of explicit simulation parameters. Thus, by setting these parameters, it will be possible to simulate an arbitrary production and distribution problem. The long-term goal is to systematically evaluate different agent-based approaches in the different parts of the problem space defined by the formal model (and the simulator). This ambition is in line with the more general ideas presented by Davidsson and Johansson [10]. We will, of course, also use the simulator to compare the performance of the agent-based approaches to more traditional approaches. Figure 3 illustrates the interaction between the control strategies and the general simulator. The control strategy manages production and distribution and the simulator controls consumption and simulates production, distribution and consumption.
However, we have thus far only investigated a small part of this problem space, namely that corresponding to district heating production and distribution. We will here focus on JIT production and distribution problems, which we define as situations where there are limited storage capacity or high deterioration rate, and a considerable delay from order to receipt of commodities (due to large production and/or distribution times). We have compared two different agent-based approaches in this domain, one semidistributed and one centralized.
An Application Science for Multi-Agent Systems
4.1
33
Case Study: Decription of District Heating Systems Domain
This case is borrowed from ABSINTHE, a current collaboration project with Cetetherm AB, one of the world-leading producers of district heating substations [11]. The technological objective is to improve the monitoring and control of district heating networks through the use of agent technology. For more information on this project, see www.ipd.bth.se/absinthe. The basic idea behind district heating is to use cheap local heat production plants to produce hot water. This water is then distributed by using one or more pumps at approximately 1-3 m/s through pipes to the customers where substations are used to exchange heat from the primary flow of the distribution pipes to the secondary flows of the building, see Figure 4. The secondary flows are used for heating both tap water and the building itself. In large cities district heating networks tend to be very complex, including tens of thousands of substations and hundreds of kilometers of distribution pipes with distribution times up to 24 hours.
Let us describe a district heating system in terms of the general model. We here present a description corresponding to how parameters are set in the current version of the simulator, in which some simplifications have been made compared to actual district heating systems. For instance, we assume that there is only one production plant. However, we intend to remove such restrictions in future versions.) Production. Since there is only one heat production plant, and the only type of commodity, is energy (hot water), we will here use a simplified notation that omits the and We assume that raw materials are sufficient to support the production (up to the production capacity, ). Moreover, we assume that: the production time, is negligible, i.e., 0 seconds, the production cost, is 1 cost unit for each kWh, and the production capacity, is larger than the total demand (see below for definition of, ).
34
Consumption. The set of customers, C, consists of 10 customers. Five customers, are serving 40 households each and five customers, are serving 60 households each. Moreover, we assume that: the demand of the customers, is composed by two parts: hot tap water demand and heating demand. The tap water demand is simulated using a model [12] based on empirical data where flow size and tapping durability is determined by the simulation of a random number, Y, from a certain distribution with cumulative distribution function which can be performed using uniformly distributed numbers, and where the time between tapping is a non-homogenous Poisson process; where
is the time-varying opening
intensity and T the time to derive. The opening intensities are derived from
where p is given from measurement data [13] and
is calculated from the distribution function for open valve time. The resistance/capacitance model described above is used also for simulation of the energy needed for household heating (radiators). The variance over time of the outdoor temperature is simulated by the following model [14]; where Tm is the lowest temperature to expect, Tv is the maximum temperature to expect, and S is the time interval expressed in hours. Distribution. The distribution network, D, is assumed to be a tree structure with the production plant as the root and the distribution pipes corresponding to the edges. A relaxing constraint in district heating systems is that the distribution time and cost between customers physically close to each other (situated in the same branch of the distribution tree) is negligible. We model this by internal cluster nodes, I, one for each cluster of neighboring customers, see Figure 5.
An Application Science for Multi-Agent Systems
35
Moreover we assume that: the distribution time, is 1 h for edges, e, between the producer, and internal nodes, and negligible, i.e., 0 for edges between internal nodes, and customers, the distribution cost, is 0 for all edges. the distribution capacity, is greater than the total demand for the cluster to which e is pointing, the buffer capacity, , is 0 for (the customer has no potential to keep inventories of hot water) and for If, the capacity is greater than the total demand for the cluster to which n belongs, the buffer cost, is negligible, i.e., 0, and the deterioration rate, is computed from the common resistance/capacitance model; where
is the temperature of a
object, x, at the time i which had temperature with thermal resistance and thermal capacity environment with temperature
4.2
one time unit ago in an surrounding
Multi-Agent System Approaches to JIT Production and Distribution in District Heating Systems
There are a number of different approaches to solve the JIT production and distribution problem outlined above. The most basic approach (and probably the most used) is strictly centralized, where the producer based on experience makes predictions of how much resource to produce in order to satisfy the total demand. These resources are then distributed directly on request from the customers. A bit more sophisticated is the approach where each customer makes predictions of future consumption and informs the producer about these predictions. Since local predictions typically are more informed than global predictions, this approach should give better results. The multi-agent system (MAS) architecture we suggest below partly builds upon this insight but also introduces a means for automatic redistribution of resources. In order to solve the problem of producing the right amount of resources at the right time, each customer is equipped with an agent that makes predictions of future needs that it sends to a production agent. The other problem, to distribute the produced resources to the right customer, is approached by forming clusters of customers within which it is possible to redistribute resources fast and at a low cost. This usually means that the customers within a cluster are closely located to each other. In this way it is
36 possible to cope with discrepancies between predicted demand and the actual consumption. For instance, this happens if the demand of a customer changes while the resources are delivered. The customer would then be faced with either a lack or an excessive amount of resources, thus leaving the system in an undesired state. Based on the above insights, we used the GAIA methodology [15] to design the MAS. This led us to the architecture outlined in Figure 6, that has the following three types of agents:
Consumer agents: (one for each customer) which continuously (i) make predictions of future consumption by the corresponding customer and (ii) monitor the actual consumption, and send information about this to their redistribution agent. Redistribution agents: (one for each cluster of customers) which continuously (i) make predictions for the cluster and send these to the producer agent, and (ii) monitor the consumption of resources of the customers in the cluster. If some customer(s) use more resources than predicted, it redistributes resources within the cluster. If this is not possible, i.e., the total consumption in the cluster is larger than predicted, it will redistribute the resources available within the cluster according to some criteria, such as, fairness or priority, or it may take some other action, depending on the application. Producer agents: (one for each producer, however, we will here only regard systems with one producer) receives predictions of consumption and monitors the actual consumption of customers through the information it receives from the redistribution agents. If necessary, e.g., if the producer cannot produce the amount of resources demanded by the customers, the producer agent may notify the customers about this (via the redistribution agents). The suggested approach makes use of two types of time intervals: (i) prediction intervals and (ii) redistribution intervals. A prediction interval is larger than a redistribution interval, i.e., during each prediction interval there
An Application Science for Multi-Agent Systems
37
is a number of redistribution intervals. Each consumer agent produces one prediction during a prediction interval and sends this to its redistribution agent, who sums the predictions of all consumer agents belonging to the cluster and informs the producer agent about this. The predictions made by the consumer agent must reach the producer at least before the resources are actually consumed. Typically, there is also a production planning time that also should be taken into account (i.e., added to the sum). The coordination technique that we use is organizational structure, i.e., the responsibilities, capabilities, connectivity and control flow are defined a priori [16]. Organizational structure can be seen as a long term, strategic load balancing technique [17]. Malone and Crowstone define coordination as “managing the dependencies between activities” [18]. The basic coordination process to manage in district heating systems is the producer/consumer relationship, i.e., the main dependencies are prerequisite constraints (some activity must be finished before another can begin), transfer (something needs to be transported), usability (one part of the system needs information produced by another part), and simultaneity constraints (some activities need to occur at the same time).
4.3
Simulation Results
The focus during the initial simulation experiments [19] was to see how the quality of service (QoS), measured in terms of the number of restrictions issued, varied with the amount of excessive production (in relation to the predicted consumption). We found that the MAS performed well, coping with faulty predictions (even though the discrepancy between the predicted and the actual consumption sometimes was quite large). We also discovered that in order to achieve the same QoS between the two sets of experiments, the centralized approach required an additional 3% excessive production. Figure 7 shows the total number of restrictions to tap water and the number of restrictions to water for household heating (radiator) during one day for different degrees of surplus production in a cluster of ten consumer agents. We see that there is a clear trade-off between the quality of service (number of restrictions) and the amount of surplus production and that there are almost no restrictions of any kind when 4% more hot water than the predicted consumption is produced. Moreover, this approach allows large fluctuations in customer demand, which is something that has been argued not to suit JIT approaches [20]. We have compared this approach to a more centralized approach without redistribution agents and where all computation is carried out at the producer side. The only task of the agents on the customer side was to read sensor
38 data and send those to the producer agent. Simulation results showed that more than 7% overproduction was needed in order to avoid any restrictions/shortages. For more information about this approach and simulation results, see [19].
5.
CONCLUSIONS AND FUTURE WORK
We have presented an outline of a general simulator based on a formal model of production and logistics networks. The goal is to implement such a simulator and use it for systematical evaluation of both new and existing agent-based approaches to supply chain management. We also described the first step towards this goal in the form of a case study. In the near future we plan to generalize the current version of the simulator to cover more types of production and distribution domains. We would also like to be able to consider environmental aspects of agent-based systems in supply chains, e.g., dynamical support for alternative transportation routes. Also, we plan to implement other agent-based approaches as well as improving the one presented here. Furthermore, an important item on the research agenda is to identify and analyze alternative coordination processes, e.g., mechanisms that support JIT strategies. Acknowledgement. This work has been financially supported by VINNOVA (The Swedish Agency for Innovation Systems).
REFERENCES 1. Knoors, F.: Establish the Door-To-Door Management System, European Commission, DG TREN FP, D2D, WP3, Sequoyah International Restructuring, 2002. 2. Fan, M., Stallaert, J., Whinston, A.B.: Decentralized Mechanism Design for Supply Chain Organizations Using an Auction Market, To appear in Information Systems Research. 3. Shapiro, J.F.: Modeling the Supply Chain, Duxbury, 2001. 4. Hinkkanen, A., Kalkota, R., Saengcharoenrat, P., Whinston, A.B.: Distributed Decision Support Systems for Real-Time Supply Chain Management Using Agent Technologies, Readings in Electronic Commerce, 275-291, 1997. 5. Fox, M.S., Barbuceanu, M., Teigen, R.: Agent-Oriented Supply-Chain Management, The International Journal of Flexible Manufacturing Systems, 12: 165-188, 2000. 6. Fu, Y., Souza, R., Wu, J.: Multi-agent Enabled Modeling and Simulation Towards Collaborative Inventory Management in Supply Chains, Winter Simulation Conference, 2000. 7. Parunak, H.V.D., Savit, R., Riolo, R.L.: Agent-Based Modeling vs. Equation-Based Modeling: A Case Study and Users’ Guide, Multi-Agent Systems and Agent-Based Simulation, LNAI 1534, 10-25, Springer Verlag, 1998. 8. Swaminathan, J., Smith, S.F., Sadeh, N.M.: Modeling Supply Chain Dynamics: A Multiagent Approach, Decision Sciences, 29(3), 1998. 9. Monden, Y.: Toyota Production System: An Integrated Approach to Just-In-Time, Industrial Engineering and Management Press, Georgia, 1993. 10.Davidsson, P. and Johansson, S.: Evaluating Multi-Agent System Architectures: A Case Study Concerning Dynamic Resource Allocation, Third International Workshop on Engineering Societies in the Agents’ World, Madrid, Spain, 2002. 11.Wernstedt, F., and Davidsson, P.: An Agent-Based Approach to Monitoring and Control of District Heating Systems, Developments in Applied Artificial Intelligence, LNAI 2358, 801-812, Springer Verlag, 2002. 12. Arvastson. L., and Wollerstrand. J.: On Sizing of Domestic Hot Water Heaters of Instantaneous Type, 6th International Symposium on Automation of District Heating Systems, 1997. 13. Holmberg. S.: Norrköpingsprojektet – en detaljerad pilotundersökning av hushållens vattenförbrukning, Technical report M81:5, Departement of Heating and Ventilation Technology, Royal Institute of Technology, Sweden, 1981 (in Swedish). 14. Ygge, F., and Akkermans, H.: Decentralized Markets versus Central Control: A Comparative Study, Journal of Artificial Intelligence Research, 11: 301-333, 1999. 15. Wooldridge, M., Jennings, N.J., Kinny, D.: The Gaia Methodology for Agent-Oriented Analysis and Design, Journal of Autonomous Agents and Multi-Agent Systems, 3(3): 285312, 2000. 16. Nwana, H., Lee, L., Jennings, N.R.: Coordination in Software Agent Systems, The BritishTelecom Technical Journal, 14(4): 79-88, 1996. 17. Durfee E.H., Lesser V.R.: Planning Coordinated Actions in Dynamic Domain, Technical report COINS-TR-87-130, Department of computer and information science, University of Massachusetts, Amhest, 1987. 18. Malone, T.W., and Crowstone, K.: The Interdisciplanary Study of Coordination, ACM Computing Surveys, 26(1), 1994. 19.Davidsson, P., and Wernstedt, F.: A Multi-Agent System Architecture to Monitoring and Control of District Heating Systems, To appear in the Knowledge Engineering Review. 20. Hall, R.W.: Zero Inventories, Dow Jones-Irwin, Illinois, 1983.
This page intentionally left blank
CENTRALIZED VS. DECENTRALIZED COORDINATION: TWO APPLICATION CASE STUDIES Tom Wagner, John Phelps, and Valerie Guralnik Honeywell Laboratories Minneapolis, MN 55418
[email protected]
Abstract
1.
This paper examines two approaches to multi-agent coordination. One approach is primarily decentralized, but has some centralized aspects, the other is primarily centralized, but has some decentralized aspects. The approaches are described within the context of the applications that motivated them and are compared and contrasted in terms of application coordination requirements and other development constraints.
Introduction
In this paper we examine two different approaches to multi-agent coordination in the context of two different multi-agent applications. In one application, a centralized approach is used to coordinate the activities of different agents. In another, a distributed approach is used but interestingly, the distributed approach still contains elements of centralization or global synchronization. These two applications are not a complete set of application classes for multi-agent systems (MAS) but they illustrate possible points in a larger continuum. The interesting element of the applications is how different domain characteristics lead to different approaches to coordination. Before we delve into the applications, it is worthwhile to ask the basic question of “what is coordination and when do we need it?” Typically a multi-agent systems (MAS) model of development is pursued when distributed processing and distributed control are required. As with other distributed processing models, one important problem of MAS research is how to obtain globally coherent behavior from the system when the agents operate autonomously and asynchronously. In general, when the agents share resources or the tasks being performed by the agents interact, the agents must explicitly work to coordinate their activities.1 Consider a simple physical example. Let two maintenance
42 robots, R1 and R2, be assigned the joint task of moving a long table from one room to another. Let both robots also have an assortment of other independent activities that must be performed, e.g., sweeping the floor. Assume that neither robot can lift the table by him/herself. In order for the robots to move the table together they must coordinate their activities by 1) communicating to determine when each of the robots will be able to schedule the table moving activity, 2) possibly negotiating over the time at which they should move the table together, 3) agreeing on a time, 4) showing up at the table at the specified time, 5) lifting the table together, and so forth. This is an example of communication-based coordination that produces a temporal sequencing of activities enabling the robots to interact and carry out the joint task (over a shared resource – the table). Without the coordination process, it is unlikely that the table would ever be moved as desired unless the robots randomly decided to move the table at the same moment in time. Note that if the robots are designed to “watch” each other and “guess” when the other is going to move the table that this is an instance of coordination by plan inference and still counts as a coordination episode. In general, achieving global coherence in a MAS where tasks interact requires coordination. In the robot/table example, the coordination episode is peer-to-peer. Imagine now a room full of maintenance robots, each having multiple joint tasks with other agents and all sharing physical resources such as tools and floorspace or X/Y coordinates. Without coordination said room full of robots would have much in common with a preschool “free play session” with robots moving about, unable to perform tasks due to obstacle avoidance systems always diverting them from their desired directions or due to the lack of a required tool. There are two primary ways to coordinate this room full of robots – either in a distributed peer-to-peer (or group to group) fashion or in a centralized fashion. When coordination is distributed each agent is responsible for determining when to interact with another agent and then having a dialog to determine how they should sequence their activities to achieve coherence. When coordination is centralized generally one agent plans for the others or manages a shared resource. Note that in the example above coordination focuses on when to perform a given task. Coordination can also be about which tasks to perform, what resources to use, how to perform a task, and so forth. While the robot domain is good for illustrating conceptually the coordination problem, the need for coordination is not limited to robots. Software agents, humans, and systems composed of mixes of agents, humans, and robots [20] all have a need for some kind of coordination. When the tasks or activities of different parties interact, in a setting where control is distributed (parties are autonomous), coordination is needed. In this paper we examine two different MAS applications and the coordination techniques that are used to achieve global coherence. One application,
An Application Science for Multi-Agent Systems
43
I.L.S.A., is a multi-agent elder caregiver system. The other application is a system for dynamic coordination of distributed aircraft service teams. Coordination requirements in these two systems are similar – achieve global coherence and do this in “real-time” (response time fast enough for the application). However, in these two systems the coordination solutions implemented are different and these differences are driven by the different characteristics of the underlying problem spaces.
2.
2.1
Coordination in I.L.S.A. The Application
I.L.S.A. is a multi-agent system that monitors an individual in his/her home and automates aspects of the care giving process. For instance, I.L.S.A. may notify a caregiver if the individual being monitored should go too long without eating. I.L.S.A. and systems like it have been of recent interest to both the research and commercial communities [1–3, 6, 8, 13, 14, 21, 22]. I.L.S.A. is currently deployed at several caregiver facilities and homes and is undergoing evaluation. In this section we explore the general coordination issue in the elder care problem space and discuss the response planning and coordination portions of I.L.S.A.. In I.L.S.A., different agents are responsible for sensing, reasoning about the sensor data, deciding on a course of action, and interacting with the caregivers and the individual for which the system is giving care. Without coordination in I.L.S.A. it would be possible for the system to exhibit undesirable characteristics. For instance, the agents might overburden the individual for which care is being given by issuing a series of separate reminders all within a few minutes of one another, e.g., “it is time to take your medication,” “it is time for lunch,” “you haven’t been moving as much today,” “I believe the oven is on.” Imagine the impact of having a telephone ring or a beeper go off every few minutes with a different reminder function. Worse still, imagine if the client were to fall and the emergency notification being sent to a caregiver was delayed so that the client could be reminded to eat his/her lunch. Or imagine that same client, who is unable to get up, listening to the phone ring where the ring is caused by a reminder that it is time to take a nap. I.L.S.A.’s MAS requires coordination. Through coordination we can organize the responses produced by the different agents and achieve globally coherent behavior for the system as a whole. For the purposes of this discussion we will use the moniker “Lois” to denote the individual of which I.L.S.A. is taking care and the term “caregiver” to denote some health care professional that may assist I.L.S.A. in dealing with certain situations.
44
In what immediately follows, we provide more detail about the agent coordination problem in I.L.S.A., discuss the process of selecting between distributed and centralized coordination, define the coordination technique used in I.L.S.A. and identify future coordination issues.
2.2
Response Planning and Coordination in I.L.S.A.
In the current implementation of I.L.S.A. the coordination problem pertains to organizing activities over shared resources – namely the interfaces to the caregivers and Lois. There are other candidates for coordination that were not
An Application Science for Multi-Agent Systems
45
addressed, including the coordination of multiple I.L.S.A. instances and the coordination of agents within I.L.S.A. over task relationships. I.L.S.A.’s overall architecture is documented in [9]. Here we focus on the subset of the architecture that deals with 1) deciding how to respond to a given situation that exists with Lois, and 2) coordinating the different responses that may be on going at any given moment in time. This functionality is loosely called response planning and coordination and the subset of the architecture that performs these functions is shown in Figure 1. Response planning in I.L.S.A. is carried out by multiple domain agents, each of which has a particular area of expertise. For instance, there is a domain agent that monitors Lois’ medication and issues reminders to her if she forgets to take her medication on schedule and issues notifications and alerts to caregivers if Lois does not correct her medication situation. Other domain agents specialize in toileting, eating, falls, mobility, and sleeping, to name a few. The domain agents receive multiple different types of input from other agents in the system. Specifically: The intent recognition [7] agent provides information on the plans Lois is likely to be performing. For instance, Lois might be cooking dinner with some specified probability. The sensor clustering agent provides filtered sensor data, e.g., toilet flushes or information from motion detectors. This agent also provides the clock pulse. The database agent provides the domain agents with information on conditions for which they are to monitor (e.g., track the consumption of medication X) and how they should respond if a particular circumstance arises. For instance, if Lois fails to take her medication, first remind her, then notify caregiver X using device Y, then if she remains inappropriately unmedicated issue an alert to caregiver Z using device K. Domain agents also receive feedback from a response coordinator (below) agent and results back from the device agents when appropriate (e.g., indicating the caregiver X has accepted an alert and will handle the condition). From this data the domain agents monitor the environment, monitor Lois’ condition, and respond when necessary. Domain agents respond to problems by interacting with Lois or one or more of the caregivers. Responses fall into four categories: reminders (to Lois), notifications (to caregivers of some condition), alerts (to caregivers of some serious condition), and alarms (to caregivers of some life threatening condition). The domain agent responses are the coordination focal point. The response coordination problem in I.L.S.A. has the following properties:
46 The domain agents operate asynchronously and autonomously and generally do not interact to solve problems. The implication is that any domain agent may generate an action request at any time. For our purposes it is sufficient to view action requests as a request to interact with a caregiver or Lois via one of the UI devices. The agents share several resources, namely the UI devices and indirectly the caregivers and Lois. In addition to the resource interaction, action requests potentially intersect on a temporal basis. In this case for there to be a resource coordination problem the requests must occur within a certain temporal proximity to one another or within a system state window (below). The response space for a particular individual caregiver or a particular client (Lois) is relatively sparse, i.e., the number of interactions that the system needs to have with any given individual is small over time. We can make this characterization because the endpoints of these action requests are human and bounded by human capacity. For instance, Lois cannot process 10,000 interactions a minute nor can a given care provider. In contrast, the load of a given device may be relatively high when compared to the load on an individual caregiver or Lois. Consider an I.L.S.A. installation where the only interaction medium for Lois and all caregivers is the telephone. The telephone itself may be a bottleneck unless usage is coordinated accordingly. In addition to interacting over shared resources, action requests interact with each other through priority. For instance, if the falls domain agent issues an alert because it has determined that Lois has fallen, the alert condition should take precedence over a previously issued, but not yet fulfilled, request to notify a caregiver that Lois is behind on her medication. Action requests also interact through system state. For instance, if the panic agent issues an alarm because Lois has pressed her panic button, reminders from the other domain agents (to Lois) must be suppressed. Different responses have different human interaction protocols that I.L.S.A. must follow. For instance, reminders go to Lois. Notifications are sent to caregivers but the caregivers do not need to explicitly acknowledge them. Alerts are sent to caregivers and require explicit acceptance – plus alerts cause I.L.S.A. to cycle over a list of caregivers until one is found who will accept responsibility for handling the alert.
An Application Science for Multi-Agent Systems
47
Alarms are sent to a list of selected caregivers all at once, in parallel. Each of these responses is further conditioned by device, e.g., caregiver Bob might always want to be notified via cell phone rather than beeper. One of the design goals of I.L.S.A. is to keep the architecture open so that 3rd parties can add custom or enhanced functionality. The domain agents will execute in close proximity to one another or will have a reliable network connection between them. This is necessary for them to obtain information about the environment and to monitor Lois. On the surface, the response coordination problem space in I.L.S.A. is akin to that found in the UMASS IHome [15] project. In IHome agents manage different appliances such as the hot water heater, dish washer, washing machine, coffee maker, and coordinate their activities to improve the quality of life for the occupants, e.g., making sure that there is sufficient hot water for morning showers while still getting the laundry done and the dishes washed. In IHome coordination centers on shared resources like hot water, electricity, noise, and a shared mobile robot (simulated) that can perform selected tasks within the environment. In both IHome and I.L.S.A. two primary approaches for solving the coordination problems exist – centralized and distributed. In IHome both centralized and distributed approaches are used and the motivation for each selection is more well understood today than it was at the time the decision was made. For resources that are characterized as 1) not centralized and 2) not heavily contested, a decentralized approach is used. In the decentralized approach agents are responsible for coordinating on their own behalf, i.e., there is no single agent that serves as the moderator or controller for that resource. The noise resource is an example of a resource over which the agents coordinate in a distributed fashion. For resources that are centralized, such as hot water which is produced locally by the hot water heater, an agent is assigned the task of controlling and coordinating usage of the centralized resource. In the case of hot water, a hot water heater agent handles the allocation of hot water to individual agents for their use. What are the important differences between the hot water resource and the noise resource in IHome? How do these differences relate to I.L.S.A.? The hot water resource is inherently centralized. The noise resource is inherently distributed (spatially). This is an important distinction because not all agents that make noise actually need to coordinate in IHome. For instance, the television agent would not need to coordinate with the vacuum cleaning agent if they are located in different rooms of the home. In this case, a centralized coordination mechanism is unnecessary and undesirable (assuming that the process of determining whether or not the agents need to coordinate is low cost). In
48 contrast, the hot water used by the dishwasher always comes from the same source as the hot water used by the shower which always comes from the same source as the hot water used by the washing machine and so forth. In this case, these agents always need to coordinate their activities and their coordination will often span more than an individual pair of agents. Additionally, in IHome, hot water proves to be a fairly contentious resource so centralized coordination serves to reduce message traffic and coordination overhead. To summarize, axis along which to evaluate a coordination approach suggested by IHome include 1) how many agents use a given resource, 2) whether the resource is inherently centralized or distributed, 3) how contentious the agents are for the resource. Another issue that arises in IHome is when multiple resources are required by a given agent before any processing can be performed, e.g., the washing machine needs noise, electricity, and hot water resources to function. This class of resource issues does not occur in I.L.S.A. though the first three criteria are useful metrics in I.L.S.A.. Now consider I.L.S.A.’s coordination problem as specified above. If we distill that information and correlate related items we can enumerate a subset of important issues to consider when evaluating I.L.S.A.’s coordination needs: All of the domain agent action requests interact over the set of UI devices (directly) and the set of caregivers Lois (indirectly). Action requests also interact with each other through system state and priority. Note that system state is a global condition (e.g., system is in alarm state so suppress all reminders to Lois). The number of action requests expected to be issued by the system is computationally bounded on the upside by the sum of the processing capabilities of the caregivers plus the capability of Lois, i.e., the number of requests the system will generate and process will be small by computer standards. The action requests are lightweight activities that do not require large amounts of processing to evaluate. The agents do not need to engage in negotiation activities or perform complex cross agent task sequencing activities. Because I.L.S.A. must adhere to a set of protocols for issuing notifications, alarms, alerts, and reminders, if domain agents are given direct access to the devices they must each implement portions of the protocols (some of the protocol requirements are implemented by the device agents). I.L.S.A. should support the addition of 3rd party / after market agents to the system.
An Application Science for Multi-Agent Systems
49
These properties can be distilled into a few key attributes: Nuisance factor – because the agents interact over a human shared resource, there is an added issue of minimizing the nuisance factor. Without coordination, for instance, it is possible for different domain agents to ring Lois’ phone every two minutes to issue different reminders. Flexibility – individual tasks are not rigidly fixed temporally. They do not have hard deadlines but instead must be performed according to some overall policy, e.g., alerts must be issued as fast a possible. Global constraints exist and they must be enforced. These include: 1) global system-wide policies for response management, e.g., when the system is in an alarm state reminders are not issued, 2) global system-wide contact protocols for a given response type, e.g., for alerts contact authorized contactees on their devices sequentially in order of priority, stopping when one contactee accepts the alert.
50
Computational predictability – the ratio of responses to computational processing power will be small in general because all responses must even-
An Application Science for Multi-Agent Systems
51
tually involve a human caregiver or client. The “distribution to avoid a bottleneck” view doesn’t apply here. When evaluating a distributed coordination approach, other factors to consider include: higher cost of technology development, application specific potential for higher computational cost in agent control problem solving overhead and network bandwidth, generally much harder to debug algorithms, and that in this application we can achieve a global view of the interaction space (the agents do not have privacy concerns as they do in e-commerce applications). The end result? Centralization is appropriate for response coordination in I.L.S.A.. This is due in part to the way in which the lightweight action requests may interact and due to the ratio of requests to caregivers that the system will exhibit in order to be useful. Centralization provides an efficient path for response coordination that we believe will scale with the system in the case of I.L.S.A.. Centralization is also particularly appropriate for I.L.S.A. because the action requests are lightweight and not strongly situated temporally, i.e., the domain agents do not need hard (fixed) real-time time performance guarantees thus consequences of an action request being delayed by another are slight (if any). Centralized coordination is still required (versus no coordination) to ensure that the agents do not overwhelm the caregivers or Lois and to ensure that important messages are sent in a timely fashion. Centralization also enables us to encapsulate certain trusted behaviors in an agent of our own design. If coordination were performed in a peer to peer fashion, a poorly written or malicious 3rd party agent could wreak havoc with the system by not coordinating well (or at all). Centralization also encapsulates human interaction policy implementation, i.e., only one agent needs to carry out the protocols for contacting Lois or a caregiver (e.g., first try pager, then try telephone, if still no response, contact the next listed caregiver using the telephone, etc.). The centralized approach to coordination is implemented by the response coordinator agent, shown in Figure 1. This agent is responsible for accepting all action requests (reminders, notifications, alarms, alerts) from the domain agents, coordinating the requests, and interacting with the device agents as follows: If multiple reminders for Lois occur within some period of time the reminders are multiplexed into a single message so that Lois only receives one phone call or one email message that contains the set of reminders. Figure 2 illustrates the algorithm. The general idea is to create temporal buckets into which all requests that occur within their boundaries are tossed. When the edge of the temporal window arises, the reminders contained in the bucket are multiplexed and sent. The implications are that some reminders will be delayed slightly, however, this is also the
52 only way to ensure that Lois is bothered by a reminder at most every time units. If multiple notifications for the same caregiver occur within the same interval of time, they are multiplexed using the same algorithm as reminders, presented in Figure 2. However, the binning algorithm is endpoint dependent – notifications are inserted into different buckets or bins than reminders and notifications to different caregivers are inserted into different buckets. When alarms and alerts are generated, they are immediately dispatched to the appropriate devices according to the appropriate protocol (which the response coordinator implements), e.g., contact all caregivers in parallel using their first contact device. Then move the system into alarm mode until a caregiver resolves the situation or acknowledges the alarm. If the system is in a mode such that reminders to Lois are to be suppressed (e.g., an alarm has been issued), the action requests are returned to the domain agents instead of being issued to Lois. The domain agents are told why the requests were refused and are able to reissue the request at some point in the future. If the condition has cleared in the meantime, the new requests are granted. Note that the task of tracking the system state and knowing when it is cleared is also centralized with the response coordinator. Note that device load is moderated by the binning algorithm shown in Figure 2. If device load becomes a performance issue we can adjust the algorithm or implement other protocols, e.g., store low priority messages for longer periods of time, easily by changing the response coordinator’s control algorithm. (Currently aspects of this class of concerns are also carried forward into the implementation by using priority queues at the device agent level so that alarms and alerts are always issued before reminders and notifications.) When action requests are issued the device agents respond with an acknowledgment of the communication and possibly with information that the end user (client or caregiver) has generated. For instance, when an alert is sent to a caregiver he/she can accept responsibility for the alert or acknowledge receipt of the notification without accepting responsibility for the alert. The data from the device agents is sent back to the response coordinator who then forwards the data onto the domain agent(s) if appropriate. The response coordinator needs to be in the device-backto-domain-agent loop in order to manage system modes, e.g., alert mode (to determine that the alert mode is ended the response coordinator needs the caregiver’s response particulars).
An Application Science for Multi-Agent Systems
53
Pseudocode for the response coordinator agent’s control algorithm appears in Figure 3. To illustrate its use, consider an example. Assume that the current time is and that the system is in a normal operations mode (i.e., not in an alert mode and Lois is home). At the medication agent issues a reminder to Lois to take medication Y because she is now 1/2 an hour past her scheduled dose time. Assume that no reminders have been issued for Lois within the last several hours. In response to this, the rc (response coordinator) will create a new bucket for reminders to Lois, put the reminder into the bucket, and set an alarm to wake up at timeWindow time in the future – for this example let us assume that timeWindow has a value of five minutes. At the eating agent issues a reminder to Lois to eat breakfast as she has not done so and her usual breakfast time has passed. In response to this the rc checks for buckets, ascertains that one is currently active, inserts the eating reminder into the bucket and goes back to waiting for the alarm it set when the first reminder arrived. At the medication agent issues a new reminder to Lois to consume medication Z because she is now 1/2 an hour past that scheduled does time also. The rc inserts the new reminder into the existing bucket and resumes waiting for its wake up notice. At the rc receives its wakeup notification from the time agent, wakes up, checks the state of its cached action requests, and issues the reminders to Lois. Assume in this scenario that Lois prefers reminders to come via telephone call and automated audio rather than via webpad because her vision is not very good and the webpad is difficult for her to focus on. At the phone rings and I.L.S.A. reminds Lois to take medications Y and Z and to eat breakfast. Without the rc I.L.S.A. would have called Lois three times during this scenario. Now, at let the falls agent detect that Lois has fallen on the stairs. The falls domain agent then sends the rc an alarm action request. In response to the request the rc immediately shifts the system into an alarm state and passes the alarm to the appropriate caregivers. In the case of an alarm, all registered caregivers are notified in parallel on their 1st choice of device. Thus in this case I.L.S.A. sends the alarm to caregiver Betty via webpad, caregiver Bill via telephone, and caregiver Becky via pager. While I.L.S.A. is communicating the alarm to the caregivers, the mobility agent determines that Lois is not getting her usual amount of exercise and issues a reminder to her to move about and exercise. When the rc receives the reminder request, it checks the system state, determines that I.L.S.A. is currently dealing with an alarm situation, and rejects the reminder request. The mobility agent receives the refusal, does not clear its own trigger, and waits for some interval before trying again. In the interim, caregiver Bill has agreed to see to Lois and has indicated this to I.L.S.A.. When the indication reaches the rc it clears the system-wide alarm mode and sends notifications to the other caregivers that Bill is handling the situation.
54 In this example, the centralization of response coordination is what enables the system to multiplex notifications for Lois and to enforce system-wide policies about modes (e.g., when in an alarm state, don’t issue reminders) and policies about contacting caregivers. Without a centralized rc each domain agent would have to implement all the system-wide policies, and be trusted to do so, and each domain agent would have to dialog with the others to determine if notifications or reminders could be multiplexed. This requires complex coordination control in each domain agent and a temporal control reasoning capability in the domain agents beyond that required with the centralized system. The bottom line is that centralization greatly simplifies the control in this application and it ensures that trusted Honeywell agents implement the system-wide policies. It is important to note that centralization in the general case may not always be desirable. Centralization can lead to a localized performance bottleneck and creates a system with a single point of failure. If the agents are spatially distributed centralization also introduces network uncertainty, latency, and connectivity issues so that even if the centralized coordinator is still online and functioning agents who are unable to connect are unable to coordinate. Other issues with centralization include scale and computational limitations – if coordination requires deep temporal analysis it is unlikely that any single agent could handle the coordination problem of even a fairly modest MAS. Other issues that are particularly important for agent-based e-commerce are localized control and information hiding or privacy. In general, we strongly advocate a distributed approach like that used in our supply chain research [26] and other research that is based on TÆMS [4], DTC agent scheduling [25], and GPGP agent coordination [4]. However, as enumerated above, I.L.S.A.’s requirements are different and the characteristics of the current and near term problem space lead us to a centralized approach. It appears likely that other caregiver applications will have similar characteristics and motivate a similar centralized response coordination approach.
3.
3.1
Aircraft Service Team Coordination The Application
Aircraft returning from an engagement need refueling and potentially need new ordinance and repairs. The aircraft are serviced by multiple different service crews that have different capabilities and require different resources and that must coordinate to effectively prepare the aircraft for another mission. For instance, it may not be desirable to service the engines while new ordinance is being loaded. Similarly it may not be possible to service the engines while refueling takes place. In contrast, it may be possible to overlap some activities, e.g., replacing the cockpit avionics while refueling the aircraft.
An Application Science for Multi-Agent Systems
55
In this section, we explore the use of agent technologies to coordinate aircraft service teams and present a new TÆMS [5] coordination algorithm used to coordinate service team activity. The performance of the algorithm is compared to a centralized scheduling oracle that generates optimal schedules for the teams – though the centralized scheduling problem is exponential, the oracle provides a good basis for comparison on smaller problem instances. A screen image of the application is shown in Figure 4. Note the “busy” task structure in the center of the screen – it is part of the of the centralized coordination problem that the agents solve through distributed and local reasoning with partial views (the centralized view is that of the simulation environment). In the following sections, we specify the problem space, define a new keybased coordination algorithm, and compare the algorithm to the centralized oracle.
56
3.2
TÆMS and TÆMS Agents
We use the expression TÆMS agents to describe our agent technology because the cornerstone of our approach is a modeling language called TÆMS (Task Analysis Environment Modeling and Simulation) [5]. TÆMS is a way to represent the activities of a problem solving agent – it is notable in that it explicitly represents alternative different ways to carry out tasks, it represents interactions between activities, it specifies resource use properties, and it quantifies all of these via discrete probability distributions in terms of quality, cost, and duration. The end result is a language for representing activities that is expressive and has proven useful for many different domains including the BIG information gathering agent [18], the Intelligent Home project (IHome) [15], the DARPA ANTS real-time agent sensor network for vehicle tracking [11], distributed hospital patient scheduling [4], and others like distributed collaborative design, process control, agents for travel planning, agent diagnosis, and others. Figure 6 shows portions of TÆMS task structures for Mission Control and three of the service teams. Consider the Mission Control task structure. It is a hierarchical decomposition of a top level goal which is simply to Prepare and Launch Aircraft. The top level goal, or task, has two subtasks which are to Prepare and Launch Wing1 and Prepare and Launch Wing2. Each of these tasks are decomposed into subtasks to service a particular aircraft in the given wing, e.g., Prepare F16.1 For Launch, and finally into primitive actions. Tasks are represented with oval boxes, primitive actions with rectangles. Note that most of the decompositions are omitted from the figure for clarity. The details are shown for the Prepare F16.1 For Launch task – it is decomposed into a single primitive action, Launch F16.1, which denotes the time required for Mission Control to launch the aircraft when the plane is ready. The operative word here is ready. In order for a given aircraft to be launched on its next mission, it must be serviced. The service activities are not carried out by Mission Control. In the figure, Mission Control’s dependence on the activities of the service agents is denoted by the edges leading into Launch F16.1 from the actions of other agents. These edges, called enables in TÆMS, denote that the other agents must successfully perform their tasks before the Launch F16.1 activity can be carried out by Mission Control. These enables are non-local-effects (NLEs) and identify points over which the agents must coordinate. The time at which Mission Control can execute Launch F16.1 is dependent on when the other agents perform their tasks. A different type of NLE exists between the Weapons Controls Repair agent and the Avionics Repair agent – the two F16.1 actions cannot be performed simultaneously and that is another point over which the agents must coordinate. In this problem, this spatial/temporal interaction of the service
An Application Science for Multi-Agent Systems
57
teams is the coordination problem on which we focus. The former enabling-ofthe-launch-task interaction only requires that the service agents notify Mission Control of when they plan to perform their activities because in this application Mission Control sets and maintains deadlines and the other agents negotiate over the temporal/spatial MUX NLEs to satisfy the stated deadlines if possible. Note that within a task structure deadlines and earliest-start-times are inherited (unless those lower in the tree are tighter) so the temporal constraints on Prepare and Launch Wing1 also apply to Launch F16.1. The same deadlines are propagated through the enables coordination to the service team agents – note that F16.1’s engines must be serviced by 240 also. Note that all of the primitive actions (leaf nodes) also have Q (quality), C (cost), and D (duration) discrete probability distributions associated with them. For simplicity in this paper we do not use uncertainty and all values will have a density of 100%. Repairing the engines of F16.1 thus takes 200 time units while servicing the engines of F16.2, which are less damaged, requires 150 time units. The two activities produce qualities of 12 and 9 respectively. The sum() function under most of the parent tasks is called a quality-accumulation-function or qaf. It describes how quality (akin to utility) generated at the leaf nodes relates to the performance of the parent node. In this case we sum the resultant qualities of the subtasks – other TÆMS functions include min, max, sigmoid, etc. Quality is a deliberately abstract concept into which other attributes may be mapped. In this paper we will assume that quality is a function of the importance of the repair.2 In the sample task structure there is also an element of choice – this is a strong part of the TÆMS construct and important for any dynamic environment in which resources or time may be constrained. The Repair Aircraft Engines task, for example, has two subtasks joined under the sum() qaf. In this case the Engine Repair agent may perform either subtask or it may perform both depending on what activities it has time for and their respective values. The explicit representation of choice – a choice that is quantified by those discrete probability distributions attached to the leaf nodes – is how TÆMS agents make contextually dependent decisions. By establishing a domain independent language (TÆMS) for representing agent activity, we have been able to design and build a core set of agent construction components and reuse them on a variety of different applications (mentioned above). TÆMS agents are created by bundling our reusable technologies with a domain specific component, generally called a domain problem solver, that is responsible for knowing and encapsulating the details of a particular application domain. It is sufficient to understand that TÆMS agents have components for scheduling and coordination that enable them to 1) reason about what they should be doing and when, 2) reason about the relative value of activities, 3) reason about
58
temporal and resource constraints, and 4) reason about interactions between activities being carried out by different agents. A high-level view of a TÆMS agent is shown in Figure 5; everything except for the domain problem solver is reusable code. Note that each module is a research topic in its own right. The agent scheduler is the Design-to-Criteria [19, 25, 27] scheduler and the coordination module is derived from GPGP [4]. Other modules, e.g., learning, can be added to this architecture in a similar (conceptual) plug and play fashion.
3.3
Dynamic Aircraft Readiness
For the dynamic aircraft readiness (DAR) project we simulated aircraft returning from an engagement and needing repairs and readiness operations to be performed. Three types of aircraft are modeled in the prototype: F16s, A10s, and C9 surveillance craft. When an aircraft returns it is potentially in need of (to varying degrees): 1) fuel, 2) missiles, 3) repairs to engines, 4) repairs to cockpit avionics, or 5) repairs to cockpit weapons controls.3 Each incoming aircraft is assigned a deadline by which it is to be ready for redeployment. Mission Control is responsible for assigning the deadline and for identifying the areas of the aircraft that need service. There are five teams on the ground that ready the aircraft for their next mission. Each team is controlled by a coordination decision support agent that uses TÆMS agent technology to reason about what the team should be doing, when, and with which resources. In this scenario the following teams handle aircraft preparation: 1) refuel, 2) rearm (replaces depleted missiles), 3) avionics repair, 4) weapons controls repair, and 5) engines repair. As aircraft land the Mission Control agent notifies the service teams of the aircrafts’ service needs and readiness deadlines. The agents then communicate with one another and reason, in a distributed fashion, about how their tasks may interact and how
An Application Science for Multi-Agent Systems
59
best to select and sequence operations so that the most aircraft can be ready by their respective launch times (if possible – not all problem instances contain fully satisfiable constraints). The agents perform this coordination using a new coordination-key algorithm presented in later sections. In this scenario the tasks required to repair an individual plane do not need to be performed in any specific sequence, however, there are sets of tasks that cannot be performed simultaneously because they involve the same geographic regions of the aircraft. For instance, the engines cannot be serviced while a plane is rearmed as both of these activities take place on or near the wings. In contrast, avionics can be serviced while an aircraft is rearmed because avionics reside in the cockpit region and the rearming takes place on or about the wings. A full specification of task interactions is shown in Table 1. There are several characteristics of this problem instance that make it a hard problem: The situation is dynamic – it is unknown a priori in what state the planes will be when they return from their mission. Thus the agents must coordinate and decide which operations to perform in real-time. Agents must make quantified / value decisions – different tasks have different values and require different amounts of time and labor resources. For instance, it may not be necessary to refuel the aircraft before the next mission but servicing avionics may be critical. Coordination is dynamic – the operations being performed by the repair teams interact and the occurrence of the interactions are also not known a priori. For instance, until an aircraft lands it is not known whether an engine will need servicing at the same time that a refueling crew is attempting to service the aircraft. Deadlines are present – aircraft have a deadline by which repairs must be completed and different aircraft may have different deadlines. Without
60 deadlines an inefficient algorithm will generally still service all of the aircraft. Deadlines require the agents to reason about end-to-end processes and to coordinate with other agents to optimize their activities. (This type of agent coordination problem is conceptually dynamic distributed scheduling.) Tasks are interdependent – tasks interact in two different ways: 1) over shared resources in a spatial/temporal fashion, 2) multiple tasks must be performed to accomplish a goal, e.g., (though in TÆMS this generally pertains to degrees of satisfaction rather than a boolean or binary value). Not always possible to meet all deadlines – not all problem instances are solvable in the sense that in a given scenario, it may be possible that the optimal solution is to miss one aircraft deadline rather than missing many deadlines. When control is distributed, this characteristic can make it particularly difficult to converge on a solution because it is difficult to know whether an optimal result has been achieved (without a complete, global view and a centralized scheduling technology). This characteristic means that it is fairly easy for a coordination algorithm to lead to many planes being partially serviced and none of them actually meeting their deadlines. This problem instance requires three classes of simulation activities: 1) simulating the outcome of the last mission in terms of aircraft condition, 2) simulating the activities of Mission Control and the initial damage assessment team, 3) simulating the activities of the repair crews. While detailed description is beyond the scope of the paper, from a high level, the aerial battle is simulated using either a problem space generator or a human generator who selects aircraft from a palette and “breaks” the aircraft. The activities of Mission Control and the initial damage assessment team are captured in TÆMS task structures that are produced by the generation tools. In essence, the Mission Control agent “sees” an aircraft for the first time at its specified landing time and at that same time a description of the aircraft’s service needs is transmitted to Mission Control in TÆMS format. Mission Control then disseminates the information to the service teams. The activities of the service teams are simulated using the TÆMS agent simulation environment [24]. In this environment the agents, which are distributed on different machines and execute as different processes, communicate and carry out simulated tasks. The simulated tasks, like real tasks, take a specified amount of time to execute and consume resources, e.g., replacing an avionics module of type 1 consumes one type 1 avionics module. Space precludes a detailed specification of tasks and attributes, however, it is important to note that different tasks require different resources, differ-
An Application Science for Multi-Agent Systems
61
ent amounts of resources, and require different time to perform. For instance, refueling an aircraft that is fully depleted requires more time and consumes more fuel (a resource). Other examples: repairing engines damaged to level 4 (heavily damaged) requires more time than engines that are damaged to level 1 (lightly damaged), rearming four missiles requires more time than rearming two missiles, etc. Similarly, different aircraft consume different resources and not all aircraft need a particular class of service. For instance, the C9 surveillance aircraft does not carry missiles and does not contain a weapons controls module. In contrast, both the A10 and the F16 carry missiles and both have weapons controls modules but the modules for the two aircraft are different and require different amounts of time to service. The teams themselves also maintain different resources, e.g., the refueling team is the only team that consumes the fuel resource. However, in the problem instance discussed in this paper the teams do not interact over consumable resources so the coordination problem is one of spatial and temporal task interaction. The characteristics of the solution to this particular application problem can be found in other problem domains. The underlying technical problem is to coordinate distributed processes that affect one another when the environment is dynamic and the coordination problem cannot be predicted offline / a priori but instead must be solved as it evolves.
3.4
Coordination via Keys
The goals of coordination in the dynamic aircraft readiness application are: 1) to adapt to a dynamic situation, 2) to maximize the number of planes that are completely repaired by their respective deadlines, 3) to provide mutual ac-
62 cess to shared physical resources, 4) achieve global optimization of individual service team (agent) schedules through local mechanisms and peer-to-peer coordination. When examining the coordination problem, it became clear that this application domain has a unique property not generally found in TÆMS agent applications – for agents whose tasks interact, all of their tasks will interact. By way of example, all of the engine repair tasks interact with all of the refueling tasks interact with all of the rearming tasks. Similarly for the tasks that pertain to the cockpit. All avionics tasks interact with all weapons controls tasks. The implications of this property for coordination are that: 1) there is no reason for a service team that operates on the wing region to interact with a team that operates in the cockpit and vice versa4, 2) agents that operate on the same spatial area (wing or cockpit) must always coordinate their activities. This translates into a discrete partitioning of the agents into coordination sets, i.e., = {Repair Engines, Refuel, Rearm}. = {Repair Avionics, Repair Weapons Controls}, where Within each coordination set the tasks of the member agents form a fully connected graph via TÆMS non-local-effects. This means that for any agent of a given set, e.g., the engine repair agent of to schedule a repair task it must dialog with the other agents to ensure that mutual exclusion over the shared resource, e.g., the wing on plane F16.1, is maintained. This coordination problem could be solved in typical GPGP [5, 4, 16] fashion. However, GPGP operates in a pairwise peer-to-peer fashion. For agents in this means that coordination could require a significant amount of time to propagate and resolve the interacting constraints and it is unclear given the dynamics of the environment and the speed with which coordination must occur whether convergence on a reasonable, if suboptimal, solution would ever occur.5 Because of the strong interconnectedness of the tasks and the partitioning of agents into coordination sets, we developed a new algorithm for problem classes of this type.6 The algorithm uses a coordination key data structure and concepts from token-passing [23, 12] algorithms to coordinate the agents. The general operation of the algorithm is that there is one coordination key per coordination set that is passed from agent to agent in a circular fashion. When an agent is holding the coordination key for its coordination set, it can 1) declare its intended course of action / schedules, 2) evaluate existing proposals from other other agents, 3) confirm or negate proposals of other agents, 4) make its own proposals, or 5) read confirmations or negations of its own proposals by other agents. The coordination key itself is the vehicle by which this information is communicated. Each key contains intended courses of action, proposals, and
An Application Science for Multi-Agent Systems
63
proposal responses, and this information is modified as the agents circulate the given key. The pseudo-code of the algorithm is shown in Figure 7. The coordination key algorithm is effective but approximate and heuristic. The crux of the matter is that in order for the agents to coordinate optimally over a single issue, e.g., when agent X should perform task the key must circulate through the coordination set multiple times. The number of times that each agent must hold the key is dependent on the changes made during each iteration. In the worst case each agent will have to re-sequence each of
64
its activities once for every change that is made, but these changes propagate to the other agents so the circulation-to-convergence factor is rather than (where is a constant). The coordination key algorithm above multiplexes changes so that in a given pass through a coordination set multiple changes are considered by the agents at once. We hypothesized that in some problem instances the algorithm would fail to find an optimal solution but that in most problem instances it would perform well. To test this hypothesis we created a centralized global scheduler that creates schedules for all of the agent teams via exhaustive search. The centralized scheduling problem is exponential, however, for instances having
An Application Science for Multi-Agent Systems
65
less than 11 total repairs the exhaustive scheduler is responsive enough for experimentation.7 Because the problem instance presented here uses a subset of TÆMS features, the centralized scheduler is designed to solve a representation of exactly the subset needed, i.e., it does not perform detailed TÆMS reasoning but instead maintains the required constraints (e.g., deadlines, earliest start times, service teams can only service one aircraft at a time, and only one service team can work in a cockpit or the wing region at a given point in time). The centralized scheduler algorithm is outlined in Figure 8. The function of the centralized scheduler is twofold. First, it determines the minimum number of aircraft deadlines that will be missed by an optimal solution. In some cases all deadlines can be met and in others aircraft deadlines represent unsatisfiable constraints. The second role of the centralized scheduler is to determine the relative size of the different solution spaces. For instance, for a given problem there may be zero solutions that don’t miss any deadlines, X (optimal) solutions that miss one aircraft deadline, Y solutions that miss two aircraft deadlines, Z solutions that miss three aircraft deadlines, etc. By tabulating this information we can determine a percentile ranking for the solutions produced by the distributed coordination key algorithm. The centralized scheduler does not compete with the distributed coordination key algorithm on a completely level playing field. The centralized scheduler sees all the repairs that will be needed for all planes on a given problem instance at time 0. The agents in the distributed system only see repairs as the aircraft land. Thus, for the instance shown in Figure 8, the service team agents will not see aircraft A 10.1 until time 25 (when it lands). At this time they may be committed to a suboptimal
66 course of action that the centralized omnipotent scheduler will avoid because it can see A10.1’s repairs at time 0 along with all of the other repairs that will need to be scheduled. This difference is due to a need to keep the centralized scheduler development costs down and has its roots in design/implementation issues with the simulation environment. A related bias in favor of the centralized scheduler is that the distributed coordination mechanisms operate in the same simulated clock as the repairs themselves. This enables the simulation environment to control and measure coordination costs but causes a skew in terms of the apparent cost of coordination relative to domain tasks, e.g., in some cases the ten clicks (about 5 seconds in wall clock time) that the agents require to coordinate will take as much simulation time as it takes the service teams to rearm one missile on an aircraft. The skew is of primary relevance when comparing the distributed algorithm to the centralized scheduler and is less of an issue when comparing different distributed algorithms. Table 2 presents the results of comparing the coordination key algorithm to the optimal and exhaustive centralized scheduler. Each row is the statistical aggregation of one set of trials where each set of trials is drawn from one difficulty class. The rows lower in the table represent increasingly more difficult problem instances – aircraft having more repairs and tighter deadlines relative to their landing times and the time required for their repairs 8. All rows except for the last represent 32 random trials. Row D contains 28 because of the occasional exception thrown by the exhaustive scheduler caused by running out of RAM. As the difficulty increases, note that the density of the solution space increases and shifts right. This is represented by the columns X=0, X=1, ..., which contain the mean number of solutions produced by the oracle that miss 0 deadlines, 1 deadline, etc., respectively. As the problem instances get harder more aircraft are likely to miss deadlines. Note that the coordination key algorithm generally performs well for all of the tested conditions. The Mean value denotes the average number of aircraft deadlines missed during a batch of trials. The more descriptive statistics are those about the percentile ranking of the solutions generated by coordination keys. This is because how well the keys algorithm performs is determined not by the absolute number of missed deadlines (the average of which is presented in the mean column) but instead by the solutions possible for a given trial. For instance, in some trials the best solution possible may miss two deadlines. As the difficulty increases the mean value for the keys algorithm increases because there are more instances where the optimal solution is to miss one deadline, or two deadlines, etc. Looking at the percentiles, in experiment class A the keys algorithm performed in the 100th percentile, in experiment class B the 98th percentile, in experiment class C the 97th percentile, and in class D (the most difficult class), the 98th percentile. The percentile rating is computed as follows:
An Application Science for Multi-Agent Systems
67
The centralized scheduler generates all of the unique schedules that exist for a given individual trial. These schedules are binned according to the number of deadlines missed, e.g., in X of the schedules 0 aircraft miss a deadline, in Y of the schedules 1 aircraft misses a deadline, in Z of the schedules 2 of the aircraft miss a deadline, etc. Think of the centralized scheduler as producing a histogram of possible solutions where solutions are binned by the number of deadlines missed. Let be the number of aircraft deadlines missed by the coordination key algorithm in trial i. Let denote the histogram bin in which bin that pertains to missed deadlines).
falls (the
Let Density-at-or be the of the densities of solutions that are in bins > or = to Bins > . represent solutions that are worse because they entail missing more deadlines. Let
where is the total number of solutions generated by the centralized scheduler for trail i. is the percentile ranking for the coordination key algorithm for trial i of the set of 32.
Let Overall_Percentile_Ranking = be the overall percentile ranking for one batch of 32 trials. In all cases the median percentile is 100% and the standard deviation is low. Because there are generally multiple solutions that perform as well as the solutions actually generated by the coordination keys, its percentile is broken down in the last three columns of Table 2. The column marked %-tile Same indicates the mean % of possible solutions that miss exactly as many deadlines as the keys algorithm did. %-tile Better indicates the number that performed strictly better (missing fewer aircraft deadlines) and %-tile Worse indicate the number that performed strictly worse. Note that as the problem space gets harder the number of solutions possible that are worse than those found by the keys algorithm increases. At the same time the band of solutions as good as those generated by keys narrows, as does the band of solutions that are strictly better than those found by the keys algorithm. While the data suggests that the algorithm performs well on average, there are circumstances where the algorithm performs less well. We examined several such instances in detail and while we have intuitions about when the algorithm will perform in a suboptimal fashion, the experiments in which perfor-
68 mance is suboptimal pertain to a more basic issue. To illustrate let us assume a three-aircraft problem instance with the following characteristics: Aircraft F16 arrives at time 15 with a deadline or take-off time of 400 and requires repair of engines damaged to level 2 (the duration of this repair is 100). Aircraft A10 arrives at time 18 with a deadline of 450 and requires complete refueling (the duration of this task is 100). Aircraft C9 arrives at time 24 with a deadline of 240 and requires repair of engines damaged to level 2 (the duration of this repair is 100) and refueling of a quarter tank (duration of this tank 25). The F16 lands at time 15 and the engine service team obtains the coordination key and schedules the engine repair of the F16 to run from time 17 to 117. The A10 lands at time 18 and at time 19 the refuel team gets the coordination key and schedules refueling of the A10 to last from 19 to 119. When the C9 lands at time 24 the engine service team is thus occupied with the F16 until time 117 and the refueling team is occupied with the A10 until time 119. To respond to the C9’s landing and repair needs, the engine service team obtains the coordination key at time 25 and schedules C9’s repair to run from time 117 to time 217. At a subsequent time-step, the refueling team attempts to schedule C9’s refueling, however, because both refueling and engine repair are mutually exclusive tasks, the earliest time the refueling team can schedule the C9 is at time 217. This means it is impossible to service the C9 by its deadline (takeoff time) of 240. In response to this pending failure, the refuel service team attempts to negotiate with the engine service team via the coordination key to obtain a wing access slot between 119 and 217. However, the engine service team needs that time slot to complete its portion of the C9’s engine repairs on time. The end result is that the C9’s deadline cannot be met. For this same problem instance, however, the centralized scheduler was able to produce a solution in which all of the deadlines are met. The underlying issue is that service activities are not interruptible in this problem instance – otherwise repair teams could run from aircraft to aircraft and the optimization problem would be much simpler. If activities were interruptible, when the C9 first landed either the engine service team or the refuel service team could disengage from their respective current activities (servicing the F16 or the A10) and attend to the C9, which is the aircraft with the tightest deadline. The reason the centralized scheduler is able to produce a better solution in this problem instance – a solution which eludes the distributed coordination approach – is that the centralized oracle sees all of the repair tasks a priori. It thus considers the possibility of not servicing the F16 or A10 im-
An Application Science for Multi-Agent Systems
69
mediately upon arrival so that the C9 can be serviced by engines or refueling immediately upon its arrival and all deadlines can be met. This particular performance issue derives from the somewhat imbalanced playing field (discussed earlier) between the distributed algorithm and the centralized oracle. Interestingly, we can hypothesize two instances where the distributed algorithm will fail to perform well, even on a level playing field, but such instances occur infrequently in randomly generated problem instances – even those with tight deadline constraints and numerous repairs per aircraft.9 One instance where the the coordination key algorithm will perform less well entails semi-independent coordination problems that occur simultaneously in the coordination set of more than two agents. Imagine a coordination set of the rearm, refuel, and engine repair agents. Let the key pass from agent to agent in the following order: rearm to refuel to engine (then the cycle repeats). Now, let us assume that at time the rearm agent needs a time slot that is held by the engine agent, and that refuel needs a time slot that is held by the rearm agent. The implications are that multiple unrelated proposals must reside on one key for part of the coordination set traversal, i.e., the proposal from rearm to engine and the proposal from refuel to rearm both reside on the key during the refuel to engine to rearm circuit. The key algorithm is designed with the assumption that, in general, multiple proposals will pertain to a single (sometimes multi-step) coordination process. Therefore, when the engine agent receives the coordination key it either accepts or rejects the set of current proposals (from the rearm and refuel agents) en masse even though it may only be affected by the rearm agent’s proposal. In this case, when the set of proposals arrives and the engine agent determines that it cannot satisfy the rearm agent’s request, it rejects the proposals en masse and the proposal from refuel to rearm is never evaluated by the rearm agent. This may result in a missed opportunity for the refuel agent. The shortcoming described here can be fixed by making the agents more selective in proposal rejection. Another instance where the coordination key algorithm may perform less well is when a long chain of multi-step inter-locking resource releases are required. The factor at work is the algorithm’s approximate limited-cycle-toaction model. However, as noted, neither class of problems occur frequently with random instances. We are currently exploring creating a generator and experiments to test performance under these circumstances.
4.
Pushing from Centralized to Distributed and Vice Versa
We have examined two different approaches to coordinating the interactions of multiple agents. In the case of the response coordinator in I.L.S.A., a manyto-one response collection, aggregation, and dispatch was used. In the case of
70 the aircraft service teams in the DAR application, a distributed approach with shared knowledge between the distributed agents was used. The motivation for each approach is stated in their respective sections. An interesting issue to consider is what if one were to push the different approaches toward one extreme, fully centralized, or another, fully distributed. With respect to the I.L.S.A. application – response coordination could be decentralized. However, the interesting property that appeared in the repair team coordination application, namely the degree of interconnectivity, also appears in I.L.S.A. response coordination. If response coordination in I.L.S.A. were distributed, there would need to be multiple different coordination sets where each set related to coordinating the responses intended for a given recipient (caregivers or Lois). In the aircraft service application, coordination centers on shared spatial regions of a given plane – all tasks of a given set of service teams potentially always interact. For responses being routed to the same recipient, this is also true though the degree of interaction is not “potential” but fixed at where is the number of responses in question. Once a set of coordination sets are defined in I.L.S.A., algorithms such as the keys algorithm could be used. However, the formation of the coordination sets themselves requires online reasoning because the recipient list can be changed dynamically in I.L.S.A.. Thus whenever a change is made to the response contact list, the coordination modules of each of the agents would need to be notified and then engage in a protocol to form the necessary coordination sets (or another single agent could do the reasoning for them and distribute the results). Another interesting caveat of I.L.S.A. is that the response notification protocols are different for different types of events. For instance, an alert must be distributed to numerous contact personnel whereas a notification must be sent, and acknowledged, by a single caregiver. However, if a caregiver does not acknowledge a notification, then a different caregiver must be notified. This implies that the coordination sets in I.L.S.A. must either, by default, include all possible response candidates or that when a fallback response caregiver is being notified, the agents dynamically form a new coordination set (or the issuer of the response joins an existing one). While the keys algorithm used in the aircraft service team application is distributed, the shared knowledge and synchronous coordination within a given coordination set has commonalties with a centralized approach. If one were to push the aircraft service team application to a centralized approach, there could still be different coordination sets and a single agent within a coordination set could be assigned the task of doing the coordination reasoning for that set. This, of course, has the undesirable properties of centralization – including introducing a potential computational bottleneck depending on the scheduling approach used (optimal versus approximate or heuristic). Pushing keys the other direction, into a more fully distributed mode where each agent
An Application Science for Multi-Agent Systems
71
coordinates in a peer-to-peer fashion could lead to system-wide thrashing over the interconnected issues that were being coordinated over individually. Note, however, that we have not experimented directly with this and the observation is based on a high-level analysis and experience.
5.
Future Work and Limitations
With respect to I.L.S.A., coordination currently centers on shared resources, i.e., the interface to the caregivers and to Lois. In the future it may be necessary to coordinate the activities of the other agents in the system. For instance, if the computational tasks being performed by the agents become heavyweight enough that they over burden the processor, performance of certain tasks may have to be coordinated to avoid unacceptable system slow downs. Another example is if the sensor interpretation agents were to only perform certain analysis tasks if the output were needed by one or more of the agents in the system. In the current model, the agents within the system do not explicitly reason about the interactions in their activities and do not coordinate over said interactions. If agents were to reason about such interactions, it is highly probable that distributed temporal task centered coordination like that used in our other work, based on TÆMS/DTC/GPGP [17, 5, 4, 25, 10, 26], will be appropriate. Another area in which coordination will play a role in I.L.S.A. is when the I.L.S.A. system is deployed in a multi-unit care facility. In this setting, one I.L.S.A. system will take care of one client but the pool of human caregivers will be common across the set of I.L.S.A. systems (or at least subsets may be common). The individual I.L.S.A. systems will thus need to coordinate over non-emergency tasks to optimize caregiver time and to prevent thrashing behavior (caregivers being overbooked, not having sufficient time with each client, etc.). I.L.S.A.-to-I.L.S.A. coordination is another area that, on the surface, appears to lend itself to a decentralized and distributed coordination approach like that used in TÆMS research, e.g., [4]. In general, centralization is appropriate for response coordination of an individual I.L.S.A. instance. The automated caregiver application has different characteristics and requirements than others, e.g., distributed supply chain management, that mandate a distributed approach. The centralized approach has proven effective in deployment and we believe has decreased the number of current and potential deployment issues. Centralization should be considered an appropriate option for domains like automated caregiver systems though distributed coordination technologies are required to address different (classes of) applications. With respect to dynamic aircraft readiness, in the future, we would like to compare the coordination key algorithm to a distributed pairwise (classic GPGP style) algorithm. We believe the key-based approach will perform bet-
72 ter but this is only conjecture at this point. Because the character of the spatial interactions in this problem differs from that typically modeled in TÆMS the standard GPGP coordination techniques could not be employed without modifications to TÆMS. Due to time and resource constraints, and a desire to compare distributed coordination to a centralized optimal oracle, resources were directed in that fashion. Hereto unmentioned is the possibility of creating an efficient optimal centralized scheduler for the service teams. As presented here the task space appears sufficiently constrained to lend itself to centralized approaches that do not require exhaustive search. This is partly an artifact of the task space as framed for this application – we are using a small subset of TÆMS features and the non-local-effects (NLEs or task interactions) presented in this paper all involve mutual exclusion. In the general case, task interactions may impact each other’s quality, cost, and durations (not just dictate mutual exclusion) and the element of choice is larger than presented here – these characteristics combined with differing deadlines often thwart typical non-exhaustive centralized scheduling methodologies. More importantly, however, is the motivation for distribution. Distribution enables incremental addition of repair teams, gives each team local autonomy (if the simulated teams were human, they could exercise their own judgment and the TÆMS technologies would coordinate with the human’s choices when the coordination recommendation is over-ridden), removes a central point of failure, and removes the issue of computational or communication overhead that occurs with one centralized scheduling node. In the broader sense, centralization is often not possible or not desirable due to privacy concerns, the need to avoid a central point of failure, scalability issues, or the potential processing delay. In many instances, it is also not required – consider the discrete spaces represented by the different coordination sets in this application but imagine a network of 100,000 agents – not all of these would need to interact, coordinate, or even be aware of one another.
6.
Acknowledgments
Both the I.L.S.A. project and the keys based coordination work rely on prior art and the work of others. TÆMS and TÆMS agents have a long history and we would like to acknowledge those many other researchers who have contributed to their growth and evolution – some of the individuals are Victor Lesser, Keith Decker, Alan Garvey, Tom Wagner, Bryan Horling, Regis Vincent, Ping Xuan, Shelley XQ. Zhang, Anita Raja, Roger Mailler, and Norman Carver. We would also like to acknowledge the support of Mr. John Beane of Honeywell on this project. The I.L.S.A. project is a large effort at Honeywell Laboratories and many individuals have contributed to the intellectual underpinnings of the project. Some of the contributors are: Chris Miller,
An Application Science for Multi-Agent Systems
73
Karen Haigh, David Toms, Wendy Dewing, Steve Harp, Christopher Geib, John Phelps, Peggy Wu, Valerie Guralnik, Ryan Van Riper, Thomas Wagner, Steven Hickman, and others The I.L.S.A. project was performed under the support of the U.S. Department of Commerce, National Institute of Standards and Technology, Advanced Technology Program, Cooperative Agreement Number 70NANBOH3020. The aircraft service work was sponsored by the Defense Advanced Research Projects Agency (DARPA) and the Office of Naval Research under agreement number N00014-02-C-0262 and by Honeywell International under project number I10105BB4. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon. Disclaimer: The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the Defense Advanced Research Projects Agency (DARPA), Office of Naval Research, the National Institute of Standards and Technology, the U.S. Government or Honeywell International.
Notes 1. For application domains in which the activities of the agents are mostly independent, or the actions carried out by the agents are particularly lightweight (so the implications of an intersection are slight, e.g., resources are unlimited), explicit coordination may not be necessary. 2. Beyond the scope of this paper is the use of quality for commitment satisfaction in this application and its role in the control heuristics. 3. Note that this is a small subset of the possible items needing service or inspection – this simplifi cation is along a dimension that does not greatly impact the utility of the coordination algorithm. 4. An indirect interaction occurs when the problem instance contains deadlines that cannot be met. In such cases both wing and cockpit agents should forgo work on selected planes in order to avoid having an entire feet of aircraft that are partially complete, none of which are ready for their next mission. This interaction is dealt with using value for commitment satisfaction and algorithms/experiments pertaining to that topic must be presented separately due to space limitations. 5. Note that this would also apply to other agent sets if the problem were expanded. 6. Comparison between the new key algorithm and a pairwise technique is discussed in the conclusion. 7. The centralized scheduler requires on the order of 10 minutes to schedule 11 repairs on a dualprocessor Xenon 2Ghz linux workstation. A problem instance of that size will generate on or about 241,920 schedules some subset of which are unique. 8. The seven trial parameters are: (1) land time, (2) takeoff time deadlines, (3) level of avionics damage, (4) level of weapons control damage, (5) level of engines damage, (6) level of rearm damage, and (7) refuel level. 9. If the repairs are spread over a large number of aircraft there is little spatial resource contention and service teams can basically function in parallel.
References [1] B. G. Celler, W. Earnshaw, E. D. Ilsar, L. Betbeder-Matibet, M. F. Harris, R. Clark, T. Hesketh, and N. H. Lovell. Remote monitoring of health status of the elderly at home, a multidisciplinary project on aging at the University of New South Wales. International
74 Journal of Bio-Medical Computing, 40:147–155, 1995. [2] M. Chan, C. Hariton, P. Ringeard, and E. Campo. Smart house automation system for the elderly and the disabled. In IEEE International Conference on Systems, Man and Cybernetics, pages 1586–1589, 1995. [3] Sandeep Chatterjee. SANI: A seamless and non-intrusive framework and agent for creating intelligent interative homes. In Proceedings of the Second International Conference on Autonomous Agents, pages 436–440, 1998. [4] Keith Decker and Jinjiang Li. Coordinated hospital patient scheduling. In Proceedings of the Third International Conference on Multi-Agent Systems (ICMAS98), pages 104–111, 1998.
[5] Keith S. Decker. Environment Centered Analysis and Design of Coordination Mechanisms. PhD thesis, University of Massachusetts, 1995.
[6] Anind K. Dey, Daniel Salber, and Gregory D. Abowd. A context-based infrastructure for smart environments. In Proceedings of the 1st International Workshop on Managing Interactions in Smart Environments (MANSE ’99), pages 114–128, 1999.
[7] Christopher W. Geib and Robert Goldman. Probabilistic Plan Recognition for Hostile Agents. In Proceedings of the FLAIRS 2001 Conference, 2001.
[8] Anthony P. Glascock and David M. Kutzik. Behavioral telemedicine: A new approach to the continuous nonintrusive monitoring of activities of daily living. Telemedicine Journal, 6(1):33–44, 2000.
[9] K.Z. Haigh, J. Phelps, and C. Geib. An open agent architecture for assisting elder independence. In Proceedings of the First Intl. Joint Conference on Autonomous Agents and Multi-Agent Systems, 2002. To appear.
[10] Bryan Horling, Roger Mailler, Jiaying Shen, Regis Vincent, and Victor Lesser. Using autonomy, organizational design and negotiation in a distributed sensor network. 2003. Book chapter. To appear.
[11] Bryan Horling, Regis Vincent, Roger Mailler, Jiaying Shen, Raphen Becker, Kyle Rawlins, and Victor Lesser. Distributed sensor network for real-time tracking. In Proceedings of Autonomous Agent 2001, 2001.
[12] IEEE. 802.5: Token Ring Access Method. IEEE, New York, NY, 1985. [13] Henry Kautz, Larry Arnstein, Gaetano Bordello, Oren Etzioni, and Dieter Fox. An overview of the assisted cognition project. In Proceedings of the AAAI Workshop “Automation as Caregiver”, 2002. To appear.
[14] Jaana Leikas, Juhani Salo, and Risto Poramo. Security alarm system supports independent living of demented persons. Proceedings of Gerontechnology Second International Conference, pages 402–405, 1998.
[15] Victor Lesser, Michael Atighetchi, Bryan Horling, Brett Benyo, Anita Raja, Regis Vincent, Thomas Wagner, Ping Xuan, and Shelley XQ. Zhang. A Multi-Agent System for Intelligent Environment Control. In Proceedings of the Third International Conference on Autonomous Agents (Agents99), 1999.
[16] Victor Lesser, Keith Decker, Thomas Wagner, Norman Carver, Alan Garvey, Daniel Neiman, and Nagendra Prasad. Evolution of the GPGP Domain-Independent Coordination Framework. Computer Science Technical Report TR-98-05, University of Massachusetts at Amherst, January 1998. A longer version of this paper will appear in the Journal of Autonomous Agents and Multi-Agent Systems in 2003.
An Application Science for Multi-Agent Systems
75
[17] Victor Lesser, Bryan Horling, and et al. The TÆMS whitepaper / evolving specification. http://mas.cs.umass.edu/research/taems/white. [18] Victor Lesser, Bryan Horling, Frank Klassner, Anita Raja, Thomas Wagner, and Shelley XQ. Zhang. BIG: An agent for resource-bounded information gathering and decision making. Artifi cial Intelligence, 118(1-2): 197–244, May 2000. Elsevier Science Publishing. [19] Anita Raja, Victor Lesser, and Thomas Wagner. Toward Robust Agent Control in Open Environments. In Proceedings of the Fourth International Conference on Autonomous Agents (Agents2000), 2000. [20] Paul Scerri, David Pynadath, Lewis Johnson, Paul Rosenbloom, Mei Si, Nathan Schurr, and Milind Tambe. A Prototype Infrastructure for Distributed Robot-Agent-Person Teams. In Proceedings of the 2nd International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS2003), 2003. [21] A. J. Sixsmith. An evaluation of and intelligent home monitoring system. Journal of Telemedicine and Telecare, 6:63–72, 2000. [22] T. Tamura, T. Togawa, and M. Murata. A bed temperature monitoring system for assessing body movement during sleep. Clinical Physics and Physiological Measurement, 9:139–145, 1988. [23] Andrew S. Tanenbaum. Computer Networks. Prentice Hall, New Jersey, 1996. [24] Regis Vincent, Bryan Horling, and Victor Lesser. An agent infrastructure to evaluate multi-agent systems: The java agent framework and multi-agent system simulator. In Thomas Wagner and Omer Rana, editors, Infrastructure for Agents, Multi-Agent Systems, and Scalable Multi-Agent Systems, Lecture Notes in AI, pages 102–127. Springer, 2001. [25] Thomas Wagner, Alan Garvey, and Victor Lesser. Criteria-Directed Heuristic Task Scheduling. International Journal of Approximate Reasoning, Special Issue on Scheduling, 19(1–2):91–118, 1998. A version also available as UMASS CS TR-97-59. [26] Thomas Wagner, Valerie Guralnik, and John Phelps. Software agents: Enabling dynamic supply chain management for a build to order product line. In Proceedings of Agents for Business Automation, 2002. A version also appears in the Proceedings of the AAAI02 Workshop on Agents for Business Automation. [27] Thomas Wagner and Victor Lesser. Design-to-Criteria Scheduling: Real-Time Agent Control. In Wagner/Rana, editor, Infrastructure for Agents, Multi-Agent Systems, and Scalable Multi-Agent Systems, LNCS. Springer-Verlag, 2001. Also appears in the 2000 AAAI Spring Symposium on Real-Time Systems and a version is available as University of Massachusetts Computer Science Technical Report TR-99-58.
This page intentionally left blank
A COMPLEX SYSTEMS PERSPECTIVE ON COLLABORATIVE DESIGN
Mark Klein
Hiroki Sayama
Massachusetts Institute of Technology Cambridge MA USA
University of Electro-Communications Tokyo Japan
Peyman Faratin
Yaneer Bar-Yam
Massachusetts Institute of Technology Cambridge MA USA
New England Complex Systems Institute Cambridge MA USA
1. THE CHALLENGE: COLLABORATIVE DESIGN DYNAMICS Almost all complex artifacts nowadays, including physical artifacts such as airplanes, as well as informational artifacts such as software, organizational designs, plans and schedules, are created via the interaction of many, sometimes thousands of participants, working on different elements of the design. This collaborative design process is challenging because strong interdependencies between design decisions make it difficult to converge on a single design that satisfies these dependencies and is acceptable to all participants. Current collaborative design approaches are as a result typically characterized by heavy reliance on expensive and time-consuming processes, poor incorporation of some important design concerns (typically later lifecycle issues such as environmental impact), as well as reduced creativity due to the tendency to incrementally modify known successful designs rather than explore radically different and potentially superior ones. Complex systems research is devoted to understanding, at a fundamental level, the dynamics of systems made up of interdependent components, and
78 has we argue much to offer to our understanding of the dynamics of collaborative design. Previous research on design dynamics has focused on routine design [1] where the design space is well-understood (e.g. as in brake or transmission design), and the goal is to optimize a design via incremental changes for requirements similar to those that have been encountered many times before [2] [3]. Rapid technological and other changes have made it increasingly clear, however, that many of the most important collaborative design problems (e.g. concerning software, biotechnology, or electronic commerce) involve innovative design, radically new requirements, and unfamiliar design spaces. In this paper we will explore some of what complex systems research can contribute to this important challenge. We will begin by defining a simple model of collaborative design, review the strengths and weaknesses of current collaborative design approaches, and discuss some of the insights a complex systems perspective has to offer concerning why it is difficult and what we can do to help.
2. DEFINING COLLABORATIVE DESIGN A design (of physical artifacts such as cars and planes as well as behavioral ones such as plans, schedules, production processes or software) can be represented as a set of issues (sometimes also known as parameters) each with a unique value. A complete design for an artifact includes issues that capture the requirements for the artifact, the specification of the artifact itself (e.g. the geometry and materials), the process for creating the artifact (e.g. the manufacturing process) and so on through the artifacts’ entire life cycle. If we imagine that the possible values for every issue are each laid along their own orthogonal axis, then the resulting multi-dimensional space can be called the design space, wherein every point represents a distinct (though not necessarily good or even physically possible) design. The choices for each design issue are typically highly interdependent. Typical sources of inter-dependency include shared resource (e.g. weight, cost) limits, geometric fit, spatial separation requirements, I/O interface conventions, timing constraints etc.
An Application Science for Multi-Agent Systems
79
Collaborative design is performed by multiple participants (representing individuals, teams or even entire organizations), each potentially capable of proposing values for design issues and/or evaluating these choices from their own particular perspective (e.g. manufacturability). Figure 1 below illustrates this model: the small black circles represent design issues, the links between the issues represent design issue inter-dependencies, and the large ovals represent the design subspace (i.e. subset of design issues) associated with each design participant: In a large artifact like a commercial jet there may be millions of components and design issues, hundreds to thousands of participants, working on hundreds of distinct design subspaces, all collaborating to produce a complete design. Some designs are better than others. We can in principle assign a utility value to each design and thereby define a utility function that represents the utility for every point in the design space (though in practice we may only be able to assess comparative as opposed to absolute utility values). A simple utility function might look like the following:
80 The goal of the design process can thus be viewed as trying to find the design with the optimal (maximal) utility value, though often optimality is abandoned in favor of ‘good enough’. The key challenge raised by the collaborative design of complex artifacts is that the design spaces are typically huge, and concurrent search by the many participants through the different design subspaces can be expensive and time-consuming because design issue interdependencies lead to conflicts (when the design solutions for different subspaces are not consistent with each other). Such conflicts severely impact design utility and lead to the need for expensive and time-consuming design rework.
3. STRENGTHS AND LIMITATIONS OF CURRENT APPROACHES Traditionally, collaborative design has been carried out using a serialized process, wherein for example a complete requirement set would be generated, then given to design engineers who would completely specify the product geometry, which in turn would then be given to the manufacturing engineers to create a manufacturing plan, and so on. This has the problem that if an earlier decision turns out to be sub-optimal from the perspective of someone making dependent decisions later on in the design process (e.g. if a requirement is impossible to achieve, or a particular design geometry is very expensive to manufacture): the process of revising the design is slow and expensive, and often only the highest priority changes are made. The result is designs that tend to be poor from the standpoint of later life-cycle perspectives, including for example environmental concerns such as recyclability that are becoming increasingly important. More recently, several strategies have emerged for better accounting for the interdependencies among collaborative design participants. These include concurrent engineering and least-commitment design: Concurrent engineering involves the creation of multi-functional design teams, including representatives of all important design perspectives, for each distinct design subspace. Design decisions can be reviewed by all affected design perspectives when they are initially being considered, so bad decisions can be caught and revised relatively quickly and cheaply. While this approach has proven superior in some ways to traditional serial design, it does often incur an overwhelming burden on engineers as they have to attend many hours of design meetings and review hundreds of proposed changes per week [4].
An Application Science for Multi-Agent Systems
81
Least-commitment design is a complimentary approach that attempts to address the same challenges by allowing engineers to specify a design incompletely, for example as a rough sketch or set of alternatives, and then gradually make the design more specific, for example by pruning some alternatives [5] [6]. This has the advantage that bad design decisions can be eliminated before a lot of effort has been invested in making them fully specific, and engineers are not forced to make arbitrary commitments that lead to needless conflicts. While the adoption of these approaches has been helpful, major challenges remain. Consider for example the Boeing 767-F redesign program [4]. Some conflicts were not detected until long (days to months) after they had occurred, resulting in wasted design time, design rework, and often even scrapped tools and parts. It was estimated that roughly half of the labor budget was consumed dealing with changes and rework, and that roughly 25-30% of design decisions had to be changed. Since maintaining scheduled commitments was a priority, design rework often had to be done on a short flow-time basis that typically cost much more (estimates ranged as high as 50 times more) and sometimes resulted in reduced product quality. Conflict cascades that required as many as 15 iterations to finally produce a consistent design were not uncommon for some kinds of design changes. All this in the context of Boeing’s industry-leading concurrent engineering practices. The dynamics of current collaborative design processes are thus daunting, and have led to reduced design creativity, a tendency to incrementally modify known successful designs rather than explore radically different and potentially superior ones. Improving the efficiency, quality and creativity of the collaborative innovative design process requires, we believe, a much better understanding of the dynamics of such processes and how they can be managed. In the next section we will review of the some key insights that can be derived from complex systems research for this purpose.
4. INSIGHTS FROM COMPLEX SYSTEMS RESEARCH A central focus of complex systems research is the dynamics of distributed networks, i.e. networks in which there is no centralized controller, so global behavior emerges solely as a result of concurrent local actions. Such networks are typically modeled as multiple nodes, each node representing a state variable with a given value. Each node in a network tries to select the value that optimizes its own utility while maximizing its consistency with the influences from the other nodes. The global utility of the network state is simply the sum of local utilities plus the degree to which all the influences are
82 satisfied. The dynamics of such networks emerge as follows: since all nodes update their local state based on their current context (at time T), the choices they make may no longer be the best ones in the new context of node states (at time T+1), leading to the need for further changes. Is this a useful model for understanding the dynamics of collaborative design? We believe that it is. It is straightforward to map the model of collaborative design presented above onto a network. We can map design participants onto nodes, where each participant tries to maximize the utility of the design subspace (i.e. subsystem) it is responsible for, while ensuring its decisions satisfy its dependencies (represented as the links between nodes) with other subsystems. As we shall see, to understand network dynamics, the links between nodes need capture only quite abstract properties of the dependencies. As a first approximation, it is reasonable to model the utility of a design as the local utility achieved by each participant plus a measure of how well all the decisions fit together. Even though real-world collaborative design clearly has top-down elements early in the process, the sheer complexity of many design artifacts means that eventually no one person is capable of keeping the whole design in his/her head and assessing/refining its global utility. Centralized control of the design decisions becomes impractical, so the design process is dominated perforce by concurrent subsystem design activities (performed within the nodes) done in parallel with checks for consistency among the subsystem designs (assessed by seeing to what extent inter-node influences are satisfied). We will assume, for the purposes of this paper, that individual designers are reasonably effective at optimizing their subsystem utilities. How do such distributed networks behave? Let us consider the following simple example, a network consisting of binary-valued nodes where each node is influenced to have the same value as the nodes it is linked to, and all influences are equally strong (Figure 3):
An Application Science for Multi-Agent Systems
83
Node A, for example, is influenced to have the same value as Node C, while Node C is influenced to have the same value as Nodes A, B and D. We assume that node values do not effect the overall utility, only the degree to which the inter-node influences are satisfied. We can imagine using this network to model a real-world situation wherein there are six subsystems being designed, with two equally optimal design options for each, and we want them to use matching interfaces. This network has reached a stable state, i.e. no single node change will result in an increase in the number of satisfied influences. If we change the value of node A from 0 to 1, it will violate its one influence so this change will not be made. If we change the value of Node C to 1, it will now satisfy the influence with Node D but violate two influences (with Nodes A and B), resulting in a net loss in the number of satisfied influences, so this change will not be made either. The analogous argument applies to all the other nodes in the network. The system will not as a result converge on a global optimum (i.e. an ideal design where all the influences are satisfied), even though one does exist (where all nodes have the same value). Generally speaking, networks may not always converge upon the global optimum, and in some cases (as we shall see with dynamic attractors), a network may not converge at all. Insights into whether and how global optima can be found in networks represent the heart of what complex systems research offers to the understanding of collaborative design. We will discuss these insights in the remainder of this section. The key factor determining network dynamics is the nature of the influences between nodes. We will first consider how such influences can be defined. We will then consider two important distinctions: whether the influences are linear or not, and whether they are symmetric or not. We will finally discuss subdivided network topologies, and the role of learning. Unless indicated otherwise, the material on complex systems presented below is drawn from [8].
4.1. How Are Influences Defined? It is, in principle, straightforward to compute what the inter-node influences should be in order to create a network that implements a given global utility function. In design practice, however, we almost invariably do not know the global utility function up front; it is revealed incrementally, rather, by the process of defining and evaluating different candidate designs. Utility evaluations are apt in any case to be approximate at best, because among other things of uncertainties about the context the artifact will exist in. Imagine for example that our goal is to design the most profitable airplane possible: so many imponderable factors heavily influence this (e.g. future oil
84 prices, wars, government subsidies for competitors) that the only way to really know the utility of a design is to build it and see what happens! It is usually much easier, as a result, to define the influences directly based on our knowledge of design decision dependencies. We know for example that parts need to have non-overlapping physical geometries, that electrical interfaces for connected systems must be compatible, that weight limits must be met, and so on. Care must be taken in defining these influences, however. We face the risk of neglecting to give sufficient prominence to important concerns. Traditionally, influences from later stages of the life cycle (e.g. the manufacturing or recycling of the product) tend to be the ones most neglected, and the consequences are only encountered when that life cycle stage has been reached and it is typically much more difficult, time-consuming and expensive to do anything about it. Another concern is that, while there is always a direct mapping from a utility function to a set of influences, the opposite is not true. Asymmetric influences, in particular, do not have a corresponding utility function, and the network they define does not converge to any final result. This will be discussed below further in the section on asymmetric networks.
4.2. Linear vs. Non-Linear Networks If the value of nodes is a linear function of the influences from the nodes linked to it, then the system is linear, otherwise it is non-linear. Linear networks have a single attractor, i.e. a single configuration of node states that the network converges towards no matter what the starting point, corresponding to the global optimum. Their utility function thus looks like that shown in Figure 2 above. This means we can use a ‘hill-climbing’ approach (where each node always moves directly towards increased local utility) because local utility increases always move the network towards the global optimum. Non-linear networks, by contrast, are characterized by having utility functions with multiple peaks (i.e. local optima) and multiple attractors, as in Figure 4:
An Application Science for Multi-Agent Systems
85
A key property of non-linear networks is that search for the global optima can not be performed successfully by pure hill-climbing algorithms, because they can get stuck in local optima that are globally sub-optimal. Consider, for example, what would happen if the system started searching anywhere in region A in Figure 4 above. Hill-climbing would take it to the top of one of the local optima in this region, all of which substantially lower than some optima outside of region A. One consequence of this reality is a tendency to stick near well-known designs. When a utility function has widely separated optima, once a satisfactory optimum is found the temptation is to stick to it. This design conservatism is exacerbated by the fact that it is often difficult to compare the utilities for radically different designs. We can expect this effect to be especially prevalent in industries, such as commercial airlines and power plants, which are capital-intensive and risk-averse, since in such contexts the cost of exploring new designs, and the risk of getting it wrong, can be prohibitive. A range of techniques have emerged that are appropriate for finding optima in multi-optima utility functions, all relying on the ability to search past valleys in the utility function. Stochastic approaches such as simulated annealing have proven quite effective. Simulated annealing endows the search procedure with a tolerance for moving in the direction of lower utility that
86 varies as a function of a virtual ‘temperature’. At first the temperature is high, so the system is as apt to move towards lower utilities as higher ones. This allows it to range widely over the utility function and possibly find new higher peaks. Since higher peaks are also wider ones, the system will tend to spend most of its time in the region of high peaks. Over time the temperature decreases, so the algorithm increasingly tends towards pure hill-climbing. While this technique is not provably optimal, it has been shown to get close to optimal results in most cases. Annealing, however, runs into a dilemma when applied to systems with multiple actors. Let us assume that at least some actors are self-interested ‘hill-climbers’, concerned only with directly maximizing their local utilities, while others are ‘annealers’, willing to accept, at least temporarily, lower local utilities in order to increase the utility in other nodes. Simulation reveals that while the presence of annealers always increases global utility, annealers always fare individually worse than hill-climbers when both are present [9]. The result is that globally beneficial behavior is not individually incented. How do these insights apply to collaborative design? Linear networks represent a special case and we would expect because of this that most collaborative design contexts are non-linear. There is a particular class of collaborative design, however, that has been successfully modeled as linear networks: routine design [1]. Routine design involves highly familiar requirements and design options, as for example in automobile brake or transmission design. Designers can usually start the design process near enough to the final optimum, as a result, to be able to model the design space as having a single attractor. Linear network models of collaborative design have generated many useful results, including approaches for identifying design process bottlenecks [2] and for fine-tuning the lead times for design subtasks [3] in routine design domains. As we argued above, however, today’s most challenging and important collaborative design problems are not instances of routine design. The requirements and design options for such innovative design challenges are typically relatively unfamiliar, and it is unclear as a result where to start to achieve a given set of requirements. There may be multiple very different good solutions, and the best solution may be radically different than any that have been tried before. For such cases non-linear networks seem to represent a more accurate model of the collaborative design process. This has important consequences. Simply instructing each design participant to optimize its own design subspace as much as possible (i.e. ‘hillclimbing’) can lead to the design process getting stuck in local optima that may be significantly worse than radically different alternatives. Design participants must be willing to explore alternatives that, at least initially, may appear much worse from their individual perspective than alternatives currently on the table. Designers often show greater loyalty to producing a
An Application Science for Multi-Agent Systems
87
good design for the subsystem they are responsible for, than to conceding to make someone else’s job easier, so we need to find solutions for the dilemma identified above concerning the lack of individual incentives for such globally helpful behavior. We will discuss possible solutions in the section below on “How We Can Help”.
4.3. Symmetric vs. Asymmetric Networks Symmetric networks are ones in which influences between nodes are mutual (i.e. if node A influences node B by amount X then the reverse is also true), while asymmetric networks do not have this property. Asymmetric networks (with an exception to be discussed below) add the complication of dynamic attractors, which means that the network does not converge on a single configuration of node states but rather cycles indefinitely around a relatively small set of configurations. Let us consider the simplest possible asymmetric network: the ‘odd loop’ (Figure 5):
This network has two links: one which influences the nodes to have the same value, the other which influences them to have opposite values. Imagine we start with node A having the value 1. This will influence node B to have the value –1, which will in turn influence node A towards the value –1, which will in turn cause node B to flip values again, and so on ad infinitum. If we plot the state space that results we get the following simple dynamic attractor (Figure 6):
88
More complicated asymmetric networks will produce dynamic attractors with more complicated shapes, but the upshot is the same: the only way to get a definite solution (i.e. configuration of node states) with a dynamic attractor is to arbitrarily pick one point along its length. There is one important special case, however: feed-forward networks. The influences in feed-forward networks are acyclic, which means that a node never is able to directly or indirectly influence its own value (there are in other words no loops). Feedforward networks do not have dynamic attractors. How does this apply in collaborative design settings? Traditional serialized collaborative design is an example of an asymmetric feed-forward network, since the influences all flow uni-directionally from the earlier product life cycle stages (e.g. design) to later ones (e.g. manufacturing) with only weak feedback loops if at all. In such settings we may not expect particularly optimal designs but the attractors should be static and convergence should always occur, given sufficient time. ‘Pure’ concurrent engineering, where all design disciplines are represented on multi-functional design teams, encourage roughly symmetric influences between the participants and thus can also be expected to have convergent dynamics with static attractors. Current collaborative design practice, however, is a hybrid of these two approaches, and thus is likely to have the combination of asymmetric influences and influence loops that produces dynamic attractors and therefore non-convergent dynamics. Dynamic attractors were found to not to have a significant effect on the dynamics of at least some routine (linear) collaborative design contexts [3], but may prove more significant in innovative (non-linear) collaborative design. It may help explain, for example, why it sometimes takes so many iterations to account for all changes in complex designs [4].
An Application Science for Multi-Agent Systems
89
4.4. Subdivided Networks Another important property of networks is whether or not they are subdivided, i.e. whether they consist of sparsely interconnected ‘clumps’ of highly interconnected nodes, as for example in Figure 7:
When a network is subdivided, node state changes can occur within a given clump with only minor effects on the other clumps. This has the effect of allowing the network to explore more states more rapidly. Rather than having to wait for an entire large network to converge, we can rely instead on the much quicker convergence of a number of smaller networks, each one exploring possibilities that can be placed in differing combinations with the possibilities explored by the other sub-networks. This effect is in fact widely exploited in design communities, where it is often known as modularization. This involves intentionally creating subdivided networks by dividing the design into subsystems with pre-defined standardized interfaces, so subsystem changes can be made with few or any consequences for the design of the other subsystems. The key to using this approach successfully is defining the design decomposition such that the impact of the subsystem interdependencies on the global utility is relatively low, because the standardized interfaces rarely represent an optimal way of satisfying these dependencies. In most commercial airplanes, for example, the engine and wing subsystems are designed separately, taking advantage of standardized engine mounts to allow the airplanes to use a range of different engines. This is not the optimal way of relating engines and wings, but it is good enough and simplifies the design process considerably. If the enginewing interdependencies were crucial, for example if standard engine mounts had a drastically negative effect on the airplane’s aerodynamics, then the design of these two subsystems would have to be coupled much more closely in order to produce a satisfactory design.
4.5. Imprinting One common technique used to speed network convergence is imprinting, wherein the network influences are modified when a successful solution is
90 found in order to facilitate quickly finding (similar) good solutions next time. A common imprinting technique is reinforcement learning, wherein the links representing influences that are satisfied in a successful final configuration of the network are strengthened, and those representing violated influences weakened. The effect of this is to create fewer but higher optima in the utility function, thereby increasing the likelihood of hitting such optima next time. Imprinting is a crucial part of collaborative design. The configuration of influences between design participants represents a kind of ‘social’ knowledge that is generally maintained in an implicit and distributed way within design organizations, in the form of individual designer’s heuristics about who should talk to whom when about what. When this knowledge is lost, for example due to high personnel turnover in an engineering organization, the ability of that organization to do complex design projects is compromised. It should be noted, however, that imprinting reinforces the tendency we have already noted for organizations in non-linear design regimes to stick to tried-and-true designs, by virtue of making the previouslyfound optima more prominent in the design utility function.
5. HOW WE CAN HELP? What can we do to improve our ability to do innovative collaborative design? We will briefly consider several possibilities suggested by the discussion above. Information systems are increasingly becoming the medium by which design participants interact, and this fact can be exploited to help monitor the influence relationships between them. One could track the volume of designrelated exchanges or (a more direct measure of actual influence) the frequency with which design changes proposed by one participant are accepted as is by other participants. This can be helpful in many ways. Highly asymmetric influences could represent an early warning sign of non-convergent dynamics. Detecting a low degree of influence by an important design concern, especially one such as environmental impact that has traditionally been less valued, can help avoid utility problems down the road. A record of the influence relationships in a successful design project can be used to help design future projects. Influence statistics can also be used to help avoid repetitions of a failed project. If a late high-impact problem occurred in a subsystem that had a low influence in the design process, this would suggest that the influence relationships should be modified in the future. Note that this has the effect of making a critical class of normally implicit and distributed knowledge more explicit, and therefore more amenable to being preserved
An Application Science for Multi-Agent Systems
91
over time (e.g. despite changes in personnel) and transferred between projects and even organizations. Information systems can also potentially be used to help assess the degree to which the design participants are engaged in routine vs innovative design strategies. We could use such systems to estimate for example the number and variance of design alternatives being considered by a given design participant. This is important because, as we have seen, a premature commitment to a routine design strategy that optimizes a given design alternative can cause the design process to miss other alternatives with higher global optima. Tracking the degree of innovative exploration can be used to fine-tune the use of innovation-enhancing interventions such as incentives, competing design teams, introducing new design participants, and so on.
6. CONCLUSIONS Existing collaborative design approaches have yielded solid but incremental design improvements, which has been acceptable because of the relatively slow pace of change in requirements and technologies. Consider for example the last 30 years of development in Boeing’s commercial aircraft. While many important advances have certainly been made in such areas as engines, materials and avionics, the basic design concept has changed relatively little (Figure 8):
Future radically innovative design challenges, such as high-performance commercial transport, will probably require, however, substantial changes in design processes:
92
This paper has begun to identify what a complex systems perspective can offer in this regard. The key insight is that the dynamics of collaborative design can be understood as reflecting the fundamental properties of a very simple abstraction of that process: distributed networks. This is powerful because this means that our growing understanding of such networks can be applied to help us better understand and eventually better manage collaborative design regardless of the domain (e.g. physical vs informational artifacts) and type of participants (e.g. human vs software-based). This insight leads to several others. Most prominent is the suggestion that we need to embrace a change in thinking about how to manage complex collaborative design processes. It is certainly possible for design managers to have a very direct effect on the content of design decisions during preliminary design, when a relatively small number of global utility driven high-level decisions are made top-down by a small number of players. But once the design of a complex artifact has been distributed to many players, the design decisions are too complex to be made top-down, and the dominant drivers become local utility maximization plus fit between these local design decisions. In this regime encouraging the proper influence relationships and local search strategies becomes the primary tool available to design managers. If these are defined inappropriately, we can end up with designs that take too long to create, do not meet important requirements, and/or miss opportunities for significant utility gains through more creative (far-ranging) exploration of the design space.
7. REFERENCES [1] Brown, D.C. Making design routine. in Proceedings of IFIP TC/WG on Intelligent CAD. 1989. [2] Smith, R.P. and S.D. Eppinger, Identifying controlling features of engineering design iteration. Management Science, 1997. 43(3): p. 276-93. [3] Eppinger, S.D., M.V. Nukala, and D.E. Whitney, Generalized Models of Design Iteration Using Signal Flow Graphs. Research in Engineering Design, 1997. 9(2): p. 112-123. [4] Klein, M., Computer-Supported Conflict Management in Concurrent Engineering: Introduction to Special Issue. Concurrent Engineering Research and Applications, 1994. 2(3). [5] Sobek, D.K., A.C. Ward, and J.K. Liker, Toyota’s Principles of Set-Based Concurrent Engineering. Sloan Management Review, 1999. 40(2): p. 67-83.
An Application Science for Multi-Agent Systems
93
[6] Mitchell, T.M., L.I. Steinberg, and J.S. Shulman, A Knowledge-Based Approach To Design, IEEE Transactions on Pattern Analysis and Machine Intelligence, 1985. PAMI(7): p. 502-510. [7] Mezard, M., G. Parisi, and M.A. Virasoro, Spin glass theory and beyond. 1987, Singapore ; New Jersey: World Scientific. xiii, 461. [8] Bar-Yam, Y., Dynamics of complex systems. 1997, Reading, Mass.: Addison-Wesley. xvi, 848. [9] Klein, M., et al., Negotiating Complex Contracts. 2001. Working paper ROMA-2001-01. Massachusetts Institute of Technology: Cambridge MA USA.
This page intentionally left blank
MULTI-AGENT SYSTEM INTERACTION PROTOCOLS IN A DYNAMICALLY CHANGING ENVIRONMENT
Martin Purvis, Stephen Cranefield, Mariusz Nowostawski, and Maryam Purvis Information Science Department, University of Otago, Dunedin, New Zealand
Abstract:
An area where multi-agent systems can be put to effective use is for the case of an open collection of autonomous problem solvers in a dynamically changing environment. One example of such a situation is that of environmental management and emergency response, which can require the joint cooperation of a distributed set of components, each one of which may be specialised for a specific task or problem domain. The various stakeholders in the process can all be represented and interfaced by software agents which collaborate with each other toward achieving a particular goal. For such situations new agents that arrive on the scene must be apprised of the group interaction protocols so that they can cooperate effectively with the existing agents. In this paper we show how this can be done by using coloured Petri net representations for each role in an interaction protocol and passing these nets dynamically to new agents that wish to participate in a group interaction. We argue that multi-agent systems are particularly suited for such dynamically changing environments, but their effectiveness depends on their ability to use adaptive interaction protocols.
Key words:
Multi-agent systems, agent conversations, adaptive systems.
1.
INTRODUCTION
There is a widely-held view that multi-agent systems provide a robust and scalable approach for the solution of complex problems [6]. Each
96 individual agent is presumed to be a specialist for a particular task, and the expectation is that, just as in the sphere of human engineering, complex projects can be undertaken by a collection of agents, no one of which has the capability of performing all the required tasks for the project. In addition, if the system has an open agent architecture, then individual agents can be replaced by improved models, thereby enabling the system to improve gradually, grow in scope, and generally adapt to changing circumstances. For agent systems to operate effectively, they must exchange information in the form of messages, and the agents must have a common understanding of the possible message types and the terms (and possible relationships among the terms) that are used in the messages. This shared information can be represented by an ontology, and a considerable amount of research has been devoted to the development of techniques for representing ontologies and for reasoning about messages that have been expressed in terms of them [5]. However, understanding messages that refer to ontologies can require a considerable amount of reasoning, and this may place a computational burden on agents that could limit their overall responsiveness. Thus although the agents are individual specialists concerning their own problem domains, their need to cooperate with other agents in a larger contextual (i.e. ontological) environment can present performance problems in areas where agents are operating in real-time, real-world environments, such as environmental response systems or those of electronic business. Of course a straightforward way of reducing some of the search space of possible responses to agent messages is by using conversation policies, or interaction protocols [4]. An interaction protocol specifies a limited range of responses that are appropriate to specific message types when a particular protocol is in operation. For example, when a person enters a restaurant, he (or she) doesn’t have to worry about all the possible statements that might be made about food. Instead, he expects to be given a menu and to place an order. Later the food will be brought, and only afterwards (for this particular restaurant, anyway) will he be expected to pay the bill. This may be called a “restaurant interaction protocol”, and the existence of such a protocol greatly reduces the search space of possible responses required, which is limited to the responses appropriate to the particular point that one has reached in the protocol. The customer, the waiter, the cook, and the cashier all know this protocol and keep track of where they are in terms of it. (Note that the waiter and the cashier may be holding many simultaneous conversations with various customers, all using the same protocol.) Everything should work reasonably smoothly if all the agents already know the interaction protocol prior to engaging in an interaction. But what happens in a new or changing environment, where the protocol is either unknown or may need to be changed to meet changing conditions? For
An Application Science for Multi-Agent Systems
97
example, when a traveller goes to a foreign country and enters an eating establishment, he may not be familiar with the specifics of the restaurant protocol for that locale. In that case he may need to revert to his traveller’s dictionary (his ontological reference) and attempt to engage in some sort of complex negotiation in order to carry out even a simple transaction. This will not result in the kind of adaptive and responsive agent system that is needed for changing environments. Thus for agent systems to operate effectively in highly dynamic environments, they need to have a mechanism for exchanging new or altered interaction protocols on the fly. In this paper we describe an approach to achieve this and discuss how this approach can make agent systems suited to certain types of problem areas.
2.
INTERACTION PROTOCOLS USING COLOURED PETRI NETS
When an agent is involved in a conversation that uses an interaction protocol, it maintains a representation of the protocol that keeps track of the current state of the conversation. After a message is received or sent, it updates the state of the conversation in this representation. The Foundation for Intelligent Physical Agents (FIPA) [3] has developed some standard and general interaction protocols that can be adopted by agents, and these have been expressed as state machines [4]. Other representations for interaction protocols have been enhanced Dooley graphs [12] and extended UML [10]. We use coloured Petri nets (CPNs) [7,1], because their formal properties facilitate the modelling of concurrent conversations in an integrated fashion. We believe coloured Petri nets are a more compact and intuitive representation for modelling concurrent processes than that offered by traditional finite-state machine techniques. Coloured Petri nets are similar to ordinary Petri nets in that they comprise a structure of places, transitions, and arcs connecting those two types of elements; but, in addition, CPNs also have structured tokens and a set of net inscriptions (arc expressions, guards, and place initialisations) which can be evaluated to yield new net markings when transitions are fired1. The availability of net analysis tools means that it is possible to check the designed protocols and role interactions for undesired loops and deadlock conditions, and this can then help eliminate human errors introduced in the design process. Moreover, coloured Petri 1
Note that since CPN output arc inscriptions can include expressions that generate tokens on associated output places, there can be occasions when a transition firing may result in no token being placed in an output place.
98 nets facilitate the modelling of individual agent conversations within a larger behavioural modelling context associated with a particular problem domain.
3.
THE FIPA REQUEST INTERACTION PROTOCOL
In order to illustrate the Petri net modelling of agent conversations, we first consider one of the fundamental FIPA message types, the request message. When an agent sends a request to another agent, it expects a response message a little later, and so we can consider this message sequence a request interaction protocol that involves the relatively brief conversation. We model all interaction protocols in terms of the roles in the interaction: for each role there is a separate Petri net (which differs somewhat from our earlier approach [9]). The collection of individual Petri nets associated with all the interaction protocol roles represents the entire interaction protocol. For every conversation, there are always at least two roles: that of the initiator of the conversation and the roles of the other participants in the conversation. In most cases, though, there are only two roles, the initiator and the single participant that receives the first message. In Figure 1, we show the Petri net representation of the initiator of the FIPA request interaction. Circles represent Petri net places, and boxes represent transitions. For diagrammatic simplicity, we omit the inscriptions from the diagram, but we will describe some of them below.
An Application Science for Multi-Agent Systems
99
The Start place will have a token placed there by the agent at the outset of the interaction, and it is highlighted with a thicker line than the other places. A token in this place initiates the interaction. The In place (in this and the following Petri net diagrams) will have tokens placed there when the agent receives messages from other agents. The In place here is a fusion node (a place common to two or more nets): the very same In place may exist on other Petri nets that also represent conversations in which the agent may be engaged. Every time the agent receives a message from another agent, a token with information associated with the message is placed in the In place that is shared by several Petri nets. The transitions connected to the In place have guards on them such that the transitions are only enabled by a token on the In place with the appropriate qualification. Figure 2 depicts the Petri net scheme for the agent that plays the Participant (FIPA uses this term) role, that receives the initial request message.
The Initiator of the request interaction will have a token placed in the Start place (Figure 1), and this will trigger the Send request transition to place a token in the Out place. Thus we are assuming that there is some communication transport machinery that causes tokens to disappear from a Petri net’s Out place and (usually) a corresponding token to appear on the In place of another agent. We do not assume, however, that the transfer is instantaneous, or even guaranteed to occur. It is possible for agents messages to be lost in transit, and thus it is possible for a token to disappear from one role’s Out place without a corresponding token appearing at another agent’s In place.
100 The agent playing the Participant role will get incoming messages and place them in the In place. The Receive request transition will, if it doesn’t understand the message, place a token in the Not-understood place. If it does understand the message, it will place a token in the Request place. Note that the FIPA specifications often include the possibility of a “notunderstood” response, but we regard such messages as similar to software exceptions and place them on another, parallel coloured Petri net (not shown) that deals with exceptional conditions and is connected to the primary net by a ‘fusion’ place. For this reason we show the Not-understood places in Figures 1 and 2, and the associated transition for sending a “notunderstood” response by dashed lines, which indicate that these nodes are actually located on parallel nets that deal with exceptions. (We discuss parallel Petri nets in connection with policies in Section 6, below.) If the original request message is understood by the Participant, it either agrees to or refuses the request and sends the appropriate response back to the Initiator (Figure 2). If the Initiator gets an agree message back from the Participant and subsequently the enabled Process answer transition is fired, a token with the appropriate information is put in the Agreed place. Meanwhile the Participant attempts to fulfill the original request. In terms of the coloured Petri net, this activity is carried out as part of the coloured Petri net transition’s action code (executable code that can be activated when a transition is fired) associated with the Fulfill request transition. Thus the Fulfill request transition is connected with the agent’s actual carrying out of the request. Upon completion of the action, a token is placed in the Done, Result, or Failed place, depending upon the circumstances of the requested action taking place at the Participant agent. The appropriate transition will then be enabled and ultimately fired, and the response will be sent back to the Initiator. When the Initiator gets the response back from the Participant, the Receive request result transition will be enabled if the incoming message contains information that matches with information on the token that was already stored in the Agreed place. Note that the Initiator could be involved in several concurrent request interaction conversations, and the placement of specific tokens in the Agreed place enables this agent to keep track of which responses correspond to which conversations. This is how the coloured Petri net representation assists the agent in managing multiple, concurrent interactions involving the same protocol and can be compared to the manner in which a restaurant waiter keeps track of several orders sent to the kitchen so that resulting food preparations can be associated with the right customers. Thus an interaction protocol has a specific Petri net associated with each role in the conversation, and the participating agents can use these Petri nets
An Application Science for Multi-Agent Systems
101
to keep track of what stage they are in the conversation. The Petri nets representing all the roles of an interaction protocol can be encoded in XML and sent as the message content of a FIPA inform or propose message. This means that a new agent that appears on the scene can be sent the interaction protocol and can “load” the appropriate Petri net role into its conversation module and engage on-the-fly in a conversation using the new interaction protocol.
4.
INTERACTION PROTOCOLS FOR ENVIRONMENTAL EMERGENCY SYSTEMS
Let us now consider a somewhat more involved situation, the management of extended environmental areas when unforseen events take place. This can require rapid responses on the part of many people or services with specialised skills. In such circumstances the resources and skills required to respond to the emergency may go well beyond the capabilities of the permanent staff who normally maintain the area. For example, when a massive forest fire breaks out in a national forest or when a blizzard threatens the lives of several scattered groups of trampers in a national park, the environmental managers may need to call on the services of a number of specialists who can provide crucial assistance in connection with specialised rescue operations and medical assistance. In today’s economic and political climate, it is more likely that these specialist service providers are private operators who can be contracted by the government to respond to emergencies in critical situations, rather than people under the permanent employ of the government. Environmental management systems attempt to manage these dynamic situations. With the increasing use of wireless communications, system components may come in and out of range as environmental professionals move around in the field. We examine a scenario in which a national park is managed by a collection of park rangers. The park has many tracks available for trampers (hikers), and tramping groups have guides equipped with personal digital assistants which can be used to contact park officers when necessary, such as in an emergency situation. Although park rangers can handle routine events themselves, there can be cases when they need outside assistance. If trampers are trapped in a remote location, they may need to be rescued quickly. This can require specialist medical personnel, firefighting professionals and equipment, professional mountain climbers and speleologists, and special types of transport (four-wheel-drive trucks, airplanes, helicopters, boats, etc.).
102 These specialist groups can be authorised to provide their services in an emergency and be compensated accordingly. As new people with special skills move into the national park region, they can be added to the network of potentially participating service agents that can be called upon to provide assistance in emergency situations. For open agent-based environmental management systems, new rescue service providers can plug into the system, as long as they are provided with the appropriate interaction protocols. In this scenario we assume that a skilled tramping guide or park officer has assessed the situation and has an idea of the kind of assistance that is needed. This person has a personal software agent that we call the “Initiator”. The Initiator communicates with the “Ranger” agent, which is a service recruiter, in order to find out what resources are available and see what can be organised. A simplified, top-level view of this activity can be shown for illustrative purposes in terms of the Petri net shown in Figure 3. Note that this Petri net does not represent an interaction protocol role, but, instead, represents a simplified view of the overall interaction. Here the Initiator sends a “proxied” message to the Ranger: within the content of the message to the Ranger is another message that is to be sent to other service agents that will attend to the problem. The Ranger, here, is a recruiter of specialised agents that can deal with the particular emergency at hand. Thus the Initiator can send a message that (a) contains a request that the Ranger recruit service agents and (b) contains the interaction protocol to be used to interact with the target service agents.
An Application Science for Multi-Agent Systems
103
Figure 3 over-simplifies the situation, because it shows a synchronous interaction between the Initiator and the Ranger, which, as we have mentioned above, is not a true representation of asynchronous messaging. As a result of this interaction, the Ranger has received the proxied information from the Initiator that is to be transmitted to the Service agents. The proxied information from the Initiator can actually involve a complex and asynchronous exchange of messages between the Ranger and the Service agents. That is, the Initiator asks the Ranger to carry out some subprotocol interaction with the Service agents and then have the results from this sub-protocol sent back to the Initiator and possibly other parties, which, following FIPA, are designated as Destinators. This interaction corresponds to the FIPA Recruiting Interaction Protocol Specification [3]. There are four basic roles associated with this more complicated interaction protocol: Initiator, Ranger (the recruiter), Service-agent (the target agents of the original communication from the Initiator), and Destinator. It is here, in the context of such more complicated conversations, that dynamic interaction protocol modelling and exchange can help demonstrate how agent-based systems can respond effectively to changing conditions. Figure 4 shows the coloured Petri net model for the Initiator role in the interaction protocol. Again, the conversation is begun when a token is placed in the Start place. The initial message sent contains the proxy message that specifies the additional interaction that is to take place between the Ranger and some service agents.
Figures 5 shows the interaction protocol role model for the Ranger (recruiter), and Figure 6 shows the interaction protocol role model for the
104 target agents that execute the sub-protocol. The places identified with thick borders in Figures 5 and 6 (e.g. Start sub and Sub-protocol result) represent fusion nodes that are connected to other, parallel Petri nets (not shown), which carry out the sub-protocol interaction. One possible example of a sub-protocol interaction is the Request protocol shown in Figures 1 and 2.
When the recruiter receives the proxy message from the Initiator (Figure 5), it either agrees or refuses to carry it out and sends an answer back to the Initiator. If it agrees to the action, it checks if it knows of any target agents that can carry out the requested proxy action. If there are none (‘No match’), it sends a failure message back to the Initiator. If, however, it does find a match, it sends the requested proxy action to the target agent(s). A target agent may agree or refuse to carry out the proxy action, and it reports this response to the Recruiter (Figure 6). If the target agent agrees, then a subprotocol interaction is started between the Recruiter and the target agent. The steps associated with this sub-protocol interaction take place on another, parallel Petri net that is not shown here. When this proxy interaction is completed (it could result in failure), the results are reported back to the Initiator and/or possibly additional Destinators (Figure 7).
An Application Science for Multi-Agent Systems
5.
105
E-BUSINESS APPLICATIONS
We show another example of agent interaction in the area of electronic commerce. For illustrative purposes, we look at the Pit Game [11], which is a card game that simulates commodity trading and exhibits some of its
106 essential characteristics. In this game a dealer distributes nine cards to each player (three to seven players may play). There are exactly nine cards of each of the commodity types: corn, barley, wheat, rice, etc. The card deck is prepared so that the number of commodity types matches the number of players for the given game. When play begins, the players independently and asynchronously exchange cards with each other, attempting to “corner” the market by getting all nine cards of one type. They can only exchange cards that belong to the same commodity type. Thus if a player has six barley cards, two wheat cards and one rice card, he will typically initially attempt to trade away his two wheat cards, hoping to acquire one or two barley cards. Trading is carried out by a player announcing, for example, that he has, say, two cards to trade. If another player also wishes to trade two cards, the two players may make an exchange. Whenever a player manages to get a “corner”, he announces that fact to the dealer and the “hand” is finished (the implementation shown here is for a single “hand”). In our implementation of the game, the trading bids are sent to the dealer, which, in turn, broadcasts the bids to all the players. Figure 8 shows the interaction protocol for the Dealer role. The Dealer deals out the cards and then sends the “start” signal to all the players (a broadcast message).
Whenever a player (Figure 9) has a hand of cards, he always checks to see if he has a “corner”. If so, he announces this to the Dealer by sending the Dealer his cards, and the Dealer, in turn, announces it to the rest of the players, signaling the end of the hand. After players have received the start
An Application Science for Multi-Agent Systems
107
signal and assuming that they don’t have a “corner”, they may choose to make a bid. They do this by sending their bid to the Dealer, which in turn broadcasts the bid to all other players. At this point, the player’s cards are also separated into the “cards offered” and the “cards remaining” places. When a bid is received, the player may choose to accept the bid. He does this by checking the bid against his own cards. If the bid is accepted, a message is sent to the player (not the Dealer) and a token is stored in a place (“bid accepted + cards to send”) for future reference. (If the player doesn’t get a corresponding acceptance of his own bid before a given timeout period, then he gives up on this potential deal and restores his offered cards back to the “cards offered” place.)
A trade of cards can take place if two players have made bids, and they have both accepted each other’s bid. Thus if player A has made a bid, has accepted player B’s bid, and has received an acceptance message from B that his own bid has been accepted, then A will send his cards to B (and expect to receive a corresponding number of cards from B). When a player receives a message from another player that his bid has been accepted, it is stored in the “Accpt.” place. The “Send cards to player” transition checks
108 (by means of a guard) to make sure that the accepted bid matches information in the token located in the “bid accepted + cards to send” place. If so, it sends the cards to the other player and keeps a copy of the acceptance information in the “Cards sent” place. If the player receives a bid acceptance that is not applicable (such as a second acceptance that has come in after he has already decided to trade cards with someone who has sent in an earlier acceptance), then the bid acceptance is discarded. When traded cards are received, the “Process rec’d cards” transition checks to see that the received cards are associated with the bid acceptance information stored in the “Cards sent” place. If the cards do not match the bid acceptance, they are discarded. If nothing is received after some time, the “Send timeout” transition guard is enabled, and the cards are returned to the “Card” place. (Though the cards have been sent, the player still has a copy of what has been sent.) The “Bid timeout” transition is enabled if there have been no takers of a bid before a certain timeout period has elapsed. When this transition is fired, the cards are returned to the hand, and the player may chose to make another bid. In real e-commerce trading situations new agent “players” could be sent the appropriate interaction protocols, similar but more elaborate to those shown in this example, and immediately begin participating in the trading arena.
6.
IMPLEMENTATION
The agent-based system that we have developed is based on our Opal agent platform [13], and uses JFern [8], a Java-based coloured Petri net simulator. When new agents appear and are to be incorporated into the network of available agents, they are sent a FIPA Propose message by the group manager with a message content containing an XML-encoding of the interaction protocol that is used. The target agent parses the XML-encoded interaction protocol, and if it accepts the proposed mechanism for interaction, sends the Accept message back to the group manager. We implement the agent conversation module as a layered Petri net. For each protocol, there is can be an additional layer that we have not shown, called a policy layer. This is the layer which would, with other approaches, be left to the agent application to coordinate and not be included explicitly in the conversation modelling process. However we feel that it is more appropriate to treat it as closely related to the conversation layer. Policies may be implemented simply by a set of rules, or, in more complex cases, they may have their own complex protocols that exist and change state in parallel with the immediate context of an ongoing
An Application Science for Multi-Agent Systems
109
conversation. Under these more complex circumstances, there might be a “policy-level interaction protocol” (another protocol, but at the policy level). It is under these conditions that we can benefit from having another modelling layer at the policy level, above that of the ordinary conversational modelling layer. The two layers can be joined together by representing them both as a coloured Petri Net.
In the Pit Game, for example, the possible rules for legal bids and legal card exchanges by the players are described by the basic interaction protocol. But existing above that level of abstraction is another level of discourse that can take place during the game. Suppose one of the players has a question concerning the official rules of the game and wants to have a ruling made by a referee. Or perhaps one of the players at some point wants to halt play so that he or she can attend to some urgent matter. These kinds of ‘interrupts’ or ‘exceptions’ are common to many kinds of interactions and can take place at almost any time. We already mentioned an example of this type of action in connection with the “not-understood” message in the FIPA interaction protocol specifications. The discourse involved in these interrupts is usually “off-topic” from the context of the immediate conversation, and in fact they are often about the conversation that is taking place (such as an accusation of breaking the protocol rules associated with playing the game). Since they are likely to be “off-topic” and can occur at any moment, it can be tedious to include these kinds of conversational
110 strands in the given (domain-specific) conversation protocol. To do so would clutter the visual simplicity of the original conversation protocol and would lessen the value in providing a easy-to-comprehend visual modelling representation of the interaction. On the other hand, to leave out the possibility of representing such events is to ignore the possibility of their occurrence and consequently means that there is a failure to model the world adequately so that its essentially contingent nature is recognised. Our solution is to model these kinds of interactions that can guide, interrupt, or redirect existing conversations by representing them as another, parallel modelling layer above that of the existing conversation layer. This idea was suggested previously [2] for specific types of conversation, but we have generalised the notion and incorporated it into a Petri Net representation. Thus a conversation is a combination of protocols being instantiated and manipulated by a particular policy.
7.
CONCLUSIONS
We have developed an approach for managing non-trivial interactions in a multi-agent system and have demonstrated how it can be used to make systems more responsive to changing protocols in a dynamic environment. In particular, we believe that this facilitates the operation of multi-agent systems in connection with applications where new service-providing agents are likely to appear and need to be given updated interaction protocols so that the group of agents can adapt to a changing environment. Our approach is based on a network of interacting agents that follow the FIPA specifications, but we have introduced and implemented an interaction protocol mechanism based on coloured Petri nets that may include more complex interaction protocols than what FIPA has so far specified. When more complicated, multi-layered and concurrent conversations take place among groups of agents, the coloured Petri net approach that we use appears to offer advantages over the state-machine techniques that have been commonly used up until now.
REFERENCES [1]
Cost, S., Chen, Y., Finin, T., Labrou, Y., and Peng, Y., “Using colored Petri nets for conversation modeling”, Issues in Agent Communication, Dignum, F. and Greaves, M. (eds.), Lecture Notes in AI, , volume 1916, Springer-Verlag, Berlin (2000) pages 178192.
An Application Science for Multi-Agent Systems [2]
[3] [4] [5]
[6]
[7] [8] [9]
[10]
[11] [12]
[13]
111
Elio, R. and Haddadi, A., “On abstract task models and conversation policies”, in. Working Notes of the Workshop on Specifying and Implementing Conversation Policies, Autonomous Agents’99, Seattle, (1999), pages 89-98. FIPA. Foundation for Intelligent Physical Agents (FIPA). FIPA 2001 specifications, http://www.fipa.org/specifications/, (2001). Greaves, M, and Bradshaw, J. (eds.), Specifying and Implementing Conversation Policies, Autonomous Agents ‘99 Workshop, Seattle, WA, (1999). Gruber, T. R., “A Translation Approach to Portable Ontologies”, Knowledge Acquisition, 5(2) (1993) pages 199-220. Jennings, N. R., “Agent-oriented software engineering”, Proceedings of the 12th International Conference on Industrial and Engineering Applications of AI, (IJCAI-99), Stockholm, Sweden, (1999) pages 1429-1436. Jensen, K., Coloured Petri Nets – Basic Concepts, Analysis Methods and Practical Use, Springer-Verlag, Berlin, (1992). Nowostawski, M., JFern, version 1.2.1, http://sourceforge.net/project/showfiles.php? group_id=16338 (2002). Nowostawski, M., Purvis, M., and Cranefield, S., “A Layered Approach for Modelling Agent Conversations”, Proceedings of the 2nd International Workshop on Infrastructure for Agents, MAS, and Scalable, MAS, International Conference on Autonomous Agents, ACM Press (2001) pages 163-170. Odell, J, Parunak, H. V. D., and Bauer, B., “Extending UML for agents”, Proceedings of the Agent-Oriented Information Systems Workshop at the 17th National Conference on Artificial Intelligence (2000) pages 3-17,. Parker Brothers, Inc., Salem, Mass., (1919), see http://www.centralconnector.com/ GAMES/pit.html. Parunak, H. V. D., “Visualizing agent conversations: Using Enhanced Dooley graphs for agent design and analysis”, Proceedings of the Second International conference on Multi-Agent Systems, ICMAS’96, The AAAI Press, Menlo Park, CA, (1996). Purvis, M., Cranefield, S., Nowostawski, M., and Carter, D., “Opal: A Multi-Level Infrastructure for Agent-Oriented Software Development”, Information Science Discussion Paper Series, No. 2002/01, ISSN 1172-6024, University of Otago, Dunedin, New Zealand, http://www.otago.ac.nz/informationscience/publctns/complete/papers/ dp2002-01.pdf.gz, (2002).
This page intentionally left blank
CHALLENGES TO SCALING-UP AGENT COORDINATION STRATEGIES* Edmund H. Durfee Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109,
[email protected]
Abstract:
There is more to scaling up agent-based systems than simply increasing the number of agents involved. Many of the challenges to making agent-based systems work in more realistic settings arise from the characteristics of the agents’ tasks and environment, and the expectations of the systems’ users. In this chapter, my goal is thus to emphasize this broader array of challenges to coordinating agent-based systems, as a step both towards extending our understanding of scale-up issues as well as towards developing richer metrics for evaluating the degree to which coordination strategies for agent-based systems can apply to more demanding applications.
Key words:
Multi-agent systems; coordination
1.
INTRODUCTION
The notion of deploying “intelligent agents” to do peoples’ bidding in environments ranging from marketplaces on the internet to robotic exploration of Mars has recently received much attention and speculation. Meanwhile, exactly what an “agent” is and in what senses a computational agent can behave “intelligently” are still undergoing much debate. Rather than confront such thorny issues head on, this article skirts around most of them to focus more squarely on just one of the central concerns of intelligent agency: coordination. With few exceptions, if an agent is dispatched to an environment, the odds are that it will share the environment with other agents. Even some *
A condensed version of this chapter appeared as “Scaling-Up Agent Coordination Strategies” in IEEE Computer 34(7):39-46, July 2001.
114 proposed strategies for robotic exploration of the planets typically involve sending a team of robots! Thus, a fundamental capability needed by an agent is the ability to decide on its own actions in the context of the activities of other agents around it. This is what we will mean when we refer to coordination. Note that this does not mean that coordination must imply cooperation: an effective competitor will coordinate its decisions to work to its advantage against an opponent, such as a producer of goods timing a product promotion to undercut a competitor. It does not even imply reciprocation: an agent may be coordinating with another who is unaware of it, such as one automobile driver trying to pass a second whose mind is entirely elsewhere. Without coordination, agents can unintentionally conflict, can waste their efforts and squander resources, and can fail to accomplish objectives that require collective effort. It is therefore no wonder that a variety of strategies for coordination among computational agents have been developed over the years, in an effort to get “intelligent agents” to interact at least somewhat “intelligently.” It does not seem possible to devise a coordination strategy that always works well under all circumstances; if such a strategy existed, our human societies could adopt it and replace the myriad coordination constructs we employ, like corporations, governments, markets, teams, committees, professional societies, mailing groups, etc. It seems like whatever strategy we adopt, we can find situations that stress it to the breaking point. Whenever a coordination strategy is proposed, therefore, a natural question that arises is “How does it scale to more stressful situations?” In an effort to map the space of coordination strategies, therefore, we need to define at least some of these dimensions in which they might be asked to “scale,” and then figure out how well they respond to being stressed along those dimensions. For example, clearly one of the most measurable scaling dimensions is simply the number of agents in the system. Yet, sheer numbers cannot be all there is to it: the coordination strategies employed in insect colonies seem to scale to large numbers of insects, yet they do not seem to satisfy all the needs of large human societies (New York City traffic not withstanding). One of my goals in writing this article, therefore, is to provoke a dialogue about what it means for a coordination strategy to “scale up.” Moreover, as Jennings has suggested, agent-oriented software engineering shows promise for developing complex, distributed systems, but requires the component agents to act and interact flexibly (Jennings 2001). A second goal of this article is therefore to provide some potentially useful starting points for characterizing portions of the space of coordination problems, so as to better understand the capabilities and limitations of
An Application Science for Multi-Agent Systems
115
strategies developed to support flexible interaction. Toward this end, I’ll begin by forming a characterization of the coordination problem space by looking at properties of the agent population, of the task-environment the agents inhabit, and of the expectations about their collective behaviors. I’ll then turn to giving a very brief survey of a few (of many) coordination strategies and how they fit into this space. I’ll conclude by pointing out gaps in our understanding, and suggest opportunities for progress in the field.
2.
SOME DIMENSIONS OF COORDINATION STRESS
There are more factors that influence how difficult it is to bring about coordination than can be covered here. Therefore, this article tries to project the richness of this space while also simplifying enough to allow a reader to grasp portions the space. To that end, I’ll limit discussion to three dimensions (so as to allow depiction on a 2-dimensional page) along each of the major properties: of the agents, of the task-environment, and of the solution. It should be noted up front that these dimensions are not necessarily orthogonal; in some cases relationships between them are indicated. Nonetheless, treating them as orthogonal can be useful in characterizing the space of coordination challenges.
2.1
Agent Population Properties
We begin with the most obvious properties that will impact coordination: those of the set of agents that need to coordinate. Certainly, one of the challenges in scaling any coordination strategy, as previously mentioned, is handling larger and larger numbers of agents. Coordination strategies that rely, for example, on a centralized “coordinator” to direct the interactions of the other agents can quickly degrade as the coordinator becomes incapable of processing all of the interactions given increasing numbers of potentially interacting agents. If each agent can potentially interact with every other agent, then the number of pairwise interactions to analyze grows quadratically with the number of agents. More problematically, since interactions often must be viewed in terms of larger groups of agents (not just pairs), the problem can devolve into a problem of exponential size: if each.agent could choose among b actions, each potentially having a different impact on other agents, then the space of all possible action combinations will be for n agents. Even if each of the n agents participated in the
116
coordination search, rather than depending on a centralized coordinator, an n-fold speedup of a problem that is exponential in n doesn’t help much. A second dimension that often poses challenges in coordination is what is broadly labeled as “heterogeneity.” Agents within a population can be different from each other in many possible ways. For example, due to occupying different places in the environment, they might know different things about the current state of the world. If they know different things about the way the world works, then we might say they have heterogeneous expertise. They could have differing abilities to sense the world or change the world. Especially in the case of competitors, they could have different preferences for how the world should be. They could even have different communication languages, ontologies, or internal architectures. Whether a coordination strategy scales to increasingly heterogeneous populations depends on the degree it expects agents to in principle be able to communicate with, share their abilities with, and basically agree with each other. Finally, the third dimension of agent properties we will consider here is what I term “complexity.” While this could mean many things, I’ll focus on it as referring to how hard it is to predict what an agent will do because of inherent versatility on the part of an agent. One of the features that arguably makes something an “intelligent” agent is that it is capable of flexibly deciding for itself which goals to pursue at a given time and how to pursue them. Agents that are not complex, under this characterization, are those that can be seen as single-mindedly doing a specialized task. In general, coordinating with such agents is easier (they are much more predictable) than coordinating with agents that could be doing any of a number of things. Couple this with the possibility of overlaps among agents’ spheres of interest and ability, and this can put enormous stress on any coordination strategy
An Application Science for Multi-Agent Systems
117
that that wants to assume unambiguous matches between tasks or roles in the system and the agents to do them. Obviously, scaling along combinations of these dimensions can pose even greater challenges. Handling complex agents is much harder, for example, if they are complex in different (heterogeneous) ways, but easier if there aren’t very many of them. Coordination strategies will tend to therefore make assumptions about which dimensions are likely to be stressed for the application domain of interest.
2.2
Task-Environment Properties
The environment in which agents operate, and the tasks they are expected to accomplish within the environment, are another major consideration in developing or choosing a coordination strategy. Real task-environments often introduce complications that blur the understanding of a coordination strategy: for example, in task-environments that require substantial domain expertise, it can be difficult to compare alternative coordination strategies because the differences in performance might be due to the quality of the knowledge given the individuals rather than to the efficacy of the coordination strategy. For this reason, researchers often employ abstract, idealized versions of task-environments such as pursuit problems, transport problems, the Prisoners’ Dilemma, and distributed sensor networks (e.g., see (Weiss, 1999) and some of the sidebars associated with this article). Even with abstract task-environments, the possible dimensions for scaling the difficulty of coordination are numerous; again, only three are given here of the many possibilities. The first dimension we will consider is the degree to which the environment, or the task, leads to interactions among agents that materially impact the agents. Since coordination is all about exerting some control over interactions, a greater degree of interaction implies more need to coordinate. Or, viewed the other way, agents that do not interact need not coordinate. Thinking slightly more concretely, suppose that an interaction involves some “issue” that involves more than one agent. The issue could be about who gets to use a resource, or about what the status of some feature of the world is, or about who is supposed to do what task, etc. The degree of agent interaction increases as more agents are concerned with the same issues, and as more issues are of concern to each agent, so that settling some issues commit agents to interactions that in turn impact how they should settle other issues. As the web of dependencies grows, some coordination strategies can have difficulty scaling. A second dimension that complicates coordination is the dynamics of the task-environment. Coping with changing environments is always difficult; in
118
a multiagent setting, where different agents might be capable of monitoring only portions of the environment, and where each might change its mind about what goals to pursue or what means to use to pursue goals, the difficulties are compounded. In more static task-environments, the agents have some hope of converging on coordinated activities and then carrying them out. But in more dynamic task-environments, convergence might be impossible: the task-environment might change faster than the coordination strategy can keep up. For this reason, coordination strategies that scale to highly dynamic task-environments are relatively uncommon. A third dimension, which is related to the first two, as well as to agent heterogeneity, is what here will be called “distributivity.” In some taskenvironments, agents are highly distributed in the (conceptual) environment and tasks are inherently distributed among the agents. In other taskenvironments, the agents are (conceptually) collected together – such as occupying a common “yellow pages,” and tasks originate at one point. Distributivity stresses a coordination strategy because it increases agents’ uncertainty about which agents are currently sharing the task-environment and what (if anything) each is, or should be, doing. Again, scaling along combinations of these dimensions is possible, placing even more substantial demands on a coordination strategy. For example, distributivity compounds the difficulties in a dynamic taskenvironment, because of the inherent delays in propagating the implications of changes in a highly distributed setting, but lowering the degree of interaction can simplify this by localizing the need to propagate to fewer interested parties.
An Application Science for Multi-Agent Systems
2.3
119
Solution Properties
To evaluate how well a coordination strategy deals with the scaling issues that we throw its way, we need to define criteria that we expect of a solution. One of the dimensions for solution properties, for example, is the “quality” of the solution, in terms of how well the interaction is coordinated. The quality might be measured in terms of efficiency – that is, whether the issues have been settled in a manner that permits the efficient use of agent resources and abilities. Higher quality can correspond to closer to optimal coordination. A less demanding level of quality might correspond to achieving a satisficing level of coordination. In some cases, simply avoiding disagreement (conflict) might be good enough. As illustration, if we were to design a coordination strategy for an automobile intersection, we might be satisfied if it prevents crashes, or we might further require that it achieve some measures such as ensuring no car needs to wait longer than some upper bound time, or we could insist that it minimize the expected wait time for all cars. As we demand more, we put greater stress on the coordination strategy. A second dimension considers how robust we expect a solution to be in the face of uncertainty or dynamics in the task-environment and the agent population. For example, as was pointed out before, a coordination strategy might have trouble keeping up with a particularly dynamic task-environment. The coordination solution might therefore be somewhat out of date. If we demand that a solution nonetheless be robust, then the coordination strategy should anticipate, either implicitly or explicitly, the range of conditions under which the solution it provides will be followed, and not simply the single expected situation. Given that some task-environments might be such
120 that a minor deviation from expectations can lead to severe consequences, finding assured robust solutions can, in some cases, be imperative. Finally, a third dimension concentrates on the cost of the coordination strategy. A solution to the problem of how to coordinate should account for the costs of doing the coordination. These costs could include the amount of computation required, communication overhead, time spent, and so on. For example, if communication is costly and time-consuming, a coordination strategy might have to reduce its demands for information exchange among agents; beyond some point, it will have to make high-quality coordination decisions lacking information it would otherwise have expected to have. Therefore, questions can arise about whether a coordination strategy can scale well to environments that impose more stringent limits on costs that the strategy incurs. As for the previous properties, these three dimensions can combine in various ways. For example, one way of improving robustness of a coordination solution without sacrificing quality is to continually monitor and update the solution in response to changes, but this in turn would require that minimizing costs and delays is not a significant objective.
3.
CHARACTERIZING COORDINATION STRATEGIES
At this point, I’ve identified three major types of properties (agent population, task-environment, and solution), and for each I’ve described three (out of many possible) dimensions in which the property could be scaled to make coordination harder. If we were to qualitatively consider “low” and “high” values along each of the dimensions, we’d have eight possible points to consider for each property, leading to combinations across the three properties. It would be tempting to now look at each of these 512 combinations in turn, and consider which coordination strategies make sense for each. The truth is, however, that even if this book had enough room, and you the reader had enough patience, there isn’t sufficient understanding of the entire space of coordination strategies that have (or could have) computational embodiments to fill all of these in. Instead, what follows summarizes just a handful of coordination strategies, highlighting where they fall within this space and the kinds of scaling for which they are particularly well suited. The selection of these strategies should not be viewed as an endorsement that the strategies given are superior to others not given, but rather is based on giving some representative examples across the space.
An Application Science for Multi-Agent Systems
3.1
121
Agents
To some people, “scaling up” is equated to being able to handle more agents, and (almost always) handling more agents is harder than handling fewer. Trying to get a large population of complicated, self-interested, and interacting agents to somehow behave efficiently and robustly in a dynamic environment is a tall order. In fact, typically something has to give: usually, coordination strategies that scale well to large numbers of agents do not deal with many of these other confounding dimensions. For example, cellular automata (Wolfram, 2002) often deal with large numbers of entities that typically use rules to react in simple ways to their very local environments, such as “deactivating” when too few neighbors are active, or “activating” when enough neighbors are active. Patterns of activity can emerge in the population through these very simple local changes. Physics-based models of large computational ecosystems of agents can even lead to designs of metamorphic robots made up many small pieces that shift and flow to adapt to the environment (Bojinov, 2001). Similarly, systems based on insect metaphors assume that each agent is a relatively simple automaton, and that emergent properties of interest arise due to their local interactions (Ferber 1999). These strategies assume little complexity and, often, little heterogeneity in the agent population, focus on very limited (local) kinds of interactions, and are satisfied with emergent, statistical system performance, rather than worrying about each agent being efficiently used or making optimal choices. More generally, successfully scaling up to large numbers of agents generally requires that each agent only needs to interact with a constant (or slowly growing) number of other agents, and that who needs to interact with whom is preordained based on agents’ features such as their physical locations or their tasks/roles. Thus, large numbers of mobile agents can be dispersed for information gathering tasks that can be pursued independently, interacting only indirectly due to contention for bandwidth or server cycles (Gray, 2001). Similarly, large-scale coalition/congregation formation can be viewed as an emergent process involving growing groups incrementally as agents (and agent groups) encounter each other and discover advantages of banding together (Lerman, 2000; Brooks, 2000).
3.2
More Heterogeneity
In the case of scaling up to large agent populations, agent heterogeneity can sometimes help, if agents that are different from each other need not interact. This serves to once again restrict the number of others about which an agent must be aware. More typically, however, heterogeneity is
122 welcomed into a system because it increases the system-wide capabilities, whereby agents with complementary attributes combine their efforts toward objectives beyond what they can individually achieve. Once the agent population is no longer homogeneous, therefore, it becomes important for agents to be able to understand and often describe what they can do, and to find others with whom to work. Coordination strategies that do not support the ability of agents to describe themselves and to find each other, such as by having implicit acquaintanceships among agents “hardwired,” have difficulty scaling along the heterogeneity dimension. A mainstay coordination strategy for handling heterogeneity has been the Contract Net protocol (Smith 1980) and its descendents, whereby agents dynamically assign tasks to others who are available and capable of doing the tasks. In its simplest form, the protocol allows an agent with a task that it needs done to broadcast an announcement of the task, along with criteria by which each of the other agents can decide whether it is eligible to take on the task and, if so, what information to supply in a bid for the task. The agent with the task can choose from among the responses to make an assignment. The Contract Net protocol scales well to an open system of heterogeneous agents, but as the number of agents increases, the broadcast communication requirements can be problematic. A response to this is to maintain a more centralized registry of agents and their capabilities, which can be used flexibly to discover promising matches between agents with tasks to do and agents that can do them. Strategies that support agent registration and matchmaking (for example, (Paolucci, 2000) or www.sun.com/jini) can allow agents to find each other by describing the kinds of services that they need or provide. More generally, formalisms for communicative acts, such as FIPA (www.fipa.org), can permit a broad array of conversation policies in support of flexible agent interactions among heterogeneous agents. Many of these concepts are being brought together in more comprehensive frameworks for supporting heterogeneous agent-based systems, such as DARPA’s Grid (coabs.globalinfotek.com).
3.3
More Complexity
Heterogeneity tends to emphasize the challenges that accrue when “specialist” agents need to identify each other and team to provide broader services. Additional complications arise when agents are individually more complex, typically meaning that they are each more versatile, yet not identically so. Now, each agent must decide which of the possible roles that it could play it should play, and must reason about other agents in terms of the alternative activities they might be engaged in, rather than the specific activity that a “specialist” could be assumed to pursue.
An Application Science for Multi-Agent Systems
123
Scaling up to more complex agents means that teaming involves not only finding an available agent with appropriate capabilities, but also selecting from among such agents so as to pick the one whose other talents are least in demand by other teams. Thus, interactions among agents are not localized within smaller teams, but rather the “partial substitutability” of agents for each other leads to complex chains of dependencies: how some teams are formed can color which other teams will be desirable. This means that agents must be increasingly aware of the broader needs of the agent network. Similarly, even when agents do not need to team up, but merely must coexist and stay out of each others’ way, the increased versatility of each agent makes anticipating what others will be doing much more difficult. Being prepared for anything that another could choose to do might be impossible, so strategies for increasing awareness about other agents’ planned activities becomes paramount. Strategies can include using statistics of others’ previous behaviors, using observations of them to infer their current plans, or using communication to convey information that permits agents to adequately model each others’ intentions. As an example of the latter, the process by which agents that can accomplish their objectives in several different ways can converge on mutually compatible plans can be viewed as a distributed constraint satisfaction process. This process involves propagating tentative plan choices among agents and, when inconsistencies are detected among the choices of some subset of agents, systematic backtracking is performed by some of the agents. Increased efficiency in this process can stem from techniques that allow parallel asynchronous exploration of the space, and that can dynamically decide which agents should be asked to try alternatives based on measures of which constraints are proving most difficult to satisfy (Weiss, 1999, chapter 4; Yokoo, 2000).
3.4
Higher Degree of Interaction
As was previously stated, the need for coordination arises from agent interactions. As the number and complexity of agent interactions grow, coordination becomes intractable. Therefore, it isn’t surprising that on effective means for addressing coordination is to reduce, or if possible eliminate, interactions. As already pointed out, when agents only have to worry about interactions with a small number of local “neighbors,” then scaling to large numbers of agents is much easier. So strategies for localizing interactions (Lansky, 1990) can obviate the need for more complicated coordination strategies. One often-used technique for controlling the degree of interaction is to impose a (relatively static) organizational structure on agents. Each agent is
124 given a role to play in the organization, including its own sphere of control and knowledge of agents playing related roles. Giving each agent the resources it needs to fulfill its role eliminates the need for agents to negotiate over resources, and giving each agent knowledge of the roles of other agents dictates who needs to communicate with whom and about what. An appropriate organizational structure among agents can thus simplify coordination, and permit larger, more complex agent systems to succeed in more challenging task domains. The challenge, of course, is in designing organizations for agents, or having agents design their own organizations, such that the organizations match the agent population and the needs of the task-environment (Prietula, 1998). Sometimes, however, multiagent tasks cannot be divided into nearlyindependent pieces; there are some tasks that absolutely require tight interactions among agents. In the literature, examples of such tasks include the “pursuit” task where predators need to surround a prey (Gasser, 1987), and tasks involving team activities such as combat flight operations (Tambe, 1995). For such applications, interactions are not a side-effect of individuals acting in a shared world, but rather are the purpose of the individuals’ actions in the first place. Therefore, an emphasis on agent teams is appropriate, leading to frameworks where a system designer explicitly describes recipes for team behavior, with particular attention to which team members should interact, when, and how (Grosz, 1996; Tambe, 2000; Kinny, 1994). When agents must formulate plans that fit together, but for which no existing recipes are available, techniques for reasoning about how actions of agents can enable or facilitate, or can hinder or even disable, actions of others, are needed (Decker, 1995). Merging the plans of agents, formulated individually, so as to permit the agents to successfully accomplish their activities without interfering with each other is also a useful technique (Georgeff, 1983; Ephrati, 1995; Clement, 1999).
3.5
More Dynamic
Whether viewed as a population of individuals or as a team, a multiagent system that operates in a dynamic task-environment must contend with changes in plans, goals, and conditions in the midst of execution. Tasks that previously could be carried out independently might now interact, such as when a resource becomes unusable forcing contention for other remaining resources. Agreements that have been forged between team members might have to be revisited as some team members change their priorities or recognize that their individual intentions, or those of the team as a whole, are no longer relevant in the new context they find themselves in.
An Application Science for Multi-Agent Systems
125
Jennings (Jennings, 1992) has characterized these issues as the challenge in having conventions about what agents should do when they begin to question their commitments due to task-environmental dynamics. A variety of conventions can be specified, including the convention that seeks to ignore dynamics entirely by insisting that agents fulfill their commitments regardless. Alternatives include allowing agents to reneg on commitments if they pay some penalty, or permitting agents to abandon obsolete commitments provided that they notify team members (and thus potentially stimulate to formation of different commitments). In fact, dynamic task-environments can suggest that agents should never view their (or others’) plans as being anything more than tentative. Agents could unilaterally change their minds about their plans and begin acting on new plans before reaching agreement across the team. This has the potential of leading to inefficient collective activities due to information delays and to chain reactions (even race conditions) among changes. However, under some limiting assumptions about how and when agents can make unilateral changes, iterative coordination and execution techniques (e.g., (Durfee, 1991)) can lead to flexible coordinated behavior in dynamic taskenvironments.
3.6
More Distributed
Even when the interactions between agents requiring coordination are few and not undergoing dynamic changes, a task-environment can stress agents if the interactions requiring coordination are hard to anticipate. In particular, if agents are acting based on privately-held information about goals and methods, then it might take substantial effort to discover who is going to be interacting with whom. One response to this is to anticipate all of the possible actions that agents might take, across all of the goals and plans that they might adopt, and to impose restrictions on what actions they can take under what conditions so as to prohibit undesirable interactions. Such “social laws” ensure that a lawabiding agent, acting in a population of other law-abiding agents, need never worry about undesirable interactions, no matter what goals and plans are being adopted (Shoham, 1994). In human terms, this is like saying that as long as all drivers obey traffic laws, then they can each eventually get to their desired destinations, wherever those are, without collision. A second response is to support the process by which agents whose individual actions might interact can efficiently find each other. When interactions are over the exchange of goods, for example, providing agents with loci (auctions) for finding each other helps. Creating agents to represent resources over which agents might contend similarly allows interacting
126 resource demands to be identified. Or agents might discover through experience others with whom they tend to interact, and form persistent aggregations (Azoulay-Schwartz, 2000; Brooks, 2000; Lerman, 2000). Without identifiable contexts for aggregating, however, it could be that agents must somehow test for possible interactions against all other agents. This could be done through a centralized “coordinator” who collects together information on all agents, and using its global awareness can inform agents of the potential interactions to watch out for. In such a case, the coordinator should accept only as much information as is absolutely necessary to recognize interactions (Clement, 1999). Alternatively, agents could broadcast information to all others, so that each has sufficient awareness of the global picture. Through iterative exchanges, the overall system can cooperatively achieve its objectives (Lesser, 1981).
3.7
Greater Optimality/Efficiency
Coordination that is optimal is generally desirable, though less often feasible. As was mentioned earlier, coordination can sometimes be viewed as a search through the exponential number of combinations of agents’ alternative actions to find a “good enough” combination. Whereas sometimes it is enough to find a combination that does well enough (avoids conflicts among agents, or ensures eventually achieving goals), for some applications the optimal solution is sought. Optimality generally requires substantial computation (and sometimes communication) overhead; especially in dynamic task-environments (where optimal can become obsolete before it is carried out) or those with many agents and/or complex interactions, a satisficing or locally-optimal solution is often acceptable. Nonetheless, for some restricted types of coordinated decisions, optimal might be within reach. An example commanding much attention in recent years has been in coordinating resource allocation decisions based on market-oriented approaches (Wellman, 1993). Through iterated rounds of bidding in an auction, agents can balance supply and demand to allocate resources to maximize their efficient use, under some assumptions. Active research is ongoing to extend these coordination strategies to “scale” them along other dimensions: not only to handle larger numbers of agents, but to handle higher degrees of interaction (using combinatorial auctions to allocate resources whose values are dependent on how they are acquired in combinations) and greater dynamics (including strategies for clearing auctions without waiting for all prices to settle) (Andersson, 2000; Fujishima, 1999). Other methods for distributed rational decision making (Sandholm, 1999) include decision theoretic methods based on multiagent extensions of
An Application Science for Multi-Agent Systems
127
Markov Decision Processes (Boutilier, 1999). This type of method can find an optimal policy for a multiagent system, based on a particular coordination protocol that can be employed at runtime (for example, to increase agents’ awareness of the global situation). When each agent follows its portion of the optimal policy, the expected utility of the multiagent system is maximized.
3.8
More Robustness
An optimal coordination solution might break when the world deviates from the coordination strategy’s assumptions. Whether a coordination strategy can scale to domains where robust performance is difficult but necessary can thus become important. One means of increasing the robustness of a coordination solution is to build a solution that contains sufficient flexibility that agents can work around new circumstances within their original coordination agreement. For example, building slack time into scheduled activities, or avoiding committing to details of exactly what will be done and when, can leave each agent with more room to maneuver when the world doesn’t proceed according to plan. Typically, more robust coordination decisions are less efficient because they reserve resources for “fall-back” contingencies and therefore might suboptimally divide up tasks among agents for a particular situation. Coordination through organizational structures typically has this feature (Weiss, 1999, chapter 7; Prietula, 1998, chapter 3; Durfee, 1993). Alternatively, a coordination strategy might expect to monitor the execution of its solution, and repair that solution as needed. These ideas are extensions of single-agent plan monitoring and repair/replan techniques. Teamwork models, with conventions as to how to respond when continued pursuit of joint commitments is senseless, are examples of this (Kumar, 2000). Moreover, in some cases it might be possible to develop generic monitoring and recovery methods for the coordination processes themselves (Dellarocas, 2000).
3.9
Lower Overheads
In application domains where communication channels are limited and where the computational resources available for coordination are minimal demand that attention be paid to reducing the overhead of coordination strategies. As communication bandwidth becomes more limited, for example, coordination decisions must be made without exchanging enough information to maintain a level of global awareness that many strategies might expect.
128 Techniques that involve the iterative exchange of increasingly detailed information about agents’ plans and intentions provide one means of permitting time-constrained coordination‚ where the communication and computation overheads can be limited at the expense of the quality of the coordination solution (Clement‚ 1999). Alternatively‚ agents can choose to continue with outdated but still sufficient coordination decisions to avoid a chain reaction of coordination activities. When communication is at a premium‚ or might even be impossible‚ techniques such as using observations to model others‚ or using reasoning to converge on coordinated decisions (e.g.‚ focal points) can pay dividends (Fenster‚ 1995). Sometimes‚ the availability of coordination resources can be sporadic. Under some coordination regimes‚ agents can take advantage of opportunities where such resources are plentiful to build more complete models of the roles and contingent plans of each other‚ that can then be exploited when the agents have moved into situations where further communication and computation to coordinate is unsafe or infeasible (Durfee‚ 1999; Stone 1999).
4.
OPEN CHALLENGES
I was initially inspired to write this piece because of what I saw as a trend toward identifying scaling to large numbers of agents as the most important challenge that can be posed to a multi-agent system. My own experience was that it was easy to develop multi-agent systems consisting of hundreds or thousands of agents‚ so long as those agents could merrily go about their business with no concern about the activities of others. On the other hand‚ it could be a tremendous challenge to develop a working system made up of only a handful of agents if the degree to which their activities needed to be dovetailed – and the penalty for failing to get the dovetailing exactly right – were both very high. The takehome messages of this article could thus be viewed as: (1) there are many ways to stress a coordination strategy‚ each of which pose research challenges and opportunities‚ and (2) there are already a variety of promising ideas out there for designing coordination strategies‚ that can be computationally realized‚ for getting agents to work well together under a broad range of circumstances. The preceding whirlwind tour of some of the coordination strategies‚ and the kinds of stresses in agent population‚ task-environment‚ and solution criteria for which they are suited‚ should be viewed only as an introduction to the rich body of work that has gone into addressing the challenges of coordination in the many domains where it is needed. Many coordination strategies‚ and variations of coordination strategies‚ have been left out of the
An Application Science for Multi-Agent Systems
129
preceding. Interested readers should refer to recent books on the subject of multiagent systems (for example‚ (Weiss‚ 1999; Ferber‚ 1999; Wooldridge‚ 2000)) and to journals such as Autonomous Agents and Multi-Agent Systems (published by Kluwer) and proceedings of conferences such as the past International Conference on MultiAgent Systems and the current series of the International Joint Conference on Autonomous Agents and MultiAgent Systems. I should also emphasize that‚ in the preceding survey‚ I was not intending that each coordination strategy be pigeonholed as only addressing issues along one of the dimensions. In fact‚ most can be scaled along multiple dimensions‚ but each has its limits. The challenge facing researchers in the field is to develop a better (preferably quantifiable) understanding of exactly how far different coordination strategies can scale along the dimensions laid out‚ as well as along dimensions that are still being identified as being germane to the application of intelligent agent systems to increasingly challenging problems.
5.
ACKNOWLEDGMENTS
As part of DARPA’s Control of Agent-Based Systems (CoABS), I have worked with several other researchers to understand the relative strengths and weaknesses of various coordination approaches. This article consolidates my take on many of those thoughts. While these colleagues should be held blameless for any oversimplifications and misrepresentations in this article, I'd like to acknowledge their contributions to my thinking along these lines. They include Craig Boutilier, Jim Hendler, Mike Huhns, David Kotz, James Lawton, Onn Shehory, Katia Sycara, Milind Tambe, and Sankar Virdhagriswaran. Milind Tambe also provided valuable feedback on an earlier version of this article. This work was supported, in part, by DARPA through Rome Labs (F30602-98-2-0142).
6.
REFERENCES
Andersson‚ A.‚ M. Tenhunen‚ and F. Ygge. “Integer Programming for Combinatorial Auction Winner Determination.” Proceedings of the Fourth International Conference on MultiAgent Systems (ICMAS-2000)‚ pages 39-46‚ IEEE Computer Society Press‚ July 2000. Azoulay-Schwartz‚ R. and S. Kraus. “Assessing Usage Patterns to Improve Data Allocations via Auctions.” Proceedings of the Fourth International Conference on MultiAgent Systems (ICMAS-2000)‚ pages 47-55‚ IEEE Computer Society Press‚ July 2000.
130 Bojinov‚ H.‚ A. Casal‚ and T. Hogg. “Multiagent Control of Self-reconfigurable Robots.” Proceedings of the Fourth International Conference on MultiAgent Systems (ICMAS-2000)‚ pages 143-150‚ IEEE Computer Society Press‚ July 2000. Boutilier‚ C. “Multiagent Systems: Challenges and opportunities for decision-theoretic planning.” AI Magazine 20(4):35-43‚ Winter 1999. Brooks‚ C. H.‚ E. H. Durfee‚ and A. Armstrong. “An Introduction to Congregating in Multiagent Systems.” Proceedings of the Fourth International Conference on MultiAgent Systems (ICMAS-2000)‚ pages 79-86‚ IEEE Computer Society Press‚ July 2000. Clement‚ B. J. and E. H. Durfee‚ 1999. “Top-Down Search for Coordinating the Hierarchical Plans of Multiple Agents.” In Proceedings of the Third Conference on Autonomous Agents‚ pages 252-259‚ May. Decker‚ K. S. “TÆMS: A framework for analysis and design of coordination mechanisms.” In G. O’Hare and N. Jennings‚ editors‚ Foundations of Distributed Artificial Intelligence‚ Chapter 16. Wiley Inter-Science‚ 1995. Dellarocas‚ C. and M. Klein. “An experimental evaluation of domain-independent fault handling services in open multi-agent systems.” Proceedings of the Fourth International Conference on MultiAgent Systems (ICMAS-2000)‚ pages 39-46‚ IEEE Computer Society Press‚ July 2000. Durfee‚ E. H. and V. R. Lesser. “Partial Global Planning: A coordination framework for distributed hypothesis formation.” IEEE Transactions on Systems‚ Man‚ and Cybernetics SMC-21(5):1167-1183‚ September 1991. Durfee‚ E. H. “Organisations‚ Plans‚ and Schedules: An Interdisciplinary Perspective on Coordinating AI Systems.” Journal of Intelligent Systems 3(2-4):157-187‚ 1993. Durfee‚ E. H. “Distributed continual planning for unmanned ground vehicle teams.” AI Magazine 20(4):55-61‚ Winter 1999. Ephrati‚ E.‚ M. E. Pollack‚ and J. S. Rosenschein. “A tractable heuristic that maximizes global utility through local plan combination.” In Proceedings of the First International Conf. on Multi-Agent Systems (ICMAS-95)‚ pages 94-101‚ June 1995. Estlin‚ T.‚ T. Mann‚ A. Gray‚ G. Rabideau‚ R. Castano‚ S. Chien and E. Mjolsness‚ “An Integrated System for Multi-Rover Scientific Exploration‚” Proceedings of the Sixteenth National Conference of Artificial Intelligence (AAAI-99)‚ Orlando‚ FL‚ July 1999. Fenster‚ M.‚ S. Kraus‚ and J. S. Rosenschein. “Coordination without communication: experimental validation of focal point techniques.” Proceedings of the First International Conf. on Multi-Agent Systems (ICMAS-95)‚ pages 102-108‚ June 1995. Ferber‚ J. Multi-Agent Systems: An Introduction to Distributed Artificial Intelligence. AddisonWesley‚ Harlow England‚ 1999. Fujishima‚ Y.‚ Leyton-Brown‚ K.‚ and Shoham‚ Y. “Taming the computational complexity of combinatorial auctions: Optimal and approximate approaches.” In Sixteenth International Joing Conference on Artificial Intelligence (IJCAI-99). 1999. Gasser‚ L.‚ C. Braganza‚ and N. Herman. “MACE: A flexible testbed for distributed AI research.” In M. N. Huhns‚ editor‚ Distributed Artificial Intelligence‚ pages 119-152‚ Pitman Publishers‚ 1987.
An Application Science for Multi-Agent Systems
131
Georgeff‚ M. P. “Communication and Interaction in multi-agent planning.” In Proceedings of the Third National Conf. on Artificial Intelligence (AAAI-83)‚ pages 125-129‚ July 1983. Gray‚ R. S.‚ D. Kotz‚ R. A. Peterson‚ Jr.‚ P. Gerken‚ M. Hofmann‚ D. Chacon‚ G. Hill‚ and N. Suri. “Mobile-Agent versus Client/Server Performance: Scalability in an InformationRetrieval Task.” Technical Report TR2001-386‚ Dept. of Computer Science‚ Dartmouth College‚ January 2001. Grosz‚ B. J. and S. Kraus. “Collaborative Plans for Complex Group Action.” Artificial Intelligence. 86(2)‚ pp. 269-357‚ 1996. Jennings‚ N. R. “Commitments and Conventions: The foundation of coordination in multi-agent systems.” The Knowledge Engineering Review‚ 2(3):223-250‚ 1993. Jennings‚ N. R. “An Agent-based Approach for Building Complex Software Systems.” Communications of the ACM44(4):35-41‚ April 2001. Kinny‚ D.‚ M. Ljungberg‚ A. S. Rao‚ E. Sonenberg‚ G. Tidhar‚ and E. Werner. “Planned Team Activity.” In C. Castelfrachi and E. Werner‚ editors‚ Artificial Social Systems. SpringerVerlag‚ Amsterdam‚ 1994. Kumar‚ S.‚ P. R. Cohen‚ and H. J. Levesque. “The adaptive agent architecture: Achieving faulttolerance using persistent broker teams.” Proceedings of the Fourth International Conference on MultiAgent Systems (ICMAS-2000)‚ pages 159-166‚ IEEE Computer Society Press‚ July 2000. Lansky‚ A. L. “Localized Search for Controlling Automated Reasoning.” In Proceedings of the DARPA Workshop on Innovative Approaches to Planning‚ Scheduling‚ and Control‚ pages 115-125‚ November 1990. Lerman‚ K. and O. Shehory. “Coalition Formation for Large-Scale Electronic Markets.” Proceedings of the Fourth International Conference on MultiAgent Systems (ICMAS-2000)‚ pages 167-174‚ IEEE Computer Society Press‚ July 2000. Lesser‚ V. R. and D. D. Corkill. “Functionally accurate‚ cooperative distributed systems.” IEEE Trans. on Systems‚ Man‚ and Cybernetics SMC-11(1):81-96‚ 1981. Paolucci‚ M.‚ Z. Niu‚ K. Sycara‚ C. Domashnev‚ S. Owens and M. van Velsen “Matchmaking to Support Intelligent Agents for Portfolio Management.” In Proceedings of AAAI2000. Prietula‚ M. J.‚ K. M. Carley‚ and L. Gasser‚ editors. Simulating Organizations: Computational Models of Institutions and Groups. AAAI Press/MIT Press‚ Menlo Park‚ CA‚ 1998. Sandholm‚ T. W. “Distributed Rational Decision Making.” Chapter 5 in (Weiss‚ 1999). Shoham‚ Y. and M. Tennenholtz. “On Social Laws for Artificial Agent Societies: Off-line design.” Artificial Intelligence 72(1-2):231-252‚ 1994. Stone‚ P. and M. Veloso. “Task Decomposition‚ Dynamic Role Assignment‚ and Low-Bandwidth Communication for Real-Time Strategic Teamwork.” Artificial Intelligence 100(2)‚ June 1999. Tambe‚ M.‚ W. L. Johnson‚ R. M. Jones‚ F. Koss‚ J. E. Laird‚ P. S. Rosenbloom‚ and K. Schwamb. “.” AI Magazine‚ 16(1):15-39‚ Spring 1995. Tambe‚ M.‚ and W. Zhang. “Towards flexible teamwork in persistent teams: extended report.” Journal of Autonomous Agents and Multi-agent Systems 3(2): 159-183‚ June 2000.
132 Weiss‚ G.‚ editor. Multiagent Systems: A modern approach to distributed artificial intelligence. The MIT Press‚ Cambridge MA‚ 1999. Wellman‚ M. P. “A market-oriented programming environment and its application to distributed multicommodity flow problems.” Journal of Artificial Intelligence Research‚ 1:1-23‚ 1993. Wolfram‚ S. A New Kind of Science. Wolfram Media Inc.‚ 2002. Wooldridge‚ M. Reasoning About Rational Agents. The MIT Press‚ July 2000. Yokoo‚ M. and Hirayama‚ K. “Algorithms for distributed constraint satisfaction: A review.” Autonomous Agents and Multi-agent Systems‚ 3(2):189-211‚ 2000.
ROLES IN MAS Managing the Complexity of Tasks and Environments Ioannis Partsakoulakis and George Vouros Department of Information and Communication Systems Engineering University of the Aegean 83200 Samos‚ Greece {jpar‚georgev}@aegean.gr
Abstract
1.
Roles have been used both as an intuitive concept in order to analyze multi-agent systems and model inter-agent social activity as well as a formal structure in order to implement coherent and robust teams. The extensive use of roles in implemented systems evidences their importance in multi-agent systems design and implementation. In this paper we emphasize the importance of roles for multi-agent systems to act in complex domains‚ identify some of their properties and we review work done concerning specification and exploitation of roles in agent-oriented system engineering methodologies‚ in formal models about agent social activity‚ and in multi-agent systems that are deployed in dynamic and unpredictable domains.
Introduction
Multi-agent systems (MAS) comprise agents that can be cooperative or selfinterested. Cooperative agents coordinate among themselves towards achieving a common goal‚ where self-interested agents have individual interacting goals. We consider that each agent in a multi-agent system is characterized by a degree of autonomy. Autonomy provides agents with the ability to decide on their own behavior. Given the autonomous nature of each agent‚ agents need to coordinate among themselves‚ or else‚ the group quickly changes to a number of individuals with chaotic behavior. An effective way to achieve coordination is via imposing a specific group organization. An organization comprises roles and their interrelations. A role clusters types of behavior into a meaningful unit contributing to the group’s goals. Role interrelations and interdependencies (e.g. hierarchical‚ class-membership‚ temporal and resource dependency relations) among agents can be exploited for effective agents’ coordination and communication. Agents may become members of such an organization‚ if they
134 succeed to undertake one or more roles that contribute towards the collective objectives. In this paper we are mainly interested in collaborative activity. Collaboration is a special type of coordinated activity‚ one in which participants work jointly with each other‚ together performing a task or carrying out the activities needed to achieve a shared goal [10]. Coordination policies and the way they are implemented in a MAS‚ impact agents’ collaborative activity and communication. Generic models that have been devised such as the SharedPlans model [10‚ 9]‚ the Joint Intentions [4‚ 13] and Joint Responsibility models [12]‚ provide the principles that underpin social activity and reasoning‚ and describe the necessary constructs for defining cooperative and individual behavior of agents in social contexts. Implemented systems [22‚ 12‚ 26‚ 11]‚ aim to make explicit the cooperation model upon which agents’ behavior is based. The objective is to provide flexibility towards solving problems related to [12] “how individual agents should behave when carrying out their local activities in the overall context of action”‚ “how the joint action may come unstuck”‚ “how problems with the joint action can be repaired”‚ “how individuals should act towards their fellow team members when problems arise” and “how agents should re-organize their local activity in response to problems with the joint action”. To address these concerns‚ implemented systems‚ driven by the high-level cooperation models that they implement‚ employ constructs and methods such as the intentional context‚ common recipes [12]‚ fixed organizations with discrete roles interchanged among agents [21]‚ and dynamic assignment of agents to pre-specified roles in conjunction with plan monitoring and repair [26]. The aim is to provide the means for systems to track the mental state of individual agents participating in the cooperative activity in a coherent and integrated way. The objective of this paper is to show the importance of roles for flexible problem solving by MAS that are deployed in dynamic and unpredictable environments with high degree of interaction and distributivity. The paper aims to identify important properties of roles‚ examine how roles are conceived in methodologies for implementing MAS and in formal models for social agency‚ and how they are exploited in implemented systems. The paper is structured as follows: The second section describes the importance of roles in order for multi-agent systems to deal with the complexities of a task-environment and identifies important properties of roles. Section three reports on the way roles are conceived in the context of agent-oriented software engineering methodologies‚ in formal models‚ as well as in multi-agent system architectures and frameworks. The paper concludes with a discussion on the advantages and disadvantages of the several approaches for role specification and exploitation.
An Application Science for Multi-Agent Systems
2.
135
Importance and properties of roles
In the last few years‚ the range of applicability of agent technology has been increased substantially. Recent multi-agent systems (MAS) are deploying in more and more complex environments‚ including the RoboCup-Rescue [15] and RoboSoccer [17‚ 18] domains‚ multi-robot space explorations‚ battlefield simulations [26‚ 25] and information integration [24]. The complexity of the environment in which a multi-agent system is deployed and the complexity of the tasks that is expected to perform can be assessed by studying the task-environment in three dimensions [6]: (a) the degree of interaction (b) the dynamics of the environment‚ and (c) the degree of distributivity. The degree of interaction specifies at what extend the actions and the decisions of agents impact other agents’ goals and plans. Interaction results from the necessity to settle issues about (i) limited shared resources‚ (ii) agents’ task interdependencies‚ (iii) goals and tasks that require collective effort and are shared by a group of agents. Dynamics specify at what extend the environment changes due to agents’ actions or due to environmental changes. High degree and unpredictability of environmental changes‚ limited agents prediction capabilities and restricted monitoring capabilities complicates copying with the dynamics of environmental changes. Distributivity specifies at what extend the resources available to the agents are distributed among them. Resources include knowledge about tasks that must be performed‚ information about the environment and other agents‚ and other task-specific resources that are inherently distributed to subsets of agents. Agents’ capabilities impact distributivity as well. In general‚ high distributivity complicates attaining consistent and coherent views of the performed tasks‚ agents’ behavior‚ agents’ states and the status of the environment.
2.1
The Importance of Roles
As already defined in the introduction‚ roles cluster types of behavior into meaningful units contributing to achieving specific goals. Roles with their interdependencies and relations in the context of a specific goal specify methods (recipes) for achieving this goal under certain conditions. The importance of roles in dynamic and unpredictable environments with high degree of interactions and distributivity is as follows:
High degree of interaction As already stated‚ interaction results from the necessity concerning settling issues about (i) limited shared resources‚ (ii) agents’ task interdependencies‚ (iii) goals and tasks that are shared by a group of agents and require collective effort.
136 The degree of interaction in a team of agents can be reduced by imposing a specific organizational structure on team members‚ specifying the resources that each agent can use‚ information that should be communicated between agents‚ and goals that should be achieved by each agent towards organization’s collective objectives. Roles provide the appropriate level of abstraction for the specification of the above-mentioned aspects concerning organizations. Following this approach‚ coordination can be simplified and the degree of interaction can be reduced‚ or be controlled effectively. Environment dynamics Agents need to deliberate effectively by evaluating their options and opportunities towards achieving their shared goals without considering low-level details of their plans. To plan and act robustly in a dynamically and unpredictably changing environment‚ agents must reason about their intended behavior in an abstract way. Roles provide such an abstraction‚ since they aggregate intentions that agents must adopt for the successful execution of tasks‚ specify the conditions that must hold for roles to apply in a context of action and capture dependencies among the intended behaviors in a shared context. In dynamic environments‚ agents must exploit role specifications to deliberatively form organizational structures‚ revise the existing organizational structure and deliberatively assign roles to agents. For example‚ an agent that has been assigned a specific role in the context of a method for achieving a task may recognize its inability to proceed due to an unpredictable lack of resources. In order agents to repair their group activity‚ they must have an explicit specification of the role each agent plays in the group‚ as well as the conditions under which a role can be assigned to an agent. Furthermore‚ agents need a mechanism to dynamically assign roles to them by exploiting role specifications and the overall context of action. Activity repair may result in revising the method employed for achieving the task and/or further agents’ group reorganization. Concluding the above‚ to deal with environment and task dynamics in a robust way‚ it requires groups of agents to dynamically assign agents to roles and deliberate on the organization of the group. The reorganization of the group may result not only to new assignments of agents to roles‚ but to changes in the set of roles‚ changes in the set of agents forming the group‚ as well as changes to the number of agents that fill each specific role. Distributivity In inherently distributed tasks and environments‚ agents need to deliberate on who-shall-perform-what so as to manage the distributed nature of the environment and task. Agents need to form groups with shared objectives‚ interact among themselves so as to have a coherent and integrated view of the
An Application Science for Multi-Agent Systems
137
whole task-environment and an integrated view of the mental states of their collaborators. Roles in specific contexts of action provide abstract specifications of distributed behavioral patterns. When these roles are performed in a coordinated fashion‚ they can accomplish a specific objective. Explicit specification of roles‚ of their relations and interdependencies provide a shared context for a group of agents to track the mental states of their collaborators and the tasks performed. However‚ for agents to plan effectively and manage distributivity in dynamic and unpredictable environments‚ they should manage roles effectively‚ decide on the number and types of roles that should be involved‚ on the number of agents that should fill a role‚ as well as on the number of roles that a (group of) agent(s) should play. Summarizing the above‚ explicit specification of roles and of their interdependencies enable agents (a) to tame the amount of interaction required for effective group behavior by imposing an organization structure‚ (b) to deal with the dynamics of the task-environment by organizing the group deliberatively and revising the existing organization structure according to their needs‚ and finally‚ (c) manage the distributivity of the task-environment by deciding on the assignment of the roles to agents.
2.2
Role Properties
Important role properties that enable agents to deal with the complexity of the task-environment‚ are the following: 1 Explicit specification of roles‚ their dependencies and conditions in the context of specific methods for achieving goals. 2 Dynamics of assignment. Dynamic assignment of roles to agents in a deliberative way by considering conditions of roles‚ agents’ capabilities and the overall context of action is important for group reorganization and for managing distributivity. 3 Dynamics of roles. This is important for agents to decide which roles should be employed in an organizational structure for achieving specific goals. Devising dynamic organizational structures deliberatively requires agents to decide on the roles that should be employed. The aim is to reduce interactions among agents and manage distributivity. 4 Cardinality of roles. This specifies the number of agents that should play a role‚ as well as the number of roles that a single agent should play. In general‚ having no cardinality restrictions allows agents to flexibly devise more effective organizational structures according to the needs of the tasks and environments‚ and take full advantage of the capabilities and resources of agents.
138 5 Lifespan of roles characterizes the dynamics of roles. Agents may dynamically build an organizational structure whose roles do not change during group action. However‚ revising the assignment of roles to agents or deactivating existing roles‚ it is important for agents to deal with the dynamics of the environment and handle distributivity effectively.
According to the above‚ properties of roles for collaborative action in highly dynamic and unpredictable environments with a high degree of interactions and distributivity are the following:
3.
Methodologies‚ Models and Systems
Roles have been used in MAS extensively‚ but there is not a methodology or an implemented architecture/framework that satisfies all the requirements described above for building MAS that act in dynamic and unpredictable environments. In the following we consider methodologies‚ models‚ and systems for the specification and exploitation of roles emphasizing on the properties of roles mentioned above.
3.1
Roles in AOSE Methodologies
Agent oriented software engineering is central to realizing agent technology as a new software engineering paradigm [28]. Many researchers dealing with the invention of new software development techniques suitable for MAS have conceived multi-agent systems as organizations of agents that interact to achieve common goals. Roles have been used in such methodologies as an intuitive and natural concept for defining agent organizations. For the AAII‚ Gaia and MaSE methodologies that we discuss subsequently‚ a role is an abstract entity that is used during system analysis‚ but does not have any direct realization within the implemented MAS. 3.1.1 MaSE Methodology. MaSE is a methodology for analyzing and designing heterogeneous multi-agent systems. The methodology is supported by the agentTool [5]. The ultimate goal of MaSE and agentTool is the automatic generation of code that is correct with respect to the original system specification. The MaSE methodology comprises seven steps:
An Application Science for Multi-Agent Systems
139
1 Capturing goals: The designer takes the initial system specification and transforms it into a structured set of system goals‚ depicted in a goal hierarchy diagram. 2 Applying use cases: Use cases drawn from the system requirements are structured as sequence of events and are used in translating goals into roles. The result is an initial set of roles. A role in MaSE is an abstract description of an entity’s expected function. 3 Refining roles: The designer ensures that all the necessary roles have been identified and develops the tasks that define role behavior and communication patterns. Roles are captured in a Role Model specifying roles and their interrelations. Specification of roles comprises their identity and tasks that roles must perform for accomplishing their goals. 4 Creating agent classes: This step results in the specification of the agent classes that are identified from roles specifications. Agent classes are defined in terms of roles that agents play and conversations in which they must participate. Classes are documented in an Agent Class Diagram. 5 Constructing conversations: In this step‚ coordination protocols between pairs of agents are been constructed. 6 Assembling agent classes: This is the step where the internals of agent classes are created. 7 System deployment: In this final step‚ the designer defines the configuration of the actual system to be implemented.
The designer is expected to move back and forth between these steps to ensure that the models and diagrams produced are complete and consistent among themselves. Roles in MaSE are used in the analysis phase and are substituted by the agent classes in the design phase. Therefore‚ roles are not realized in the final system implementation. A role in MaSE is an abstract specification of an entity’s expected function that comprises the tasks that should be performed for the role to accomplish its goals. The specification and interrelations of roles are defined by the designer during the design phase and do not change. Since each agent class is defined by the set of roles it plays‚ roles’ lifespan equals to the lifespan of the agents in the system. Furthermore‚ the assignment of agent classes to roles is done during the design phase. Generally‚ as mentioned in [5]‚ although there is a one-to-one mapping between roles and agent classes‚ the designer may combine multiple roles in a single agent class or map a single role to multiple agent classes.
140 Agents‚ as instances of agent classes‚ do not deliberate on role assignments and therefore‚ are not able to revise the devised organization. Role properties in MaSE are summarized in the following table:
Concluding the above‚ systems developed with the MaSE methodology‚ although they can control interaction by means of agents’ coordination protocols in a fixed organization structure‚ they cannot cope with environment dynamics since roles are not defined explicitly in the final system realization. For instance‚ roles do not provide the abstractions needed for agents robust planning and execution‚ and do not enable agents to deliberate on roles assignment. Therefore‚ agents can not revise the given organizational structure. This also affects distributivity‚ since agents cannot form groups at need in a dynamic fashion. 3.1.2 Gaia Methodology. Gaia is intended to allow an analyst to go systematically from a statement of requirements to a design that is sufficiently detailed‚ that it can be implemented directly [29]. Gaia‚ as it is stated in [28]‚ encourages developers to think of building agent-based systems as a process of organizational design. An organization is considered to be a collection of roles that stand to certain relationships to each other. A role in Gaia is defined by four attributes: responsibilities‚ permissions‚ activities‚ and protocols. Responsibilities determine the functionality of the role and are divided into liveness properties and safety properties. Liveness properties describe those states of affairs that must be brought about by the agent that has been assigned to the role. Safety properties describe those states of affairs that the agent must maintain across all states of execution of the role. In order to realize responsibilities‚ a role has a set of permissions that identify the resources that are available to that role. Activities are computations associated with a role and may be carried out by the agent without interacting with other agents. Finally‚ protocols define role interactions. The analysis stage of Gaia is based on the following steps:
An Application Science for Multi-Agent Systems
141
1 Identification of the roles in the system. This gives a prototypical (informal and unelaborated) role model‚ defined as a set of role schemata. Each schema comprises protocols‚ activities‚ permissions and responsibilities. 2 Protocols of interactions among roles are identified and documented. This results in an interaction model‚ which captures the recurring patterns of inter-role interaction for tasks performance. 3 Full elaboration of roles model‚ using the protocol model. Roles in Gaia are specified using the role schemata‚ but they are not explicitly defined in the actual system implementation. During the design phase role specifications guide the definition of agent types in the system. Roles are static and long-lived‚ because they define a fixed organizational structure and are statically associated with specific agents types. However‚ one or more instances of an agent type (which plays an associated role) can be created according to instance qualifiers introduced in the agent model. Instance qualifiers define the cardinality of roles‚ i.e.‚ the number of agents of the same class that can be assigned in a specific role. However‚ agents cannot deliberate on the organizational structure and on the number of agents that should play a specific role. The following table describes the properties of roles‚ as these are considered in Gaia.
Concluding the above‚ systems developed with the Gaia methodology have a fixed organization structure managing interaction and distributivity in a static fashion. Such systems cannot cope with environment dynamics since roles are not defined explicitly in the final system realization. Roles‚ although elaborated and defined in a detailed way during design‚ do not provide the abstractions needed for agents’ robust planning and execution‚ and do not enable agents to deliberate on roles assignment. AAII Methodology. The AAII methodology [14] operates at two 3.1.3 levels of abstraction‚ the external and the internal viewpoint. From the external
142 viewpoint the system is decomposed into agents‚ modelled as complex objects characterized by their purpose‚ their responsibilities‚ the services they perform‚ the information they require and maintain‚ and their external interactions. From the internal viewpoint‚ beliefs‚ goals and plans must be specified for each agent. Therefore‚ agents in systems built with AAII are considered to follow the BeliefDesire-Intendon (BDI) paradigm. The details of the external viewpoint are captured in the agent and interaction models. Their elaboration and refinement is done in four major steps. 1 Identification of the roles that apply and elaboration of an initial agent class hierarchy. Roles can be organizational or functional. I.e.‚ they can be related to the application‚ or they can be required by the system implementation‚ respectively. 2 Identification of roles’ responsibilities and of those services needed for fulfilling responsibilities. Services are activities that are not decomposed further and may include interaction with the external environment or other agents. Agent classes are further decomposed to the service level. 3 Identification of interactions associated with services. This includes the identification of performatives (speech acts)‚ information content‚ events and conditions to be noticed‚ actions to be performed‚ and other information requirements for interactions. This step includes determination of the control relationships between agents. At this point the internal modeling of each agent class can be performed. 4 Final refinement of the agent and control hierarchies‚ and introduction of the agent instances.
The methodology for internal modeling begins from the services provided by an agent and the associated events and interactions. These define the purpose and the top-level goals of an agent. Internal modeling can be expressed at two steps. 1 Analysis of the means for achieving the goals. This includes‚ contexts‚ conditions‚ subgoals‚ actions‚ and handling of failures. It results in a plan for achieving each goal. 2 Consideration of the beliefs that affect the appropriateness of a given plan and the manner in which it is carried out in various contexts. Analysis of the input and output data requirements for each subgoal in a plan.
The final system realizes the agent hierarchy that is built based on roles and role interactions. This hierarchy and the corresponding roles do not change during group performance. In accordance to the previous methodologies‚ roles
An Application Science for Multi-Agent Systems
143
are not realized in the final system but drive the implementation of the agent classes. The properties of roles for AAII are summarized in the following table:
Concluding the above‚ agents employed in systems developed with the AAII methodology comply with a rigid organizational structure‚ but in correspondence with the above mentioned methodologies cannot cope with environment dynamics since roles are not defined explicitly in the final system realization. This affects robust planning and execution‚ effective management of distributivity and interaction.
3.1.4 AALAADIN Model. AALAADIN is not a specific agent methodology‚ but a meta-model for describing organizations of agents using the core concepts of group‚ agent and role [8]. The methodology is supported by the MadKit platform. With AALAADIN one can describe multi-agent systems with different forms of organizations such as market-like or hierarchical organizations and therefore it could be useful for designing MAS. An organization in AALAADIN is a framework for activity and interaction through the definition of groups‚ roles‚ and their relationships. A group contains a finite set of roles that are handled by specific agents. Groups and roles are specified using the MadKit platform during system development. A group can be created by any agent that is then automatically takes the special role of group manager. The group manager has the responsibility for handling requests for group admission or role requests. A role in this model is an abstract representation of an agent function‚ service or identification within a group. Each agent can take several roles‚ and each role assigned to an agent is local to a group. Furthermore‚ each role can be assigned to more than one agents. In this case each agent plays an instance of that role. The most interesting feature of AALAADIN is that an agent can create new groups with roles. Therefore‚ the structure of the system organization can be constructed dynamically. Agents can create roles with transient lifetime in a group structure. AALAADIN does not aim at cooperative behavior. It just
144 provides to the designer the above-mentioned facilities in order to build groups of agents. Building a MAS with AALAADIN requires substantial effort from the system designer and the properties of roles depend on the sophistication of the designed system. The following table illustrates the properties of roles in AALAADIN.
Concluding the above‚ the AALAADIN model provides developers with the ability to define systems that can cope with task-environment dynamics in terms of agents groups and roles. The actual properties of roles depend on the final MAS system design sophistication. However‚ for complex environments it requires substantial effort from the system designer and developer in order MAS to realize the full-range facilities provided by roles. This is mainly due to the fact that there is not a specific methodological and/or development framework for the specification and exploitation of roles during planning and execution of tasks.
4.
Formal Models
There are a number of recent formal models [7‚ 3‚ 1‚ 19] that capture important properties of roles. In [7] roles are related to relationship types via the predicate RoleO f (a‚ R)‚ which means that a is one of the roles in relationship of type R. The adoption of a role by an agent implies the adoption of a social commitment towards another agent that has a different role in the same relationship. Therefore social commitments of each agent are towards a single agent. It is assumed that all roles are instantiated at the same time‚ each by a single agent. The proposed model does not make clear whether an agent can adopt many roles at the same time and it doesn’t deal with the issue of (re)assigning agents to roles. In [3] the adoption of a role by an agent enforces this agent to adopt the goals (desires for potential intentions) associated with its role. This work goes further by formalizing the influence relation between roles. A role‚ being more influential than another role to an agent‚ translates to stronger commitments to responsibilities. Although the framework considers goals that agents can adopt as part of playing a role‚ the framework does not clearly distinguish between
An Application Science for Multi-Agent Systems
145
“socially” and “internally” motivated goals. Furthermore‚ authors propose the use of team plans as glue for the abstract framework provided. Team plans are pre-compiled plans that are selected for execution by a group of agents and are considered to specify relationship types with many roles involved‚ implying obligations of agents adopting roles towards other agents. In other words‚ team plans provide a framework for group communication and coordination. The main restrictions of that model is that all roles must be instantiated at the same time‚ each by a single agent. It is not clear whether an agent can adopt many roles at the same time. Constraints and conditions on roles are not specified explicitly‚ therefore it is not clear how agents are (re)assigned to roles. This latter approach is very close and extends the way roles are used in [1]. Each role in [1] describes a major function together with the obligations‚ interdictions and permissions attached to it. Roles are associated with obligations towards other roles. Each obligation requires the obliged agent to achieve a goal under certain conditions. Conditions in this case may include external stimuli (e.g. messages received) while in [3] are assumed (but not explicitly formalized) to include agent’s capabilities. An interesting part of the formalization introduced is the organization of roles in an inheritance hierarchy. The main restriction is that an obligation exists between two agents in specified roles. However‚ each role can be filled by a number of agents. Summarizing the above‚ [7] and [3] follow the same approach regarding the formalization of roles. Cavedon and Sonenberg [3] are interested in the adoption of goals as a result of adopting roles and in the degree of roles’ influence in the prioritization of goals. Fasli [7] is more interested in obligations that result from the adoption of roles. No model exploits roles for coordinating agents activities and no model is oriented towards implementation. An implementation oriented advance model of collaborative decision making and practical reasoning incorporating roles is the one proposed in [19]. The model proposed aims to cover all the aspects of the processes involved for agents to reach a joint decision. A critical aspect of the model proposed is the social mental shaping mechanism‚ which provides a mechanism for affecting agents’ mental state as a consequence of an agent taking a role‚ participating in a social relation with another agent‚ or as a means for an agent to affect the mental state of any other agent outside of any social relationship. In this way‚ the proposed model clearly distinguishes between “socially” and “internally” motivated mental attitudes. Social relationships are defined to be pairs of roles abstracted by relationship types. Authors consider roles to be sets of mental attitudes (e.g. beliefs‚ goals and intentions) governing the behavior of an agent occupying a particular position within the structure of a multi-agent system. These mental attitudes can be either mandatory or optional. The notion of mandatory role-based mental attitudes‚ as authors point out‚ is consistent with the concept of organizational
146 commitment. Therefore‚ when an agent is committed towards a group of agents to adopt a specific role that contributes to the group’s objective‚ then it is committed to adopt the mental attitudes that are attached to this role. Therefore‚ each agent adopting a role is committed to act according to the obligations‚ responsibilities‚ expectations and constraints relative to this role. Agents‚ via the social mental shaping process‚ may bring new relationships within the organization in order to bring new opportunities for cooperation within the group‚ or in order to engage other agents within the group. In other words‚ the establishment of new social relations in terms of roles is a possibility for agents to establish a potential for cooperation in the context of achieving a particular world state. On the other hand‚ to recognize such a potential for cooperation‚ agents need to believe that they have the ability to achieve the world state jointly with the others. Therefore‚ the assignment of a role is done dynamically based on an agent’s ability to achieve a particular state. It must be mentioned that the notion of “ability” is not restricted to the capabilities of the agent but to its ability to find a way to jointly achieve a state This implies that an agent deliberates in order to decide whether it shall play a role. Role properties as there are conceived in the model proposed in [19] are summarized in the following table:
4.1
Roles in implemented MAS
Roles have been used in implemented multi-agent systems in order to achieve coherence in teams of cooperative agents in domains where well-coordinated activity is required. Such domains include battlefield simulations [25] and the robocup simulation league [16]. Furthermore‚ this section refers to frameworks for implementing cooperative agents following the role-oriented agentprogramming paradigm [2‚ 20]. 4.1.1 The Karma-Teamcore framework. The Karma-Teamcore framework focuses on rapidly integrating distributed‚ heterogeneous agents and tasking them by providing wrappers that encapsulate general teamwork reason-
An Application Science for Multi-Agent Systems
147
ing and automatically generate the necessary coordination for robust execution [27]. According to this framework a system developer builds a team-oriented program that consists of a team organizational hierarchy and a team (reactive) plan hierarchy. Furthermore‚ the system designer assigns agents to roles. Roles are used in the specification of the organizational hierarchy‚ which does not change during team performance. Roles are assigned to specific tasks in the plan hierarchy and agents can play specific roles depending on their capabilities. Roles in this framework are abstract specifications of a set of activities. Therefore roles and their interrelations are static. They provide a useful level of abstraction for (re)assigning agents to tasks based on monitoring and re-planning mechanisms. Such mechanisms are either generic‚ exploiting role relationships (AND‚ OR Role-dependency)‚ or partly domain dependent for inferring role (non) performance. It must be pointed that the (re)assignment of agents to roles is based on the capabilities that each role requires without agents to consider the overall context of action. Leaf nodes in the organizational hierarchy correspond to single agents. The internal nodes in the organizational hierarchy correspond to groups of agents (group roles) that are defined implicitly by their successor leaf nodes. Tasks that are assigned to a group role is assigned to each member of the group that corresponds to this role. Roles do not change during group action and therefore are long-lived. The properties of roles in this approach are shown in the following table.
Concluding the above‚ the Karme-Teamcore provides a framework for the specification of a fixed organizational structure in terms of roles that agents should play for the achievement of specific goals. Roles provide a valuable abstraction for assigning tasks to agents. Therefore‚ roles provide the means for dealing with environment dynamics. This is achieved by monitoring and replanning agent abilities based on roles‚ and via agents reassignment to roles. However‚ agents do not deliberate on roles assignment (considering the overall context of action) but they are rather reactively assigned to roles based only on their capabilities. Agents cannot revise the given organizational structure. This may affect distributivity in cases that agents cannot be assigned roles from the
148 prespecified organizational structure because of their limited capabilities. In these cases alternative organizations may lead the team to a successful execution of its task. As far as managing interactions is concerned‚ Karma-Teamcore integrates a decision theoretic communication selectivity mechanism based on communication costs and benefits. However‚ roles are not exploited for managing interactions. 4.1.2 The RoboCup Simulation Domain. RoboCup simulation [16] is a highly dynamic‚ real-time environment with many real-world complexities. In this domain there are two teams of agents consisting of eleven members each. The objective of a team is to win the game‚ i.e. to score more goals than the opponent team. In order to fulfill this objective‚ the team players must act in a well-coordinated and coherent fashion. Roles have been used in the most well-known team architectures‚ that of the CMUnited [23] that won the RoboCup world championships RoboCup98 and RoboCup99 and that of FC Portugal [21] the winner of the RoboCup2000 world championship. A role in robotic soccer can be as simple as a position in the field. In the CMUnited [23] architecture an agent has a set of internal and external behaviors. Internal behaviors update the agent’s internal state‚ while external behaviors reference the world and agents’ internal states and select the actions to be executed. Internal and external behaviors are both sets of condition/action pairs. Conditions are logical expressions and actions are behaviors themselves. A role is a specification of agent’s position and inter-position behavior‚ like passing options. Therefore‚ a role aggregates agent’s internal and external behaviors into a logical group. Roles are static and long-lived and are grouped into formations. Agents can change formation at run-time. This results in changing the characteristics of roles at run-time. However‚ agents within a formation can also inter-change roles in order to save energy [21]. An agent undertakes one role in each formation. The next table summarizes the role properties in these systems.
Concluding the above‚ environmental changes in Robocup are due to agents’ actions and decisions. Agents in a team can reactively follow pre-specified organizational structures‚ which may change during a play. Such organizations comprise eleven roles‚ each for a single agent. These roles can be interchanged between agents in a reactive way (depending on the current situation). The
An Application Science for Multi-Agent Systems
149
main issue concerning RoboCup is that it provides a case study were roles have been successfully employed for coordinating agents to reactively achieve their tasks. 4.1.3 Role Oriented Programming. A multi-agent system with a dynamic organization can change in size and structure‚ dynamically (re)assign agents to roles‚ decide on the number of agents that should fill a role and deliberate on the roles that should structure the group. ROPE framework [2] as well as the work reported in [20] aim to this target by the use and exploitation of roles for specifying cooperation processes. Both works give a strong emphasis on the role concept. In ROPE a role is defined as an entity consisting of a set of required permissions‚ a set of granted permissions‚ a directed graph of service invocations and a state not visible to other agents. The service invocations describe the agent behavior and may contain an arbitrary number of alternatives. Roles may have an associated set of sub-roles that inherit granted permissions. An agent in ROPE is defined as a set of provided services. The aim in [20] is to build a generic framework for implementing collaborative agents. Roles are used to define the intended behavior of an agent within a collaborative task. Tasks are defined within a special formal structure called a multi-role recipe. The recipe constitutes the know-how of each agent and contains roles-specifications. Each role specification comprises capabilities that the agent must have in order to undertake each role‚ a number of constraints that must be preserved during the execution of the cooperative task‚ an action list as well as temporal and synchronization constraints among actions‚ and a number of effects that are achieved when the role has been executed successfully. An agent can undertake one or more roles either in the context of the same activity or in different activity contexts. A role can be undertaken by a group of agents. These approaches seem to exploit most of the dynamics of roles in complex multi-agent systems. Agents may form organization structures that are dynamic‚ as roles can be discarded or changed at run-time. The lifespan of roles is transient (roles are used when needed and for as long as required) and the (re)assignment of roles is dynamic. In both approaches an agent can undertake more than one role in different goal contexts. However‚ in the second approach a single role can be assigned to a group of agents. In this case‚ a task can be delegated to a group of agents who must contact further planning to decide on the way to perform their task. The following table summarizes the role properties of both role-oriented programming approaches.
150
Role-oriented agent-programming provides the full range of facilities for agents to manage the complexity of highly dynamic and unpredictable environments with a high degree of interaction and distributivity. Agents can deliberatively devise an organizational structure‚ decide on the assignment of roles to agents and revise their decisions‚ in parallel to planning and executing their tasks. This provides them with the ability for flexible problem solving. Moreover‚ it affects distributivity‚ since agents may form groups at need in a dynamic fashion. Finally‚ explicit specification and shared knowledge of roles and their interrelationships in a context of action enables agents to manage interactions.
5.
Concluding remarks
As we saw earlier‚ roles have been used both as an intuitive concept in order to analyze multi-agent systems and model inter-agent social activity‚ and as a formal structure in order to implement coherent and robust teams. In this paper we presented the role-related work on agent-based systems done in three main streams of research‚ that of agent-oriented system engineering‚ that of formal models for agent social activity and that of implemented multi-agent systems that are deployed in complex domains. The following table illustrates the several approaches in using roles and their role-related properties. From this table we can observe that agent-oriented software engineering is in an early stage concerning the analysis and specification of systems that act in dynamic and unpredictable environments. Therefore‚ a lot of work has to be done in this area. Until now‚ the most well known methodologies are concerned with the analysis and design of systems with a fixed number of agents‚ with a static structure and fixed inter-agent relationships and dependencies. In such systems there is no need for introducing role structures dynamically and reasoning about roles. Agent-oriented methodologies concern about the flexibility and robustness of the resulting implemented agent systems in a limited manner‚ without dealing with those aspects of the problem domains that concern dynamic assignment of tasks to agents‚ the run-time selection of roles‚ as well as the reorganization of the overall system. These restrictions result from the
An Application Science for Multi-Agent Systems 151
152 fact that these methodologies do not realize roles and their interrelations during the design and implementation of the multi-agent system. Recently there is an increased interest in formal models that contain the notion of role. These models are mainly concerned about how the social context in which an agent acts affects its mental state and therefore its behavior. Although there are advanced theoretical approaches to formalizing multi-agent systems behavior by means of roles‚ there is not a pure role-oriented formalization of collaborative activity based on role properties and exploitation of roles interdependencies. Due to this‚ the above mentioned models do not clearly define important properties of roles. Therefore‚ the need for implementation-oriented models that can help us build more robust and reliable systems for dynamic and unpredictable environments exploiting roles for managing the complexity of the task-environment is great. Implementation-specific works are actively being concerned with more complex environments and therefore‚ with advanced role properties. The extensive use of roles in implemented systems evidences the need for role-oriented thinking and modelling in multi-agent design and implementation. However‚ there is not an agent framework‚ an implemented system or architecture that exploits the full range of facilities provided by roles in an integrated fashion. We are currently working on the analysis and design of a generic role-based model of collaborative activity. The development of the architecture presented in [20] is a major step towards this target. The full-range development of this architecture is on-going work.
References [1] Mihai Barbuceanu. Coordinating agents by role based social constraints and conversation plans. In Proceedings of the AAAI‚ 1997. [2] M. Becht‚ T. Gurzki‚ J. Klarman‚ and M. Muscoll. ROPE: Role Oriented Programming Environment for Multiagent Systems. In Proceedings of the Fourth IECIS International Conference on Cooperative Information Systems‚ 1999. [3] Lawrence Cavedon and Liz Sonenberg. On social commitment‚ roles and preferred goals. In Proceedings of the Third International Conference on Multi-Agent Systems (ICMAS)‚ 1998. [4] Philip R. Cohen and Hector J. Levesque. Teamwork. Nous‚ 25(4):487–512‚ 1991. [5] Scott A. DeLoach and Mark Wood. Developing Multiagent Systems with agentTool. In C. Castelfranchi and Y. Lesperance‚ editors‚ Intelligent Agents VII‚ LNAI 1986‚ pages 46–60. 2001. [6] Edmund H. Durfee. Scaling Up Agent Coordination Strategies. IEEE Computer‚ pages 39–46‚ July 2001. [7] Maria Fasli. On commitment‚ roles‚ and obligations. In From Theory to Practice in Multi-Agent Systems‚ LNAI 2296. Springer-Verlag‚ 2002.
An Application Science for Multi-Agent Systems
153
[8] Jacques Ferber and Olivier Gutknecht. A meta-model for the analysis and design of organizations in multi-agent systems. In Proceedings of the Third International Conference on Multi-Agent Systems‚ 1998. [9] Barbara Grosz and Sarit Kraus. The evolution of SharedPlans. In Anand Rao and Michael Wooldridge‚ editors‚ Foundations and Theories of Rational Agencies. Kluwer Academic Press‚ 1999.
[10] Barbara J. Grosz and Sarit Kraus. Collaborative plans for complex group action. Artificial Intelligence‚ 86(2):269–357‚ October 1996. [11] Merav Hadad and Sarit Kraus. SharedPlans in Electronic Commerce. In M. Klusch‚ editor‚ Intelligent Information Agents‚ chapter 9‚ pages 204–231. Springer‚ 1999. [12] Nicholas Jennings. Controlling cooperative problem solving in industrial multi-agent systems using joint intentions. Artificial Intelligence‚ 75‚ 1995. [13] D. Kinny‚ M. Ljungberg‚ A. Rao‚ E. Sonenberg‚ G. Tidhard‚ and E. Werner. Planned Team Activity. In C. Castelfranchi and E. Werner‚ editors‚ Artificial Social Systems‚ LNAI 830. 1992. [14] David Kinny‚ Michael Georgeff‚ and Anand Rao. A Methodology and Modelling Technique for Systems of BDI Agents. In Agents Braking Away‚ Seventh European Workshop on Medelling Autonomous Agents in a Multi-Agent World‚ MAAMAW’96‚ LNAI 1038. 1996. [15] H. Kitano‚ S. Tadokor‚ H. Noda‚ I. Matsubara‚ T. Takhasi‚ A. Shinjou‚ and S. Shimada. Robocup-Rescue: Search and Rescue for Large Scale Disasters as a Domain for MultiAgent Research. In Proceedings of the IEEE Conference on Man‚ Systems‚ and Cybernetics (SMC-99). 1999. [16] H. Kitano‚ M. Tambe‚ M. Veloso‚ I. Noda‚ E. Osawa‚ and M. Asada. The robocup synthetic agents’ challenge. In Proceedings of the International Joint Conference on Artificial Intelligence‚ 1997. [17] Itsuki Noda. Soccer server: A simulation of robocup. In Proceedings of AI symposium ’95 Japanese Society for Artificial Intelligence‚ pages 29–34‚ 1995. [18] Itsuki Noda‚ Hitoshi Matsubara‚ Kazuo Hiraki‚ and Ian Frank. Soccer server: A tool for research on multiagent systems. Applied Artificial Intelligence‚ 12:233–250‚ 1998. [19] Pietro Panzarasa‚ Nicholas R. Jennings‚ and Timothy J. Norman. Formalizing Collaborative Decision-making and Practical Reasoning in Multi-agent Systems. Journal of Logic and Computation‚ 11(6)‚ 2002. [20] Ioannis Partsakoulakis and George Vouros. Roles in Collaborative Activity. In I. Vlahavas and C. Spyropoulos‚ editors‚ Methods and Applications of Artificial Intelligence‚ Second Hellenic Conference on AI‚ LNAI 2308‚ pages 449–460. 2002. [21] Luís Paulo Reis‚ Nuno Lau‚ and Eugénio Costa Oliveira. Situation Based Strategic Positioning for Coordinating a Team of Homogeneous Agents. In M. Hannebauer‚ J. Wendler‚ and E. Pagello‚ editors‚ Balancing Reactivity and Social Deliberation in Multi-Agent Systems‚ LNAI 2103‚ pages 175–197. 2001. [22] Charles Rich‚ Candace Sidner‚ and Neal Lesh. COLLAGEN: Applying Collaborative Discource Theory to Human-Computer Interaction. AI Magazine‚ 22(4):15–25‚ 2001. [23] Peter Stone and Manuela Veloso. Task decomposition‚ dynamic role assignment‚ and low-bandwidth communication for real-time strategic teamwork. Artificial Intelligence‚ 110:241–273‚ 1999.
154 [24] K. Sycara‚ M.Paolucci‚ M van Velsen‚ and J. Giampapa. The RETSINA MAS Infrastructure. Technical Report CMU-RI-TR-01-05‚ CMU Technical Report‚ 2001. [25] M. Tambe‚ K. Schwamb‚ and P. S. Rosenbloom. Constraints and design choices in building intelligent pilots for simulated aircraft pilots for simulated aircraft: Extended Abstract. In AAAI Spring Symposium on Lessons Learned from Implemented Software Architectures for Physical Agents‚ 1995. [26] Milind Tambe. Towards flexible teamwork. Journal of Artificial Intelligence Research‚ 7:83–124‚ 1997. [27] Milind Tambe‚ David V. Pynadath‚ and Nicolas Chauvat. Building Dynamic Agent Organizations in Cyberspace. IEEE Internet Computing‚ pages 65–73‚ March-April 2000. [28] Michael Wooldridge and Paolo Ciancarini. Agent-Oriented Software Engineering: The State of the Art. In P. Ciancarini and M. Wooldridge‚ editors‚ Agent-Oriented Software Engineering‚ LNAI 1957. 2001. [29] Michael Wooldridge‚ Nicholas R. Jennings‚ and David Kinny. The Gaia Methodology for Agent-Oriented Analysis and Design. Autonomous Agents and Multi-Agent Systems‚ 3(3):285–312‚ 2000.
AN EVOLUTIONARY FRAMEWORK FOR LARGE-SCALE EXPERIMENTATION IN MULTI-AGENT SYSTEMS* Alex Babanov, Wolfgang Ketter, and Maria Gini Department of Computer Science and Engineering, University of Minnesota
Abstract
We discuss the construction of an evolutionary framework for conducting large-scale experiments in multi-agent systems for applications in electronic marketplaces. We describe how the evolutionary framework could be used as a platform for systematic testing of agent strategies and illustrate the idea with results from a simple supply-demand model. We further explain how to integrate the proposed framework in an existing multi-agent system and demonstrate our approach in the context of MAGNET, a multi-agent system where agents bid over complex combinations of tasks with time and precedence constraints.
Keywords: Multi-agent systems, economic agents, evolutionary methods, simulation.
1.
Introduction
Online marketplaces are gaining popularity among companies seeking to streamline their supply chains. For buyers such marketplaces can significantly ease the process of finding, comparing and coordinating providers, while for sellers marketplaces provide access to much broader customer base [21]. Intelligent software agents can significantly reduce the burden of market exploration by sifting through the avalanche of information and performing calculations to promptly provide a human decision maker with a refined list of alternatives. However, we believe that to exploit the true potential of electronic marketplaces, software agents need to be able to make their own decisions and adapt their strategies to the current situation.
*Work supported in part by the National Science Foundation, awards NSF/IIS-0084202 and NSF/EIA-9986042.
156 A major difficulty that hampers the acceptance of software agents as decision makers is the lack of systematic and accepted methods to assess and validate the agents’ decisions in a multi-agent system. We are not concerned here with the broad issue of software validation. We assume that proper software design and testing methods are used in the development of the software agents. Our concern is with the methods (or lack thereof) to assess and validate the strategic decisions agents make, and their ability to adapt to changing market situations. An issue in assessing multi-agent systems is that there is not enough real-world data available to perform comprehensive testing. At the same time, analytical modeling for the majority of less than trivial problems is prohibitively hard. In this paper we propose to design a large-scale test environment based on an evolutionary approach to economic simulation. We specifically address the question of how to assess agent strategies in an ever changing and heterogeneous market environment. We start by proposing in Section 2 an evolutionary approach, and we support the proposal with experimental results obtained from a simple supply-demand model. We then consider in Section 3 practical issues of building an evolutionary testing environment on top of an existing Multi-Agent System (MAS). Finally, in Sections 4 and 5 we compare our proposed approach with other existing methods and we outline future work.
2.
An Evolutionary Framework for Large-scale Experimentation
A major obstacle in the way of understanding the properties of multiagent systems is the lack of tractable data. Publicly available data are scarce and insufficient for exhaustive testing, while private data sets are expensive and not always suitable for research purposes. We propose a way of employing an evolutionary approach to economic simulation to make up for the scarcity of data. The rationale behind our choice of an evolutionary framework is that it is able of revealing patterns of macroscopic behavior in a society of agents without requiring a complex theory of agent optimization criteria, or of strategic interaction [23]. The methodology we propose is essentially derived from the application of evolutionary techniques to game theoretical problems [18, 22, 26]. The evolutionary game theory studies equilibria of games played by populations of players, where players are myopically rational and have conflicting interests.
An Application Science for Multi-Agent Systems
157
The “fitness” of the players derives from the success each player has in playing the game governed by the natural selection. Agents which do not perform well, because of their strategy, will eventually disappear from the market. In the case of electronic markets the players are customer and supplier agents, and their fitness is determined by the strategies they use to secure their profit.
2.1
Reproduction, Mutation, and Introduction of New Strategies
One of the cornerstones of the evolutionary approach is the need for a large and diverse population of agents. A common resolution to this issue is to describe the agents’ strategies in terms of gene sequences and to use cross-breeding and mutations to ensure the desired diversity. In large-scale multi-agent systems agents can employ a variety of methodologies, such as Q-Learning, Neural Networks, Game Theoretic models, Genetic Algorithms and others. It is hard to imagine that each and every one of the strategies that are based on the above mentioned methodologies can easily be encoded in a gene sequence. It is even harder, if not impossible, to maintain the compatibility between gene sequences of different strategies. In practice, it is difficult to come up with an encoding for even well studied problems [13], let alone complex domains such as electronic markets. Our proposed approach to the problem described above is to maintain separate “gene pools” for different types of strategies. For each type of strategy the system will derive the offsprings by operating on the whole pool to which they belong. In our test model, which is described and examined in the following sections, an information pool is derived from statistical data. A company, represented by an agent, that receives negative profits over a certain period of time, is taken out of the market. In return the system eventually creates a new company with a variant of one of the existing strategy types. When a new company is created, the probability of selecting a particular type of strategy for that new company is weighted by how represented the strategy is in the current market. The parameters of a newly created strategy instance are chosen based on the gene pool of the corresponding strategy. To make sure that a presently unsuccessful strategy is given a chance to conquer the market in a more favorable time, the simulation will maintain a repository of all strategies that were washed away from the market, and will randomly reintroduce them.
158 Completely new types of strategies can be created by a human. These new types of strategies enter the market the same way as the “retired” strategies, i.e. they are added to the list of available strategies, in the hope of acquiring a noticeable market share as soon as the market conditions become favorable.
2.2
Test Model
Our test model is a continuous time discrete-event simulation of a society of suppliers of some service and customers, which live and interact in a circular city of radius R. Customers appear in the city in intervals governed by a stationary Poisson process with a fixed frequency
where U[x, y] is distributed uniformly on the interval [x, y]. The distribution of customers is intentionally fixed, so that the society of suppliers had to evolve and match it. Customers appear on the market according to the following rules expressed in polar coordinates:
Several different types of suppliers are modeled by different sizes of their “factories.” Bigger factories have increasingly lower production costs. Suppliers are introduced to the market by rule similar to the one used for customers. A new supplier enters the market with a fixed price of its service. Every supplier is audited at regular time intervals and dismissed from the market if its profit becomes negative. These rules ensure that although each particular supplier cannot adjust its price, the society of suppliers employing the same strategy will eventually evolve to find the right price by loosing its least successful members. Upon entry, a customer observes a selection of suppliers and chooses the one that offers the greatest benefit, where the benefit is a linear function of the supplier’s price, distance to the customer, and time delay due to scheduling of other customers’ tasks. The probability that a supplier of a particular type will enter the market next is proportional to the number of suppliers of its type that are surviving in the market. Another possibility to enter the market is through a small noise factor (set at 5% for the experiments in this paper). With a probability equal to the noise factor a strategy of a newly created supplier is chosen at random among all present and retired strategies.
An Application Science for Multi-Agent Systems
159
Hence every retired strategy has a chance to enter the market at a more favorable time. The noise also provides a way for completely new types to enter the market, as it will be shown later in the experimental results. Price levels of the same size suppliers are considered to be a gene pool of the particular suppliers’ type. We also assume that the structure of a gene pool of some type depends on the distance from the center of the city. Every once in a while the structure of gene pools is recalculated as a function of type and distance. At the same time the density of the population is updated as a function of distance, and a new distribution of strategies by types is calculated. To smooth the effects of the limited society population, all changes enter the above described distributions with a “learning rate”
2.3
Expectations
We expect the simulation to exhibit some patterns of gene pools adjustment to the market situation. It is most likely that the relative sizes of populations of different supplier types will change with time. The price distribution and the density of the suppliers as functions of the strategy type and the distance from the center are likely to adapt as well. It is reasonable to expect that large size suppliers would perform better near the densely populated and therefore highly competitive center of the city, because of their lower production costs. Smaller suppliers will survive better on the boundaries, where transportation costs become increasingly important to compare with the cost advantage of the large suppliers. Consequently, the higher level of competition should drive the prices and the profit margins down close to the center. The reasonably evolving society of suppliers should adapt to the dynamic changes in the parameters of the customer distribution. Consistently with the expectations outlined before, the increased frequency of the customers’ arrival should increase the “habitat” of the large suppliers, while the opposite change should give a leverage to the small ones. On a final note, new and retired supplier types should be able to acquire a position in an existing market by the means of noise in the supplier type selection process. To verify our expectations we conducted several experiments with a variety of initial conditions. Two representative experiments are considered in the following.
160
2.4
Simulation Results: Noise Factor
In the first experiment the simulation starts out with suppliers of size 1 and 2. After some time suppliers of size 3 enter the market through the 5% noise factor, meaning that initially every new supplier has a chance of about 1.67% to enter the market with size 3 factory. This experiment models a situation when a new strategy is designed and some suppliers try to enter the market with its benefit (or, alternatively, some of the existing supplier decide to switch).
Figure 1 displays the population of different supplier types as a function of milestones. Each milestone (m/s) stands for two million transactions in the market. In the figure the x-axis represents the milestones and the y-axis represents the population of each particular type. Suppliers of size 3 enter the market at milestone 50 and struggle to find their place. After some time in the market, the size 3 supplier type proves itself to be competitive with size 2 suppliers. The more market share size 2 suppliers gained, the more were lost proportional to that by size 2, until a dynamic equilibrium with approximately same populations of both sizes was reached around milestone 250. To reveal the mechanics of size 3 successful entry we examine the state of the city at milestones 50 and 250.
An Application Science for Multi-Agent Systems
161
Figure 2 shows the state of the city just before the introduction of a strategy to own a factory of size 3 (milestone 50). There are two important observations to be made from this figure. Firstly, size 1 suppliers dominate the market and, indeed, as we expected tend to the rim, while the larger size 2 suppliers operate mostly in the middle of the city. Secondly, the distribution of suppliers is quite uneven with dense clusters and wide open areas situated at the same distance from the center. Figure 3, in turn, gives a “bird eye” snapshot of the city at milestone 250. We can see that suppliers of size 3 pushed size 2 suppliers out of the center, while suppliers of size 1 still successfully inhabit the outer city zones. The left part of the Figure 4 shows the state of the two gene pools for factory sizes 1 (top) and 2 (bottom) at milestone 50. The right part of this figure shows the gene pools for all three sizes, starting with size 1 at the top to size 3 at the bottom at milestone 250.
162
In each of the gene pool graphs the x-axis shows ten concentric city zones numbered starting from the center, the left y-axis and histogram bars show the size of the population of the corresponding strategy in the particular zone relative to the whole population, and, finally, the right y-axis and error bar graph represent average values and standard deviations of profit margins. It can be seen from Figure 4 that size 2 suppliers tend to operate near the center of the city, while size 1 suppliers prefer outer city zones. This behavior is similar to what we expected, although a picture of profit margins is not very clear. To get a better picture of the prices and profit margins we consider the state of gene pools at milestone 250 in the right part of Figure 4. The introduction of size 3 suppliers caused the suppliers of size 1 and 2 to decrease the average price in all zones. Size 3 supplier agents have found their appropriate niche in zones one to three. We observe that size
An Application Science for Multi-Agent Systems
163
1 and 2 suppliers converged to stable averages prices for their services in all zones, as the variance is very low. It is also important to note that, although the gene pools reached a relatively stable state, the population shares continuously fluctuate, as shown in Figure 1. Also, the high variance of size 3 profit margin implies that the market state may change in a future as this strategy try to find the right price distribution.
2.5
Simulation Results: Changing Environment
In the second experiment we reduced the frequency of customers’ arrival by 1/3 of its initial value half way through the simulation. This was meant to emulate the loss of interest in the supplied service due some economic factor, such as depression or introduction of the alternative service. One of the results of the experiment is depicted in Figure 5. The figure shows the entry probabilities for each of the three supplier types. In this figure we can observe two important effects. First and foremost we see that the market reaches a relatively stable state shortly both after the beginning of the simulation and after the change of conditions. Secondly, we observe that after the change the size 3 suppliers loose a sizable part of their market share to size 1 suppliers. Lower frequency
164
of customer arrival resulted in the disadvantage for large suppliers in accordance with our expectations.
3.
Integration of the Evolutionary Approach into an Existing MAS
In this section we demonstrate a way to add an evolutionary framework to an existing MAS. For the example purposes we use MAGNET, a multi-agent system we have designed to study agents in auctions for combinations tasks with time and precedence constraints [9]. In essence, MAGNET is a mixed-initiative system, in which intelligent software agents facilitate the deliberation process of human decision makers. It is possible, however, for research purposes, to exclude a human from the loop and let the agents select autonomously the best course of actions.
3.1
MAGNET Architecture
In MAGNET we distinguish between two trading agent roles, customer and supplier (see Figure 6). The customer has a set of tasks to be performed and needs to solicit resources from suppliers by issuing Request for Quotes (RFQs) through an agent-mediated market. MAGNET agents participate in a first-price sealed-bid reversed combinatorial auction over combinations of tasks with precedence relations
An Application Science for Multi-Agent Systems
165
and temporal constraints. After the auction ends, the customer agent solves the winner determination problem and awards bids. The market agent is responsible for coordination of tasks, i.e. distributing customers’ RFQs among an appropriate selection of suppliers, collecting and timing bids, monitoring interactions during task execution phase, etc. The data warehouse agent collects information on the transactions, and makes it available in form of statistical data to agents and their owners.
3.2
A Practical Example
The following example of a house construction illustrates how MAGNET handles problems in its domain. Figure 7 (left) shows the tasks needed to complete the construction. The tasks are represented in a task network, where links indicate precedence constraints.
166 The first decision the customer agent is faced with is how to sequence the tasks in the RFQ and how much time to allocate to each of them. For instance, the agent could reduce the number of parallel tasks, allocate more time to tasks with higher variability in duration or to tasks that are in short supply in the market. Presently MAGNET uses a simple CPM-based algorithm for generating RFQs [8]. An alternative approach based on the Expected Utility Theory is being actively researched by our group [3, 4]. A sample RFQ is shown in Figure 7 (right). Note that the time windows in the RFQs do not need to satisfy the precedence constraints; the only requirement is that the accepted set of bids satisfies them.
3.3
Introduction of Evolutionary Components
In order for MAGNET to operate in the evolutionary framework, we need to add components that will manage evolutionary aspects of the system and exclude human decision makers from the loop. Figure 8 shows the resulting architecture and the following list summarizes the required changes: The Manager generates and distributes tasks to customer agents. It observes the rate of customers’ and suppliers’ failures to complete
An Application Science for Multi-Agent Systems
167
their evaluations of RFQs and bids during specified deliberation periods.The manager adjusts the frequency of issuing RFQs to keep the rate of failures reasonably low, yet not zero. Having a rate of failures greater than zero puts some pressure on agents that use computationally overly intensive strategies. The frequency of generating the RFQs determines the size of the market population. The Auditor evaluates the performance of supplier agents’ strategies based on suppliers’ average profit over a specified period of simulation time. Agents that make negative profit are removed from the market. Whenever the average profit in the market exceeds some specified value, the auditor introduces a new supplier agent with a strategy that is chosen from the pool of all the strategies in the market, weighted by the number of suppliers that execute them. The auditor maintains a pool of “retired” strategiesand eventually tries to put them back in the market. The Customer agent makes all market decisions without help from its human supervisor and reports the rate of failures to the manager. The Supplier agent also operates without human supervision and reports on its computational failures to the manager. The supplier agent coordinates its resource commitments with its own factory. One instance of the Factory is assigned to each supplier agent to keep track of resource availability and existing commitments. The size and types of products produced in a factory are determined by the auditor upon creation of the corresponding supplier agent. Human participants submit new strategies to the pool of possible mutations. “Mutant” strategies are introduced to the market after it reaches its dynamic steady state, i.e. after the rate of issuing RFQs by the manager eventually stabilizes. The Data warehouse agent collects data the same way as in the mixed-initiative configuration. In addition, it replies to data queries from the manager, the auditor and, possibly, human observers whenever it is required. The choice of this particular architecture is determined by the need to stabilize the size of the market and by our specific interest in supplyside strategies. To satisfy these requirements we fix the population of customer agents and let the supplier agents evolve to meet the demand. The demand, in turn, is limited from above by the computational capacity of a system that runs agents’ software. In case the load is overly
168 high, the rate of agents’ failures to complete calculations will signal the manager to decrease the rate of issuing RFQs, thus effectively shrinking the market. Whether we decide to study the demand-side strategies, the architecture might be changed to make the auditor manage the evolving population of customer agents, while the manager governs the resource availability on the supply-side. The exact choice and composition of the evolutionary components is not formalized at present, however we plan to improve the methodology as we study other prospective domains. Some of such candidate domains are considered in the following section along with the approaches that are currently used to study them.
4.
Experimentation in an Evolutionary Environment
The proposed evolutionary approach to large-scale simulations is not the only possibility to makeup for the lack of readily accessible realworld data. Two other methods that are widely used in the research community are analytical modeling and competitions of software agents. The major drawback of analytical modeling lies in its failure to embrace the complexity of the real world to any significant degree. It is often useful to employ analytical methods to study some specific and usually global properties of the highly simplified models, however, they are not very helpful when the domain of interest involves many different types of agents and possible agents’ strategies. A comprehensive review of successes and pitfalls of competitions based on experiences from the Trading Agent Competition (TAC) and RoboCup can be found in [24]. In short, the competitions of intelligent software agents proved to be a dynamic and valuable source of research material. The problematic properties include overly restrictive rules, domainspecific solutions (i.e. strategies that exploit peculiar properties of the simulation environment or wordings of the rules) and invalid evaluation criteria. In the following we focus on the important properties of the evolutionary methodology: A heterogeneous multi-agent system is commonly governed by a magnitude of parameters, many of which are continuous variables. The search space of the system is immense, thus rendering any systematic testing by far impossible. Using an evolutionary approach allows us to search the space of parameters in an efficient way. A thorough study of agent strategies requires information about the behavior of other agents in the system. The evolutionary ap-
An Application Science for Multi-Agent Systems
169
proach solves this problem by enclosing all the agents in a selfsufficient system, where they can observe each other’s behavior. The evolutionary approach allows for formation of complex spatiotemporal patterns of behavior on the level of groups of agents. Examples range from the emergence of cooperation in an otherwise selfish society [1, 2] with possible formation of spatial patterns of strategic interaction [17] to natural phenomena, like fish schools [15]. Finally, a simulated evolutionary environment, like most simulations, offers facilities for controllable experimentation and systematic data collection.
4.1
The Role of Evolutionary Experimentation in Application Science
We divide the technology space in three parts by degree of compatibility of each known or future technology to our evolutionary approach. The first subspace includes technologies that are based on one or another evolutionary methodology, such as Genetic Algorithms or Celluar Automata. Examples of such technologies include ZIP [6], our pilot discrete-event Citysim model or, perhaps, its possible asynchronous implementation.
170 The second collection of technologies can be restructured to be compatible with the evolutionary paradigm. MAGNET [9], as it was demonstrated in the previous section, is one perspective member of this subspace, and the new 2003 revision of TAC [25] is another. In the third part of the space we count those technologies that are not easily convertable to the evolutionary setup, such as the older versions of TAC. Also in this subspace are technologies whose compatibility was not yet examined, e.g., TAEMS framework [11, 16].
5.
Related Work
Much research has been done in the last few years in designing pricing strategies for agents, assessing their performance, and adapting to changing situations [5]. Understanding collective interactions among agents that dynamically price services or goods is discussed in [14], where several pricing strategies are compared. Examples of price-wars caused by agents that dynamically change their posted price for information bundles are described. Because of the complexity of the problem, experiments are limited to a small number of agents. A simulation based approach to study dynamic pricing strategies in finite time horizon markets is described in [12]. The study uses a market simulator and simple strategies. The results are evaluated in terms of overall profit, but there are so many variables in the simulation that it is hard to assess the generality of the results obtained. Continuous double auctions have been the subject of multiple studies. Cliff’s [6] Zero-Intelligence Plus trader agents have minimal intelligence, yet they have been successfully used in continuous double auctions, where they performed very well even when compared to human traders [10]. The use of evolutionary methods for continuous double auctions is proposed in [19], who simulates the evolution of the agent population as they adapt their strategy by observing what happens in the environment. Cliff [7] uses genetic algorithms to learn the parameters that control how his trader agents evolve their pricing strategies. Along similar lines, an evolutionary system based on Genetic Programming is presented in [20]. The major difference with these and the work presented here, is that we are interested in understanding how strategies of individual agents interact in the market, as opposed to study specific types of auctions to learn auction rules. We are also interested in providing a methodology for studying effectively multi-agent systems with a large number of agents.
An Application Science for Multi-Agent Systems
6.
171
Conclusions and Future Work
Complex system with many parameters and with stochastic properties are difficult to assess. Multi-agent marketplace systems, where agents can enter and leave the market at any time are specially hard to analyze because the agent strategies depend on the behavior patterns of other agents. Yet, there is no standard method for supporting systematic experiments in such systems. We have proposed building an evolutionary system with a setup that helps the system reach a dynamically stable condition.In evolutionary systems there is no fitness function, instead there is a rule which governs survival of society members based on their success. In our case, when an agent fails to make any profit for a period of time, the agent will leave the market to be eventually replaced with a more fit entity. The outcome of using an evolutionary system around a MAS could produce several different strategies, not only an optimal one. Strategies that survive could vary from some strategies that are very fast but expensive for the customer, to inexpensive strategies with long delivery delays, to strategies that depend on the size of the company, etc. The design of the framework allows for new behavior patterns to evolve over time and for new strategy types to be introduced seamlessly. Our future plans include examination of conditions that allow for the introduction of the evolutionary framework in other MAS and formalization of the integration procedures. We are considering applying the evolutionary framework to the year 2003 revision of the Trading Agent Competition [25], to experiment with strategies for manufacturer agents under different market conditions. We will also use the proposed approach to study strategies and patterns of strategy interaction in the context of the MAGNET system.
References
[1] R. M. Axelrod. The evolution of cooperation. Basic Books, 1984. [2] Robert Axelrod. The complexity of cooperation. Princeton University Press, 1997. [3] Alexander Babanov, John Collins, and Maria Gini. Risk and expectations in a-priori time allocation in multi-agent contracting. In Proc. of the First Int’l Conf. on Autonomous Agents and Multi-Agent Systems, volume 1, pages 53–60, Bologna, Italy, July 2002. [4] Alexander Babanov, John Collins, and Maria Gini. Scheduling tasks with precedence constraints to solicit desirable bid combinations. In Proc. of the Second Int’l Conf. on Autonomous Agents and Multi-Agent Systems, Melbourne, Australia, July 2003. [5] Christopher H. Brooks, Robert Gazzale, Rajarshi Das, Jeffrey O. Kephart, Jeffrey K. MacKie-Mason, and Edmund H. Durfee. Model selection in an information economy: Choosing what to learn. Computational Intelligence, April 2002. [6] D. Cliff and J. Bruten. Minimalintelligence agents for bargaining behaviors in marketbased environments. Technical Report HPL-97-91, Hewlett Packard Labs, 1997.
[7] Dave Cliff. Evolutionary optimization of parameter sets for adaptive softwareagent traders in continuous double auction markets. Technical Report HPL2001-99, Hewlett Packard Labs, 2001. [8] John Collins, Corey Bilot, Maria Gini, and Bamshad Mobasher. Decision processes in agent-based automated contracting. IEEE Internet Computing, pages 61–72, March 2001. [9] John Collins, Wolfgang Ketter, and Maria Gini. A multi-agent negotiation testbed for contracting tasks with temporal and precedence constraints. Int’l Journal of Electronic Commerce, 7(1):35–57, 2002.
[10] Rajarshi Das, James E. Hanson, Jeffrey O. Kephart, and Gerald Tesauro. Agenthuman interactions in the continuous double auction. In Proc. of the 17th Joint Conf. on Artificial Intelligence, Seattle, WA, USA, August 2001.
An Application Science for Multi-Agent Systems
173
[11] Keith Decker. Taems: A framework for environment centered analysis and design of coordination mechanisms. In Foundations of Distributed Artificial Intelligence, pages 429–448. January 1996. [12] Joan Morris DiMicco, Amy Greenwald, and Pattie Maes. Dynamic pricing strategies under a finite time horizon. In Proc. of ACM Conf on Electronic Commerce (EC’01), October 2001. [13] Stephanie Forrest. Genetic algorithms: Principles of natural selection applied to computation. Science, 261:872–878, 1993. [14] Jeffrey O. Kephart, James E. Hanson, and Amy R. Greenwald. Dynamic pricing by software agents. Computer Networks, 32(6):731–752, 2000. [15] Janet T. Landa. Bioeconomics of some nonhuman and human societies: new institutional economics approach. In Journal of Bioeconomics, pages 95–113. Kluwer Academic Publishers, 1999. [16] Victor Lesser, Bryan Horling, Prank Klassner, Anita Raja, Thomas Wagner, and Shelley XQ. Zhang. BIG: A resource-bounded information gathering and decision support agent. Artificial Intelligence, 118(1–2):197–244, May 2000. [17] Kristian Lindgren. Evolutionary dynamics in game-theoretic models. In The Economy as an Evolving Complex System II, pages 337–367, 1997. [18] Richard R. Nelson. Recent evolutionary theorizing about economic change. Journal of Economic Literature, 33(1):48–90, March 1995. [19] Sunju Park, Edmund H. Durfee, and William P. Birmingham. An adaptive agent bidding strategy based on stochastic modeling. In Proc. of the Third Int’l Conf. on Autonomous Agents, 1999. [20] Steve Phelps, Peter McBurney, Simon Parsons, and Elizabeth Sklar. Coevolutionary mechanism design: a preliminary report. In Workshop on Agent Mediated Electronic Commerce, Bologna, Italy, July 2002. [21] Charles Phillips and Mary Meeker. The B2B internet report – Collaborative commerce. Morgan Stanley Dean Witter, April 2000. [22] David Rode. Market efficiency, decision processes, and evolutionary games. Department of Social and Decision Sciences, Carnegie Mellon University, March 1997. [23] Thomas C. Schelling. Micromotives and Macrobehavior. W. W. Norton & Company, Inc., 1978. [24] Peter Stone. Multiagent competitions and research: Lessons from RoboCup and TAC. In The 6th RoboCup International Symposium, 2002. [25] TAC-02. Trading agent competition 2002. http://www.sics.se/tac/, 2002. [26] Leigh Tesfatsion. Agent-based computational economics: Growing economies from the bottom up. Artificial Life, 8(1):55–82, 2002.
This page intentionally left blank
APPLICATION CHARACTERISTICS MOTIVATING ADAPTIVE ORGANIZATIONAL CAPABILITIES WITHIN MULTI-AGENT SYSTEMS
K. Suzanne Barber and Matthew T. MacMahon The Laboratory for Intelligent Processes and Systems The University of Texas at Austin 201 East 24th St., ACES 5.402, Austin, TX 78712 Phone:+1(512) 471-6152
[email protected] Abstract: Both theory and practice show no single way to organize always performs best; the best organization depends on context. Therefore, a group should adapt how it interacts to fit the situation. A Multi-Agent System (MAS) is a group of distributed software programs, called agents, which interact to achieve a goal. This research defines a Decision-Making Framework (DMF) as the allocation among agents of decision-making and action-execution responsibilities for a set of goals and a situational condition. An agent’s ability to perform goal-driven, situation-driven adaptation of a DMF is called Adaptive Decision-Making Frameworks (ADMF). The roles the agents play in the Decision-Making Framework are called their Decision-Making Interaction Styles. This analysis scrutinizes how the combinations of Decision-Making Interaction Styles and Decision-Making Frameworks affect the performance of individual agents and groups of agents in accomplishing an inter-dependent set of goals in a shared simulation under a diverse set of environmental conditions. The findings are explained in text and a rich data visualization. The analysis details the effects of the individual interaction styles, formation of different combinations Decision-Making Frameworks, and factors such as group size and makeup across a dynamically changing environmental simulation. The analysis illuminates the relationship between decisions about team formation and subsequent planning. Key words:
Agent Organization, Model Sharing, Planning
176
1.
INTRODUCTION
An agent is a system able to sense its environment, reactively or deliberatively plan actions given inferred environmental state and a set of goals, and execute actions to change the environment [1]. Agents in a MultiAgent System (MAS) can interact in sensing, planning, and executing. The study of Multi-Agent Systems examines individual agent- and system-level behaviour of interacting software agents. A Multi-Agent System decision-making organization, is an instantiated set of agents working together on a set of goals, the (explicit and implicit) decision-making protocols used to select how to accomplish the goals, and the coordination protocols used to execute the decided actions. That is, any given decision-making organization is a particular set of agents using some particular protocols to decide and enact a particular set of goals. The term organization will be used in this paper as shorthand for Multi-Agent System decision-making organization. This paper empirically examines the relative importance of and relationships between several key Multi-Agent System design decisions regarding MAS decision-making organizations and the utility and performance of those organizations. This work is a natural extension of previous work [2], refining experiments to clarify the impact of previously undistinguishable factors. Martin ran experiments controlling the set of Decision-Making Frameworks, the number of agents, the difficulty of the domain problem, communication availability and the ability to sense distant environmental conditions. This work will investigate the impact of the organization strategies for model sharing, sensing, and planning. A Decision-Making Framework (DMF) specifies how agents work together to decide and execute a given set of goals. A particular DMF representation has been previously defined as an assignment of variables in three sets, (D, C, G) [2]. This Decision-Making Framework (DMF) representation models the set of agents D deciding a set of goals for another, controlled, set of agents C, which are bound to accept sub-goals to accomplish the goal set G. Agents form a DMF for one or more goals and an agent may participate in multiple DMFs for different goals simultaneously. This model specifies the agent’s decision-making interaction style, controlling how that agent participates in the decision-making process for some goal set. For example, a “master/command-driven” DMF in which an agent, Agent1, acts as the deciding agent and other agents, Agent2 and Agent3, are controlled by Agent1 to accomplish a goal, Goal1, would be represented by the (D = {Agent1}, C = {Agent2, Agent3}, G = {Goal1}) assignment. The set of DMFs covering all goals in the system is the Global
An Application Science for Multi-Agent Systems
177
Decision-Making Framework, denoted by GDMF. The GDMF is the (D, C, G) DMF assignments such that all goals are in exactly one DMF, but one (D, C, G) assignment may apply to multiple goals. In Martin’s implemented DMF, all agents in D have exactly one vote to approve one of a set of proposed assignments of sub-goals to agents in C to achieve G [2]. A proposal with a plurality of votes is implemented and the agents in C are equally bound to accept the sub-goals chosen by the agents in D. Voting scheme is not explicitly represented, but all multi-agent DMFs to date have used the same plurality voting. This analysis considers three Decision-Making Interaction Styles (DMIS) describing how an individual agent participates in a DMF: Command-Driven (CD)– The agent does not make decisions about how to pursue this goal set and must obey orders from its Master agent(s). Consensus (CN) – The agent works as a team member, sharing decisionmaking control and acting responsibility equally with all agents in the DMF. Locally Autonomous (LA) / Master (M) – This agent alone makes decisions for these goals. Masters give other agents orders, while Locally Autonomous agents act alone.
In this paper, referring to a Master or M agent means an agent using the Master Decision-Making Interaction Style for the goals under consideration. The same holds for Locally Autonomous (LA), Consensus (CN) and Command-Driven (CD) agents. A single Decision-Making Framework is composed of a coherent set of individual Decision-Making Interaction Styles for all participating agents (e.g. Master/ Command-Driven or all Consensus frameworks). A Global Decision-Making Framework (GDMF) is a partition of the system’s goal and agent set into DMFs so that, at any time, each goal is in exactly one DMF. Many Multi-Agent Systems cannot change how the agents interact to make decisions. Multi-Agent Systems with one fixed DMF have a Static Decision-Making Framework. Agents that can change DMFs have the Adaptive Decision-Making Frameworks (ADMF) capability. AMDF is a
178 search through the space of possible Decision-Making Schemes for a DMS predicted to perform well given the system’s goals, agent capabilities, and perceived environmental factors. Martin showed that adapting the system’s Decision-Making Frameworks to the perceived situation improves system performance [2]. Decision-Making Framework performance can be influenced by factors both internal and external to the agent and agent system. Martin’s experiments explored the effects of factors external to the MAS, including domain problem difficulty, the state of the exogenous environment, and the number of other active DMFs. This work investigates how design choices related to the MAS internal operations impact decision-making performance. Let us define the Model-Sharing Framework as the pattern of world model sharing internal to the Multi-Agent System. Specifically, the full Model-Sharing Framework is a two-dimensional matrix of which agents are willing to share aspects of their world model with each other agent in the system. This paper examines Model-Sharing Frameworks at a higher level of granularity: either the agent does not share models at all or shares only within its Decision-Making Framework. The experiments in this paper investigate the influence on Decision Making Framework performance by which agents share their world models (the Model-Sharing Framework) and several planning parameters. Computational experiments evaluate the performance effects of sharing models within and across Decision-Making Frameworks. Tested planning parameters include decision procrastination and risk aversion. The following analysis examines the tradeoffs agents in a MAS make between their individual performance for their goals and the performance of the group on system goals. The analysis scrutinizes data recorded in an experiment that tested the performance differences of different global Decision-Making Frameworks (GDMFs) under dynamically changing situations [2]. The original experiment focused on the overall performance of the multi-agent system in each GDMF. This analysis focuses on how different Decision-Making Interaction Styles affect performance within the GDMFs across those changing situations. The resulting data can serve as a repository of empirical evidence about the impact of key design parameters (Model Sharing, Sensing and Planning strategies) on the performance of decision-making organizations. Even if the specific quantification of impact varies with domain applications, the parameters are persistent. Designers must consider the association between these parameters and choices of decision-making organizations.
An Application Science for Multi-Agent Systems
179
2. DECISION-MAKING FRAMEWORK EXPERIMENTS The primary thrust of the experimental work is to disentangle those factors influencing DMF performance and consequently use that knowledge to design improved static DMFs or better algorithms deploying Adaptive Decision-Making Frameworks. The Decision-Making Framework (DMF) representation (D, C, G) specifies only which agents vote on proposed subgoal assignments to which set of agents to achieve which set of goals. The DMF representation does not specify factors such as which agents communicate their world models (the Model-Sharing Framework), how agents coordinate planning and action within a DMF, and how agents plan accounting for the actions of agents outside of their DMF. Intuitively these factors should affect the performance of a group of agents in a DMF. Work examining other human and software organizations also suggests the interaction of these factors [3]. The following experiments investigate the influence on Decision Making Framework performance by which agents share their world models (the Model-Sharing Framework and several planning parameters. Computational experiments evaluate the performance effects of sharing models within and across Decision-Making Frameworks. Tested planning parameters include decision procrastination and risk aversion. These experiments control for the following parameters associated with Model-Sharing and Planning. The effect of the Model-Sharing Framework is investigated by controlling (1) whether agents share world models within their DMF and (2) whether agents can sense the pertinent information directly from the environment. The Planning parameters examined include (1) the probability agents will choose not to plan during a processing phase, given an unachieved goal (decision-procrastination) and (2) the probability agents will choose to implement a solution that is predicted to worsen performance given its world model (risk-aversion). Table 1 outlines the design parameters investigated. Section 2.1 gives a review of the experiment domain, Section 2.2 details the effect of the agents interaction style in this domain, Section 2.3 reviews the experimental testbed, and Section 2.4 elaborates on the meaning of and motivation for each experimental variable (i.e. design parameter).
2.1 Experiment Domain The experiment domain is naval radar interference management where agents attempt to minimize radar interference. An agent controls the radar frequency on each simulated ship and attempts to choose frequencies that are not being used by other radars in the system. Radar interference primarily occurs when two radars are operating in close geographical proximity at
180 similar frequencies. The agents must use frequency management instead of position control to manage radar interference. Each agent has one goal, “Minimize radar interference.” Agents evaluate performance for this goal using the metric of average system interference, that is, the mean interference of all agents in the environment, including agents in other DMFs. Static allocation of frequency assignments may not work because the interference level is affected by changing variables (e.g. the ships’ relative positions, weather, and friend or foe emitters in the environment). An agent-based approach is suited to this problem because: (1) information and control are distributed – that is, each ship’s frequency settings, interference, position information, and frequency-controlling actuators are only directly accessible on each ship, and (2) actions are interdependent, which encourages collaboration through information sharing and coordinated planning.
2.2 Implementation of Decision-Making Interaction Styles In this experiment, the Decision-Making Interaction Styles are implemented by altering the procedure to pick a set of frequency allocations, but not the underlying frequency selection algorithm. In prior work, the Decision Making Interaction Style also altered when and how the agents proposed frequency allocations, in addition to how one set of allocations was chosen to be implemented by the group [2]. In this experiment, the difference in Decision-Making Interaction Styles is reduced to how they interact over the course of processing phases. During a single processing phase, a planner may attempt to determine an acceptable way to achieve its goals, one batch of messages may be sent and received, and one frequency setting action may be taken. The DMIS of an agent constrains how that agent interacts with other agents in its DMF, if any. The following list enumerates how the DMIS constrains action selection in this experiment. Locally Autonomous: Locally Autonomous agents do not explicitly plan to better their own interference, just the system’s average interference. A Locally Autonomous agent can act every processing phase. Master/Command-Driven: A Master searches for an interference-free frequency setting for each agent in turn in every processing phase any agent in its DMF experiences interference. If the Master finds a frequency allocation that produces equal or better system interference, it will change its own frequency, if necessary, and send frequency change commands to its Command-Driven agents. During the next processing phase, the Command-Driven agents can act. Consensus: Agents in a Consensus DMF act in three processing phases. First, if any agent in the DMF experiences interference, all CN agents search for an interference-free frequency setting for each DMF agent in
An Application Science for Multi-Agent Systems
181
turn. Each agent sends its plan to the other DMF agents. In the second processing phase, the agents update their local models with any communicated information and vote for the solution set predicted to produce the lowest system interference. If two solutions tie, the agents pick the solution proposed by the agent with the lowest agent identification number. If no proposed solution is predicted to be better than the status quo, the agents will not act. In the third processing phase, the CN agents tally the votes and enact the most popular proposal. If there were a tie, the agents would choose not to act. Locally Autonomous and Master agents can act each processing phase; Command-Driven can act every two processing phases, (one for the Master to generate and transmit the solution and one to act); and Consensus agents can act every three, one to generate proposals, one to vote, and one to act. In situations where agents believe there is interference but have no position information about other agents, all implementations act as if all other agents were so far away that no interference was possible, in other words, as if any action were better than the status quo.
2.3 Experiment Testbed These experiments use the Sensible Agents architecture for all agents in the system, and leverage the facilities of the Sensible Agent Run-Time Environment (SARTE) and Testbed [4]. The Sensible Agent Testbed provides a set of tools for system configuration and testing, including automated generation of parameter combinations for experimental runs, both deterministic and non-deterministic simulation, and highly configurable data acquisition facilities. Decker provides a good overview of the state of the art of agent testbeds in [5], describing comprehensive testbeds such as DVMT, MACE, MICE, and Tileworld. DVMT [6] was a groundbreaking testbed, but is limited to a distributed vehicle monitoring domain and is no longer developed. MACE [7] is domain-neutral development framework with some evaluation facilities. MICE [8] is a generalization of the ICE domain built on top of MACE. Tileworld [9] is a simple architecture-neutral domain that can be used for controlled experiments. The CLIP and CLASP packages provide excellent evaluation facilities for agent designers [10]. The Phoenix testbed and agent framework [11] and the domain facility [12] use CLIP/CLASP for data collection and analysis. The main advantage of the Sensible Agent Testbed over CLIP/CLASP is that the SA Testbed can be applied to all languages with a CORBA® library, while CLIP only works with LISP-based programs. For analysis, we use MATLAB®, although CLASP looks like a viable alternative.
182 Hanks, Pollack, and Cohen give an enlightening discussion of the challenges of creating valid experiments for agent systems in simplified environments, drawing examples from the Phoenix and Tileworld testbeds [13]. The Sensible Agents Testbed fulfils their requirements for a planning testbed because it is has a clean, well-defined interface, a explicit, discrete model of time, and parameterizable controls for experimentation. The SARTE environment likewise supports exogenous events, arbitrarily complex worlds, costly and inaccurate sensing, multiple agents, and recording performance information including the partial achieving of goals. The other subsections in Section 2 describe why the chosen domain illuminates interesting aspects of Multi-Agent System behaviour and provide implementation information for the reader to decide if and how the concepts empirically tested herein generalize beyond the testbed domain.
2.4 Experimental Variables These experiments disentangle the influences of the planning and modelsharing factors on Decision-Making Framework performance. The Decision-Making Framework (DMF) representation (D, C, G) found in [2] specifies only which agents vote on proposed sub-goal assignments to which set of agents to achieve which set of goals. The DMF representation does not specify factors such as which agents communicate their world models (the Model-Sharing Framework), how agents coordinate planning and action within a DMF, and how agents plan accounting for the actions of agents outside of their DMF. Intuitively these factors should affect the performance of a group of agents in a DMF. Work examining other human and software organizations also suggests the interaction of these factors [3]. This work extends previous experiments on ADMF [2] explicitly controlling for the effects of these design parameters. How does the Model-Sharing Framework affect the performance of Decision-Making Frameworks? In prior experiments only agents using the Consensus Decision-Making Interaction Style shared world models with one another. For some problems, the information needed to make good decisions can only be sensed locally, so agents that share information have an advantage over those who do not, regardless of the Decision-Making Framework. Therefore, to understand the performance characteristics of DMFs of agents using the Consensus DMIS, we need to separate the decision-making advantages of cooperative DMFs from the model-sharing advantages. Since agents in a Consensus DMF share some information about their views of the world implicitly via their votes, they actually may not benefit as much from sharing their world models as other types of DMFs. On the other hand, getting “on the same page” by sharing models may be essential to forming a valid consensus decision.
An Application Science for Multi-Agent Systems
183
This experiment controls both model-sharing within Decision-Making Frameworks and position sensing of other agents. These two parameters control the two ways agents can gain information about the state of the world: sensing and communication. The Model-Sharing Framework is logically independent of the Decision-Making-Framework: the allocation of decision-making control does not imply the pattern of model-sharing, and vice versa. At the same time, a system performance correlation may exist between the system’s Model-Sharing Framework and Global DecisionMaking Framework. On one hand, there are obvious benefits to a controlled agent sharing information with the agents allocating it sub-goals. On the other hand, for tasks with interfering actions, all surrounding agents’ actions affect performance, so giving all affecting agents information should be beneficial. In fact, an agent notifying its neighbours of its intentions may make unnecessary more expensive, explicit action coordination. An example of the utility of publishing information outside of a decisionmaking group is open bid auctions, as opposed to close bid auctions, where the bids are secret until all are submitted [14] [15]. In previous work, different Decision-Making Frameworks used planning algorithms with different parameters [2]. Specifically, some versions were more reluctant to act; that is some had a chance of not planning even when they had unsatisfied goals. Another difference in the planners between the DMF conditions was varying levels of risk-aversion – whether the planner will take actions not believed to work towards the goal some of the time. These experiments control the planning parameters to examine the interaction between planning parameters and utility of DMFs. The changes in planning algorithm were made assuming that some DMFs would benefit more from them, and this thesis tests those assumptions empirically. Each parameter and its motivation is further explained in the next paragraphs. Let us call a bias to not plan for an unsatisfied intended goal decisionprocrastination. Decision-procrastination may be a rational strategy when the expected value of any action is negative or only marginally positive. A particular instance of this is when agents’ actions are likely to interfere with each other and many agents are acting independently. In this case, the fewer uncoordinated actors in the environment, the better the system should perform. Therefore, decision-procrastination may be a good strategy in the naval radar frequency management domain when the agents are close to one another and more than one DMF is acting. This experiment examines when agents should be prone to procrastinate in relation to problem difficulty. Three levels of decision-procrastination are tested for all types of Decision-Making Frameworks. The decision
184 procrastination level is implemented as the percentage of time that agents chose not to act even with unsatisfied goals. We predict high levels of decision-procrastination will be beneficial in situations with tasks in which the actions of one agent may interfere with the performance of others and there are many uncoordinated actors. Let us call a bias against actions predicted to have negative utility riskaversion. An agent with high risk aversion appears to act cautiously, given its model of the world. On the other hand, low risk-aversion, a propensity to take actions with predicted negative consequences, may be a rational strategy in situations where agents make low quality predictions. For instance, when agents have very incomplete or erroneous world models, taking exploratory or blind actions may give the agent better, more recent information. Sometimes the agent will even randomly walk into a better situation. How does performance vary with the level of risk-aversion and the quality of the local model? The quality of the local perspective model is represented in this experiment to be the presence or absence of position sensing and the number of acting DMFs in the environment. In this experiment, the level of risk-aversion is varied against all possible DMFs in a variety of situations. We predict that risk-aversion will be beneficial when agents do not or cannot share models and do not sense enough pertinent information.
Finally, the composition of different DMFs, the Global Decision-Making Framework, was investigated. For five agents, the following GDMFs were tested using the configurations in Table 2: (1) all Locally Autonomous (all LA), (2) one Master and the other four agents Command-Driven (M/ CD), (3) all agents working in Consensus (all CN), (4) one Master controlling two (Command-Driven) agents and the remaining two agents each Locally Autonomous (M/ CD/ LA), and (5) one DMF of a Master and two Command-Driven agents and a DMF of two agents working in Consensus
An Application Science for Multi-Agent Systems
185
with one another (M/ CD/ CN). (6), (7), (8), and (9). This experiment was designed to quantify the overall performance differentials between DMFs.
In previous work, different Decision-Making Frameworks used planning algorithms with different parameters. A group of five agents each independently using the Master or Consensus planning algorithms would act differently than five agents each using the Locally Autonomous planning algorithm. Specifically, some versions were more reluctant to act; that is some had a chance of not planning even when they had unsatisfied goals. Another difference in the planners between the DMF conditions was varying levels of risk-aversion – whether the planner will take actions not believed to work towards the goal some of the time. Another difference from Martin’s experiments is the selection of initial frequency allocations for the ships. Martin randomly generated a set of allocations and used them throughout her experiments. For these experiments, initial frequency allocations were crafted with specific characteristics to ensure corner cases would be tested, while ensuring fairness. The allocations used are listed in Table 3. The agents in this experiment are arrayed sequentially in a circular formation, with Agent 1 next to Agent 5 and Agent 2. Each individual agent should experience in aggregate equally bad initial interference. Therefore we use the following distributions:
186 1. Even Low: Evenly distributed with low interference. Each agent is equally far away in the frequency spectrum from its immediate neighbours within 1300-1350. 2. Even High: Evenly distributed with high interference. Each agent is equally close in the frequency spectrum to its immediate neighbours within 1300-1350. This has the same frequency assignments as Even Low, but shuffled. 3. Pessimal: All agents have the same initial frequency allocation of 1300. 4-7. Clumpy 1-4: Two distinct groups, one at low frequencies, one at middle frequency range. Rotate to get fairness.
Additionally, previous work on Adaptive Decision-Making Frameworks examined only how ADMF affects the behaviour of the system as a whole, not the performance for individual agents. An organization may increase the aggregate performance of the system while decreasing the performance of some members. Other organizations improve the performance of all equally.
3.
EXPERIMENT RESULTS
This section covers the findings of the experiment. Section 3.1 details the composition of the graphical results figures. Sections 3.2, 3.3, and 3.4 look at the individual interactions of the controlled design parameters with the Global Decision-Making Framework and the environmental situation. Section 3.5 examines the effect of altering the GDMF of the Multi-Agent System once the design parameters have been set. The following discussion makes use of game-theoretic language. For one GDMF to dominate another means each agent in the dominating GDMF perform as well or better than every agent in the dominated GDMF. Likewise, for an interaction style to dominate, each agent using that style must perform as well or better than agents using all other styles. The following sections explore the performance characteristics of different Decision-Making Interaction Styles.
3.1 Explanation of Graphical Conventions The figures with individually plotted points (Figure 2, Figure 4, Figure 6, Figure 8, and Figure 9) in this section show the mean interference in
An Application Science for Multi-Agent Systems
187
scale for each of five agents in nine different Global Decision-Making Frameworks plotted by the distance of the ships from the centre of the formation. “Average interference” is the mean across twenty-eight experimental runs (seven initial frequency allocations by four planning seeds) of the mean interference over time during each experimental run. Since the ships were tested at a discrete set of distances, GDMF performance is plotted side-by-side within a bin for each tested distance. The placement of symbols along the x-axis within a bin represents the Global DecisionMaking Framework tested, with the order from left to right corresponding to the order in Table 2. The x-axis location of the bin is the distance of the agents from the centre of the group formation. The y-axis is the mean system interference, so a lower placement indicates lower interference and better performance. The logarithmic scale of interference is appropriate because the difficulty of interference management scales by the square of ship distance from the centre of the group. The other figures in this section (Figure 3, Figure 5, and Figure 7) plot the statistical significance of an interaction between the Global DecisionMaking Framework, on the x axis, and the distance of the agents from the centre of the group, on the y axis. The tint in each square is an indicator of the value of the two-sided T test for significance for that combination of GDMF and agent spacing. The lighter the tint, the less certainty of a statistically valid difference between the performance of the two values of the experimental parameter indicated in the caption and accompanying text.
3.2 Model Sharing and Position Sensing Regardless of how the agents obtain information about one another – whether it is sensed or shared – the same pattern is exhibited. Averaged across all other conditions, knowing the exact position of other ships does not significantly help when the ships are closely packed. Position information only helps when the ships are far enough apart that the frequency selections of the ships on nearby ships affect performance significantly more than relatively distant ships. Figure 2 shows the effect of the presence or absence of the ability to sense instantaneously the position of the other ships. The Ys represent performance on an experiment run using position sensing and the Ns represent performance without position sensing. Within a bin, the performances for the nine Global Decision-Making Frameworks tested are shown and each bin plots one radius of separation between the ships.
188
At zero and low separation between the ships, position sensing may even be a hindrance, which reveals a flaw in the naïve frequency selection algorithm used. As the separation between ships increases, the importance of the positioning information becomes more significant in picking low interference frequency combinations. However, at the furthest tested distance, with a radius of 74 between the ships and the group centre, position information is again less important. This U-shaped curve of influence can be explained by noting that when all ships are very close or very far, the exact relative distance to any does not matter. When some ships are relatively close while other are far, the close ships will interfere much more. Only with these uneven interference sources is position sensing useful. The plot for the agents sharing their own models of ship position is omitted for space considerations. The pattern is the same as Figure 2, but Model Sharing becomes beneficial at greater radii and with less statistical significance than position sensing.
An Application Science for Multi-Agent Systems
189
Figure 3 highlights the statistically significant differences in performance for each GDMF at each distance, averaged across all other experimental conditions. Light shades are large positive differences, dark shades are large negative differences, and intermediate shades are less significant differences. In Figure 3 the 1st, All Locally Autonomous, and 3rd, All Consensus GDMFs do not show a statistically significant change in performance based on whether or not position sensing is available. All other GDMFs have a strong positive performance improvement except at the broadest ship position radius. These two GDMFs share the most and least decisionmaking control across agents. The All Locally Autonomous GDMF cannot coordinate at all, even given good information. The All Consensus GDMF can coordinate to ensure no interference even in the absence of non-local positional information.
3.3 Decision Procrastination Decision Procrastination was modelled in this experiment as the percentage of time an agent would choose not to plan regardless of the situation. Three levels were tested: 0%, 10% and 25. As Figure 4 shows, averaged over all other conditions, the medium, 10% level of procrastination is not significantly outperformed, so that it dominates in this experiment. For this experiment with five agents, at the 10% Decision Procrastination rate, we expect one randomly selected agent not to plan to act about every other time step. On the other hand, the 0% rate, obviously all agents must act all of the time and at the 25% rate, at least one agent will usually not plan
190 to act in any given time step. Therefore, the 10% rate is a compromise between reducing the dynamicism of the environment by reducing the number of actions taken while not over-constraining the potential to act.
This tradeoff can be seen in the qualitatively different performance curves for the 0% and 25% rates, shown respectively as 0 and 2 on Figure 4. The two flip-flop in performance of the high and zero levels of decision procrastination. In the tightest and most spread formations, the 25% level of decision procrastination significantly outperforms the 0% level for all Global Decision-Making Frameworks. However, at intermediate radii of separation, agents acting with no procrastination perform as well or significantly better (at the 65 radius) than the high decision procrastination agents. This can be explained by considering when large gains are likely. At the smallest and largest radii, not many actions will result in better performance; each agent that is dormant may reduce the number of active agents, thus easing the planning problem for the others. At intermediate distances, agents are less constrained by circumstance and so more acting agents do not hurt.
An Application Science for Multi-Agent Systems
191
Figure 5 verifies that the observation of changing behaviour as the average distance between ships increases is statistically significant. Across all GDMFs, while there is strong significance even at middle radii (or problem difficulties), the most significant results occur at the extrema of 0 radius and 74 radius. The light bars at the top and bottom of the figure indicate that 0% decision procrastination is very significantly superior to 10% decision procrastination across all Global Decision-Making Frameworks at the smallest and largest ship separation radii. The t-value plot for 0% versus 25% decision procrastination is similar but not quite as striking.
3.4
Risk Aversion
Risk aversion was represented in this experiment as the percentage of time an agent would enact a frequency selection it believed to produce worse interference. Averaged across all other conditions, we again see a cross-over effect. At low radii, the highest, 25%, level of risk seeking behaviour performs best. Conversely, for the most spread out formations, most risk averse agents perform best by never selecting a predicted worse solution. These results, visualized in Figure 6, were as expected: when agents are close together their interference is naturally high, so poorly thought out actions are not likely to worsen the situation. On the other extreme, when agents are far apart, few frequency selections will significantly interfere, so those few should be scrupulously avoided.
192
An Application Science for Multi-Agent Systems
193
Figure 7 shows the two-sided T values for difference in mean performance with 10% and 25% risk aversion for a given Global DecisionMaking Framework with ships at a given distances. Light tints indicate high significance in affecting performance; dark tints indicate no significant relationship. Interestingly, though GDMF 1 is all agents using the Locally Autonomous Decision-Making Interaction Style, GDMFs 4 and 8, which are mixed Master/Command-Driven and Locally Autonomous, have a noticeably more significant relationship between the risk aversion level and the average system performance.
3.5 Overall Effect of Decision-Making Interaction Style and Global Decision-Making Framework Averaged across all other factors, there is a small, but significant effect of the Global Decision-Making Framework of the Multi-Agent System and the Decision-Making Interaction Style of the individual agents. However, the significance is small, when compared to the other tested factors. If one holds the other parameters constant, the Global Decision-Making Framework can have a large effect. For instance, Figure 8 and Figure 9 show the differences in performance for GDMFs operating with different planning parameters.
194 In Figure 8 and Figure 9 shape represents individual interaction styles and tint (or order within a bin) represents the Global Decision-Making Framework. is a Consensus agent, * a Command-Driven agent, o a Locally Autonomous agent, and a Master agent. The tint and location on the x-axis within a “bin” represent the Global Decision-Making Framework: furthest left is all Locally Autonomous agents, the second and last two black GDMFs have one Master, four Command-Driven agents, the third GDMF is all Consensus agents, the fourth and eighth GDMFs have two Locally Autonomous, one Master, two Command-Driven agents, and the fifth, sixth, and seventh GDMFS have two Consensus, one Master, two CommandDriven agents. Where GDMFs share the same tint, the GDMFs differ in how the Decision-Making Interaction Style roles are allocated to agents, as shown in Figure 8, above.
This depiction spotlights that the allocation of roles in the GDMF affects the GDMFs performance, in addition to the composition of the GDMF. Thus a system searching for a Global Decision-Making Framework to use in a situation must consider how the Decision-Making Interaction Style roles are assigned to agents in the Multi-Agent System. Here, the agents are homogeneous, so the pertinent factor is whether all agents in a DMF are adjacent, or whether the DMF agents are interspersed among agents in other DMFs.
An Application Science for Multi-Agent Systems
4.
195
CONCLUSION
This paper presented an analysis of experimental data to determine the tradeoffs for agents and systems of agents in forming collaborative DecisionMaking Frameworks (DMF). A Decision-Making Framework identifies the locus of decision-making control for a given goal and the authority of decision-makers to assign subtasks in order to achieve that goal. The roles the agents play in the Decision-Making Framework are called their Decision-Making Interaction Styles. Master agents decide goals for themselves and Command-Driven agents, Consensus agents collaborate equally to decide for the group, and Locally Autonomous agents decide and act alone. This analysis scrutinized how the combinations of DecisionMaking Interaction Styles and Decision-Making Framework affected the performance of individual agents and groups of agents in accomplishing an inter-dependent set of goals in a shared simulation under a diverse set of environmental conditions. The relative performance benefits of different Decision-Making Frameworks (DMF) and constituent Decision-Making Interaction Styles (DMIS) change significantly as organizational context changes. Prior work found performance benefits by adjusting the Decision-Making Framework – adjusting the locus of decision-making control for a given goal and the authority of decision-makers to assign subtasks [2]. Within DMFs and within the MAS (within the Global DMF), this work distinguishes and clarifies specific parameters characterizing a decision-making organization and influencing performance of the DMF and the MAS. This research investigated the benefits of adjusting the Model-Sharing Frameworks (which agents exchange models of themselves and their environments) and the planning parameters of decision procrastination and risk aversion. As game theory has found [16], the benefit of one interaction style is affected by the interaction styles employed by other agents. Some situations have network effects, where additional cooperating agents provide benefit to all, but in other situations, the fewer simultaneous actors the better. None of the interaction styles provides a universal advantage either within or across organizations. This analysis provided an intimate look at the behaviour of an implemented agent system on a difficult non-linear domain problem. This close examination brought to light some areas where the architecture implementation and theory of Decision-Making Frameworks can be expanded. The results of this investigation can be further a repository of design heuristics for builders of MAS systems. For instance, ongoing work is examining more egalitarian solution allocation. The theory of Decision-
196 Making Frameworks focuses finely on distribution and flow of decisionmaking control and needs to be linked with a representation of information exchange in groups of agents. This analysis highlighted the close relation between reasoning about how to form teams of agents and the planning and acting in those teams. Certain interaction styles enable more powerful collaborative information gathering, decision-making, and action execution while exacting an overhead in time and communication costs. Other interaction styles with less constrained interaction are less overtly restrictive, but do not permit combining forces and where actions may interfere and therefore necessitate more cautious planning. These experiments show that adapting the Decision-Making Framework independently of all other parameters of the decision-making organization has a measurable and significant effect on the performance of the MultiAgent System. The experiment series also shows that ADMF is not the preeminent factor affecting MAS performance. Agents should adapt how they share information and plan before adapting the way they come to group decisions. However, once the Model-Sharing Framework and planning parameters have been adapted to the domain and likely situations, adapting the Decision-Making Framework can still increase the system performance. The findings of the relative importance of adapting the Decision-Making Framework should generalize to other Multi-Agent Systems adapting other parameters of the decision-making organization. Intuitively, having the ideal decision-making structure for a situation will not help if the decision-makers collectively do not have good models of the world or good planning algorithms; if a group is made up of foolish and misinformed agents, does it really matter if they work independently, collaboratively, or under a peer dictator? Primarily, groups need adequate information exchange and the cognitive tools to solve the problems faced. Only then does group decisionmaking organization matter. The decision-making organization structure is simply not the low-hanging fruit in improving the performance of the decision-making system. Moreover, these experiments show that averaged over all other decisionmaking organization parameters, the performance differentials are small between Global Decision-Making Frameworks. On the other hand, once the Model-Sharing Framework, the decision-procrastination rate and the risk aversion level have been optimized, the GDMF can be a large, significant performance factor. This implies that the performance effect of the group’s Global Decision-Making framework is largely a dependent variable of the other factors. The relative performance benefits GDMFs vary widely across the various combinations of the Model-Sharing Framework, the decisionprocrastination rate and the risk aversion level. Therefore, GDMF cannot be
An Application Science for Multi-Agent Systems
197
optimized independently of other decision-making organization parameters. The best Decision-Making Framework combination for a group of agents is dependent on not only the number of agents and goals, but also the details of the planning and information sharing algorithms the agents use.
5. [1] [2]
[3] [4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
REFERENCES G. Weiss, Multiagent Systems : A Modern Approach to Distributed Artificial Intelligence. Cambridge, Massachusetts: The MIT Press, 1999. C. E. Martin, “Adaptive Decision-Making Frameworks for Multi-Agent Systems,” in Electrical and Computer Engineering. Austin, TX: University of Texas at Austin, 2001, pp. 324. S. Wasserman and K. Faust, Social Network Analysis: Methods and Applications. Cambridge, UK: Cambridge University Press, 1994. K. S. Barber, R. M. McKay, M. T. MacMahon, C. E. Martin, D. N. Lam, A. Goel, D. C. Han, and J. Kim, “Sensible Agents: An Implemented Multi-Agent System and Testbed,” presented at Fifth International Conference on Autonomous Agents (Agents-2001), Montreal, QC, Canada, 2001. K. S. Decker, “Distributed Artificial Intelligence Testbeds,” in Foundations of Distributed Artificial Intelligence, Sixth-Generation Computer Technology Series, G. M. P. O’Hare and N. R. Jennings, Eds. New York: John Wiley & Sons, Inc., 1996, pp. 119-138. V. R. Lesser and D. D. Corkill, “The Distributed Vehicle Monitoring Testbed: A Tool for Investigating Distributed Problem Solving Networks,” in AI Magazine, vol. 4, 1983, pp. 15-33. L. Gasser, C. Braganza, and N. Herman, “MACE: A Flexible Testbed for Distributed AI Research,” in Distributed Artificial Intelligence, M. N. Huhns, Ed. San Mateo, CA: Morgan Kaufmann, 1987, pp. 119-152. E. H. Durfee and T. A. Montgomery, “MICE: A Flexible Testbed for Intelligent Coordination Experiments,” presented at 9th International Workshop on Distributed Artificial Intelligence, Rosario, WA, 1989. M. E. Pollack and M. Ringuette, “Introducing the Tile-World: Experimentally Evaluating Agent Architectures,” presented at Eighth National Conference on Artificial Intelligence, Boston, MA, 1990. S. D. Anderson, D. M. Hart, D. L. Westbrook, and P. R. Cohen, “A Toolbox for Analyzing Programs,” International Journal of Artificial Intelligence Tools, vol. 4, pp. 257-279, 1995. P. R. Cohen, M. Greenberg, D. Hart, and A. Howe, “Trial by Fire: Understanding the Design Requirements for Agents in Complex Environments,” in AI Magazine, vol. 10, 1989, pp. 33-48. K. S. Decker, “TÆMS: A Framework for Environment Centered Analysis and Design of Coordination Mechanisms,” in Foundations of Distributed Artificial Intelligence,
198
[13]
[14]
[15] [16]
Sixth-Generation Computer Technology Series, G. M. P. O’Hare and N. R. Jennings, Eds. New York: John Wiley & Sons, Inc., 1996, pp. 429-448. S. Hanks, M. E. Pollack, and P. R. Cohen, “Benchmarks, Testbeds, Controlled Experimentation, and the Design of Agent Architectures,” in AI Magazine, vol. 14, 1993, pp. 17-42. P. R. Wurman, M. P. Wellman, and W. E. Walsh, “The Michigan Internet AuctionBot: A Configurable Auction Server for Human and Software Agents,” presented at Second International Conference on Autonomous Agents, Minneapolis/St. Paul, MN, 1998. H. R. Varian, “Economic mechanism design for computerized agents,” presented at USENIX Workshop on Electronic Commerce, New York, NY, 1995. G. Zlotkin and J. S. Rosenschein, “Mechanisms for Automated Negotiation in State Oriented Domains,” Journal of Artificial Intelligence Research, vol. 5, pp. 163-238, 1996.
APPLYING COORDINATION MECHANISMS FOR DEPENDENCY RELATIONSHIPS UNDER VARIOUS ENVIRONMENTS* Wei Chen Computer and Information Sciences University of Delaware Newark, DE 19716, USA
[email protected]
Keith Decker Computer and Information Sciences University of Delaware Newark, DE 19716, USA
[email protected]
Abstract
Coordination is a key functionality in multi-agent systems, and mechanisms for achieving coordinated behaviors have been well-studied. One important observation has been that different mechanisms have correspondingly different performance characteristics, and that these can change dramatically in different environments (i.e., no one mechanism is best for all domains). A more recent observation is that one can describe possible mechanisms in a domainindependent way, as simple or complex responses to certain dependency relationships between the activities of different agents. Thus agent programmers can separate encoding agent domain actions from the solution to particular coordination problems that may arise. This paper explores the specification of a large range of coordination mechanisms, for the common hard “enablement” (or “happens-before”) relationship between tasks at different agents. It also explores the impact of task environment characteristics on the choice/performance of these mechanisms. Essentially, a coordination mechanism can be described as a set of protocols (possibly unique to the mechanism), and as an associated automatic re-writing of the specification of the domain-dependent task (expressed as an augmented HTN). The idea about the separation of general knowledge and domain-dependent knowledge is explained. A general method to address the relationships between application domains and agent coordination is introduced. This paper also presents a concrete implementation of this idea in the DECAF
*This work is supported by NSF Grant No. 9733004.
200 agent architecture and an initial exploration of the separation of domain action from meta-level coordination actions for eight simple coordination mechanisms.
1.
Introduction
Generally speaking, there are three ways to coordinate multi-agent systems: arrange the problem so as to avoid the need for coordination, associate coordination mechanisms with each individual agent, or construct a special agent that functions as a centralized coordinator. All of these solutions have good and bad points, and many possible realizations in different environments. If we are to make progress toward a general theory of coordination, we must look at what information is needed to enumerate and choose between these alternatives. If we are to build intelligent multi-agent systems, we must look at how to represent and reason about this information computationally. If we did this, each agent would be able to learn over time which coordination mechanism is the best for a specific situation, according to its knowledge of its capabilities and beliefs towards other agents in a dynamic environment. Previous approaches towards coordination have explored different perspectives. The study of coordination over external resources is popular, and has produced several demonstrations where complex or knowledge-intensive mechanisms can be detrimental for both individuals and a system as a whole [19, 20, 23]. Other work has focussed on particular classes of coordination mechanisms, such as contracting [5] or organizational approaches [22, 25, 15]. However, not all systems depend on external resource coordination alone, and it is not clear how to integrate multiple approaches (since no single approach is likely to be good all the time in many realistic environments). In this paper we present a general approach to explore the relationships between application domains and agent coordination. An important observation, made independently by Decker [7] and Castelfranchi [1], is that the need to coordinate comes about because of the relationships of agents goals, the planned tasks to achieve those goals, and required resources (including not only external resources but also private resources and capabilities). Such relationships can be both positive or negative, hard (necessary) or soft. Thus an important way to express coordination mechanisms in a more general way is to express them with respect to these task interrelationships. Our work involves a general solution to this problem. It is based mainly on two previous achievements, TÆMS and GPGP. TÆMS (Task Analysis, Environment Modeling, and Simulation) [8] proposes a formal approach to represent the tasks and their relationships in a domain-independent, quantitative way. It is based on annotating HTN (Hierarchical Task Networks) [10] with both basic information about the characteristics over which an agent might ex-
An Application Science for Multi-Agent Systems
201
press utility preferences (e.g. cost, quality, duration, etc.) but also how the execution of some action or subtask changes these characteristics at another subtask—thus setting up the potential for some coordination to occur (or not). The problem is not only to represent potential coordination in a general way, but to develop an algorithmic approach by which many different coordination mechanisms can be integrated. GPGP (Generalized Partial Global Planning) [7] is a domain-independent, extendible approach for expressing coordination mechanisms. Although inspired by PGP, GPGP is not tied to a single domain, allows more agent heterogeneity, has a wider variety of coordination mechanisms, and uses a different method of mechanism integration. The key to the integration used by GPGP is to assume each agent is capable of reasoning locally about its schedule of activities and possible alternatives. Thus coordination mechanisms of many different styles can be thought of as ways to provide information to a local scheduler that allow it to construct better schedules. This is especially useful in practice where considerable work has been done in tuning an agent’s local scheduler for a particular domain. Furthermore, it has been shown that it is “... useful to separate domain-specific problem solving and generic control knowledge” [4]. Our approach allows for such a separation of concerns: the behaviors of domain problem solving, versus behaviors for coordination. This paper concerns an implementation of these ideas using DECAF (Distributed Environment Centered Agent Framework). DECAF [14] is an architecture and set of agent construction tools for building real (non-simulated) multi-agent systems, based on RETSINA [9]. DECAF is being used to build applications in electronic commerce, bioinformatics, and plant alarm management. Unlike many API-based agent construction toolkits, DECAF provides an operating system for each agent—a comprehensive set of agent support behaviors to which the casual agent programmer has strictly limited access. The GPGP module described here is one of these agent OS-level subsystems. Section 1.2 analyzes the hard dependency relationship, “enables”. Section 1.3 describes eight specific coordination mechanisms in detail. Section 1.4 discusses the implementation of the mechanisms in the DECAF architecture. Section 1.5 states some initial experimental results. Section 1.6 presents a general approach to understand the relationships between application domains and agent coordination.
2.
Task Structures and the Enables Relationship
To achieve its desires, an agent has to select appropriate actions at suitable times with the right sequence. These actions are represented by an agent’s task structures. Task structures might be created in agent architectures by various means: table lookup, reactive planning, task reduction, classical HTN plan-
202 ning, etc. While the TÆMS representation is more completely and formally specified elsewhere (e.g [24, 8]), the basic idea is as follows. A task represents a way of achieving some objective, possibly via the execution of some set of subtasks. Each task is related to a set of subtasks via a Quality Accumulation Function (QAF) that indicates how the execution of the subtasks affects the completion of the super task. In this paper we will stick only to AND/OR trees: an AND task indicates a set of subtasks that must be accomplished, and an OR task indicates a set of possible alternatives to be considered. Eventually these task trees end in leaf nodes that represent executable methods or actions. Actions are described by a vector of characteristics (e.g. quality, cost, and duration) that allow predictive scheduling based on some dynamic agent utility function (i.e. maximize quality subject to meeting a deadline). The result is a well-defined calculus that allows the calculation of the utility of any particular schedule of actions. The agent scheduler can be implemented with any number of algorithms (cf. comparisons in [14]). However, because of the inevitability of non-local dependencies, and the associated uncertainty of the action characteristics, the ability of the scheduler is severely limited if it can not acquire information about when and how the non-local dependencies are to be handled. It is this information that is provided by coordination mechanisms. If the execution of a task affects, or is affected by, another task, we say there exists relationship between these tasks. These relationships among agent tasks can be classified as hard or soft relationships. For example, if action Act1 cannot start until action Act2 is finished, we say Act2 enables Act1, and this is a hard relationship. On the other hand, if Act1 and Act2 can be executed in either order, but doing Act2 before Act1 is “better” for Act1 (it acquires a lower cost, higher quality, faster duration), then we say Act2 facilitates Act1, and that this is a soft relationship. In any case, if a task structure (or some alternative within) extends over tasks/actions located at different agents, then these non-local relationships (subtask, enablement, facilitation, etc.) that cross agent boundaries become potential coordination points. We will concentrate on the enablement relationship as our main concern in this paper, although these mechanisms presented here will also work for facilitation. Previous work on GPGP coordination mechanisms [7] had described them in an abstract way, which made implementation and analysis difficult. Our more recent observation is that we can specify a specific coordination mechanism generally as a set of protocols (task structures) specific to the mechanism, and a pattern-directed re-writing of the HTN[2][3]. For example, if Act2 at Agent 2 enables Act1 at Agent 1, then one coordination mechanism (out of many) might be for Agent 1 to ask Agent 2 to do Act2, and to commit ahead of time to a deadline by which Act2 will be completed. Here the protocols are a reservation and a deadline commitment protocol, and the re-writing changes
An Application Science for Multi-Agent Systems
203
“Act2 enables Act1” into “reserve-act enables deadline-cmt enables Act2 enables Act1”. To support this activity, an agent architecture must provide a facility for examining patterns of relationships in the current task structures between local tasks and non-local tasks, and re-writing them as required by the mechanism. This approach enables the cataloging of potential coordination mechanisms for a relationship, much more clear comparisons, and the real possibility of supporting automated coordination in an agent architecture such as DECAF (leaving aside for the moment the important question of which coordination mechanism to use for any particular relationship and context)
3.
Coordination Mechanisms for Enabling Task Relationships
We have catalogued at least seventeen coordination mechanisms for enable relationships. For example, if a task TB at agent B enables task TA at agent A, one could: Have A request that B complete TA by some deadline; Have B commit to a deadline of its own choosing for TB (the original PGP-inspired mechanism [7]); Have B send the result of TB (“out of the blue”, as it were) to A when available; Have A poll for the completion of TB (Our model of current hospital practice [8]); The seventeen mechanisms are not an exhaustive list, and many are simply variations on a theme. They include avoidance (with or without some sacrifice), reservation schemes, simple predecessor-side commitments (to do a task sometime, to do it by a deadline, to an earliest-start-time, to notify or send result directly when complete), simple successor-side commitments (to do a task with or without a specific EST), polling approaches (busy querying, timetabling, or constant headway), shifting task dependencies by learning or mobile code (promotion or demotion), various third-party mechanisms, or more complex multi-stage negotiation strategies. This paper will focus on both the absence of coordination, and seven implemented mechanisms of various types. In order that these mechanisms be applicable across task structures from any domain, the result of the mechanism is some alteration of the task specification. This might be a true structural alteration (i.e. removing or adding tasks) or an alteration of the annotations on the task structure. As an example of the latter, consider the scheduling problem imposed by a task structure that includes a non-local task. In general the local agent may have no knowledge about the
204 characteristics of that non-local task. Thus even though the agent may have perfect knowledge about all of its local tasks, it cannot know the combined characteristics of the complete task structure. Coordination mechanisms that rely on commitments from other agents remove this uncertainty and allow the local agent to make better scheduling decisions.
3.1
Avoidable Dependency
Even the simplest coordination mechanisms take some time and effort. In the case where alternatives are available (an OR task node) then one way to deal with the dependency is to remove it from consideration (all other things being equal). Only if the non-dependent alternative were to fail would the agent attempt the dependent alternative. As an example, two agents A and B have task structures as in figure 1. Since agent A has two alternatives for completing TaskA, and one of them does not involve the dependency, one response would be to prune the dependent alternative SubA2, assuming that the local utility of completing alternative SubA1 is no less than that of completing SubA2. This utility will be some function of the various characteristics (here: quality, cost, and duration) that accompany each alternative. Although this seems like a trivial mechanism that should be used whenever possible, remember that the decision is still made from a local perspective. One can construct environments where this behavior is not in the best interest of the agents, for example, if the agents are quality maximizers only, and SubA1 accesses a shared resource1 that produces poor quality when over-utilized (i.e. the “Tragedy of the Commons”).
An Application Science for Multi-Agent Systems
3.2
205
Sacrifice Avoidable Dependency
Simply choosing task SubA1 to avoid the hard dependency could result in a somewhat lower utility because this alternative has a different mix of performance characteristics. For example, Agent A may desire a pizza, and may have all the ingredients and capability to make one itself, or it could call for take-out. The first alternative may (for good cooks) result in a higher-quality result at lower cost, but take considerably more time than the second alternative. A subtle variation on the avoidable-dependency mechanism is then to to avoid the dependency even if it means sacrificing perceived local utility. Although we focus on enablement in this paper, note that this mechanism may also be used for facilitation (ignoring the relationship). The mechanisms for handling Avoidable/Sacrifice-Avoidable dependencies are very fast. They have much shorter coordination time compared with more complex mechanisms, and are of course easy to implement.
3.3
Coordination by Reservation
This mechanism is named after the real world activity of reservations. For example, if you want to have dinner in a restaurant, before you go there you’d better make a reservation so that at some agreed future time, you will be there and the people there will be ready to serve you. This mechanism includes both a rewriting of the local task structure and some external communication protocols. Imagine Figure 1 with task SubA1 is removed (so Agent A has no alternative). The reservation mechanism includes a new protocol, instantiated as a new task structure (called “Wait”) at Agent B, that processes a message indicating a request to do a task sometime in the future. The result is a message indicating when (if ever) the task will occur. The reservation mechanism rewrites the task structure at Agent A so that a new subtask (“GPGPReservation”) is executed before SubA2 and invokes the “Wait” protocol at Agent B and then processes the return message. The result is an annotation on the non-local task that allows Agent A to predictively schedule SubA2 locally at a time after the enabling task has been completed. Coordination by Reservation works very well in complex task structures with many task levels, although it requires extra communication overhead. It is advantageous in exploiting the best schedule, with the lowest rate of deadlines missed and the highest quality achieved.
206
3.4
Demotion Shift Dependency
Under the same assumption of removing SubA1 in figure 1, we name AgentB as Predecessor and AgentA as Successor. Demotion Shift is so named because the task structure is transfered to the successor from the predecessor. The communication protocol is: AgentA detects the dependency relationship with AgentB’s task, SubB1; AgentA sends a request coordination message to AgentB for task demotion. AgentB receives the message and sends back the task information and the object code for SubB1; AgentA gets the returning message and dispatches it to GPGPModule again; the GPGP module in AgentA unwraps the message and executes the object code to get the result, at the same time modifying the task structure so that the result is directed to the next execution module. Demotion Shift Dependency performs efficiently when the same tasks are requested for execution over and over again. For example, in information gathering applications, very complex wrapper code can be demoted to the requester, especially when the wrapper is unpredictably loaded. Although we do not implement it in this paper, Promotion Shift Dependency is also possible in information gathering applications to upload tasks from a handheld agent or thin client to a large server.
3.5
Coordination by Sending Result
Instead of sending coordination meta-information back and forth, the enabler sends the actual execution result to the enablee. This mechanism differs from previous mechanisms in that the coordinated agents do not have to spend extra time on re-scheduling and exchanging coordination information. On the other hand, the enablee cannot do any detailed predictive scheduling, since they only have a commitment that it will be done, not when. This mechanism is often used for simple load-balancing purposes. For example, in figure 1 assume that Agent A chooses to do SubA2 because it represents the offloading of almost all the processing to Agent B. This mechanism is the default in DECAF.
3.6
Coordination by Polling Result
Polling has been widely used in many areas such as network management, remote network monitoring, etc., to attack the problem of unreliable communication protocols. The enablee agent, AgentA, asks for the result of SubB1 with polling. The enabler agent, AgentB, replies with the result only if the requested task is already completed. Here, Polling means AgentA starts a new query to AgentB at some fixed periodic time interval. For example, AgentA continuously asks the result from AgentB every ten seconds until the result
An Application Science for Multi-Agent Systems
207
is received. During the time between queries, AgentA may execute whatever other tasks it has so that it is not idle. Polling is good for communication protocols that are connectionless and have no guarantee of returning service. On the other hand, communication costs are clearly higher because of the continuously periodic queries before acquiring results.
3.7
Constant Headway / Timetabling
In certain situations AgentA may know or detect that AgentB executes routine tasks periodically in some fixed manner and the repeated tasks provide inputs to some task of AgentA. Similarly, the enabling task may be executed according to some stable timetable. In either case AgentA can deduce either a maximum bound on task completion (constant headway) or a precise schedule (timetabling) without explicit coordination. We can think about this mechanism as how a subway train or bus system works. Subway trains are scheduled to arrive at some known periodicity (constant headway), buses arrive according to a timetable. In this situation, the enablee, which could be the traveler, acquires the bus schedule in advance by picking up a bus-schedule timetable. At scheduled time, traveler comes to the bus stop right before the arrival of the bus and waits to be picked up. One problem is still present, which is how to determine the number of services in the period between each enabler’s execution. Continuing the previous example, the buffer could be the the number of empty seats, which is a constraint on carrying travelers. It is important to balance the buffer size so that the efficiency, correctness and completion of executed tasks are achieved.
4.
Implementation
208 The execution of coordination mechanisms can be activated either under programmer direction or left up to the agent’s coordination component2 The mechanisms can be selected in any combination. Although the ordering of the execution of mechanisms matters in terms of time/performance efficiency, the functionality is ensured correct. The detection of the coordination relationships is domain-independent, which is advantageous compared to the earlier approach taken in [7]. The structure of DECAF tasks explicitly reveals abstract dependency information. By parsing the KQML/FIPA messages during program execution, even the specific dependency related agents are known to the GPGP component. Previously, a single agent in DECAF system has message dispatcher, planner, scheduler, Agenda Manager and executor. These components work together to keep track of an agent’s current status, plan selection, scheduling and execution. Now we put a GPGP component between the planner and scheduler (Figure 2). GPGP analyzes the structures of the tasks in the WITQ (“What-If Task Queue”). According to specific characteristics of the task structure, GPGP exploits some coordination mechanism. Then, the modified tasks will be sent into Task Queue, from which the local scheduler chooses tasks for efficient scheduling. The dotted lines between GPGP Module and Scheduler indicates that GPGP takes advantage of the local scheduler’s scheduling ability to evaluate the features of actions for a remote agent (for example, when making a commitment for either a predecessor task or a reservation request). The dotted lines, from Incoming KQML Message Queue to GPGP module, and the one from GPGP module to Outgoing KQML Message Queue, show that GPGP takes request from remote agents to do task evaluation work, and then sends the evaluation result back to the requesting agents. The DECAF task structure representation is composed of Task, Action, Non-Local Task (NET—explicitly representing an inter-agent dependency), Provision Cell, Characteristic Accumulation Function (CAF) and Action Behavior Profiles. It is a fairly straightforward transformation from the abstract TÆMS task structures (Figure 1) to the actual DECAF task structures (Figure 3) In DECAF (based on RETSINA), we make the abstract structures concrete by explicitly indicating the provisions (inputs) needed before a task runs, and the possible outcomes that can occur. A Provision Cell is a data structure which stores the required inputs before action can be executed, or the outcomes of the action execution. The line between them, such as the outcome provision cell OK from Ask and the input provision cell IN of the NLT (Non-Local Task), means the value of the outcome of action Ask is transported to the input of NLT by a KQML message, so that the NLT can be instantiated. The NLT can not begin execution until the input value is filled by an incoming message. In this way, the relationships among tasks can be represented naturally with DE-
An Application Science for Multi-Agent Systems
209
CAF task structure, and the data flows allow an agent to intelligently use local resources such as multiple processors.
5.
5.1
Experimental results Experimental Framework
We created an experiment framework to test the performance of these mechanisms under various environment factors. Since our coordination mechanisms focus on the non-local dependencies, we reduce the complexity by using only two agents, and concentrate on the relationships between their non-local tasks. Both agents have an average of nine multiple subtasks with alternatives (ORnodes) per request. They may choose any selected subset of the task alternatives to be executed based on various environment factors. This experiment framework is constructed as a simulation testbed for coordination mechanisms. It can be also applied to real applications, for instance, query planning and execution in an information gathering system. For information gathering tasks, a query planning agent takes user input and decomposes it into sub-tasks distributed to multiple wrapper agents. Each wrapper agent queries remote databases, extracts the expected information, and sends the sub-
210 results back to an agent for result composition. One wrapper agent may need the results of other wrapper agents’ queries for either further sub-queries or further information extractions. The scheduling for a wrapper agent to access multiple information resources, the explicit enablement relationships between the reasoner and the wrapper agent, and the interdependencies among wrapper agents, provide coordination mechanisms with a perfect stage to perform on. The abstract experiment structure is similar to figure 1, except that the task structures and task characteristics are selected randomly during experimental execution. In this framework Error Rate, Task Repeat Rate, and Agent Load are some of the changing input factors that describe an environment. Multiagent system performance is characterized by Quality Change, Communication Load, Task Execution Time, Idle Time, Deadlines Missed and GPGP Coordination Time. In this paper we concentrate on the analysis of execution time and coordination time under various error rates and task repeat rates.
5.2
Experiment Result Analysis
Each experiment with respect to a different coordination mechanism was repeated 20 times for each environmental condition, and we report the mean performance scores. We first show the change in final quality and the communication load qualitatively.
Mechanism performance on different environments. As shown in the table, the characters have the following meaning: A = Avoidable Dependency; SA = Sacrifice Avoidable Dependency; RES = Coordination by REServation; D = Demotion Shift Dependency; SR = Coordination by Sending Result; P = Polling; CH = Constant Headway; QC = Quality Change; CL = Communication Load; CT= GPGP Coordination Time. Since quality is domain-dependent, it is reasonable to use qualitative expressions to show the result of Quality Change with Positive, Non-Negative or Depends. The Sacrifice Avoidable mechanism may sacrifice quality achievement but ensures faster execution, and it is thus suitable to be used in time constrained environments. Other mechanisms usually achieve higher quality but have longer coordination time. Because Avoidable and Sacrifice Avoidable mechanisms just alter the task structures without any non-local communication, their CL is 0. The period/headway is the key information for Constant Headway; if this information is available, agents do not need coordination communication if there is sufficient service available. Reserva-
An Application Science for Multi-Agent Systems
211
tion and Sending-Result are similar, differing in whether coordination metainformation (the reservation) or the actual domain result is sent back, approximately two non-local communications. Polling obviously has the highest communication load value since it issues new queries periodically, but it provides better reliability and connectionless communication. We list communication load and coordination time together because coordination time is mainly composed of coordination communication time, which usually takes seconds (more for true Internet connections) while the GPGP reasoning module only uses less than dozens of milliseconds. As the above table shows, the communication among the coordinating agents is not necessary for these three mechanisms: Avoidable Dependency, Sacrifice Avoidable Dependency and Constant Headway / Timetabling . While we are more interested in the mechanisms which agent communication is required. As a result, we choose the other four mechanisms for further analysis: Reservation, Demotion, Send-Result and Polling. Figure 4 shows how the average task execution time changes according to the coordination task repeat rate. Task repeat rate characterizes how often an enabler task is requested for execution by the enablee. As expected, demotion shift dependency outperforms the others if the coordination task repeat rate is higher than 0.4, or costs extra time when below 0.2. Coordination by reservation and coordination by sending result share similar performance values here. The frequency of each request for Polling is selected appropriately so as to not degrade the agent’s performance by polling too often. However, the overhead of the polling requests still creates a longer execution time than the reservation and result sending mechanisms.
Next graph Figure 5 shows the performance of a particular mechanism, Polling, based on various poll gaps. We define Poll Gap as the length of the
212 time period between each polling request. Poll gap reflects poll frequency, since a small gap results in higher frequency and a larger gap stands for a lower frequency. As Figure 5 indicates, the dashed line is for the situation that the task is finished already before any further poll request, that is, the poll gap is longer than the task execution time. The graph points out the overhead cost associated with too-frequent polling. Polling is more useful as a mechanism when error rates increase. As in Figure 6 shows, we see that Polling mechanism finishes the tasks much faster than other mechanisms as the error rate gets higher. Here, “error rate” could be either task or communication error (in the experiments, it is simulated task error rate). This is because the Polling mechanism keeps requesting for results leading to much earlier re-execution of the failed task, while the other mechanisms have to wait for a reply until they timeout. As the error rate rises, the other mechanisms’ performance gets worse much faster, while Polling only degrades in an approximately linear way. From Figure 5 and Figure 6, we conclude that Polling outperforms other mechanisms if system is unreliable, but costs much more in communication overhead. If one carefully chooses the poll gap based on different error rates, one can optimize the performance for Polling. Different coordination mechanisms are better in different environments and there is no single best mechanism for every situation. To carefully choose a coordination mechanism based on different environment conditions improves multi-agent system performance.
An Application Science for Multi-Agent Systems
6.
213
Applying The Mechanisms Under Various Environments
Our first goal is to associate a set of possible coordination mechanisms with every agent. However, as we have discussed, various application environments have different features that directly affect agents’ coordination behaviors. Our research objective is to enable our agents to autonomously select appropriate mechanism(s) according to various factors in different environments. Next we will introduce our approach about how to model various application domains, and our plans to modify a boosting algorithm so that agents may adapt themselves for different environments. There are some previous learning approaches in multi-agent systems [18, 21, 11]. But these approaches are domain-specific and treat various environmental factors as being static. As a result, they are not appropriate for our objective, which is to study different problem spaces to understand the relationship between problem domains and coordination mechanisms. [16] is an initial approach that respects the domain knowledge and develops situationspecific coordination. But that approach still sticks to a specific domain without presenting a general method for our objective. Here we present a general method to evaluate the domain’s effect upon coordination. We have stated that there is no single best coordination mechanism for all environments, but at the same time coordination behaviors do share common features. Based on the relationships between general and domain knowledge, we claim that it is necessary to separate general coordination knowledge and domain knowledge to model the coordination behaviors and the various environments. General knowledge describes the basic facts possibly impinging on coordination in agent systems. These facts are natural attributes about the system it-
214 self. For example: the number of agents in a multi-agent system, whether communication between certain agents is available or not, and if communication is available what is the communication load and communication frequency, agent failure rate, coordination task repeat rate, communication bandwidth, average message size, etc. These factors are basic for every kind of multi-agent system and they reflect general features that may impact the choice of mechanisms. We represent the general coordination features with a vector Element represents one general feature of the coordination system, and vector is the collection of all general environmental features. Domain-dependent knowledge represents the featured environmental factors in different application areas. Coordination mechanisms selected vary from different environments. For example, in Bioinformatics the most heavy calculations and time consumed are at remote gene/protein databases or analysis tools [6]. But for agent system developers, remote databases are not under direct control. What we really care about are the factors like how our wrapper agents extract information from the remote databases, what the size of each query/result message is, and how these wrapper agents coordinate their query tasks so that the system provides fast and expected query results. The domain factors in a Bioinformatics system (some of which are common to most information gathering systems, but not to any arbitrary MAS) include the number of gene databases to query, number of external analysis tools, query repeat rate, average/maximum analysis pathways (measured in number of NLTs), query plan features, bandwidths of the communication channels, availability of local caching, response time of remote databases, etc. Since there is no database holding all the information biologists may need, it is inevitable to query multiple different databases such as GeneBank, SwissProt, etc. The number of databases to query does matter for the system, because querying more databases may create more hits and improve the query accuracy, but with longer query time as a side effect. The bandwidth of a communication channel is also an domain-dependent factor, since in Bioinformatics the size of the query result message varies from several KB to more than 10 MB, and possibly even bigger. A higher bandwidth shortens the response time for the end users, the biologists. We explain more about the domain-dependent factors by discussing the potential relationships between these factors and the coordination mechanisms. For example when doing a BLAST query against the NCBI database, after the query is submitted, a query ID is displayed for this specific query together with hint as “the results are estimated to be ready in 12 minutes ... ” Our mechanism of Predecessor Deadline Commitment fits this situation very well in that the long sequential query time is hidden by multiple queries at the same agent and the results are picked up at a fixed time in the near future. Another factor is repeated queries. Although biologists from different insti-
An Application Science for Multi-Agent Systems
215
tutes/universities/organizations usually query absolutely different data against the databases, it is very likely that people (such as professors and their students) from the same place make the same queries. In this case Coordination by Demotion Shift Dependency (for query task structure) and Local Caching (for final result) will save much time for repeated queries. In Bioinformatic systems the accuracy and the number of hits, which relate to quality, are important, while for other areas the domain-dependent factors are different, such as EMS (Emergency Medical Service) systems. The main concern of an EMS is to deliver ambulances to emergency sites and take victims to the nearest hospitals as soon as possible. Time is the most important characteristic to evaluate in an EMS system. The domain-dependent factors in EMS are the number of ambulances and hospitals, the alternative routes to the emergency site and the hospitals (topology of the EMS geographical area), the time taken for the hospitals to get prepared, the traffic pattern of the area, etc. From the above examples we can see that the featured factors under different environments are different. We represent the domain-dependent knowledge with another vector Element represents the featured factor j of the application domain i. Now the coordination universe is represented with vector
We explicitly differentiate these in order to clarify that general knowledge and domain knowledge could be studied separately. While there are virtually unlimited factors that may be used to describe even a single application area, we select in consultation with domain experts the main features and rule out the trivial/irrelevant features, such as the salaries of the people who use the query system. After we vectorize the coordination universe, we need to normalize the vector so that the elements can be formatted as input for learning algorithms. It is easy to normalize the elements based on their minimum and maximum instance values. For example, the bandwidth of the Internet Information Gathering system ranges from zero to 10M bps, an instance of 3M bps could be normalized as the value of 0.3. We represent coordination mechanisms with vector Now our problem is reduced to as follows: Given an input vector and an output vector of how can we find a hypothesis that predicts which should be selected according to the selected for the best system performance. We apply the mechanisms one at a time under certain combinations of environmental values in the coordination framework, and then modify the environmental values and apply the mechanisms one at a time again. We repeat this process until we acquire a set of mappings which selects which mechanism to use under which combination of environmental values. We take this set of mappings as a training set. Finally
216 the problem reduced to as follows: Given (v,m), where v = one instance of and m = one instance of find a hypothesis to predict the relationship between the input and output. The abstract problem described above represents the concrete real world problem, which is how to map out particular application domains and selecting appropriate coordination mechanisms for dealing with the optimization problems present in the domains. We are currently adapting the AdaBoost [17] algorithm for this abstract problem. The algorithm is shown as below: Given : where Initialize For Train weak learner using distribution
Get weak hypothesis
with error
Choose Update
where is a normalization factor (chosen so that bution). Output the final hypothesis:
will be a distri-
Boosting refers to a general and provably effective method of producing a very accurate prediction rule by combining rough and moderately inaccurate rules of thumb in an accumulative manner. It is a general method to improve the accuracy of given learning algorithm. AdaBoost is advantageous because of these features: It is easy to implement; it only has one parameter, the number of rounds T, to tune; it requires no prior knowledge about any component weak learner and can be combined with any method for finding weak hypotheses; it was shown to achieve accurate prediction from accumulative weak learners. Boosting technology is helpful for our objective. Using domain-dependent elements from instances of as input, we can find the
An Application Science for Multi-Agent Systems
217
hard domain-dependent factors and pay special attention to them, so that appropriate coordination mechanism is selected. We can also include the general factors as input to the algorithm. In this case, comprehensive knowledge is taken into consideration and some general factors may be found particularly important for the application domain as well. There are still some points that need to be clarified when applying the above procedure to study the relationships between coordination behaviors and domain applications. First, as the above AdaBoost algorithm shows, it is a binary case, which means the label Y’s value is either true or false, while the label Y for our approach represents one of seventeen coordination mechanisms. This problem can be handled by reducing the multiclass problem to a larger binary problem [12]. Second, the acquisition of the training data is key part of our approach. It requires a large set of data from the execution of the experimental coordination framework. The good point is that the system developer does not have to explore all the possibilities for all environments, but to concentrate on an individual application domain. Finally, the developer does need some domain knowledge from expert to start the learning approach. The approach presented above attacks the problem of understanding how application domains and coordination/control techniques relate. Given the ideas above, research tasks only need to follow the steps and concentrate on different application domains. It is a logically correct and practical approach, and more experiments are needed.
7.
7.1
Conclusion and future work Conclusion
We have introduced a new way of thinking about coordination mechanisms— as re-writings to hierarchical task networks—that brings us closer to providing general coordination services for an agent architecture in a domain-independent manner. We also introduced several implemented coordination mechanisms and discussed the applicability of them to different environments. The mechanisms we implemented above are on a promising path to explore coordination for dependency relationships. The mechanisms implemented so far can be applied to agent programs written with DECAF for different domains. The agent programmer does not have to explicitly code for coordination (although without learning, the programmer does have to select which mechanisms to automatically instantiate for each NLT). We presented a general approach towards the understanding and automated learning of the relationships between application domains and agent coordination, however more experimental results and a larger scale experimental framework are needed and are under development. Different coordination mechanism may outperform others within different kinds of environments. Avoidable and Sacrifice Avoidable mechanisms cost
218 very little in terms of meta-level coordination time, so programmers are encouraged to apply them first unless they are not applicable as described earlier. The more complex coordination mechanisms result in higher quality, but require longer times. If deadlines are the main factor agents must respect, Coordination by Reservation outperforms others. If the repeat rate of a requested task is high, the Demotion Shift mechanism saves the transmission time by sending object code for further reuse. Promotion Shift has this advantage as well and saves one round of coordination communication, but with the constraint of extra domain knowledge. If the enablee agent is heavily loaded, Coordination by Sending Result distributes the computing task to the enabler agent. If the underlying network protocol is connectionless, Polling ensures the reliability of coordination protocol with continuous queries for the earliest presence of a result, or takes recovering actions early in case of failure. By publishing a well known timetable, the mechanism of Constant Headway unburdens the enabler and enablee agents from tightly coupled coordination. We identified agent-architectural properties that are necessary for our approach: a local scheduler or other way of reasoning about non-local commitments, and a way to generate such commitments by “what-if” questions or other means. Viewing the eventual end-product of coordination mechanisms as information for making scheduling decisions gives a consistent and general way of integrating many different coordination mechanisms in many styles. The most difficult aspect of implementation is in altering an agent architecture so that planned activities can be examined and possibly altered before they are executed.
7.2
Future Work
In our original blueprint there are at least seventeen mechanisms for the hard dependency relationships, some of them only subtly different from each other. Most of the mechanisms can be implemented with task structure re-writing directly. Some of them may require arbitrarily complex new tasks, such as the Multistage Negotiation Mechanisms, but they can be implemented through DECAF anyway. There are some other mechanisms which require other technologies, such as mobile agent implementations. With the introduction of other technologies, coordination mechanisms will become more complex and more functional as well. We explored the hard relationship, enables, among agents. The soft relationships are a simple extension to our current research. A more complex extension is to complex subtask alternative relationships across multiple agents. We have already begun representing soft relationships within DECAF. We believe all kind of non-local relationships can be explored within the general DECAF HTN task structures.
An Application Science for Multi-Agent Systems
219
Current research concentrates on the analysis for individual coordination mechanism, but under certain environments it is highly possible that combinations of mechanisms may result in better performance. For example, Polling and dynamic event-driven mechanisms can be combined so that the new mechanisms not only ensure reliability requirements, but also provide a flexible solution for time constrained systems. Error tolerance is another important issue associated with the mechanisms. Because the coordination mechanisms here are in execution together with a real time scheduling component, it is more desirable to make the mechanisms dynamically selected according to the changes of the factors in the environment so that the MAS system reaches optimal overall performance over a period of time. Finally, learning approaches will allow agent designers (in non-critical action fields, such as personal information gathering) to build agents that can experiment with various known coordination mechanisms to find the best ones to use (or avoid) in certain common situations. More experiments and large training data set are needed to fed in the learning algorithm we present in this paper.
Notes 1. TÆMS represents such shared resources explicitly[8] 2. This will be useful for eventual learning and exploration control, see Section 6.
References [1] C.Castelfranchi and R.Conte. Distributed artificial intelligence and social science: Critical issues. In Foundations of Distributed Artificial Intelligence, Chapter 20, 1996. [2] W.Chen and K.Decker. Coordination Mechanisms for Dependency Relationships among Multiple Agents. In Proceedings of AAMAS2002, Bologna, Italy, July, 2002. [3] W.Chen and K.Decker. Developing Alternative Mechanisms for Multiagent Coordination. In Intelligent Agents and Multi-Agent Systems, pages 63–76, Springer-Verlag, 2002. [4] C.Dellarocas and M.Klein. An experimental evaluation of domain-independent fault handling services in open multi-agent systems. In Proceedings of ICMAS’00, Boston, MA, USA, 2000. [5] R. Davis and R. G. Smith. Negotiation as a metaphor for distributed problem solving. Artificial Intelligence, 20(1):63–109, Jan. 1983. [6] K. Decker, S. Khan, C. Schmidt, and D. Michaud. Extending a multi-agent system for genomic annotation. In M. Klusch and F. Zambonelli, editors, Cooperative Information Agents IV, pages 106–117. Springer-Verlag, 2001. [7] K. S. Decker and V. R. Lesser. Designing a family of coordination algorithms. In Proceedings of the First International Conference on Multi-Agent Systems, pages 73–80, San Francisco, June 1995. AAAI Press. Longer version available as UMass CS-TR 94–14. [8] K. S. Decker and J. Li. Coordinating mutually exclusive resources using gpgp. Autonomous Agents and Multi-Agent Systems, 3, 2000.
220 [9] K. S. Decker and K. Sycara. Intelligent adaptive information agents. Journal of Intelligent Information Systems, 9(3):239–260, 1997.
[10] K. Erol, D. Nau, J. Hendler, and R. Tsuneto. A critical look at critics in htn planning. In Proceedings of the IJCAI95, Montreal, Canada, Aug. 1995. [11] C. B. Excelente-Toledo and N. R. Jennings. Learning to select a coordination mechanism. In Proceedings of AAMAS02, pages 1106–1113, 2002. [12] Y. Freund and R. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. 55(1): 119–139, 1997. [13] G.Weiss. Multiagent Systems, A Modern Approach to Distributed Artificial Intelligence, pages 88–92. MIT Press, 1999. [14] J.Graham and K.Decker. Towards a distributed, environment-centered agent framework. In Intelligent Agents IV, Agent Theories, Architectures, and Languages. Springer-Verlag, 2000. [15] M.Tambe. Teamwork in real-world, dynamic environments. In Proceedings of the International conference on multi-agent systems(ICMAS96), 1997. [16] M. N. Prasad and V. Lesser. Learning situation-specific coordination in generalized partial global planning. In AAAI Spring Symposium on Adaptation, Co-evolution and Learning in Multiagent Systems, Stanford, Mar. 1996. [17] R. Schapire. A brief introduction to boosting. In Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence, 1999. [18] S. Sen, M. Sekaran, and J. Hale. Learning to coordinate withotu sharing information. In Proceedings of AAAI’94, Seattle, WA, USA, July 1994. [19] S.Rustogi and M.Singh. Be patient and tolerate imprecision: How autonomous agents can coordinate effectively. In Proceedings of IJCAI’99, Stockholm, Sweden, Aug. 1999. [20] S. S.Sen and N. Arora. Effect of local information on group behavior. In Proceedings of the International Conference on MAS, pages 315–321, 1996. [21] T. Sugawara and V. R. Lesser. On-line learning of coordination plans. Computer Science Technical Report 93–27, University of Massachusetts, 1993. [22] S.Willmott and B.Faltings. The benefits of environment adaptive organizations for agent coordination and network routing problems. In Proceedings of IJCAI’99, Stockholm, Sweden, Aug. 1999. [23] T.Hogg and B.Huberman. Controlling chaos in distributed systems. In IEEE Transactions on Systems, Man, and Cybernetics, pages 1325–1332, 1991. [24] T. Wagner, A. Garvey, and V. Lesser. Complex goal criteria and its application in designto-criteria scheduling. Providence, July 1997. [25] Y.So and E.Durfee. Designing organizations for computational agent. In K. C. In M. J. Pritula and L. Gasser, editors, Simulating Organizations, pages 47–64. AAAI press, 1998.
PERFORMANCE MODELS FOR LARGE SCALE MULTI-AGENT SYSTEMS: A DISTRIBUTED POMDP-BASED APPROACH Hyuckchul Jung Institute for Human and Machine Cognition University of West Florida, USA
[email protected]
Milind Tambe Department of Computer Science University of Southern California, USA
[email protected]
Abstract
Given a large group of cooperative agents, selecting the right coordination or conflict resolution strategy can have a significant impact on their performance (e.g., speed of convergence). While performance models of such coordination or conflict resolution strategies could aid in selecting the right strategy for a given domain, such models remain largely uninvestigated in the multiagent literature. This chapter takes a step towards applying the recently emerging distributed POMDP (partially observable Markov decision process) frameworks, such as MTDP (Markov team decision process), in service of creating such performance models. A strategy is mapped onto an MTDP policy, and strategies are compared by evaluating their corresponding policies. To address issues of scale-up in applying the distributed POMDP-based models, we use small-scale models, called building blocks that represent the local interaction among a small group of agents. We discuss several ways to combine building blocks for performance prediction of a larger-scale multiagent system. We present our approach in the context of DCSPs (distributed constraint satisfaction problems), where we first show that there is a large bank of conflict resolution strategies and no strategy dominates all others across different domains. By modeling and combining building blocks, we are able to predict the performance of five different DCSP strategies for four different domain settings, for a large-scale multiagent system. Thus, our approach in modeling the performance of conflict resolution strategies points the way to new tools for strategy analysis and performance modeling in multiagent systems in general.
Keywords:
Multiagent systems, Performance measurement, Distributed POMDP
222
1.
Introduction
In many large-scale applications such as distributed sensor networks, distributed spacecraft, and disaster response simulations, collaborative agents must coordinate their plans or actions [5, 9, 13]. While such applications require agents to be collaborative, an agent’s choice of actions or plans may conflict with its neighboring agents’ action or plan choices due to limited (shared) resources. Selecting the right action, plan or resource to resolve conflicts, i.e., selecting the right conflict-resolution strategy, can have a significant impact on the performance of conflict resolution particularly in a large-scale multiagent system. For instance, in distributed sensor networks, tracking targets quickly requires that agents controlling different sensors adopt the right strategy to resolve conflicts involving shared sensors. Unfortunately, selecting the right conflict resolution strategy is difficult. First, there are often a wide diversity of strategies available, and they can lead to significant variations in the rate of conflict resolution convergence in multiagent systems. For instance, when distributed agents must resolve conflicts over shared resources such as shared sensors, they could select a strategy that offers maximum possible resources to most constrained agents, or that distributes resources equally among all agents requiring such resources, and so on. Each strategy may create a significant variation in the conflict resolution convergence rate [5, 13]. Furthermore, faced with certain types of problem domains, a single agent cannot immediately determine the appropriate coordination or conflict resolution strategy, because it is typically not just this agent’s actions, but rather the actions of the entire community that determine the outcome. Performance modeling of multiagent coordination and conflict resolution could help predict the right strategy to adopt in a given domain. Unfortunately, performance modeling has not received significant attention in mainstream multiagent research community; although within subcommunities such as mobile agents, performance modeling has been considerably investigated [16]. Fortunately, recent research in distributed POMDPs (partially observable Markov decision processes) and MDPs (Markov decision processes) has begun to provide key tools to aid multiagent researchers in modeling the performance of multiagent systems [1, 15, 18]. In the context of this chapter, we will use the MTDP model [15] for performance modeling, although other models could be used. There are at least two major problems in applying such distributed POMDP models. First, while previous work has focused on modeling communication strategies within small numbers of agents [15], we are interested in strategy analysis for large-scale multiagent systems. Second, techniques to apply such models to performance analysis of conflict resolution strategies have not been investigated.
An Application Science for Multi-Agent Systems
223
We address these limitations in the context of DCSPs (distributed constraint satisfaction problems), which is a major paradigm of research on conflict resolution [17, 19, 20]. Before addressing the limitations, we introduce DCSPs and our previous work in which we have illustrated the presence of multiple conflict resolution strategies and showed that cooperative strategies can improve performance in conflict resolution convergence. Our first contribution in this chapter is to illustrate that more strategies exist and that indeed no single strategy dominates all others. Since the best strategy varies over different domains, given a specific domain, selecting the best strategy is essential to gain maximum efficiency. Our second key contribution is to illustrate the use of MTDP to model the performance of different strategies to select the right strategy. To address the limitations in the MTDP modeling introduced above, we first illustrate how DCSP strategies can be modeled in MTDP. Next, to address scale-up issues, we introduce small-scale models called “building blocks” that represent the local interaction among a small group of agents. We discuss several ways to combine building blocks for performance prediction of a larger-scale multiagent system.
2.
Background
DCSP techniques have been used for coordination and conflict resolution in many multiagent applications such as distributed sensor network [9]. In this section, we introduce the DCSP framework and efficient DCSP strategies.
2.1
Distributed Constraint Satisfaction Problems (DCSPs)
A Constraint Satisfaction Problem (CSP) is commonly defined by a set of variables, each element associated with value domains respectively, and a set of constraints, A solution in CSP is the value assignment for the variables which satisfies all the constraints in A DCSP is a CSP in which variables and constraints are distributed among
224 multiple agents [19]. Formally, there is a set of m agents, Each variable belongs to an agent There are two types of constraints based on whether variables in a constraint belong to a single agent or not: For a constraint if all the variables in it is called a local constraint. For a constraint if variables in it is called an external constraint.
belong to a single agent
belong to different agents in
Figure 1-a illustrates an example of a DCSP: each agent (denoted by a big circle) has a local constraint and there is an external constraint between and As illustrated in Figure 1 -b, each agent can have multiple variables.1 There is no limitation on the number of local/external constraints for each agent. Solving a DCSP requires that agents not only satisfy their local constraints, but also communicate with other agents to satisfy external constraints. Note that DCSPs are not concerned with speeding up centralized CSPs via parallelization; rather, it assumes that the problem is originally distributed among agents.
2.2
Asynchronous Weak Commitment (AWC) Search Algorithm
AWC search algorithm is known to be the best published DCSP algorithm [19]. In the AWC approach, agents asynchronously assign values to their variables from domains of possible values, and communicating the values to neighboring agents with shared constraints. Each variable has a non-negative integer priority that changes dynamically during search. A variable is consistent if its value does not violate any constraints with higher priority variables. A solution is a value assignment in which every variable is consistent. To simplify the description of the algorithm, suppose that each agent has exactly one variable and the constraints between variables are binary. When the value of an agent’s variable is not consistent with the values of its neighboring agents’ variables, there can be two cases: (i) a good case where there exists a consistent value in the variable’s domain; (ii) a nogood case that lacks a consistent value. In the good case with one or more value choices available, an agent selects a value that minimizes the number of conflicts with lower priority agents. On the other hand, in the nogood case, an agent increases its priority to max+1, where max is the highest priority of its neighboring agents, and selects a new value that minimizes the number of conflicts with all of its neighboring agents. This priority increase makes previously higher agents select new values. Agents avoid the infinite cycle of selecting non-solution values by saving the nogood situations.
An Application Science for Multi-Agent Systems
2.3
225
Cooperativeness-based Strategies
While AWC is one of the most efficient DCSP algorithms, real-time and dynamism in multi-agent domains demands very fast conflict resolution. Thus, novel strategies were introduced for fast conflict resolution convergence to a complete solution [5]. While AWC relies on the min-confiict heuristic [8] that minimizes conflicts with other agents, the new strategies enhanced by local constraint communication consider how much flexibility (choice of values) is given towards other agents by a selected value. By considering neighboring agents’ local constraints, an agent can generate a more locally cooperative response, potentially leading to faster conflict resolution convergence. The concept of local cooperativeness goes beyond merely satisfying constraints of neighboring agents to accelerate convergence. That is, an agent cooperates with a neighbor agent by selecting a value for its variable that not only satisfies the constraint with but also maximizes flexibility (choice of values). Then has more choices for a value that satisfies local constraints and other external constraints with its neighboring agents, which can lead to faster convergence. To elaborate this notion of local cooperativeness, the followings were defined in [5]. Definition 1: For a value and a set of agents (a set of neighboring agents), a flexibility function is defined as such that and is the number of values of that are consistent with Definition 2: For a value of local cooperativeness of is defined as That is, the local cooperativeness of measures how much flexibility (choice of values) is given to all of neighbors by As an example of the flexibility function suppose agent has two neighboring agents and where a value leaves 70 consistent values to and 40 to while anothervalue leaves 50 consistent values for and 49 to Now, assuming that values are ranked based on flexibility, an agent will prefer to and These definitions of flexibility function and local cooperativeness are applied for the cooperative strategies defined as follows: Each agent selects a value based on min-conflict heuristics (the original strategy in the AWC algorithm); Each agent attempts to give maximum flexibility towards its higher priority neighbors by selecting a value that maximizes Each agent attempts to give maximum flexibility towards its lower priority neighbors by selecting a value that maximizes
226 Each agent selects a value flexibility to all neighbors.
that maximizes
i.e. max
These four strategies can be applied to both the good and nogood cases. In the nogood case, neighboring agents are grouped into higher and lower agents based on the priorities before the priority increase described in Section 2.2. (Refer to [5] for detailed information.) Therefore, there are sixteen strategy combinations for each flexibility base. Since, we will only consider strategy combinations, henceforth, we will refer to them as strategies for short. Note that all the strategies are enhanced with constraint communication and propagation. Here, two exemplar strategies are listed: This is the original AWC strategy. Min-conflict heuristic is used for the good and nogood cases. For the good case, an agent is most locally cooperative towards its lower priority neighbor agents by using (Note that the selected value doesn’t violate the constraints with higher neighbors). On the contrary, for the nogood situations, an agent attempts to be most locally cooperative towards its higher priority neighbors by using
2.4
Experimental Evaluation
An initial principled investigation of these strategies can improve our understanding not only of DCSP strategies, but potentially shed some light on how cooperative an agent ought to be towards its neighbors, and with which neighbors. To that end, a number of DCSP experiments were done with an abstract problem setting. Here, agents (variables) are in a 2D grid configuration (each agent is externally constrained with four neighbors except for the ones on the grid boundary). All agents share an identical binary constraint by which a value in an agent is not compatible with a set of values in its neighboring agent. This grid configuration is motivated by the research on distributed sensor network [9] where multiple agents must collaborate to track targets. For additional justification and domains, refer to [5]. In the experiments, the total number of agents was 512 and each agent has 36 (= 6 × 6) values in its domain. In addition to the external binary constraint, agents can have a unary local constraint that restricts legal values into a set of randomly selected values among its original 36 values. The evaluation followed the method used in [19]. In particular, performance evaluation is measured in terms of cycles consumed until a solution is found, and the experiments ranged over all possible strategies. In Figure 2, for expository purpose, only five strategies are presented, which does not change the conclusions in [5]. The vertical axis plots the number of cycles and the horizontal axis plots the
An Application Science for Multi-Agent Systems
227
percentage of locally constrained agents. The performance difference between different strategies are proved to be statistically significant by performing A key point to note is that choosing the right strategy has significant impact on convergence. Certainly, choosing (the top line in Figure 2) may lead to significantly slower convergence rate while appropriately choosing can lead to significant improvements in convergence. However, we may not need to consider all the strategies for selecting the best since significant performance improvement can be achieved by alone.
3.
More Cooperative Strategies
While novel cooperative strategies in Section 2 improved conflict resolution convergence, two questions remain unanswered: (i) can we apply for conflict resolution across different domains?; (ii) in addition to the strategies defined in Section 2, are there more local cooperativeness based strategies? In this section, to address these two questions, first, we provide more possible DCSP strategies based on the local cooperativeness. Second, experimental results in different types of domains are presented. The results show that there is no single dominant strategy (like across all domains.
3.1
New Basis for Cooperativeness
While the novel value ordering strategies (proposed in [5]) improved the performance in conflict resolution, the definition of the flexibility function using summation may not be the most appropriate way to compute the local cooperativeness.
228
Example: suppose we are given two neighboring agents and where a value leaves 99 consistent values to and 1 to while another value leaves 50 consistent values for and 49 to Now, according to the cooperativeness definition in Section 2.3, an agent will prefer to and However, leaving one value choice to by the value can have high chances of conflicts for in the future. Here, we extend the flexibility function to accommodate different types of possible local cooperativeness: and a set of agents Definition3: For a value flexibility where (i) function is defined as is the number of values of that are consistent with and (ii) (iii) referred to as a flexibility base, can be sum, min, max, product, etc. With this new definition, for the above example, if is set to min instead of sum, an agent will rank higher than and We can apply these extended definitions of the flexibility function to the cooperative strategies defined in [5]. That is, the cooperative strategies definitions hold with the only change in the flexibility function.
3.2
New Experimental Evaluation
To provide the evaluation of these extended strategies, a number of DCSP experiments were done with the problem setting used in [5]. Here, to test the performance of strategies in various problem settings, we make a variation in
An Application Science for Multi-Agent Systems
229
the problem setting by modifying the external constraint: in a new setting, each agent gives less flexibility (choice of values) towards its neighboring agents given a value in its domain. Henceforth, the previous setting is referred as a high flexibility setting, and the new setting with less choice of values to neighbors is referred as a low flexibility setting. Experiments were performed in the two problem settings (the high flexibility setting and the low flexibility setting), and both sum and min were used as a flexibility base Other than this external constraint variation, the parameters for the experiments remain same. Each data point in the figures was averaged over 500 test runs. While all the possible strategies for each flexibility base were tried, for expository purpose, only five strategies are presented in Figure 3 and 4, which does not change our conclusion. Figure 3 shows the case where min is used as a flexibility base in the high flexibility setting. Here, note that shows much worse performance than in some cases. Overall, in this domain, strategies with min as a flexibility base do not improve performance over the strategies with sum as a flexibility base. Figure 4 shows the case where sum is used as a flexibility base, but the problem instances are generated in the low flexibility setting. Note that the experimental results shown in Figure 2 were also based on sum as a flexibility base but they were from the high flexibility setting. The significant difference between the high flexibility setting (Figure 2) and the low flexibility setting (Figure 4) is that is not the overall dominant strategy in the low flexibility setting. Furthermore, (that performed well in the high flexibility setting) showed worse performance than the original AWC strategy in particular when the percentage of locally constrained agents is low. These experimental results show that different flexibility bases also have an impact on performance. For instance, choosing min as a flexibility base instead of sum may degrade performance for some cooperative strategies. Furthermore, these results clearly show that choosing the right strategy given a domain has a significant impact on convergence. Certainly, when 80% of agents are locally constrained, and showed the same performance in the high flexibility setting (Figure 2). However, in the low flexibility setting (Figure 4), showed 10 fold speedup over Here, to check the statistical significance of the performance difference, twotailed t-test was done for each pair of strategies at each percentage of locally constrained agents. The null hypothesis for each pair of strategies that there was no difference between the two strategies in average cycles was rejected, i.e., the difference is significant, with p-value < 0.01 for each case. To conclude, no single strategy dominates in different domains. As shown above, the best strategy in a domain could produce 10 times worse performance
230
than others in another domain. Thus, to gain maximum efficiency in conflict resolution, it is essential to predict the right strategy in a given domain.
3.3
Long tail distribution in Convergence
The key factor in determining the performance of strategies is in the long tail where only a small number of agents are in conflicts. Figure 5 shows the number of conflicts at each cycle for two different strategies, and In the beginning of conflict resolution, both strategies show similar performance in resolving conflicts. However, the performance difference appears in the long tail part. While quickly solves a given problem, has a long tail with a small number of conflicts
An Application Science for Multi-Agent Systems
231
remaining unresolved. This type of long tail distribution has been also reported in many constraint satisfaction problems [4].
4.
Performance Analysis
Section 3 shows that, given the dynamic nature of multiagent systems, predicting the right strategy to use in a given domain is essential to maximize the speedup of conflict resolution convergence, and the critical factor for strategy performance is in the long tail part where a small number of conflicts exist. Here, we provide formal models for performance analysis and the mapping of DCSP onto the models, and present the results of performance prediction.
4.1
Distributed POMDP-based Model
As a formal framework for strategy performance analysis, we use a distributed POMDP model called MTDP (Multiagent Team Decision Process) model [15]. The MTDP model has been proposed as a framework for teamwork analysis. Distributed POMDP-based model is an appropriate formal framework to model strategy performance in DCSPs since it has distributed agents and the agentView in DCSPs (other agents’ values, priorities, etc.) can be modeled as observations. In DCSPs, the exact state of the system is only partially observable to an agent since the received information for the agent is limited to its neighboring agents. Therefore, there is strong correspondence between DCSPs and distributed POMDPs. While we focus on the MTDP model in this chapter, other distributed POMDP models such as DEC-POMDP [1] could be used. Here, we illustrate the actual use of the MTDP model in analyzing DCSP strategy performance. The MTDP model provides a tool for varying key domain parameters to compare the performance of different DCSP strategies, and thus select the most appropriate strategy in a given situation. We first briefly introduce the MTDP model. Refer to [15] for more details. 4.1.1 MTDP model. The MTDP model involves a team of agents operating over a set of world states during a sequence of discrete instances. At each instant, each agent chooses an action to perform and the actions are combined to affect a transition to the next instance’s world state. Borrowing from distributed POMDPs, the current state is not fully observed/known and transitions to new world states are probabilistic. Each agent makes its own observations to compute its own beliefs, and the performance of the team is evaluated based on a joint reward function over world states and combined actions. More formally, an MTDP for a team of agents, is a tuple, <S, P, R>. S is a set of world states. is a set of combined actions where is the set of agent actions. P controls the effect of agents’ actions in a dynamic environment:
232 is a set of combined observations. Observation function, specifies a probability distribution over the observations of an agents team Belief state is derived from the observations. is a reward function over states and joint actions. A policy in the MTDP model maps individual agents’ belief states to actions. A DCSP strategy is mapped onto a policy in the model. Thus, we compare strategies by evaluating policies in this model. Our initial results from policy evaluation in this model match the actual experimental strategy performance results shown before. Thus, the model could potentially form a basis for predicting strategy performance in a given domain. Mapping from DCSP to MTDP. In a general mapping, the first 4.1.2 question is selecting the right state representation for the MTDP. One typical state representation could be a vector of the values for all the variables in a DCSP. However, this representation leads to a huge state space. For instance, if there are 10 variables (agents) and 10 possible values per variable, the number of states is To avoid this combinatorial explosion in state space, we use an abstract state representation in the MTDP model. In particular, as described in the previous section, each agent can be abstractly characterized as being in a good or nogood state in the AWC algorithm. We use this abstract characterization in our MTDP model. Henceforth, the good state and the nogood state are denoted by G and N respectively. Here, an initial state of the MTDP is a state where all the agents are in G state since, in the AWC, an agent finds no inconsistency for its initial values until it receives the values of its neighboring agents. Note that, for simplicity, the case where agents have no violation is not considered. In this mapping, the reward function R is considered as a cost function. The joint reward (cost) is proportional to the number of agents in the N state. This reward is used for strategy evaluation based on the fact that the better performing strategy has less chance of forcing neighboring agents into the N state than other strategies: as a DCSP strategy performs worse in a given problem setting, more agents will be in the N states. In the AWC algorithm (a base DCSP algorithm in this chapter), each agent receives observations only about the states of its neighboring agents and its current state as well. Thus, the world is not individually observable, but rather it is collectively observable (in the terminology of Pynadath and Tambe [15]), and hence the mapping does not directly reduce to an MDP. Initially, we assume that these observations are perfect (without message loss in communication). This assumption can be relaxed in our future work with unreliable communication. Here, and in DCSPs are mapped onto the actions for agents in the MTDP model. A DCSP strategy provides a local policy for each agent in the MTDP model, e.g., the strategy implies that each agent selects action when its local state is good, and action when its local state is nogood. The state transition in the MTDP model is controlled by
An Application Science for Multi-Agent Systems
233
an agent’s own action as well as its neighboring agents’ actions. The transition probabilities can be derived from the DCSP simulation. 4.1.3 Building Block. While the abstract representation in the mapping above can reduce the problem space, for a large-scale multiagent system, if we were to model belief states of each agent regarding the state of the entire system, the problem space would be enormous even with the abstract representation. For instance, the number of states in the MTDP model for the system with 512 agents would be each agent can be either G or N. To further reduce the combinatorial explosion, we use small-scale models, called building blocks. Each building block represents the local situation among five agents in the 2D grid configuration. In the problem domains for the experiments shown in Section 2.4 and 3.2, each agent’s local situation depends on whether it has a unary local constraint or not: each agents can be either constrained (a portion of its original domain values are not allowed) under a local constraint (C) or unconstrained (U). Figure 6 illustrates some exemplar building blocks for the domain used in the experiments. For instance, Figure 6-a represents a local situation where all the five agents are constrained (C) while Figure 6-b represents a local situation where an agent in the left side is unconstrained (U) but the other four agents are locally constrained (C). Note that, when the percentage of locally constrained agents is high, most of building blocks would be the one shown in Figure 6-a and a small portion of building blocks would be like the ones shown in Figure 6-b, 6-c, and 6-d. As the percentage of locally constrained agents decreases, more building blocks include unconstrained agents (U) as shown in Figure 6-e and 6-f.
234
In each building block, as shown in Figure 7, a middle agent is surrounded by the other four neighboring agents Thus, the state of a building block can be represented as a tuple of local states (e.g.,
if all the five agents are in the good (G) state). There are totally 32 states in a building block of the MTDP model (e.g., , , , etc), and the initial state of a building block is . Here, agents’ actions will cause a transition from one state to another. For instance, if agents are in a state (Figure 7-a) and all the agents choose the action there is a certain transition probability that the next state will be (Figure 7-b) when only the third agent is forced into a nogood (N) state. However, the agents may also transition to (Figure 7-c) if only the fourth agent enters into the N state. One may argue that these small-scale models are not sufficient for the performance analysis of the whole system since they represent only local situations. However, as seen in Figure 5, the key factor in determining the performance of strategies is in the long tail where only a small number of agents are in conflicts. Therefore, the performance of strategies are strongly related to the local situation where a conflict may or may not be resolved depending on the local agents’ actions. That is, without using a model for the whole system, small scale models for local interaction can be sufficient for performance analysis. Furthermore, while this simple model may appear limiting at first glance, it has already shown promising results presented below. 4.1.4 Building Block Composition. While the building blocks are the basis for performance analysis, we need to deal with multiple building blocks that can exist in a given domain. Each building block has a different impact on conflict resolution convergence. It is expected that the local interaction in a single building block does not totally determine the performance of strategies, but the interactions between building blocks have a great impact on the strategy performance. Here, we propose four methods of building block composition to evaluate MTDP policies (mapping of DCSP strategies) as follows:
An Application Science for Multi-Agent Systems
235
Single block: for a given domain, a single building block is selected. The selected block has the highest probability of producing the nogood (N) cases within the building block. The performance of a strategy is evaluated based on the value of an initial state of the building block () given its mapped policy: As the initial state value gets lower, the policy (strategy) is better. Simple sum: for a given domain which has multiple building blocks, we compute the value of each building block’s initial state. Performance evaluation of a policy is based on the summation of the initial states’ values of the multiple building blocks in the domain. Weighted sum: given multiple building blocks, for each building block, we compute the ratio of the building block in the domain and the value of its initial state. Performance evaluation of a policy is based on the weighted sum of initial states’ values where the weight is the ratio of each building block. Interaction: a sequence of building blocks is considered since the performance difference comes from the long tail part (shown in Figure 5), and their interaction is taken into account: an agent may force its neighboring agents to enter into the nogood (N) state given an action under a policy to evaluate. For instance, two blocks, Figure 6-(b) and Figure 6-(c), may interact side by side, so that the rightmost C agent of Figure 6-(b) interacts with the leftmost C agent of Figure 6-(c). Here, with the interaction between these two building blocks, the state of the rightmost C agent of Figure 6-(b) and its policy influence the probability that the initial local state of the leftmost C agent of Figure 6-(c) starts in the N state. Without such interaction, it would always start in the G state. For the interaction method, we don’t have arbitrary degrees of freedom: for different domains, there can be commonalities in building blocks. That is, for common building blocks, the same transition probabilities within a building block are applied. As we change from the first method (single block) to the fourth method (interaction), we gradually increase the complexity of composition. The accuracy of the performance prediction with those methods is presented in the next section. Here, note that the focus of our building block composition is not on computing the optimal policy, but it remains on matching the long-tailed phenomenon shown in Figure 5. Thus, our interactions essentially imply that neighboring blocks affect each other in terms of the values of the policies being evaluated, but we are not computing optimal policies that cross building block boundaries.
236
4.2
Performance Prediction
To check whether the MTDP based model can effectively predict the performance of strategies, performance analysis is done (based on the four methods of building block composition in Section 4.1.4), and the performance evaluation results from the MTDP model are compared with the real experimental results presented in the previous sections. Before showing the performance analysis results, we present the complexity of DCSP strategy performance evaluation. The complexity analysis presented below shows that observability conditions within a building block can be exploited to further reduce the complexity of policy evaluation, which makes our building block based approach more useful in practice by saving computation overhead. 4.2.1 Complexity of Performance Evaluation. The policy evaluation for a finite horizon MTDP is a computationally expensive problem since the action is indexed by observation history. In the worst case, the computational complexity of evaluating a single policy is and are the number of states and observations respectively. However, in the environment where a given policy is based only on agents’ local states, we prove that evaluating the policy becomes the evaluation of a time homogeneous Markov chain, which leads to low computation overhead. To elaborate on the proof, we first introduce the following definitions, followed by a theorem that proves the MTDP reduction to a Markov chain: Markov Chain ([14]): Let be a sequence of random variables which assume values in a discrete (finite or countable) state space S. The sequence is called a Markov Chain if
for
and
Time Homogeneous Markov Chain ([14]): A Markov Chain is homogeneous if, for and does not depend on Local state the subset of features of the world state that affect the observation of an agent A world state
in MTDP is
where
Local observability: each individual’s observation uniquely determines its local state. such that
An Application Science for Multi-Agent Systems
237
Based on the above definitions, we show the proof that, under certain assumptions, an MTDP reduces to a Markov chain, leading to significant computation cost savings. Theorem: Given a MTDP < S, P, R>, given a fixed policy, the MTDP reduces to a time homogeneous Markov chain with S if the following assumptions hold: Assumption 1: the environment is locally observable; Assumption 2: the domain level policy current local state only.
is a function of
Proof sketch: Assume a sequence of random variables for states S, First, we show that the conditional probability distribution of depends on only that is, where Given that the world is Markovian, the probability distribution of can be computed by where the is the joint action such that where is agent individual action selected at state Note that the action is selected by domain-level policy At state belief state is its observation history where is observation at the state. Since
Because of assumption 1 (local state uniquely decided by observation) and 2 (action decided by current local state only), where is local state at state that is uniquely decided by Here, the above expression is transformed to the following:
(Because
is fixed.)
(Because Therefore, the conditional probability distribution of depends on only To prove that the Markov chain is time homogeneous,
238 we show that the transition between of the index
and
is independent
and
where P is defined in the MTDP and is decided by state as shown above. Therefore, the Markov chain is time homogeneous, and the reduction from an MTDP to a time homogeneous Markov chain is proved. The above theorem shows that the performance evaluation of the DCSP strategies defined in Chapter 2.3 can be reduced to the evaluation of a Markov chain because the DCSP strategies and the environment for them satisfy the required assumptions for the above theorem 2 as follows: DCSP strategies are based on only local state (whether an agent is in the good (G) or the nogood (N) state). E.g., and
indicates that strategy for N state.
strategy is applied for G state
The environment is locally observable since each agent can find its own state based on its value and communicated values (observation) of neighboring agents. Thus, performance evaluation of the DCSP strategies (defined in Section 2.3) can be done in where is the number of states in a given MTDP since their corresponding policies can be evaluated by the same method of value determination for MDP: value determination in MDP can be done in by solving a system of linear equations where the number of variables is the number of states [6]. 4.2.2 Analysis of Performance Prediction. While there can be various problem domains, in this initial investigation, we focus on two special cases where the percentages of locally constrained agents are 90% and 50% respectively (either most of agents are locally constrained or half of agents are locally constrained). 2 These two cases are considered for both the high flexibility setting and the low flexibility setting. Among the possible strategies defined in Section 3.1, we focus on the strategies with sum as a flexibility base that were selected in Section 2 for expository purpose: and Figure 8 shows the results when 90% agents are locally constrained in the low flexibility setting. Figure 8-(a) (top) shows the experimental results in the case and Figure 8-(b) (bottom) shows the performance evaluation from the MTDP model using the interaction composition method. In the performance
An Application Science for Multi-Agent Systems
239
evaluation, the lower value means the better performance (indicating that less number of cycles will be taken until a solution is found). Figure 8-(a) and (b) show that the performance prediction results match to the real experimental results in the problem setting considered here: the patterns of column graphs in Figure 8-(a) and (b) correspond to each other. That is, the strategies that showed better performance in the experiments and are expected to perform better than the others according to the performance prediction. Here, the current goal of performance prediction is mainly to predict the best performing strategy not to predict the magnitude of speedup difference among the given strategies.
240
Here, a question remains to be answered for whether the other composition method of building blocks can provide the same power of performance prediction as the interaction method does. Figure 9 shows the performance prediction results with different composition methods for the same problem domain used in Figure 8. The rightmost part is for the interaction method shown in Figure 8-(b). It is shown that the result from the single block method does not match at all. While the simple sum or weighted sum method provides a little improvement over the single block method, they are far from matching to the real experimental results (Figure 8-(a)). That is, the composition method that considers the interaction between building blocks shows the best prediction results. Note that 5 building blocks were combined in each prediction (except for the single block method where only one selected block was used), while ensuring that the ratio of U (unconstrained) and C (locally constrained) agents were what was required for each domain. Figure 10 also shows the performance prediction results in different domains using the interaction method of building block composition. Here, the performance analysis distinguishes better performing strategies from worse performing strategies, and this difference is statistically significant. The correlation coefficient between the speedup in cycles and the difference in performance evaluations was 0.83. Furthermore, the hypothesis that there is no such correlation was rejected with p-value of 0.0001. This result illustrates that the MTDP model can be used to predict the right strategy to apply in a given situation (possibly with less computation overhead). That is, given a new domain, agents can analyze different strategies with the simple MTDP model and select the right strategy for the new domain without running a significant number of problem instances for each strategy.
An Application Science for Multi-Agent Systems
241
Furthermore, this approach will enable agents to flexibly adapt their strategies to changing circumstances. More generally, this result indicates a promising direction for performance analysis in DCSPs, and potentially other multiagent systems.
5.
Related Work and Conclusion
Significant works in multiagent learning are focused on learning to select the right coordination strategy [13, 3]. While this goal is related to our goal of choosing the right strategy, one key difference is that the learning work focuses on enabling each agent to select a strategy. Instead, our focus is on a complementary goal of trying to predict the overall performance of the entire multiagent system assuming homogeneous conflict resolution strategies.
242 In centralized CSPs, the performance prediction for different heuristics has been investigated. However, their methods are based on the estimation of nodes to expand for search[7]. However, this approach is not applicable to DCSPs since multiple agents simultaneously investigate their search space. Theoretic investigations of heuristic performance were also done in centralized CSPs [8,10]. However, no theoretical investigation has been done for the performance prediction in DCSPs. While there are related works in composition methods of subproblem solutions in MDPs[2, 11] and in POMDPs[12], we are really interested in applying these composition techniques for performance modeling, not computing an optimal policy. For instance, our techniques are heavily influenced by the need to capture the long-tailed phenomena in conflict resolution. To conclude, in this chapter, the recently emerging distributed POMDP frameworks such as MTDP were used to create performance models for conflict resolution strategies in multiagent systems. To address issues of scale-up, we used small-scale models, called building blocks that represent the local interaction among a small group of agents. We discussed several ways to combine building blocks for the performance prediction of larger-scale multiagent systems. These approaches were presented in the context of DCSPs (distributed constraint satisfaction problems), where we first showed that there is a large bank of conflict resolution strategies and no strategy dominates all others across different domains. By modeling and combining building blocks, we were able to predict the performance of five different DCSP strategies for four different domain settings, for a large-scale multiagent system. Thus, our approach points the way to new tools for the strategy analysis and performance modeling in multiagent systems in general.
Notes 1. For simplification, we assume each agent has only one variable. 2. There is no significant performance difference in the 0% case.
References [1] D. S. Bernstein, S. Zilberstein, and N. Immerman. The complexity of decentralized control of mdps. In Proceedings of the International Conference on Uncertainty in Artificial Intelligence, 2000. [2] T. Dean and S. Lin. Decomposition techniques for planning in stochastic domains. In Proceedings of the International Joint Conference on Artificial Intelligence, 1995. [3] C. Excelente-Toledo and N. Jennings. Learning to select a coordination mechanism. In Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, 2002.
An Application Science for Multi-Agent Systems
243
[4] C. Gomes, B. Selman, N. Crato, and H. Kautz. Heavy-tailed phenomenon in satisfiability and constraint satisfaction problems. Journal of Automated Reasoning, 24, 2000. [5] H. Jung, M. Tambe, and S. Kulkarni. Argumentation as distributed constraint satisfaction: Applications and results. In Proceedings of the International Conference on Autonomous Agents, 2001. [6] M. Littman, T. Dean, and L. P. Kaelbling. On the complexity of solving markov decision problems. In Proceedings of the International Conference on Uncertainty in Artificial Intelligence, 1995. [7] L. Lobjois and M. Lemaitre. Branch and bound algorithm selection by performance prediction. In Proceedings of the Seventeenth National Conference on Artificial Intelligence, 1998. [8] S. Minton, M. D. Johnston, A. Philips, and P. Laird. Solving large-scale constraint satisfaction and scheduling problems using a heuristic repair method. In Proceedings of the National Conference on Artificial Intelligence, 1990. [9] P. Modi, H. Jung, M. Tambe, W. Shen, and S. Kulkarni. A dynamic distributed constraint satisfaction approach to resource allocation. In Proceedings of the International Conference on Principles and Practice of Constraint Programming, 2001.
[10] R. Musick and S. Russell. How long it will take? In Proceedings of the National Conference on Artificial Intelligence, 1992. [11] R. Parr. Flexibile decomposition algorithms for weakly coupled Markov decision problems. In Proceedings of the International Conference on Uncertainty in Artificial Intelligence, 1998. [12] J. Pineau, N. Roy, and S. Thrun. A hierarchical approach to POMDP planning and execution. In Proceedings of the ICML Workshop on Hierarchy and Memory in Reinforcement Learning, 2001. [13] N. Prasad and V. Lesser. The use of meta-level information in learning situation-specific coordination. In Proceedings of the International Joint Conference on Artificial Intelligence, 1997. [14] M. L. Puterman. Markov Decision Processes. John Wiley & Sons, 1994. [15] D. Pynadath and M. Tambe. The communicative multiagent team decision problem: analyzing teamwork theories and models. Journal of Artificial Intelligence Research, 2002. [16] O. Rana. Performance management of mobile agent systems. In Proceedings of the International Conference on Autonomous Agents, 2000. [17] M. C. Silaghi, D. Sam-Haroud, and B. V. Faltings. Asynchronous search with aggregations. In Proceedings of the National Conference on Artificial Intelligence, 2000. [18] P. Xuan and V. Lesser. Multi-agent policies: from centralized ones to decentralized ones. In Proceedings of the International Conference on Autonomous Agents, 2002. [19] M. Yokoo and K. Hirayama. Distributed constraint satisfaction algorithm for complex local problems. In Proceedings of the International Conference on Multi-Agent Systems, 1998. [20] W. Zhang and L. Wittenburg. Distributed breakout revisited. In Proceedings of the National Conference on Artificial Intelligence, 2002.
This page intentionally left blank
INDEX AAII methodology in multi-agent systems, 141–143 AALAADIN model, 143–144 Adaptive organizational capabilities within MAS application characteristics, 175–198 decision procrastination, 189–191 decision-making framework, experiments, 179–186 decision-making interaction style consensus, 180–181 effect of, 193–194 implementation of, 180–181 locally autonomous, 180 master/command-driven, 180 experiment domain, 179–180 experiment testbed, 181–182 experimental variables, 182–186 frequency allocations for experiment, 186 global decision-making framework, 175–198 graphical conventions, explanation of, 186–187 model sharing, 184 position sensing, 187–189 risk aversion, 191–193 Aircraft service team coordination, 54–76 application, 54–56 coordination, 59 via keys, 61–69 deadlines, 59, 60 dynamic aircraft readiness, 58–61 dynamic distributed scheduling, 60 interdependent tasks, 60 quantified/value decisions, 59 simulation environment, 55 status display, 55 Alarms, 45 Alerts, 45 Asymmetric network odd loop, 87 symmetric networks, contrasted, 87–88 Asynchronous weak commitment search algorithm, distributed constraint
satisfaction problem-based approach for large scale multi-agent systems, 224 Autonomous spacecraft coordination, 7–26 autonomous operations, 19–21 autonomy technology interactions, 20 combining signals, 12 Earth Observer-1 following landsat 7, 19 execution, 21–22 coordinated measurement, 21 failure, 22 local resources, 21 recovery, 22 shared resources, 21 uncertainty, 22 ground operations, 14–19 high accuracy geolocation, 14 interferometer, 10 locations of phenomena, 9 mission concept, TechSat-21,14 multiple platform, 9–14 multiple rationales, 13–14 ORBCOMM communications structure, 16 planning/scheduling, 22–23 communication constraints, 22 computation constraints, 22–23 cooperation/negotiation, 23 failure, 23 local activities, 22 metrics, 23 recovery, 23 shared activities, 22 uncertainty, 23 science analysis, 23–24 communication constraints, 23 computation constraints, 23 cooperation/negotiation, 24 metrics, 24 signal combination, 12–13, 18–19 signal separation, 10–11, 17–18 signal space coverage, 11–12, 16–17 spacecraft interferometer, 17 spacecraft operation model, 15 StarLight Mission, 17
246
Boeing 737, collaborative design, 91 Boeing 777, collaborative design, 91 Boeing Sonic Cruiser, collaborative design, 92 Caregiver system, 43–54 Centralized coordination pacing, multiplexing, 49 Centralized vs. decentralized coordination, 41–76 aircraft service team coordination, 54 application, 54–56 coordination, 59 coordination via keys, 61–69 deadlines, 59, 60 dynamic aircraft readiness, 58–61 dynamic distributed scheduling, 60 interdependent tasks, 60 quantified/value decisions, 59 simulation environment, 55 I.L.S.A., 43–54 alarms, 45 alerts, 45 application, 43–44 categories of responses, 45 centralized coordination pacing, 49 computational predictability, 50 control algorithm, pseudocode, 50 flexibility, 49 global constraints, 49 multiplexing, 49 normal operations mode, 53 notifications, 45 nuisance factor, 49 reminders, 45 response coordinator, 51 response planning, 44–54 Cluster augmentation “on-demand,” 14 Collaborative design, 77–94 asymmetric network, odd loop, 87 Boeing 737, 91 Boeing 777, 91 Boeing Sonic Cruiser, 92 coloured petri nets, interaction protocols using, 97–98 complex systems research, 81–90 defining collaborative design, 78–80 destinator role, recruiting interaction protocol for, 105 dynamics, 77–78 e-business applications, 105–108 environmental emergency systems, interaction protocols for, 101–105
implementation, 108–110 imprinting, 89–90 influences, defining, 83–84 initiator role recruiting interaction protocol for, 103 request interaction protocol for, 98 interaction protocol, request, 98–101 linear vs. non-linear networks, 84–87 model of, 79 multiple optima utility function, 85 networks, stuck in local optima, 82 odd loop, dynamic attractor for, 88 participant role, request interaction protocol for, 99 petri net, with conversation, policy levels, 109 Pit Game, 105 Pit Game interaction protocol for dealer role, 106 for player role, 107 recruiter role, recruiting interaction protocol for, 104 request transition fulfilling, 100 receiving, 100 subdivided network, example of, 89 subdivided networks, 89 symmetric vs. asymmetric networks, 87–88 target agents’ role, recruiting interaction protocol for, 105 top-level process model, 102 utility function, 79 Coloured petri nets, interaction protocols using, 97–98 Communication constraints, spacecraft operations, 22, 23 Computation constraints, spacecraft operations, 22–23 Computational predictability, 50 Consumer agents, 36 Consumers, 29 Consumption, logistics network management, 31 Contract Net protocol, 122 Control algorithm, pseudocode, 50 Cooperation, negotiation, spacecraft operations, 23 Coordination mechanism application, dependency relationships, 199–220 DECAF agent architecture, 207 task structure, 209
An Application Science for Multi-Agent Systems
environments, applying mechanisms under, 213–217 experiment results, analysis, 210–212 experimental framework, 209–210 experimental results, 209–212 future work in, 218–219 implementation, 207–208 mechanisms, 203–207 avoidable dependency, 204 constant headway/timetabling, 207 demotion shift dependency, 205 polling result, coordination by, 206–207 reservation, coordination by, 205 sacrifice avoidable dependency, 205 sending result, coordination by, 206 performances of coordination mechanisms, 211, 212, 213 task structures, 201–203 Coordination stress, in scaling-up agent-based systems, 115 agent population properties 115–117 solution properties, 119–120 task-environment properties, 117–118 DECAF agent architecture, 199–220 large scale MAS, performance models, 221–244 Decentralized coordination, centralized coordination, compared, 41–76. See also Centralized coordination Defining collaborative design, 78–80 Degree of interaction, in scaling-up agent-based systems, 123–124 Dependency relationships, applying coordination mechanisms for, 199–220 coordination mechanisms, 203–207 avoidable dependency, 204 constant headway/timetabling, 207 demotion shift dependency, 205 polling result, coordination by, 206–207 reservation, coordination by, 205 sacrifice avoidable dependency, 205 sending result, coordination by, 206 DECAF agent architecture, 207 task structure, 209 environments, applying mechanisms under, 213–217 experiment results, analysis, 210–212 experimental framework, 209–210 experimental results, 209–212 future work in, 218–219 implementation, 207–208
247
performances of coordination mechanisms, 211, 212, 213 task structures, 201–203 Design, collaborative. See Collaborative design Destinator role, recruiting interaction protocol for, 105 Distributed constraint satisfaction problem-based approach for large scale multi-agent systems, 221–244 asynchronous weak commitment search algorithm, 224 complexity of performance evaluation local observability, 236 time homogeneous Markov chain, 236 cooperativeness, new basis for, 227–228 cooperativeness–based strategies, 225–226 distributed POMDP-based model, 231–235 building block, 233–235 interaction, building block, 235 Markov team decision process model, 231–232 simple sum, building block, 235 single block, 235 weighted sum building block, 235 experimental evaluation, 226–230 long tail distribution in convergence, 230–231 performance analysis, 231–241 performance prediction, 236 analysis of performance prediction, 238–241 complexity of performance evaluation, 236–238 local state, 236 Markov chain, 236 theorem, 237 related work, 241–242 Distribution logistics network management, 31 in scaling-up agent-based systems, 125–126 Distributivity, roles, 136–137 Distributors, 29 District heating systems domain, 27–40 Dynamic attractor for odd loop, 88 Dynamism, in scaling-up agent-based systems, 124–125 Earth Observer-1 following Landsat 7, 19 E-business applications, agent interaction, 105–108
248
Elder caregiver system, 43–54 Environmental emergency systems, interaction protocols for, 101–105 Evolutionary framework, large-scale experimentation, 155–174 application science, role of evolutionary experimentation in, 169–170 changing environment, simulation results, 163–164 degree of compatibility with evolutionary approach, division of technology space by, 169 distribution, different supplier types, 61, 162 evolutionary environment, experimentation in, 168–170 existing multi-agent systems, integration of into, 164–166 auditor, 167 customer agent, 167 data warehouse agent, 167 evolutionary components, introduction of, 166–168 example, 165–166 factory, 167 MAGNET architecture, 164–165 mutations, 167 supplier agent, 167 expectations, 159 MAGNET architecture adjusted to evolutionary paradigm, 166 milestones, city population for different supplier types as Function of, 160 mixed-initiative MAGNET architecture, 165 mutation, 157–158 noise factor, simulation results, 160–163 reproduction, 157–158 strategy introduction, 157–158 test model, 158–159 Formal models, 144–150 Gaia methodology, 140–141 Geolocation, high accuracy, 14 Global decision-making framework, 175–198 Heliosphere, 9 Heterogeneity, in scaling-up agent-based systems, 121–122 I.L.S.A., 43–54
application, 43–44 categories of responses, 45 alarms, 45 alerts, 45 notifications, 45 reminders, 45 centralized coordination pacing, multiplexing, 49 computational predictability, 50 control algorithm, pseudocode, 50 flexibility, 49 global constraints, 49 multiplexing, 49 normal operations mode, 53 nuisance factor, 49 response coordinator, 51 response planning, 44–54 Imprinting, 89–90 Influences, defining, 83–84 Information integrity, in spacecraft operations, 9 Information rates, 9 Initiator role recruiting interaction protocol for, 103 request interaction protocol for, 98 Interaction degree of, in scaling-up agent-based systems, 123–124 protocol, request, 98–101 Interferometer, spacecraft, 10, 17 Ionosphere atmosphere, 9 Karma-Teamcore framework, 146–148 Large scale multi-agent systems, distributed constraint satisfaction problem-based approach for, 221–244 asynchronous weak commitment search algorithm, 224 cooperativeness, new basis for, 227–228 cooperativeness-based strategies, 225–226 distributed POMDP-based model, 231–235 building block composition, 233–235 interaction, building block, 235 Markov team decision process model, 231–232 simple sum, building block, 235 single block, building block, 235 weighted sum, building block, 235 experimental evaluation, 226–230 long tail distribution in convergence, 230–231
An Application Science for Multi-Agent Systems
performance analysis, 231–241 performance prediction, 236 complexity of performance evaluation, 236–238 local state, 236 theorem, 237 Large-scale experimentation, evolutionary framework, 155–174 application science, role of evolutionary experimentation in, 169–170 changing environment, simulation results, 163–164 degree of compatibility with evolutionary approach, division of technology space by, 169 distribution, different supplier types, 61, 162 evolutionary environment, experimentation in, 168–170 existing multi-agent systems, integration of into, 164–166 auditor, 167 customer agent, 167 data warehouse agent, 167 evolutionary components, introduction of, 166–168 example, 165–166 factory, 167 MAGNET architecture, 164–165 mutations, 167 supplier agent, 167 expectations, 159 MAGNET architecture adjusted to evolutionary paradigm, 166 milestones, city population for different supplier types as function of,160 mixed-initiative MAGNET architecture, 165 mutation, 157–158 noise factor, simulation results, 160–163 reproduction, 157–158 strategy introduction, 157–158 test model, 158–159 Linear vs. non-linear networks, 84–87 Lithosphere, 9 Local optima, networks stuck in, 82 Local resources, spacecraft operations, 21 Logistics network management, 27–40 characterization, problem domain, 30–32 consumption, 31 distribution, 31 production, 31 consumer agents, 36
249
consumers, 29 distribution, 29 distributors, 29 district heating systems domain, 27–40 input, 20 producer, 29 producer agents, 36 quality of service, surplus production, for semi-distributed approach, 38 redistribution, 34 redistribution agents, 36 simple supply network, hourglass shape of, 29 simulation results, 37–38 simulator, production, 32–38 consumption, 34 distribution, 34 production, 33 suppliers, 29 supply chain network, 29–30 tier suppliers, 29 MAGNET architecture, 164–165 adjusted to evolutionary paradigm, 166 mixed-initiative, 165 Magnetosphere, 9 Markov team decision process model, distributed constraint satisfaction problem-based approach, large scale multi-agent systems, 221–244 MaSE methodology, 138–140 Multiple optima utility function, 85 Multiple platform, spacecraft operations, 9–14 Multiplexing, 49 Nano-satellite magnetospheric constellation orbits, 11 Negotiation, in spacecraft operations, 24 Non-linear networks, linear networks, compared, 84–87 Notifications, 45 Odd loop asymmetric network, 87 dynamic attractor for, 88 Optimality, in scaling-up agent-based systems, 126–127 ORBCOMM communications structure, 16 Overheads, reducing, in scaling-up agent-based systems, 127–128
250
Participant role, request interaction protocol for, 99 Passive radiometry mission, 14 Petri net coloured, interaction protocols using, 97–98 with conversation, policy levels, 109 Pit Game, 105 interaction protocol for dealer role, 106 for player role, 107 Problem domain characterization, logistics network management, 30-32 Producer agents, 36 Production, logistics network management, 31 Quality of service, surplus production, for semi-distributed approach, 38 Radar mission, 14 Recruiter role, recruiting interaction protocol for, 104 Redistribution, 34 agents, 36 Reduced overheads, in scaling-up agent-based systems, 127–128 Reminders, 45 Request transition fulfilling, 100 receiving, 100 Response coordinator, 51 RoboCup simulation domain, 148–149 Robust execution, in spacecraft operation, 20 Robustness, in scaling-up agent-based systems, 127 Role oriented programming, 149–150 Roles in multi-agent systems, 133–154 AAII methodology, 141–143 AALAADIN model, 143–144 degree of interaction, 135–136 distributivity, 136–137 environment dynamics, 136 formal models, 144–150 Gaia methodology, 140–141 implemented multi-agent systems, 146 Karma-Teamcore framework, 146–148 RoboCup simulation domain, 148–149 role oriented programming, 149–150 importance of roles, 135–137 MaSE methodology, 138–140 role properties, 137–138
Scaling-up agent coordination strategies, 113–132 agents, 121 challenges, 128–129 characteristics coordination strategies, 120–128 complexity, 122–123 Contract Net protocol, 122 coordination stress, 115 agent population properties, 115–117 solution properties, 119–120 task-environment properties, 117–118 distribution, 125–126 dynamism, 124–125 heterogeneity, 121–122 interaction, degree of, 123–124 optimality/efficiency, 126–127 overheads, reduced, 127–128 robustness, 127 Scaling-up agent-based systems, 113–132 challenges, 128–129 characteristics coordination strategies, 120–128 complexity, 122–123 Contract Net protocol, 122 coordination stress, 115 agent population properties, 115–117 solution properties, 119–120 task-environment properties, 117–118 distribution, 125–126 dynamism, 124–125 heterogeneity, 121–122 interaction, degree of, 123–124 optimality/efficiency, 126–127 overheads, reduced, 127–128 robustness, 127 Science analysis, in spacecraft operations, 20 Service teams, aircraft, coordination of, 54–76 Signal combination, in spacecraft operations, 12–13, 18–19 Signal location, in spacecraft operations, 9 Signal space coverage, in spacecraft operations, 11–12, 16–17 Simple supply network, hourglass shape of, 29 Simulator, production, logistics network management, 32–38 consumption, 34 distribution, 34 production, 33 Spacecraft, autonomous, coordination of, 7–26
An Application Science for Multi-Agent Systems
autonomous operations, 19–21 autonomy technology interactions, 20 celestial sphere, 9 cluster augmentation “on-demand,” 14 combining signals, 12 Earth Observer-1, following Landsat 7, 19 execution, 21–22 coordinated measurement, 21 failure, 22 local resources, 21 recovery, 22 shared resources, 21 uncertainty, 22 ground operations, 14–19 heliosphere, 9 high accuracy geolocation, 14 information integrity, 9 predictability, 9 information rate, 9 interferometer, 10 ionosphere atmosphere, 9 lithosphere, 9 locations of phenomena, 9 magnetosphere, 9 mission concept, TechSat-21, 14 multiple platform, 9–14 multiple rationales, 13–14 nano-satellite magnetospheric constellation orbits, 11 ORBCOMM communications structure, 16 passive radiometry mission, 14 planning/scheduling, 22–23 communication constraints, 22 computation constraints, 22–23
251
cooperation/negotiation, 23 failure, 23 local activities, 22 metrics, 23 recovery, 23 shared activities, 22 uncertainty, 23 radar mission, 14 robust execution, 20 science analysis, 20, 23–24 communication constraints, 23 computation constraints, 23 cooperation/negotiation, 24 metrics, 24 signal combination, 12–13, 18–19 signal isolation, 9 signal location, 9 signal separation, 10–11, 17–18 signal space coverage, 11–12, 16–17 spacecraft interferometer, 17 spacecraft operation model, 15 StarLight Mission, 17 StarLight Mission, 17 Subdivided network, 89 example of, 89 Supply chain network, 29–30 Surplus production, quality of service, for semi-distributed approach, 38 Symmetric vs. asymmetric networks, 87–88 Target agents’ role, recruiting interaction protocol for, 105 TechSat-21 mission concept, 14 Tier suppliers, 29 Utility function, collaborative design, 79