Springer Tracts in Advanced Robotics Volume 11 Editors: Bruno Siciliano · Oussama Khatib · Frans Groen
Springer Berlin Heidelberg NewYork Hong Kong London Milan Paris Tokyo
J.-H. Kim D.-H. Kim Y.-J. Kim K.-T. Seow
Soccer Robotics With 205 Figures and 12 Tables
13
Professor Bruno Siciliano, Dipartimento di Informatica e Sistemistica, Universit`a degli Studi di Napoli Federico II, Via Claudio 21, 80125 Napoli, Italy, email:
[email protected] Professor Oussama Khatib, Robotics Laboratory, Department of Computer Science, Stanford University, Stanford, CA 94305-9010, USA, email:
[email protected] Professor Frans Groen, Department of Computer Science, Universiteit van Amsterdam, Kruislaan 403, 1098 SJ Amsterdam, The Netherlands, email:
[email protected] STAR (Springer Tracts inAdvanced Robotics) has been promoted under the auspices of EURON (European Robotics Research Network)
Authors Dr. Jong-Hwan Kim Dept. of Electrical Engineering and Computer Science Korea Advanced Institute of Science and Technology (KAIST) 373-1 Gusong-dong, Yusong-gu Daejeon 305-701 Republic of Korea
Dr. Dong-Han Kim Dept. of Electrical Engineering and Computer Science Korea Advanced Institute of Science and Technology (KAIST) 373-1 Gusong-dong, Yusong-gu Daejeon 305-701 Republic of Korea
Dr. Yong-Jae Kim Intelligent Robot Lab. Inst. of Intel. Syst. Mechatronic Center Samsung Electronics Co. Maeton 3-dong Paldal-gu, Suwon-si Gyeonggi-do 442-742 Republic of Korea
Dr. Kiam-Tian Seow School of Computer Engineering Nanyang Technological University Nanyang Avenue Singapore 639798 Singapore
ISSN 1610-7438 ISBN 3-540-21859-9
Springer-Verlag Berlin Heidelberg New York
Library of Congress Control Number: 2004104485 This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilm or in other ways, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer-Verlag. Violations are liable to prosecution under German Copyright Law. Springer-Verlag is a part of Springer Science+Business Media springeronline.com © Springer-Verlag Berlin Heidelberg 2004 Printed in Germany The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Typesetting: Digital data supplied by authors. Data-conversion and production: PTP-Berlin Protago-TeX-Production GmbH, Germany Cover-Design: design & production GmbH, Heidelberg Printed on acid-free paper 62/3020Yu - 5 4 3 2 1 0
Editorial Advisory Board EUROPE Herman Bruyninckx, KU Leuven, Belgium Raja Chatila, LAAS, France Henrik Christensen, KTH, Sweden Paolo Dario, Scuola Superiore Sant’Anna Pisa, Italy R¨udiger Dillmann, Universit¨at Karlsruhe, Germany AMERICA Ken Goldberg, UC Berkeley, USA John Hollerbach, University of Utah, USA Lydia Kavraki, Rice University, USA Tim Salcudean, University of British Columbia, Canada Sebastian Thrun, Carnegie Mellon University, USA ASIA/OCEANIA Peter Corke, CSIRO, Australia Makoto Kaneko, Hiroshima University, Japan Sukhan Lee, Sungkyunkwan University, Korea Yangsheng Xu, Chinese University of Hong Kong, PRC Shin’ichi Yuta, Tsukuba University, Japan
Foreword
At the dawn of the new millennium, robotics is undergoing a major transformation in scope and dimension. From a largely dominant industrial focus, robotics is rapidly expanding into the challenges of unstructured environments. Interacting with, assisting, serving, and exploring with humans, the emerging robots will increasingly touch people and their lives. The goal of the new series of Springer Tracts in Advanced Robotics (STAR) is to bring, in a timely fashion, the latest advances and developments in robotics on the basis of their significance and quality. It is our hope that the wider dissemination of research developments will stimulate more exchanges and collaborations among the research community and contribute to further advancement of this rapidly growing field. This monograph written by Jong-Hwan Kim, Dong-Han Kim, Yong-Jae Kim and Kiam-Tian Seow forms an introduction to the field of Soccer Robotics. Soccer Robotics has become an important research area with different competing initiatives. It integrates mechatronics, computer science and artificial intelligence techniques to create real-world autonomous systems, which are not only fun to see. Soccer Robotics forms also a test bed for system integration of autonomous systems comparing different approaches in various competitions with different levels of distributed perception and collaboration. Soccer Robotics opens the route towards collaborating autonomous robot systems in a real-world adversarial setting. The focus of this monograph is on the FIRA framework of Soccer Robotics, in particular MiroSot, which uses a central overhead camera to overview the whole soccer field, and a central control of the robots. The monograph gives a complete description of the different aspects needed to create a soccer team. It describes the hardware aspects, the computer vision needed, navigation, action selection, basic skills and game strategy. These aspects are described at an undergraduate level, and up to a junior graduate level, showing its use of as text book but also a must for everyone who wants to enter MiroSot robotics. A fine addition to the series! Amsterdam February 2004
Frans Groen STAR Editor
Preface
Autonomous robots which are adaptable, communicative and objectiveoriented, and intelligent multi-agent robotic systems in general, are so evidently complex that it has become increasingly necessary to find a domain that can serve as an integrated framework for the complementary purposes of research and education. Robot soccer is one such suitable domain that is representative of intelligent multi-agent robotic systems, in which multiple robotic agents (or simply, multiple robots) need to cooperate in an adversarial environment to achieve specific objectives. It is a game based on the modified rules of human soccer and is played in a scaled down soccer field, in which two soccer robot teams compete by attempting to move a ball into the opponent team’s goal. The team with a higher score at the end of regulation time wins. Technically, robot soccer is a competitive game that makes heavy demands in all the key areas of robot technology, namely, mechanics, control, sensors, communication, and intelligence. On the one hand, it spurs wide-ranging multidisciplinary research work by providing a comprehensive test bed that facilitates the concrete demonstration and performance evaluation of new ideas and concepts. On the other hand, it captivates as an educational tool that helps students better understand and appreciate the scientific knowledge and technological developments in an inherently multidisciplinary setting of intelligent multi-agent robotic systems. Since its inception in 1995, robot soccer has evolved into a recognized area of its own. This area, called Soccer Robotics, is a subfield of AI Robotics that offers a challenging domain for research and education in a large spectrum of issues integrating the problems of sensing, deciding and acting that are of relevance to the development of complete autonomous agents in general. The hope in Soccer Robotics, of course, is that by discovering how to get a team of robots to sense with acuity, decide collaboratively and act in coordination within the limited context of a soccer game, it will be possible to use the same techniques and technologies to build robots that carry out other more useful tasks. The development of this subfield is actively supported through the Micro-Robot Soccer Tournament (MiroSoT) and Simulated-Robot Soccer Tournament (SimuroSoT) Categories of the FIRA Cup, an international event organized by the Federation of International Robot-soccer Association (FIRA, http://www.fira.net). FIRA Cup, held annually since 1996, has
X
Preface
been the ‘examination’ ground for the testing of new techniques and technologies integrated in the game of robot soccer, and has provided much excitement and entertainment for all those who participated. This new book Soccer Robotics is intended to be a comprehensive introduction to the field of soccer robotics, emphasizing breadth of coverage and accessibility of the material to readers with possibly different backgrounds. Its key feature is the emphasis placed on a robot soccer-programming framework that integrates all the key areas of robot technology. Until now, these areas had been treated mainly in separate books or in research literature only, outside the arena of soccer robotics. A substantial portion of this book is based on the first author’s lectures EE006 Robot Soccer System at the Korea Advanced Institute of Science and Technology in the period July 13 - August 14, 1998. The material on robot soccer originated with the KAIST postgraduate theses of Dong-Han Kim (2003,1998), Yong-Jae Kim (2003), Hyun-Sik Shim (1998), Mun-Soo Lee (2000), Heung-Soo Kim (1997) and others, together with joint publications with the first author. The experimental robot soccer system program for Small League MiroSoT that supplements this book has been developed with the help of many students in the first author’s Robot Intelligence Technology (RIT) Lab at KAIST, while the simulator package for Large League SimuroSoT has been developed by Bing-Rong Hong’s research team at Harbin Institute of Technology, P.R. China. Both are available for free download from the FIRA website http://www.fira.net. Soccer Robotics is written as a textbook for practical courses at the undergraduate level, and up to the first-year graduate level. It is useful for researchers and practising engineers interested in trying out new techniques in the domain of robot soccer. This book is also suitable for anyone interested in learning and developing robot soccer systems for edutainment purposes. For those interested in participating in either the MiroSoT or SimuroSoT categories of the annual FIRA Cup and other robot-soccer championship events, the material in this book will provide a firm foundation for the development of robot soccer systems to competitive standards. The book will be of interest to scientists, engineers and students in a variety of disciplines besides AI Robotics, where the use of robot soccer as a test bed is relevant: sensors (including computer vision), control, communication, multiagent systems and artificial intelligence. To review the chapters briefly: Chapter 1 defines the multi-agent framework of soccer robotics in terms of the three commonly accepted primitives of AI robotics, namely, SENSE, DECIDE and ACT. The goals of soccer robotics in research and education of intelligent multi-agent robotic systems are explained. The various categories of robot soccer created by FIRA, an international regulating body for robot soccer, are described. The classification of robot soccer systems for MiroSoT is also examined.
Preface
XI
Chapter 2 presents the basic theoretical background on the mechanical motion of mobile robots, with emphasis on the kinematics of a two-wheel MiroSoT robot. The essentials of hardware and firmware needed to build a two-wheel MiroSoT robot with IR or RF communication are covered in sufficient detail. Chapter 3 focusses on the (visual) SENSE primitive; in particular, it presents how the postures of target objects in robot soccer can be computed using centralized vision techniques. The basics of computer vision are first introduced. Real examples are then provided to highlight the practical considerations in building a good vision system for a MiroSoT team. Chapter 4 focusses on the DECIDE and ACT primitives. A hybrid control architecture is introduced that integrates the three primitives of SENSE, DECIDE and ACT in a hierarchy of four interacting levels, namely, role, action, behaviour and execution. To expose the technical challenges involved, example strategies are given at the role level and action level. Action designs for robot soccer, to be implemented at the behavioral level, are classified and explained. An overview of classical PID control, applicable at the behavioral level and execution level, follows. Finally, two different navigation methods, applicable at the behavioral level, are presented. Chapter 5 motivates the importance of the various aspects of intelligence, namely, search and evolution, knowledge representation and inference and learning and adaptation, as needed by the DECIDE and ACT primitives. Following, it demonstrates, by examples, the applicability of Petri nets, Q-learning, neural networks, evolutionary programming and fuzzy logic to robot soccer under the MiroSoT category. These soft-computing paradigms make concrete (at least one of) the abstract aspects of intelligence. For each paradigm, one or two examples are provided that address some key issues at specific hierarchical levels of the hybrid control architecture introduced in Chapter 4. Chapter 6 introduces a host software model for MiroSoT robot soccer system. An overview of the programming framework for robot soccer is then presented, in which a number of the robot soccer concepts described in earlier chapters are illustrated through example ‘C’ programs which are the key functions of a robot soccer system for Small League (3-a-side) MiroSoT. Chapter 7 complements the real-system programming framework presented in the previous chapter with a computer-simulated system programming framework. It presents the core simulator system and programming framework for Large League (11-a-side) SimuroSoT. Example ‘C++ ’ codes are provided for illustration. As do all authors of technical work, we wish to acknowledge the many contributors on whose work our own presentation is partly based. The list of references gives some indication of those to whom we are in debt. On a more
XII
Preface
personal level, we would expressly like to thank Hyun-Sik Shim, Myung-Jin Jung, Heung-Soo Kim, Kuk-Hyun Han, Kui-Hong Park, Ming Yu-Chi, JunSu Jang, Kang-Hee Lee, Jayyati Ghoshal and many other students in the RIT Lab who have contributed to this book in a wide range of invaluable ways. This book would never have been possible without the funding that came from a variety of sources to support the research and development work in soccer robotics, and the writing of this book. The first author would like to acknowledge each of these agencies: LG, Samsung, POSCO, KOSEF and MRDEC. The last author would like to acknowledge the award of a ‘Brain Korea 21’ Institute Fellowship in 2002 that supported his joint research and authorship at KAIST. Finally, the authors are indebted to Dr. Thomas Ditzinger, Engineering Editor at Springer Verlag, for editorial assistance, and quality production of this book. KAIST, Daejeon, Korea, Jan 2003
Jong-Hwan Kim
KAIST, Daejeon, Korea
Dong-Han Kim
Samsung Electronics, Suwon, Korea
Yong-Jae Kim
NTU, Singapore
Kiam-Tian Seow
Contents
Foreword . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VIII Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
IX
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . XIX List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . XXV 1.
Soccer Robotics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.1 Agents, Multi-agent Systems, and AI Robotics . . . . . . . 1.1.2 Cooperative Robot Teams . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.3 Domain Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.4 Robot Soccer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 The Goals of Soccer Robotics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1 Test Bed for Robotics Research and Development (R&D) 1.2.2 Educational Tool for AI Robotics . . . . . . . . . . . . . . . . . . . 1.2.3 FIRA Robot World Cup . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.4 Technology Transfer to New Useful Tasks . . . . . . . . . . . . 1.3 Fundamental Motion Benchmarks for Robot Soccer . . . . . . . . 1.3.1 Striking the Ball . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.2 Passing the Ball to Another Robot . . . . . . . . . . . . . . . . . 1.3.3 Striking a Moving Ball . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.4 Passing a Moving Ball to a Moving Robot . . . . . . . . . . . 1.3.5 Dribbling the Ball Past Obstacles . . . . . . . . . . . . . . . . . . 1.4 Categories of Robot Soccer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4.1 MiroSoT and NaroSoT . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4.2 SimuroSoT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4.3 RoboSoT and KheperaSoT . . . . . . . . . . . . . . . . . . . . . . . . 1.4.4 HuroSoT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5 The MiroSoT Robot Soccer System . . . . . . . . . . . . . . . . . . . . . . . 1.6 Classification of MiroSoT Robot Soccer Systems . . . . . . . . . . . . 1.6.1 Command-Based Robots . . . . . . . . . . . . . . . . . . . . . . . . . . 1.6.2 Action-Based Robots .. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1 1 1 2 6 6 7 8 9 10 10 11 11 12 12 12 12 13 14 15 16 18 19 22 23 24
XIV
Contents
1.6.3 Intelligence-Based Robots . . . . . . . . . . . . . . . . . . . . . . . . . 25 1.7 Purpose of This Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Notes on Selected References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 2.
3.
Robot Soccer System: Hardware and Firmware Components . . . . . . . . . . . . . . . . . . . . 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Mobile Robots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 Mechanical Movement Mechanisms . . . . . . . . . . . . . . . . . 2.2.2 Kinematics of a Two-Wheel Robot . . . . . . . . . . . . . . . . . 2.2.3 Basic Motion Control: A Circular Path Analysis . . . . . . 2.3 A Two-Wheel Command-Based Soccer Robot . . . . . . . . . . . . . . 2.3.1 Microcontroller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.2 DC Motors and Auxiliary Components . . . . . . . . . . . . . . 2.3.3 Motor Driving and Circuits . . . . . . . . . . . . . . . . . . . . . . . . 2.3.4 Velocity and Duty Cycle . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.5 Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.6 Power System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.7 Other Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Notes on Selected References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
27 27 27 27 31 34 36 37 45 52 54 57 66 69 69
How to Sense? Use Computer Vision Techniques . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Vision Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Computer Vision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2 Vision System Operations . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.3 Sampling, Pixel, and Quantization . . . . . . . . . . . . . . . . . . 3.2.4 Gray Scale, Binary, and Colour Images . . . . . . . . . . . . . . 3.2.5 Colour Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Binary Image Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Thresholding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.2 Computing Geometric Properties . . . . . . . . . . . . . . . . . . . 3.3.3 Labelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.4 Labelling Algorithm 1: Recursive . . . . . . . . . . . . . . . . . . . 3.3.5 Labelling Algorithm 2: Sequential, 4-Connectivity . . . . 3.3.6 Size Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Vision System For MiroSoT Robot Soccer . . . . . . . . . . . . . . . . . 3.4.1 System Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.2 Image and Physical Coordinates on MiroSoT Playground 3.4.3 Example 1: System Hardware . . . . . . . . . . . . . . . . . . . . . . 3.4.4 Example 2: Vision Processing . . . . . . . . . . . . . . . . . . . . . . 3.4.5 Example 3: Information Extraction . . . . . . . . . . . . . . . . . 3.4.6 Example 4: Window Tracking for Fast Vision Processing Notes on Selected References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
71 71 72 72 73 74 74 75 77 78 80 84 87 87 89 90 90 91 93 94 97 100 101
Contents
4.
5.
XV
How to Decide and Act? Use Intelligent Systems and Control Techniques . . . . . . . . . . 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Hybrid Control Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 Role Level: The Who Issue . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.2 Action Level: The What Issue . . . . . . . . . . . . . . . . . . . . . . 4.2.3 Behaviour Level: The How Issue . . . . . . . . . . . . . . . . . . . . 4.2.4 Execution Level: The Motion Issue . . . . . . . . . . . . . . . . . . 4.3 Example Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.1 Action-Level Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.2 Role-Level Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Design of Robot Soccer Actions . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.1 Base Class: Primitive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.2 Attacker Class: Shoot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.3 Defender Class: Push . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.4 Goalkeeper Class: Block . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Control Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5.1 PID Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6 Unified Navigation Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6.1 Control of a Two-Wheel Robot . . . . . . . . . . . . . . . . . . . . 4.6.2 Univector Field Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6.3 Limit-Cycle Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Notes on Selected References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
103 103 104 105 105 106 107 107 108 109 109 110 111 114 115 117 117 120 121 126 130 140
How to Improve Intelligence? Use Soft Computing Techniques . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Intelligence Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1 Search and Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.2 Knowledge Representation and Inference . . . . . . . . . . . . 5.2.3 Learning and Adaptation . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Petri Nets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.1 Petri Net Structure and Graph . . . . . . . . . . . . . . . . . . . . . 5.3.2 Petri Net Markings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.3 Rules for Petri Net Execution . . . . . . . . . . . . . . . . . . . . . . 5.3.4 Example 1: Role Level . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.5 Example 2: Action Level . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 Q-Learning: A Model-Free Reinforcement Learning Method . . 5.4.1 Standard Reinforcement Learning . . . . . . . . . . . . . . . . . . 5.4.2 Q-Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.3 Example 1: Role Level . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5 Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.1 A Simple Neuron . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.2 Neural Network Structure . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.3 Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
141 141 141 142 143 144 146 146 147 149 149 153 159 161 162 164 167 167 169 170
XVI
6.
7.
Contents
5.5.4 Example 2: Action Level . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6 Evolutionary Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6.1 The EP Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6.2 EP and GAs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6.3 EP and ES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6.4 Example: Behaviour Level . . . . . . . . . . . . . . . . . . . . . . . . 5.7 Fuzzy Logic and Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.7.1 Fuzzy Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.7.2 A Fuzzy Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.7.3 Fuzzy Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.7.4 Defuzzification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.7.5 Example: Behaviour Level . . . . . . . . . . . . . . . . . . . . . . . . Notes on Selected References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
170 178 179 182 182 183 188 189 189 191 193 194 204
Robot Soccer System: Software Components and Programming . . . . . . . . . . . . . . . . . 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 MiroSoT Host Software Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.1 Modular Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.2 Modular Design Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Programming Framework: An Overview . . . . . . . . . . . . . . . . . . . 6.4 Basic Skill Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.1 Velocity() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.2 Angle() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.3 Position() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.4 Shoot() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5 Applied Skill Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5.1 Kick() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5.2 Goalie() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5.3 AvoidBound() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.6 Game Strategy Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.6.1 Zone-Defence Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.6.2 Univector Field Navigation . . . . . . . . . . . . . . . . . . . . . . . . 6.6.3 Limit-Cycle Navigation . . . . . . . . . . . . . . . . . . . . . . . . . . . Notes on Selected References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
205 205 206 206 206 208 211 211 213 215 221 227 227 236 242 247 247 250 254 256
Simulated Robot Soccer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257 7.2 Client-Server Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257 7.2.1 Server Side . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258 7.2.2 Client Side . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 7.3 Kinematics Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260 7.3.1 For the Ball . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260 7.3.2 For the Robot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260 7.4 How To Run the Simulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
Contents
XVII
7.4.1 System Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261 7.4.2 Server Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262 7.4.3 Client Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263 7.5 Client Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264 7.5.1 Basic Program Structure . . . . . . . . . . . . . . . . . . . . . . . . . . 264 7.5.2 Attack Direction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265 7.5.3 System-Defined Variables and Constants . . . . . . . . . . . . 265 7.5.4 Velocity() and Position() Functions . . . . . . . . . . . . . 266 7.5.5 Example Game Strategy Programs: NormalGame() . . . . 269 Notes on Selected References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273 A. Programming the PIC16C73/73A Microcontroller . . . . . . . . 273 A.1 On-chip PWM Programming for Robot Motion Control . . . . . 273 A.2 On-chip USART Programming for Robot Communication . . . 276 B. Reference Manual for an Experimental MiroSoT System . . B.1 Vision System: Set-up and Initialization . . . . . . . . . . . . . . . . . . . B.1.1 Build Program Executable Code . . . . . . . . . . . . . . . . . . . B.1.2 Run the Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.1.3 Set Camera Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.1.4 Set Ball Colour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.1.5 Set Robot Team ID Colour . . . . . . . . . . . . . . . . . . . . . . . . B.1.6 Set Robot ID Colour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.1.7 Set Playground Boundary . . . . . . . . . . . . . . . . . . . . . . . . . B.1.8 Set Robot Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.1.9 Set Pixel Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.1.10 Save Vision Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.1.11 Open Vision Settings File . . . . . . . . . . . . . . . . . . . . . . . . . B.1.12 Set Auto Colours . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.1.13 Change Colour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.2 System Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.2.1 The SENSE Category . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.2.2 The Communication Category . . . . . . . . . . . . . . . . . . . . . B.2.3 The DECIDE-and-ACT Category . . . . . . . . . . . . . . . . .
283 283 283 285 285 288 291 293 293 294 294 295 295 295 300 302 302 312 314
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
List of Figures
1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 1.11
Hardware entities of a robot soccer team . . . . . . . . . . . . . . . . . . . . . . Basic set-up for the ‘dribbling the ball past obstacles’ test . . . . . . . FIRA robot soccer: off-board/centralized vision categories . . . . . . . FIRA robot soccer: simulation category . . . . . . . . . . . . . . . . . . . . . . . FIRA robot soccer: onboard/distributed vision categories . . . . . . . . FIRA robot soccer: humanoid category . . . . . . . . . . . . . . . . . . . . . . . . General set-up for robot soccer (MiroSoT Category) . . . . . . . . . . . . A general SENSE-DECIDE-ACT block diagram for robot soccer Command-based robot soccer system . . . . . . . . . . . . . . . . . . . . . . . . . Action-based robot soccer system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Intelligence-based robot soccer system . . . . . . . . . . . . . . . . . . . . . . . . .
8 13 15 16 17 19 20 22 23 24 25
2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11 2.12 2.13 2.14 2.15 2.16 2.17 2.18 2.19 2.20 2.21 2.22 2.23
Different types of ‘move by rolling’ mechanism for a mobile robot . The robot’s posture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Different wheel assemblies resulting in different ICR-axes . . . . . . . . Kinematics of a robot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Computation of the robot’s unique ICR . . . . . . . . . . . . . . . . . . . . . . . Circular path and angle of turning . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rotational velocity profile of the two wheels . . . . . . . . . . . . . . . . . . . Hardware architecture of a soccer robot . . . . . . . . . . . . . . . . . . . . . . . Harware of a soccer robot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Normal operational sequence of a microcontroller . . . . . . . . . . . . . . Handling an interrupt request . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Supporting architecture for a microcontroller . . . . . . . . . . . . . . . . . . Scheme separating address and data signals . . . . . . . . . . . . . . . . . . . . Input-output of a chip selector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . An example CS equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A system address map and corresponding CS equations . . . . . . . . . Selecting the clock frequency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Device interfacing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pin Configuration of PIC16C73/73A microcontroller . . . . . . . . . . . A DC motor ‘package’ and its exploded view . . . . . . . . . . . . . . . . . . . Working principle of a DC motor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rotational velocity and current versus torque . . . . . . . . . . . . . . . . . . Torque about the wheel axis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
28 29 30 32 33 34 35 36 37 38 39 39 40 41 41 42 42 43 44 45 46 47 47
XX
2.24 2.25 2.26 2.27 2.28 2.29 2.30 2.31 2.32 2.33 2.34 2.35 2.36 2.37 2.38 2.39 2.40 2.41 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11 3.12 3.13 3.14 3.15 3.16 3.17 3.18
List of Figures
Two mechanical designs of motor-wheel assembly . . . . . . . . . . . . . . . H-Bridge circuit for motor driving . . . . . . . . . . . . . . . . . . . . . . . . . . . . Amplified PWM signals with different duty cycles . . . . . . . . . . . . . . PWM-based operations of H-bridge circuit . . . . . . . . . . . . . . . . . . . . . Graph showing the relationship between the wheel rotational velocity ω ¯ G and the PWM data W + required to attain it . . . . . . . . . . IR communication using ASK and IrDA1.0 methods . . . . . . . . . . . . Module block diagrams implementing the base band method for IR communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Generic module for IR base band communication . . . . . . . . . . . . . . . A game set-up for teams using IR base band communication . . . . . IR transmission coverage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Circuit block diagram of a transceiver . . . . . . . . . . . . . . . . . . . . . . . . . Communication message format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RF communication using the FSK method . . . . . . . . . . . . . . . . . . . . . An ALLINTEK ARFM-424 RF communication module . . . . . . . . . An ALLINTEK RF transceiver circuit for use by the host computer An RF transceiver circuit for use by each team robot (within dotted lines) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Communication message format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Examples of power regulation IC chips . . . . . . . . . . . . . . . . . . . . . . . .
50 53 54 54
Basic architecture of a computer vision system . . . . . . . . . . . . . . . . . An n × m image grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The RGB colour cube . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A gray level image and its resulting binary images using different thresholds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . An image and its histogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . An example showing the centre position (¯ x, y¯) of an image . . . . . . . Finding the orientation of the object . . . . . . . . . . . . . . . . . . . . . . . . . . A binary image and its labelled connected components . . . . . . . . . . The 4- and 8-neighbourhoods of a pixel at square position [i, j] . . . Examples of a 4-path and an 8-path . . . . . . . . . . . . . . . . . . . . . . . . . . An example illustrating the workings of the sequential connected components algorithm on an image . . . . . . . . . . . . . . . . . . . . . . . . . . . A noisy binary image and its resulting image after application of a size filter (Af = 8) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A high-level flow chart showing vision processing as a software component of a robot soccer host-system program . . . . . . . . . . . . . On mapping the image and physical coordinate points in the playground . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Frame grabber (Media camp 7 plus) . . . . . . . . . . . . . . . . . . . . . . . . . . Example 1: Labelled components image . . . . . . . . . . . . . . . . . . . . . . . A robot’s colour patch layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Computing a robot’s posture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
73 75 76
58 59 60 61 61 62 63 63 64 66 67 67 68 70
79 80 81 82 85 86 86 89 90 92 93 94 96 97 98
List of Figures
XXI
3.19 Basic window tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 4.1 A hybrid control architecture for robot soccer (MiroSoT Category) 105 4.2 Situational problems encountered by role-level assigner . . . . . . . . . . 107 4.3 Basis areas of manoeuvre for zone defence strategy (Small League MiroSoT Category) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 4.4 Formations for Middle League MiroSoT . . . . . . . . . . . . . . . . . . . . . . . 110 4.5 Wandering action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 4.6 SweepBall action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 4.7 Shoot action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 4.8 Cannon Shoot action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 4.9 Position To Shoot action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 4.10 PushBall action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 4.11 Position To PushBall action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 4.12 BlockBall action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 4.13 Block diagram representation of a PID controller . . . . . . . . . . . . . . . 118 4.14 Robot soccer situation: A robot (in white) should kick the ball (round) avoiding a opponent robot (in grey) . . . . . . . . . . . . . . . . . . . 120 4.15 Robot modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 4.16 Available velocity region . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 4.17 Component forces for generating a potential field . . . . . . . . . . . . . . . 127 4.18 A potential field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 4.19 A univector field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 4.20 Univector field for obstacle avoidance by a point object . . . . . . . . . . 129 4.21 Modified univector field for obstacle avoidance by a robot while moving towards a target point g . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 4.22 Phase portrait of a limit-cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 4.23 Navigation using the limit-cycle method . . . . . . . . . . . . . . . . . . . . . . . 133 4.24 Multiple obstacle situation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 4.25 Decision of rotational direction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 4.26 Navigation example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 4.27 Local minima with two overlapped obstacles . . . . . . . . . . . . . . . . . . . 138 4.28 Extended navigation method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 4.29 Robot soccer example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 5.1 A Petri net structure and its graph . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 A Petri-net graph for role assignment supervision . . . . . . . . . . . . . . . 5.3 A Petri-net supervisor for role assignment: transition firings and token redistributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 Role selection: who should attack? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5 A role-action structure for Petri net supervision and control . . . . . 5.6 Four key situations for the defending robot . . . . . . . . . . . . . . . . . . . . 5.7 A Petri-net graph for defending robot control . . . . . . . . . . . . . . . . . . 5.8 Predictor of target point of the ball. . . . . . . . . . . . . . . . . . . . . . . . . . . 5.9 A Petri-net graph for goalkeeping robot control . . . . . . . . . . . . . . . .
148 151 151 152 153 154 156 157 159
XXII
List of Figures
5.10 A Petri-net control for robot goalkeeping: a transition firing and token redistributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 5.11 The standard reinforcement-learning model . . . . . . . . . . . . . . . . . . . . 161 5.12 States of the attacking robot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 5.13 States of the defending robot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 5.14 States of the ball . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 5.15 A Q-learning state (ra3 , θ1 , rd2 , b5 ) of a role-level strategy for robot soccer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 5.16 Schematic diagram of a simple neuron . . . . . . . . . . . . . . . . . . . . . . . . . 168 5.17 Two sigmoid functions, with λ1 = 0, λ2 = 1 . . . . . . . . . . . . . . . . . . . . 168 5.18 A feedforward neural network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 5.19 Structure of the proposed ASM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 5.20 The process of selecting aji f s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 5.21 A one-a-side MiroSoT game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 5.22 Four situation variables characterizing ball possession . . . . . . . . . . . 176 5.23 Four situation variables representing the team (or home) robot’s winning score against the opponent robot and the risk level of conceding a goal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 5.24 Structure of the feedforward neural network . . . . . . . . . . . . . . . . . . . 177 5.25 Pseudocode of algorithm EP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180 5.26 Grid net of the function approximator . . . . . . . . . . . . . . . . . . . . . . . . 184 5.27 Simple example of grid net . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 5.28 Illustration of membership functions for fuzzy values. The upper diagram shows the membership functions of cold, moderate, and hot. The middle diagram shows the membership functions for cold and moderate; the lower diagram shows the membership functions for cold or moderate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190 5.29 A fuzzy PD controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190 5.30 Illustration of fuzzy inference with two rules using the min-max rule. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 5.31 Shooting from the left side when the line connecting the ball to the opponent goal is on the right . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 5.32 Variables for relative posture characterization . . . . . . . . . . . . . . . . . . 195 5.33 Overall fuzzy control structure for the Shoot action . . . . . . . . . . . . . 196 5.34 Desired output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 5.35 Membership functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198 5.36 θd sampled at each input region in the vicinity of the ball at (0, 0) 199 5.37 Obstacle avoidance scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 5.38 Membership functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 5.39 Membership functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 5.40 FLC for obstacle block . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 5.41 Membership functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203 6.1 Host software model for a MiroSoT team . . . . . . . . . . . . . . . . . . . . . . 207 6.2 Overall program structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
List of Figures
6.3 6.4 6.5 6.6 6.7 6.8 6.9 6.10 6.11 6.12 6.13 6.14 6.15
6.16
6.17
6.18 6.19 6.20
6.21 6.22 6.23 6.24 6.25
6.26
XXIII
My Strategy() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 Angle or turning control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214 Position control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216 Problem of oscillation about θe = ±90◦ with Position() . . . . . . 218 Graph of Vc against de . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 Different paths of the robot for shooting the ball in different desired directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222 Geometric relationships for calculating the robot’s desired heading angle θd at its online position (x, y) . . . . . . . . . . . . . . . . . . . . . . . . . . . 223 Problem of chattering with Shoot() . . . . . . . . . . . . . . . . . . . . . . . . . 225 A modified univector field for solving the problem of chattering . . 226 A strategy for the Kick() function . . . . . . . . . . . . . . . . . . . . . . . . . . 228 Mapping playground subareas to desired directions of ball kick . . . 229 State S0: the desired point (pos[0], pos[1]) behind the ball the specified robot should move to. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231 In state S0: to switch to state S1 if the robot is at less than a distance of Dc from the desired point (pos[0], √ pos[1]). Note that in this figure, the illustrated C0-condition ‘ dx × dx + dy × dy < Dc ’ is true. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231 In state S1: to switch to state S2 if the angle error is less than a specified value Ad . Note that in this figure, the illustrated C1condition ‘|θe | < Ad ’ is false. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233 State S2: the desired point (pos[0], pos[1]) the specified robot should reach in order to move through the ball’s position at a certain positive velocity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234 In state S2: to switch to state S0 if the robot is at less than a distance of Df from the ball, or is behind the ball. . . . . . . . . . . . . . . 235 Areas of surveillance for the Goalie() function . . . . . . . . . . . . . . . 236 Desired position (estimate x, estimate y) of goalkeeping robot dy when the ball is in far-distance area, with y = · x, where dx dy = PositionOfBall[1]−130/2 and dx = PositionOfBall[0]−15.237 Desired position (estimate x, estimate y) of goalkeeping robot when the ball is in middle-distance area . . . . . . . . . . . . . . . . . . . . . . 237 Desired position (estimate x, estimate y) of goalkeeping robot when the ball is in near-distance area . . . . . . . . . . . . . . . . . . . . . . . . . 238 The two Y -coordinate boundaries of the goalkeeping robot . . . . . . . 239 Numbered exception-situations and the desired turning directions for the goalkeeping robot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240 Labelled exception-situations characterized by specified bounds (DISTANCE BOUND, ANGLE BOUND) and wall locations, namely top and bottom walls, left and right walls . . . . . . . . . . . . . . . . . . . . . . . . . 243 Team robots’ assigned areas according to the zone-defence strategy 248
7.1 Client-server platform for SimuroSoT . . . . . . . . . . . . . . . . . . . . . . . . . 258
XXIV List of Figures
7.2 Internal architectue of client-server based simulator for robot soccer programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 SimuroSoT simulation display . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4 SimuroSoT client interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.5 SimuroSoT client connection window . . . . . . . . . . . . . . . . . . . . . . . . . .
258 262 263 264
A.1 Using CCP in PWM mode for motor driving and USART for data reception . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273 A.2 Switch for assigning Team ID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281 B.1 Visual C++ window environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.2 MiroSoT system user-interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.3 Colour-setting buttons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.4 Default colour setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.5 Colour parameter-setting window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.6 Default camera image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.7 Adjusting Brightness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.8 Adjusting Hue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.9 Adjusting Saturation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.10 Setting ball colour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.11 Ball colour control Window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.12 Well-set ball colour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.13 Selecting a robot on the real-image screen . . . . . . . . . . . . . . . . . . . . . B.14 Setting a robot team ID colour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.15 Setting robot ID colour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.16 Setting boundary of playround . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.17 Setting robot size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.18 Setting the pixel size of colours with the Set Pixel Size button . . B.19 Saving settings with [Save Vision File] . . . . . . . . . . . . . . . . . . . . . . B.20 Importing settings with [Open Vision File] . . . . . . . . . . . . . . . . . . B.21 Zooming in after clicking on Auto Set Colour button . . . . . . . . . B.22 Setting the team ID colour with auto colour setting . . . . . . . . . . . . . B.23 Setting robot ID colour with Auto Colour Setting . . . . . . . . . . . . . B.24 Setting the ID number for a robot . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.25 Robot colour 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.26 Menu for changing colours . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.27 Change colour parameter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.28 Display function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.29 Set-colour function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.30 Vision function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.31 Communication function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
284 284 285 285 286 286 287 287 288 289 290 290 292 292 293 294 295 296 296 297 297 298 299 299 300 301 301 302 304 307 313
List of Tables
1.1 Robot primitives defined in terms of inputs and outputs . . . . . . . . . 3 1.2 Robot primitives and system entities for robot soccer (MiroSoT Category) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.1 Some motor and gear characteristics (source: datasheets) . . . . . . . . 51 2.2 Pin Configuration of ALLINTEK ARFM-424 module . . . . . . . . . . . 65 2.3 Some batteries and their characteristics . . . . . . . . . . . . . . . . . . . . . . . 68 4.1 Robot primitives and hierarchy levels defined for robot soccer (MiroSoT Category) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 5.1 5.2 5.3 5.4 5.5
Algorithm EP for offline training of univector fields . . . . . . . . . . . . Representation of the fuzzy PD controller as a table . . . . . . . . . . . . Rules for the destination block . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rules for the obstacle block, FLC1 (left) and FLC2 (right) . . . . . . Rules for right wheel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
187 191 197 200 202
6.1 Program functions for robot soccer system (MiroSoT category) . . . 210
1. Soccer Robotics
1.1 Introduction Soccer robotics is an emerging field that combines artificial intelligence and mobile robotics with the popular sport of soccer. In essence, it studies how mobile robots can be built and trained to play a game of soccer. It arises out of a need to find a domain that can serve as an integrated framework for the complementary purposes of research and education in multi-agent robotics. Indeed, robot soccer, a roboticized version of human soccer, has gained worldwide acceptance as an interesting and challenging domain for studying and investigating a large spectrum of issues of relevance to the development of complete autonomous robots in multi-robot systems. Since its official founding in 1997, the Federation of International Robotsoccer Association (FIRA, http://www.fira.net), an international non-profit regulating body for robot soccer, has been actively promoting the development of soccer robotics. This chapter presents the motivations and an overview of the FIRA framework for soccer robotics as a subfield of AI robotics. More specifically, it presents the ‘what’ of intelligent multi-agent robotic systems and robot soccer systems. The basic terminology and concepts used in AI robotics and multi-agent systems are defined. The domain characteristics inherent in robot soccer, and of significant relevance to the study of multi-agent robotic systems, are examined. The goals of soccer robotics in research and education of intelligent multi-agent robotic systems are explained. The various categories of robot soccer created by FIRA are described. The classification of robot soccer systems for the Micro-Robot Soccer Tournament (MiroSoT) is examined. MiroSoT is an important category of robot soccer, especially from the perspective of edutainment, and is the focus of this book. 1.1.1 Agents, Multi-agent Systems, and AI Robotics The field of intelligent multi-agent robotic systems is concerned with the study of building (artificially) intelligent robotic agents. An intelligent mobile robotic agent is an autonomous physical robot situated in a real environment, and is reactive, proactive and communicative. To elaborate, an autonomous robot requires minimal or no human intervention. In order to satisfy its design objectives, the robotic agent (or simply, the robot) is J.-H. Kim, D.-H. Kim, Y.-J. Kim, K.-T. Seow: Soccer Robotics, STAR 11, pp. 1-26, 2004 Springer-Verlag Berlin Heidelberg 2004
2
1. Soccer Robotics
• reactive in that it can perceive its environment and respond in a timely fashion to changes that occur in it; • proactive in that it is capable of taking its own initiative; and • communicative in that it is able to interact with other agents, possibly including humans. Reaction, proaction and communication are the means by which a robot interacts with its environment. A robot’s intelligence is said to emerge from such interactions. In general therefore, intelligence is not a property of the robot in isolation, but is rather a result of interplay with its environment. A software agent shares the same means of interaction. But just perhaps, the most crucial aspect that sets a robotic agent and a software agent apart is embodiment; a robot has an individual physical presence (or individual body) unlike a software agent. This spatial reality has implications in the robot’s dynamic interactions with the environment that cannot be simulated faithfully. A robot perceives its environment through sensors and acts upon that environment through actuators, usually after some decisive reasoning. Following which, the artificial intelligence (AI) in the robot, the degree of which is exhibited by the way it behaves when taking actions, can be organized naturally in terms of the three commonly accepted primitives of AI robotics, namely, SENSE, DECIDE and ACT, with ACT further subcategorized into Intelligent Control and Actuation. If a robot’s function is collecting information from its sensors, and producing an output for use by its other functions, then the function falls in the SENSE category. If the function is taking in information (either from its sensors or its own knowledge about the application domain and environment), and selecting an action for the robot to perform, the function is in the DECIDE category. Functions which produce output commands to the motor actuators fall into the ACT:Control category. Functions that drive the robot hardware to produce physical motion fall into the ACT:Actuation category. Functions under the categories of DECIDE and ACT:Control constitute the core of a robot’s intelligence. Table 1.1 defines these primitives in terms of the inputs and outputs for each primitive. A multi-agent robotic system is said to be formed when two or more robots are situated in the same environment. When a group of robots in a multi-agent system come together to share a common ultimate objective, they are said to form a team. In other words, a team is always objective-oriented. Team members (or teammates) cooperate via interactions to achieve the ultimate objective. Other robots in the same environment that have objectives opposing the team’s ultimate objective are the team’s opponents. 1.1.2 Cooperative Robot Teams Broadly speaking, the multi-agent robotic problems of SENSE, DECIDE and ACT fall within the general framework of building cooperative robot
1.1 Introduction
3
Table 1.1. Robot primitives defined in terms of inputs and outputs ROBOT PRIMITIVE
INPUT
OUTPUT
Sensor Data
Sensed Information
Information (Sensed and/or Cognitive)
Selected Actions
: Control
Sensed Information and Selected Actions
Actuation Commands
: Actuation
Actuation Commands
Physical Motion
SENSE DECIDE
ACT
teams. Ideally, a robot team should satisfy the system design requirements of robustness and fault tolerance, reliability, flexibility or adaptability, coherence and scalability. In this section, we examine each requirement to emphasize its significance in robot teams. Robustness and Fault Tolerance. Robustness refers to the ability of a system to gracefully degrade in the presence of partial system failure. The related notion of fault tolerance refers to the ability of a system to detect and compensate for partial system failures. To achieve robustness and fault tolerance, cooperative teams need to minimize their vulnerability to individual robot outages. To achieve this design requirement, first, one must ensure that critical control behaviors are distributed across as many robots as possible rather than being centralized in one or a few robots. This complicates the issue of action selection among the robots, but results in a more robust multi-robot cooperation system since the failure of one robot does not jeopardize the system’s objective entirely. Second, one must ensure that an individual robot does not rely on orders from a higher-level robot to determine the appropriate actions it should employ. Relying on one or a few coordinating robots makes the team much more vulnerable to individual robot failures. Instead, each robot should be able to perform some meaningful tasks, up to its physical limitations, even when all other robots have failed. And third, one must ensure that robots have some means for redistributing tasks among themselves when some robots fail. This characteristic of
4
1. Soccer Robotics
task reallocation is essential for a team to achieve its objective in a dynamic environment. Reliability. Reliability refers to the dependability of a system, i.e., whether it functions properly each time it is utilized. As an example of a reliability problem, consider a situation in which two robots, r1 and r2 , have two tasks, t1 and t2 , to perform. Let us assume that they negotiate for task allocation, which results in robot r1 performing task t1 and robot r2 performing task t2 . Further, suppose that robot r1 experiences a mechanical failure that neither robot r1 nor robot r2 can detect. While robot r1 valiantly attempts in vain to complete task t1 , robot r2 successfully completes task t2 . However, although robot r2 also has the ability to successfully complete task t1 , it does nothing further since it expects robot r1 to complete task t1 . Thus, as a team, the robots never achieve the objective of completing the two tasks. One would probably not term such a team as reliable, since a mere reallocation of the tasks would have led to achieving the objective. Flexibility or Adaptability. The term flexibility or adaptability refers to the ability of team members to modify their actions as the environment or robot team changes. Ideally, a cooperative team should be responsive to changes in individual robots’ skills and performances as well as in the dynamic environment. In addition, the team should not rely on a prespecified group composition in order to achieve its objective. The capabilities of the team robots can change over time due to learning which should enhance performance, or due to mechanical or environmental causes which may reduce or increase a robot’s success at certain tasks. Team members should respond to these changes in performance levels by taking over tasks that are no longer being adequately performed or by relinquishing those tasks better executed by others. Each robot must decide which task to undertake based on the actual rather than predetermined performance of the team robots for all the tasks. Robots must also exhibit flexibility in their action selection in response to the dynamic nature of their environment. Obviously, in real world environments, some changes that occur cannot be attributed to the actions of any robot team member or members. Rather, outside forces not under the influence of the robot team affect the state of the environment throughout the course of execution. These effects may be either destructive or beneficial, leading to an increase or decrease in the workload of the robot team members. The robot team should therefore be flexible in its action selections, opportunistically adapting to environmental changes that eliminate the need for certain tasks, or activating other tasks that a new environmental situation requires. Finally, the flexibility requirement also deals with the ease of deploying robot teams in various applications. The human designer should be given the liberty to form teams as desired from subsets of the available robots. However, different groups of robots may be useful for different objectives and
1.1 Introduction
5
thus the team composition can vary from one team to another. The aim is to have these robots perform reasonably well in their teams the very first time they are grouped together, without requiring any robot to have prior knowledge of the abilities of the other team members. However, over time, we want a given team of robots to improve its performance by having each robot learn how the presence of other specific robots on the team would affect its own behavior. For example, a robot that prefers to clean floors but can also empty the garbage, should learn that in the presence of another robot that is only capable of cleaning the floors, it should automatically take the role of emptying the garbage. Coherence. Coherence refers to how well the team performs as a whole in terms of whether the actions of individual agents are purposefully combined toward some unifying objective. Typically, coherence is measured by criteria such as the quality of the solution or the efficiency of computing the solution. Efficiency considerations are particularly important in teams of heterogeneous robots whose capabilities overlap, in that different robots are able to perform the same task but with quite different performance characteristics. In a highly efficient team, the team robots select tasks such that the overall team performance is as close to the optimal as possible. A team in which robots pursue conflicting actions or duplicate one another’s actions cannot be considered a highly coherent team. A coherent team, however, need not be totally free of all possible conflicts. Rather, the team robots must be able to resolve conflicts as they arise. As a simple example, conflicts do occur whenever multiple robots physically share the same workspace. Although they have the same high-level objective, they may at times try to occupy the same position in space, giving rise to positioning conflicts that need to be resolved. Clearly, robot teams exhibiting low coherence are of limited use in solving practical engineering problems. Achieving high coherence is therefore an important design objective in building cooperative robot teams. Scalability. Scalability refers to the ease with which a team can admit more team members so as to improve the overall team performance in working on its problem tasks. As an example of a scalability problem, consider a situation in which two robots, r1 and r2 , clean up some toxic wastes. During execution, two robots, r3 and r4 , are added. The extra robots, r3 and r4 , know about the existing robots, r1 and r2 . However, the existing robots do not know about these extra robots. Thus, several problems can occur, such as a degradation in team efficiency due to the interference among the robots and an increase in both communication collisions and job reallocations. To cope with these problems, considerations for scalability are necessary. In particular, the issues of complexity that arise as the team size increases must be mitigated if one is to produce a highly scalable team.
6
1. Soccer Robotics
1.1.3 Domain Characteristics Robot teams can be deployed in many application domains. The following domain characteristics present challenging problems of SENSE, DECIDE and ACT for building such distributed agent systems: 1. Cooperative domains are those in which a group of agents shares a common objective. 2. Adversarial domains are those in which there are agents with opposing objectives. 3. Real-time domains are those in which success depends on acting in time in response to a dynamically changing environment. A dynamically changing environment is one that has agents actively operating on it, and making changes in ways generally beyond the control of any individual agent. 4. Noisy domains are those in which the agents cannot accurately perceive the environment they are situated in, nor can they accurately affect it. These characteristics are inherent in the real world. Interestingly, the domain of robot soccer has these characteristics, making it particularly appropriate yet enjoyable for studying the problems of multi-agent robotic systems. Robot soccer is a game based on the modified rules of human soccer, and is played in a scaled down soccer field. In a game of robot soccer, two robot soccer teams compete by attempting to move a ball into the opponent team’s goal. The team with a higher score at the end of regulation time wins. 1.1.4 Robot Soccer Through the game of robot soccer, soccer robotics studies how multiple soccer-playing robots on each team could be built to cooperate in an adversarial environment to achieve specific objectives. The domain of robot soccer has the following characteristics: • Independent agents with the same well-defined high level objective of scoring as many goals to win the match: teammates. • Agents with conflicting well-defined high-level objective of counter-scoring as many goals to win the match: opponents. • A need for real-time decision-making. • Sensor and actuator noise. Note that in a competitive setting, the intermediate or low-level objectives of teammates and opponents can differ indeterminately. The different low-level objectives are usually associated with different roles such as defending and attacking. A teammate or opponent can dynamically assume these roles in accordance to its own team’s strategy.
1.2 The Goals of Soccer Robotics
7
The robots are assumed to have, at their disposal, the following resources: • Sensors that provide partial, noisy information about the environment. • The ability to process sensory information in order to update a world model (of the environment). • Noisy actuators that affect the environment. • Low-bandwidth, unreliable (wireless) communication capabilities. In order to cooperate well in such a domain, soccer robots must perform real-time visual recognition and tracking of moving objects, collaborate with teammates (to decide their roles of attack or defence in a dynamic game situation), navigate in coordination with teammates and in counteraction against opponents, and strike the ball in the goal-ward direction. All these demand robots that are efficient (functioning under time and resource constraints), reactive and proactive (deciding what actions to take based on strategic reasoning of the game situation, and perhaps learning and adapting from experience), communicative (as part of collaborating and coordinating with one another when deciding what actions will accomplish the low-level objectives that are beyond individual’s capabilities) and autonomous (sensing, deciding and acting as an independent system). The point to emphasize is that all these capabilities must be well integrated into a single and complete robot soccer system. Soccer robotics studies how such integrated robots can be built, using different approaches from those employed in separate research disciplines. This book provides course material for developing a soccer team of independent robots that can cooperate and work towards the ultimate objective in a complex, real-time, noisy, and adversarial environment. As a quick introduction, the system set-up considered is a robot soccer team that consists of micro-robots, a global vision system, a communication module and a host computer. Fig. 1.1 shows the hardware composition of the robot soccer team. More information on this game is given in Section 1.4.1. The rules of the game are given on the FIRA website http://www.fira.net. To build such a system successfully, this book uses a common architecture and basic robot hardware design, but emphasizes a robot soccer-programming framework on which to learn how to write programs to incorporate intelligence into the system, i.e., to integrate the abilities in a soccer team of robots to play different roles and utilize different strategies and control techniques in their behavior.
1.2 The Goals of Soccer Robotics The goals of soccer robotics are closely linked to those of the Federation of International Robot-soccer Association (FIRA), officially founded in June 19971 . FIRA actively promotes the game through organizing tournaments, 1
FIRA started as an organizing committee of the first Micro-Robot World Cup Soccer Tournament held in November 1996 at KAIST, Daejeon, Korea.
8
1. Soccer Robotics
Fig. 1.1. Hardware entities of a robot soccer team
conferences, workshops and other activities, and is an international non-profit regulating body that created the various categories of competitions and devised the game rules to reflect the state of the art in robot technology. 1.2.1 Test Bed for Robotics Research and Development (R&D) The domain of robot soccer is real-time, noisy, adversarial and cooperative, and thus provides a representative test bed for study and investigative research on the real issues of SENSE, DECIDE and ACT for intelligent multi-agent robotic systems. Although realistic simulation environments exist, it is important to evaluate physical robotic systems in order to address the full complexity of the issues. The test bed enables the direct comparisons between two different teams of robots, namely, by pitting them against each other in competitions such as those organized by FIRA. These competitions offer an independent measure of progress in intelligent multi-agent robotics. Different approaches to the same problem of robot soccer are demonstrated and evaluated in an environment with rules specified by an independent com-
1.2 The Goals of Soccer Robotics
9
mittee, rather than in a laboratory setting engineered to produce the most favorable but possibly biased results. 1.2.2 Educational Tool for AI Robotics The current study and research into building intelligent multi-agent robotic systems necessitate an integrated approach to problem solving in computing, science and engineering. Students in this area come from a broad base of computer, control and electrical engineering, computer science, mathematics, biology, physics and biomedical engineering and neuroscience. Robot soccer not only provides an experimentation test bed for multi-agent robotic systems research, but also a useful education tool for practical courses in this area. The domain of robot soccer is sufficiently complex yet accessible, with standard game rules well-defined and regulated by FIRA. Technically, the game of robot soccer makes heavy demands in all the key areas of robot technology, namely, mechanics, control, sensors, communication, and intelligence. It thus provides a sound educational, integrated project to progress students to real-world problem solving. Students will have the unique opportunity to focus on an easily understood standard problem where a wide range of these technologies would need to be developed, examined and integrated in a multidisciplinary and teamwork setting. The four key education areas involved are discussed below. Integrated knowledge. Students become quickly aware in a project on robot soccer that specialist intellectual knowledge from many pure discipline areas, such as mathematics, physics, electronics, computing, etc, must be brought together into an integrated whole. Complex systems are indeed complex, and to study their evolution and control requires an investment and application of a vast amount of multidisciplinary knowledge, forcing a teamwork approach. Teamwork. A project on robot soccer is undertaken typically by interdisciplinary teams, rather than by a single individual. This introduces participating students to a teamwork approach to problem solving which is quite lacking in many current educational systems that encourage individual problem solving within specific discipline areas. Real world issues. Through their involvement in the development of their multi-robot system, students are quickly brought into the realities of working with real-time evolution of complex, nonlinear physical systems. Unlike the many textbook problems studied and examined by students in an undergraduate curriculum, either by theory or through computer simulations where information and modelling are exact, real world problems are inherently complex systems where the modelling process is never exact, input and output data have a certain degree of uncertainty, parameter measurements may be imprecise and uncertain, and definitive control evaluation, for example, is impossible to achieve.
10
1. Soccer Robotics
Computer programs that run simulation models with exact input and output data very well may not, for example, run on a micro-robot due to its limitations in onboard processor and memory requirements. Students must learn to overcome these difficulties in practical applications and students lacking this hands-on experience greatly underestimate these real world issues. Critical thinking and creativity. The involvement and participation in building a robot soccer system brings to the student a sense of creativity and critical thinking necessary for student transition to the professional worker/researcher in our technological world. 1.2.3 FIRA Robot World Cup One basic goal of soccer robotics is to take the spirit of science and technology in AI robotics to the laymen and the younger generation, worldwide. In line with this, FIRA has its flagship event, the FIRA Cup, held annually since 1998. FIRA Cup has its predecessor in the Micro-Robot World Cup Soccer Tournament, held in 1996 and 1997 at KAIST, Korea, and is an international competition that seeks to fulfill the dual purposes of research evaluation and edutainment (education plus entertainment). Current information on this event can be found on the FIRA website http://www.fira.net. 1.2.4 Technology Transfer to New Useful Tasks The game of robot soccer allows researchers to discover and learn how to get a team of robots to sense with acuity, decide collaboratively and act in coordination within the limited context of a soccer game. The hope is that it will be possible to use or modify the same techniques and technologies to build robots that carry out other more useful tasks in industries. This should extend eventually to robots capable of working for or with humans in their envisaged roles as personal, service or field agents. Below, we enumerate these roles in various human-oriented applications which are currently subjects of intensive research and development, but which could potentially develop into gigantic enterprises, as personal computer businesses are today, to serve the needs of our 21st century society: 1. Personal Robots: homeostasis and utility oriented, e.g. as household servants, subject tutors (education) and pets (entertainment). 2. Service Robots: occupation oriented, e.g., as hospital nursing and surgery assistants, museum tour guides, restaurant waiters and service-on-demand driverless taxis (intelligent transportation). 3. Field Robots: intensive-labour or hazardous-mission oriented e.g., as construction workers, farmers and military agents.
1.3 Fundamental Motion Benchmarks for Robot Soccer
11
Central to these human-oriented applications is the need for ease and safety in operating these robots after switching them on. They should be commanded intuitively even by non-experts via multi-modal interfaces such as voice, speech, graphics and vision, and would dynamically adapt to everchanging environmental conditions. Personal and service robots should be fail-safe while field robots should be survivable. As a test bed that even laymen can easily identify with, we believe that robot soccer can help fire the imagination in terms of scientifically imitating human abilities, both mental and physical, leading to the creation of powerful technological ideas initially directed at developing a soccer team of fully autonomous humanoid robots capable of winning against the human world soccer champion team. It is envisaged that the technological know-how developed in the process could culminate in overcoming challenging issues facing the design and development of personal, service and field robots. A lot of the accumulated technology would have high potential for transfer, to eventually realizing robotic agents capable of carrying out other more useful tasks in human-oriented roles.
1.3 Fundamental Motion Benchmarks for Robot Soccer The fundamental motion of robot soccer is defined as the possibility of a robot starting from an arbitrary posture (x, y, θ) and moving to another posture (x , y , θ ) in the minimum time period possible, t, where (x, y) is the coordinate position of the centre of the robot in a Cartesian frame, and θ is the angle of the robot’s heading in that frame. The need for fundamental motion is clear: soccer robots must be able to move from where they are to different strategic postures in time during a game. For, if this cannot be done with a minimum level of accuracy and reliability, any game (cooperative) strategy will be overwhelmed by randomness for its real effectiveness to be ascertained. For this reason, FIRA has outlined a series of benchmark tests. For the purpose of performance evaluation, each benchmark test standardizes and makes explicit a (level of) skill that assumes the fundamental motion. The extent that the robots could measure up to these motion benchmarks will influence significantly the extent a proposed game strategy can be smoothly coordinated in execution. These individual skills and strategic teamwork are fundamental in the overall performance of a robot soccer team. The following briefly describes the benchmark tests. 1.3.1 Striking the Ball The Ball-Striking test requires a robot to move from (x, y, θ) to strike a stationary ball at (x , y ) to make the ball pass through (x , y ). A time period could also be added.
12
1. Soccer Robotics
In other words, the robot should be able to start from anywhere to strike a ball placed anywhere to make it go in any specified direction. This is a basic requirement to play soccer competently. It is necessary for the most elementary tasks such as kicking off, taking goal kicks and taking penalties. 1.3.2 Passing the Ball to Another Robot The Ball-Passing test requires a robot at (x, y, θ) to strike a stationary ball at (x , y ) so that another robot starting from (x , y , θ ) can strike the ball to pass through (x , y ). This is actually a very difficult test to pass. However, the skill of ball passing is the absolute minimum for robots to be able to coordinate as teammates to move the ball around from one team robot to another. Without this skill, no game strategy beyond schoolboy ‘kick-and-run’ can be effectively executed. 1.3.3 Striking a Moving Ball The Moving Ball-Striking test requires a robot to move from (x, y, θ) to strike a moving ball at (x , y , θ ) to make the ball pass through (x , y ). This test reflects a real game situation, in which the ball is moving most of the time, and sometimes it is moving quite fast, exceeding 1m/s. To hit a moving ball, the vision system must be working well at the highest possible image sampling rate, since the decision-making module has to know the speed of the ball to predict where it will be when the robot hits it. 1.3.4 Passing a Moving Ball to a Moving Robot The Passing a Moving Ball test requires a robot at (x, y, θ) to strike a moving ball at (x , y , θ ) to be struck at (x , y ) by another robot starting from (x , y , θ ) to pass through (x , y ). 1.3.5 Dribbling the Ball Past Obstacles Perhaps the most challenging, this test requires a robot to manoeuvre with the ball past a series of obstacles, without colliding with any one of them. The robot needs to plan and move with the ball through a zigzag or winding course that avoids obstacles. These obstacles simulate the opponent team robots and they could be stationary or moving. A basic test set-up, as shown in Fig. 1.2, has one test robot, one ball and two stationary obstacles. One obstacle is placed directly behind the other, at (xo1 , y) and (xo2 , y), with x01 < x02 . The Cartesian x-distance in between the obstacles is just long enough to place two imaginary robots, rotating about their individual centres, to form a straight line (through the centres of these
1.4 Categories of Robot Soccer
13
Y
Obstacle1 (x 01, y) Robot (x, y, θ)
Obstacle2 (x02, y)
0
X
Fig. 1.2. Basic set-up for the ‘dribbling the ball past obstacles’ test
four objects). The test robot is placed at (x, y, θ), in front of the obstacle at (xo1 , y), with x < xo1 ; the Cartesian x-distance in between them is just long enough to place one imaginary robot rotating about its own centre to form a straight line. All imaginary robots replicate the test robot. The ball is placed directly in front of the test robot. The test then requires the robot at (x, y, θ) to push the ball around and through the two obstacles in an ‘S’-like path to a good posture behind the obstacle at (xo2 , y).
1.4 Categories of Robot Soccer In this section, we briefly describe the categories of robot-soccer created by FIRA. Each category is a tournament with a defined set of operating rules and physical or simulated conditions that lend R&D focus on certain aspects of developing robot soccer systems, but in a competition setting that people from all walks of life can easily understand and enjoy. The FIRA Cup is one such major event that runs these tournaments on an annual basis. The detailed game rules of each category are available on the FIRA website http://www.fira.net. These categories, taken together, reflect the state of the art in AI robotics from the robot soccer perspective. They are by no means fixed, and will evolve in tandem with the R & D progress in robot technology.
14
1. Soccer Robotics
1.4.1 MiroSoT and NaroSoT The Micro-Robot Soccer Tournament (MiroSoT) and the Nano-Robot Soccer Tournament (NaroSoT) are the two categories of robot soccer that use a vision camera overlooking the playground to enable global (i.e., off-board and centralized) vision processing. The set-up of an overhead camera emulates an accessible environment, i.e., one in which complete information about the environment can be obtained if one wishes to. It considers the fact that in current vision technology, cameras mounted onboard the robots cannot deliver the same quality of position information as simply as an overhead camera. The intention of the overhead vison camera is to simplify the process of gathering visual information so that the main focus can be placed on the other two components, DECIDE and ACT, of the system. These two categories are of interest to those whose research or edutainment programmes would be held back by distributed (or localized) vision problems without the simple expediency of an overhead camera. There are two leagues in MiroSoT, namely, a small league (3-a-side, i.e., a team has 3 team robots, inclusive of the goalkeeper) and middle league (5-a-side); for each league, the number of robots specified for a team includes the goalkeeper. The size of each robot is limited to a cubic box of 7.5cm × 7.5cm × 7.5cm. The small league made its debut in 1996, while the middle league made its debut in 2001. Fig. 1.3(a) shows the snapshot of a game of Middle League MiroSoT. There is currently only one league in NaroSoT, namely, a middle league (5-a-side). This league made its debut in 1998. The size of each robot is smaller, and is limited to a rectangular box with a square base of 4cm × 4cm and a height of 5.5cm. Except for the size of the robot and the playground, the game rules for Middle League MiroSoT and Middle League NarosoT are quite similar. Fig. 1.3(b) shows the snapshot of a game of Middle League NaroSoT. Once the fundamental problems of robot motion can be efficiently solved, it is hoped that a full 11-a-side league would eventually emerge under this set-up. This is one good reason for introducing NaroSoT for, to put more robots on the playground, a practical option is to make the robots smaller rather than the playground bigger. The playground for the Middle League MiroSoT is the largest. Experience has shown that up to this size, robots can be quite easily accessed and positioned from the sidelines for games and experiments. Anything larger would require people to step on the playground, and make the playground ungainly and difficult to move. Besides, the camera only needs to be positioned about 2 metres above the playground to capture the full view of the playground without excessive optical distortion due to the lens. A larger playground would require the camera to be positioned higher than most conventional buildings would feasibly allow. MiroSoT is the focus of this book. More will be said of MiroSoT from Section 1.5 onwards.
1.4 Categories of Robot Soccer
15
(a) MiroSoT
(b) NaroSoT Fig. 1.3. FIRA robot soccer: off-board/centralized vision categories
1.4.2 SimuroSoT The Simulated-Robot Soccer Tournament (SimuroSoT) is MiroSoT played in a computer simulated environment. Without robot and vision hardware, the problems of sensing and acting are reduced to non-issues, and it becomes possible to focus on game strategy development for the two bigger leagues in SimuroSoT, namely the middle league (5-a-side) and the large or full league (11-a-side). Both leagues made their official debuts in 2001. The SimuroSoT framework consists of a network of three computers. One computer is configured as a server that simulates vision processing and physical motion of the robots and the ball, with a monitor screen that displays the game situation (i.e., the two competing team robots on a MiroSoT-simulation playground). The other two computers serve as clients, each assigned to a different competing team. Each client loads and runs a program that executes the designed game strategy of the team. Dynamic information on postures
16
1. Soccer Robotics
(a) SimuroSoT Fig. 1.4. FIRA robot soccer: simulation category
(i.e., the coordinate positions and directions of move) of the robots and the ball are passed as input to the client programs which then execute the necessary team game strategy, and pass control information back to the server computer that updates its monitor display accordingly. Fig. 1.4 shows the screen snapshot of a simulation game in Middle League SimuroSoT. A computer simulation game, this category is decidedly a competitive test of complex strategy development (the DECIDE component) using advanced AI techniques. 1.4.3 RoboSoT and KheperaSoT The Robot Soccer Tournament (RoboSoT) made its debut in 2001. Each team can consist of one, or up to three robots of which one can be the goalkeeper. The size of each robot is limited to a rectangular box with a square base of 20cm × 20cm and a height of 40cm This differentiates from a similar but one-a-side tournament, the Khepera robot Soccer Tournament (KheperaSoT), which made its debut in 1998. Khepera is the name of a commercially available robot; it is much smaller, has a vertical cylindrical shape, and has also been used for other research and education purposes in autonomous mobile robotics. Although KheperaSot is primarily intended for Khepera robots, any similar cylindrical robot is allowed as long as its base diameter does not exceed 6 cm. Fig. 1.5 shows the respective snapshots of a game of RoboSoT and KheperaSoT. A major difference of these two categories from the MiroSoT/NaroSoT categories is that in the physical set-up, the overhead camera is abandoned in favour of vision processing systems onboard each team robot. With this onboard / distributed vision-based set-up, the environment becomes inaccessible
1.4 Categories of Robot Soccer
17
(a) RoboSoT
(b) KheperaSoT Fig. 1.5. FIRA robot soccer: onboard/distributed vision categories
in that each robot will always only have a partial view of the environment. Therefore, this set-up offers a unique opportunity to promote research and development in distributed (or local) vision processing as an equally important aspect of robotic agents. These categories also require the team robots to be endowed with the local capabilities to decide and act. In other words, the SENSE-DECIDE-ACT components are (required to be) distributed in the team robots, making them truly autonomous as individual agents. Because of increased complexity, the level of play in these two categories, as demonstrated in past international tournaments thus far, has been inferior to the MiroSoT/NaroSoT categories, but they are much closer to the research goals of intelligent multi-agent robotic systems. Below, we briefly discuss some non-trivial problems of local sensing and local decision-making problems not found in centralized vision-based systems that teams in the FIRA RoboSoT and KheperSoT categories must contend with.
18
1. Soccer Robotics
• In local visual sensing, at the individual agent level, object recognition becomes a very difficult problem due to occlusion of objects and the fact that the same object to be tracked can appear very different in terms of shape and size from different views of the camera on-board a robot; these different views are due to the relative motion of the environmental objects and the robot. At the team level, a mobile robot now needs to dynamically decide which view to focus in relation to its team members’, in order to fulfill the team’s overall visual needs. Besides, broadcasting of time-critical visual information to all other team members over an inherently bandwidth-limited communication network is often infeasible without sacrificing real-time requirements. To overcome this, each agent would need to determine and only send to those team members that need the information it acquires. All these problems have technical implications in the development of robot navigation techniques aimed at attaining a robot’s desired posture while avoiding obstacles in its way. • In local or distributed decision-making, the moving robot would need to keep track of its own position and obtain critical information from its team members to complement its own. Based on such information, it can then determine its current ability with respect to all the available actions. It needs to constantly update this capability status and other information in order to work cooperatively (i.e., in collaboration and coordination) with its team members, to arrive at good joint decisions made in real-time, and under various trade-offs to achieve different objectives. Clearly, from the perspective of teamwork, central to the problems of local visual sensing and local decision-making is the general problem of cooperation. The idea of cooperation presents numerous challenges as it has to be carried out among distributed robotic agents with limited sensing capabilities, over a bandwidth limited communication network. This is an important area of active research, and for a start, the interested reader might want to refer to the survey paper [1]. 1.4.4 HuroSoT Fig. 1.6 shows some robot participants at the first Humanoid Robot Soccer Tournamnent (HuroSoT). A category at the stage of infancy, HuroSoT made its debut in 2002. It is the only category in which the robots assumes the primitive form of humans, from which the term humanoid is derived. These robots do not move on wheels, but are biped (i.e., ‘two legged’), admitting critical problems of dynamic motion control and balancing not found in the other categories. The HuroSoT initiative aims to stimulate and promote research and development in humanoid (biped) robotics. Because of the current state of the art, this game is still quite far away from an actual soccer game. Presently, the competition is organized as a series
1.5 The MiroSoT Robot Soccer System
19
(a) HuroSoT Fig. 1.6. FIRA robot soccer: humanoid category
of tests, including robot dash, penalty kick and obstacle avoidance. Robot dash is a sprint event; in penalty kick, the robot participants have to kick a ball into an empty goal and in obstacle avoidance, they have to avoid obstacles simulating stationary opponent players. These tests are to be seen as preparations for humanoid robot soccer in subsequent years of development. The format of the HuroSoT will actively evolve in tandem with the state of the art developments in humanoid robotics. Although still a distant dream, it is hoped that the game will eventually evolve into one with two competing humanoid robot teams playing a full game of soccer.
1.5 The MiroSoT Robot Soccer System Fig. 1.7 shows the general set-up for the Micro-Robot Soccer Tournament (MiroSoT). The set-up is a combination of mobile (small-sized) micro-robots, a vision camera overlooking the playground connected to a centralized host computer, and a wireless communication module connecting the host computer to the robots. The fact that visual sensing is achieved by a video camera that overlooks the complete playground offers an opportunity to get a global view of the dynamic game situation. This set-up may have simplified the sharing of information among multiple robots, but it still presents a real challenge for reliable and real-time processing of the movement of multiple moving objects, namely, the ball, the team robots as well as the robots on the opposing team. In a well-defined processing cycle, the global vision system perceives the dynamic game situation and processes the image frames, giving the postures
20
1. Soccer Robotics
Fig. 1.7. General set-up for robot soccer (MiroSoT Category)
of each robot and the ball (the SENSE functionality); the decision-making program module, given this information, uses its strategic knowledge to decide what action each robot has to take next (the DECIDE functionality); the intelligent control module, based on the selected action or sensed information, determines the actuation commands which are communicated to the individual robots that translate them into physical robot motion using their resident actuation routines (the ACT functionality). To adapt to the dynamic game situation, the team robots under control should always change their desired postures2 , and reactively determine the appropriate paths towards these desired postures. In order not to miss critical visual information that helps the team robots adapt in time to the dynamic game situation, the moving objects in the playground must be tracked as closely and as accurately as possible, and this necessarily requires fast sampling of the image frames of the game situation captured by the vision camera. Clearly, the processing cycle time (of sensing, deciding, controlling and communicating) must not exceed this sampling time in order not to miss any image frame. To satisfy this, the processing cycle time must be kept correspondingly short for a fast (image frame) sampling rate. Keeping this processing cycle time short while attempting to increase the overall intelligence of the system is a challenging constraint to handle in robot soccer, and in real-time multi-agent robotic systems in general. Table 1.2 maps the robot primitives of SENSE-DECIDE-ACT onto the robot soccer domain and the possible system (hardware) entities; the 2
Of course, what is ‘desired’ is decidedly a subjective opinion of the system designer. In general, such an opinion is coded in the system as a strategy formulated in terms of the positions of the other team robots, the opponent team robots and the ball.
1.5 The MiroSoT Robot Soccer System
21
Table 1.2. Robot primitives and system entities for robot soccer (MiroSoT Category) PRIMITIVE
ENTITY
INPUT
OUTPUT
SENSE [S]
Vision Camera and Host Computer (Vision System)
Field of View
Robots and Ball Postures
DECIDE [D]
Host Computer or Robots
Robots and Ball Postures and/or Strategic Knowledge
Selected Actions
Host Computer or Robots
Robots and Ball Postures and Selected Actions
Desired Wheel Velocities
Robots
Desired Wheel Velocities
Physical Motion
with Feedback Control
Physical Motion
Current Wheel Velocities
: Control [C]
ACT [A] : Actuation [M]
wireless communication that always exists between the host computer and the micro-robots is implicit. Fig. 1.8 shows a typical block diagram of an organization relating these primitives. There are many ways to design and implement this organization. The flexibility is due to one salient feature of the organization, namely, it does not enforce the cyclic operational sequence of SENSE, DECIDE, ACT, but allows DECIDE to modify the SENSE and ACT couplings as needed. It is easy to see from the block diagram why intelligent control is sometimes called outer-loop control and actuation is sometimes called inner-loop control. In outer-loop (robot) control, regardless of the techniques used, there are conceptually two major steps, namely, desired posture generation and posture control. In inner-loop control, encoders (together with some auxil-
22
1. Soccer Robotics ACT
Selected actions
Intelligent Control
DECIDE
Posture Generation
Desired postures of objects
Σ
+
Control
Desired wheel velocities
Σ
+ -
-
Actuation
PWM signals
Wheel velocities
Motors Motor rotations
REAL WORLD
Encoders Status of actions
Postures of objects
SENSE
Field of view
Fig. 1.8. A general SENSE-DECIDE-ACT block diagram for robot soccer
iary devices) are needed to quantify the physical motion of the wheels (input) and measure their current velocities (output) as feedback information; these devices would be covered in Chapter 2. Strictly speaking, an encoder is a motion sensor and thus comes under the SENSE primitive. However, for conceptual neatness as laid out in Table 1.2, and as it is better to present motion sensors as an integral part of a MiroSoT robot hardware, we classify motion sensing as part of actuation under the ACT primitive. The SENSE primitive is viewed exclusively as sensing the real outer-world or environment in which the motors run. The actual entity for each primitive depends on the system of play used that we will elaborate next.
1.6 Classification of MiroSoT Robot Soccer Systems The MiroSoT set-up supports three systems of play, namely, command-based robot system, action-based robot system and intelligence-based robot system. Each system has a centralized (software) component for (visual) sensing [S], and is characterized by whether the other two key components of deciding [D] and acting [A] are centralized in the host computer or distributed in the team robots. Here, intelligent control [C] and actuation [M] - the subcategories of primitive A - are used to help characterize or classify the robot soccer systems. The following describes the system classification in some detail, and also discusses the relative advantages and disadvantages of the systems.
1.6 Classification of MiroSoT Robot Soccer Systems
23
1.6.1 Command-Based Robots In the command-based robot system, the S-D-C components are centralized in the host computer; only the M component is distributed in the (hardware of the) team robots. In other words, the core intelligence (due to D and A:C) of the system resides in the host computer, as depicted in Fig. 1.9. In this system, each robot is similar to an RC-car (radio-frequency control car), but is under intelligent control resident in the host computer. Only a one-way wireless communication link is required, to transmit actuation commands from the host computer to the team robots. Of the three systems of play, this system is the most economical one to build; no visual sensor is mounted onboard the robots; the major efforts are focussed on writing software programs for strategic cooperation (and communication) amongst the robots. But it is also the one with the heaviest computational load in the host computer. However, increasingly, this is becoming a non-critical issue because of the availability of low-cost but powerful personal computers. This system should be preferred by many who have some knowledge of multi-agent systems and computer vision, and whose primary aim is to learn the basics of robot soccer or participate and win in a major tournament.
Command
Fig. 1.9. Command-based robot soccer system
24
1. Soccer Robotics
1.6.2 Action-Based Robots In the action-based robot system, the S-D components are resident in the host computer but the A component is distributed in the team robots. In other words, the core intelligence is split between the D component in the host computer and the A:C subcomponent distributed in the robots, as depicted in Fig. 1.10. In the command-based robot system, the control functions reside in the host computer. In this system, they are embedded in each robot, thereby reducing the host computer’s computational load. As in the command-based robot system, only a one-way wireless communication link is required, but is used instead to transmit visual information or action commands (i.e., selected actions to take) from the host computer to the team robots. This system is analogous to a human instructing a well-trained dog. Here, the human instructor plays the role of the host computer and the dog plays the role of the robot. The human only needs to give instructions without having to know the detailed movements of the dog. This makes it easier for the instructor since the dog knows how to act autonomously such as avoiding collision with any obstacle. This system should be favoured by those whose research emphasis is on robot control programming to build soccer robots individually capable of executing a selected action autonomously.
Action
Fig. 1.10. Action-based robot soccer system
1.7 Purpose of This Book
25
1.6.3 Intelligence-Based Robots In the intelligence-based robot system, only the S component remains centralized in the host computer; the D-A components are distributed in the team robots. In other words, the core intelligence of the system is distributed in the robots, as depicted in Fig. 1.11. The computational load in this system is the most distributed and wellbalanced in the individual robots and the host computer, with no computational load heavily concentrated in any one hardware entity. In principle, therefore, this system is the most scalable in terms of the number of team robots. It is also the closest to the notion of an intelligent multi-agent robotic system. Not only is a one-way wireless communication link required to transmit visual information from the host computer to the team robots; two-way wireless communication links are also needed to enable cooperative exchange of strategic information among the team robots. But many research issues remain open on how the team robots can effectively communicate and cooperate to achieve their ultimate objective. As a result, such robots are currently difficult, if not impossible, to build. This system is a good infrastructure for researchers who intend to use robot soccer as a test bed to develop or improve the techniques of multi-agent systems and distributed agent communication for real world applications, but under the assumption that the operating environment is accessible, as provided for by the overhead vision camera.
Posture
Fig. 1.11. Intelligence-based robot soccer system
1.7 Purpose of This Book This book presents course material that discusses in detail, the concept, design and construction of appropriate vision algorithms (SENSE), strategies
26
1. Soccer Robotics
(DECIDE), micro-robots (the hardware for sensing and movement) and control and actuation algorithms (ACT) devised for a command-based robot soccer system for MiroSoT. The cooperative strategies include the robots organizing themselves in formations, engaging in zone defence and continually switching specific roles to achieve their common objective. The intelligent control techniques to carry out some designed actions include novel navigation methods of a robot manoeuvring towards a desired posture, without colliding with any other robot. The general architectures, techniques and tools available for developing a robot soccer system are described and their applicability to the various aspects of robot soccer demonstrated as much as it is possible. The ‘C’ source code of an experimental (command-based) robot soccer program (for MiroSoT) is available for download from the FIRA website http://www.fira.net. The details of all major algorithms implemented in the robot soccer program are presented throughout the book to enable students to learn and build their own robot soccer systems from scratch, and up to competitive standards. A reference manual for the experimental system is given in Appendix B; the information provided therein should compress the learning curve in the practical design and development of a MiroSoT robot soccer system.
Notes on Selected References The textbook [2] presents an excellent introduction to AI robotics under the SENSE-DECIDE-ACT paradigm. For an advanced treatment of the same material, refer to the book [3]. The DECIDE primitive is referred to as the PLAN primitive in these two books. A special column, The Robot Competition Corner, of the International Journal of Robotics and Autonomous Systems, Elsevier Science, publishes information about the robot soccer competitions organized by FIRA since 1996, their interesting highlights and results, among other competitions’. A special session, Entertainment Robotics, of the 1997 IEEE International Conference on Robotics and Automation, presented a list of papers [4, 5, 6, 7] reporting early research efforts on robot soccer. Two special journal issues contain selected papers from workshop participants of the inaugural event MiroSoT’96 [8] and MiroSoT’97 [9]. A survey on multi-agent systems from the robot soccer perspective is presented in [10]. Two books, written in Korean and published under the auspices of FIRA, provide a comprehensive introduction to robot soccer systems [11] and cover the engineering know-how of building MiroSoT systems [12]. Material on the benchmark tests for the fundamental motion of robot soccer is taken from [13].
2. Robot Soccer System: Hardware and Firmware Components
2.1 Introduction In a robot soccer game, individual team robots must be capable of communicating and moving in real time. That a soccer robot could receive and execute various motion commands, driving itself into proper postures at the right times, is crucial to the success of physical team coordination. Such real time capabilities need to be enabled by the hardware and firmware for a soccer robot, integrated in a suitable robot architecture. This chapter presents the necessary background on the mechanical motion and basic architecture of a mobile robot. On the former, some mechanical structures for robot mobility via ‘rolling’ are examined; the kinematics of a differential-drive (two-wheel) robot is then presented. On the latter, the essentials of electronic hardware and firmware needed to build a differentialdrive robot of a MiroSoT system are covered in sufficient detail.
2.2 Mobile Robots A mobile robot is defined as one capable of locomotion on a surface solely through the actuation of a movement mechanism on which the robot is mounted, and which is in contact with the surface. This definition encompasses every robot equipped with a movement mechanism, including a wheelbased robot, a six-legged walking robot that assumes the shape of an insect, and a two-legged robot that assumes the shape of a human. 2.2.1 Mechanical Movement Mechanisms The characteristic movements exhibited by a robot depend on the design of the mechanical structure for its movement mechanism. It is therefore important to choose the right type of structure for a robot’s movement mechanism in a given application. Here, we concentrate only on the different types of ‘move by rolling’ mechanism for a robot, as shown in Fig. 2.1. Fig. 2.1(a) shows a commonly used differential-drive wheel-based mechanism. A robot on this mechanism can turn smoothly. However, the robot J.-H. Kim, D.-H. Kim, Y.-J. Kim, K.-T. Seow: Soccer Robotics, STAR 11, pp. 27-70, 2004 Springer-Verlag Berlin Heidelberg 2004
28
2. Hardware Components
(a) Wheel
(b) Caterpillar
(c) Omni-directional Fig. 2.1. Different types of ‘move by rolling’ mechanism for a mobile robot
tends to slip easily because of the small contact areas its wheels make with the floor surface. Fig. 2.1(b) shows a caterpillar-based mechanism. A robot on this mechanism can move in a straight line easily, but is unable to negotiate sharp bends. Fig. 2.1(c) shows a 3-wheel mechanism, with the rim of each wheel lined with small bar rollers placed parallel to the rotation axis. Unlike the previous two mechanisms, a robot on this mechanism can move freely in any direction with proper rotation of the three wheels. A robot with this characteristic mechanism is called an omni-directional mobile robot. Such a movement mechanism is, however, difficult to construct due to its structural complexity. The mechanisms shown in Fig. 2.1(a) and Fig. 2.1(b) are the most commonly used in general, and are also popular with designers of soccer robots for the FIRA tournament series (except HuroSoT). Wheel Mobile Robots. In this section, we analyze how different wheel assembly designs for a wheel mobile robot can affect the turning ability of a robot that has a rigid body. A wheel assembly is a movement mechanism or device which provides or allows motion between its mounted robot and the surface, on which each wheel is intended to have a single point of rolling contact. The mechanical flexibility (to move and turn) determines if the robot is capable of attaining an intended posture, depicted in Fig. 2.2 and defined by P as follows:
2.2 Mobile Robots
x P = y , θ
29
(2.1)
where (x, y) is the coordinate position of the robot’s centre in the Cartesian X1-Y1 frame, and θ, called its heading angle, is the angle of orientation of the robot in X1-Y1 frame, defined by the angle the robot’s heading in the X2direction makes with the X1-axis. By convention, the heading angle increases counter-clockwise.
Y1 Y2
θ
X2
X1 Fig. 2.2. The robot’s posture
In order not to clutter the analysis, some simplifying assumptions are made: The navigation floor of the robot (i.e., the floor on which the robot moves) is a flat horizontal surface; (the plane surface of) each wheel of the robot is perpendicular to the floor. The condition of pure rolling is also assumed. Pure rolling refers to rolling without slipping in any direction, including the direction of motion. Conceptually speaking, a slip is said to occur if in the consecutive positions of a moving wheel-based robot, a point on the circumference of each wheel comes into contact with more than one point on the floor. Most wheels, with the exception of the Swedish wheel, satisfy this condition within reasonable tolerances. A Swedish wheel is designed specially with embedded ball rollers in its rim to admit the freedom of moving in any direction, including that perpendicular to the plane surface of the wheel.
30
2. Hardware Components
To explain more concisely and without ambiguity, the following terminology needs to be defined: • A wheel’s surface normal refers to the imaginary line that is perpendicular to (the cross section of) the wheel surface, and passes through the centre of the wheel surface. • A robot’s instantaneous centre of rotation (ICR) refers to the intersection point of the surface normals of any two wheels of the robot. That it is instantaneous suggests that the ICR can shift as the robot moves. • A robot’s unique ICR refers to the common ICR at which the surface normals of all the robot’s wheels intersect. • A robot’s unique ICR-axis refers to the line that is perpendicular to the navigation floor and passes through the robot’s unique ICR. For a robot with two or more wheels, more than one ICR may exist, but the robot can turn provided a unique ICR exists (with respect to its wheel assembly). This states the mechanical principle on which a wheel-based robot turns. The unique ICR-axis is the vertical line about which the robot can turn. Clearly, the further the unique ICR is from the robot, the less pronounced is the curvature of the turn. We now analyze four robots with different wheel assemblies, as shown in Fig. 2.3.
Fig. 2.3. Different wheel assemblies resulting in different ICR-axes
In Fig. 2.3(a) is a robot with a unique ICR. The robot’s wheel assembly is similar to that of an automobile; the robot can turn left about the unique ICR-axis which exists on its left-hand side. Fig. 2.3(b) shows a robot on two parallel wheels, with the centres of these wheels aligned, resulting in a common surface normal. For this wheel assembly, the unique ICR can exist at any point on this line, and is determined (marked off on the line) by the turning speed induced by the difference in the velocities of the two wheels.
2.2 Mobile Robots
31
Fig. 2.3(c) shows a robot with no unique ICR. Any robot with this wheel assembly cannot move in any direction and always stays ‘locked’ in any position it is placed. Fig. 2.3(d) shows an assembly similar to that of Fig. 2.3(b) but with the wheels’ centres not aligned. The unique ICR in this case is, in principle, located at infinity since the two parallel wheel surface normals are deemed to intersect there. This implies that the robot cannot turn, but move forward. When building a more compact robot (e.g. a NaroSoT robot), the wheel assembly shown in Fig. 2.3(d) may be chosen. Robot turning is possible in practice because the two wheels attached to the same mechanical body always create some slip. The slip is due to ‘pull and push’ that result whenever the wheels move at different velocities. But since the left side is not symmetrical to the right side, an analysis to determine the unique ICR is difficult. 2.2.2 Kinematics of a Two-Wheel Robot We now analyze the kinematics of a MiroSoT soccer robot that moves on a two-wheel mechanism as shown in Fig. 2.3(b). Robot kinematics refers to a robot in motion, analyzed in terms of the mathematical relations between the robot’s position and its wheels’ velocities, without reference to force and mass. Consider a MiroSoT soccer robot as shown in Fig. 2.4. If the rotational velocities of the left and the right wheels are ωL and ωR respectively, then assuming no slipping of the wheels, the wheel velocities at the respective contact points are VL = rωL ,
VR = rωR ,
(2.2)
where r is the radius of the wheel. Let ν be the velocity of the robot at its centre and ω be the turning velocity of the robot (about its unique ICR-axis). Then, ν, ω, ωR and ωL have the following relationship: VL + VR ωL + ω R ν= =r ; 2 2 VR − VL ωR − ω L ω= =r , L L where L is the distance between the two wheels. [x have the following relationship: 1
(2.3)
y
For notational convenience, we also denote it by (x, y, θ).
θ]T 1 and [ν
ω]T
32
2. Hardware Components
Y
VL v
VR
θ y
0
x
X
Fig. 2.4. Kinematics of a robot
x˙ cos θ ˙ P = y˙ = sin θ 0 θ˙
0 ν 0 . ω 1
(2.4)
Eq. (2.4) is the kinematics equation of the robot, where
U=
ν ω
(2.5)
is the control vector input. Along the line orthogonal to the plane of the two parallel wheels (i.e., the line passing through the centres of these two centrally-aligned wheels), the resultant component velocity is given by
x˙ ˙ H · P = sin θ − cos θ = x˙ sin θ − y˙ cos θ, y˙
(2.6)
where H is the unit vector orthogonal to the plane of the wheels. Obviously, there is no motion along this line, hence: H · P˙ = 0.
(2.7)
2.2 Mobile Robots
33
Eq. (2.7) is called the nonholonomic constraint of the robot. It can be rewritten as:
tan θ =
y˙ . x˙
(2.8)
Now, Eq. (2.4) has 3 component equations but only two input control variables ν and ω (obtainable from Eq. (2.3), given the related wheel velocities VR and VL ). The holonomic constraint is not an independent equation as it can be obtained by combining the first two component equations of Eq. (2.4). This explains why, in general, no control solution is guaranteed to move the robot from a given posture (x, y, θ) to a desired posture (x , y , θ ).
R VL
VR
ICR R1
L R2
Fig. 2.5. Computation of the robot’s unique ICR
Instantaneous Turning Radius. Consider the two-wheel robot shown in Fig. 2.5. As this robot has a rigid body, VL = R2 ω,
VR = R1 ω.
(2.9)
Since L is the distance between the two wheels, it is easy to deduce from Fig. 2.5 that: R1 = R −
L ; 2
R2 = R +
L , 2
(2.10)
where R is the turning radius of the robot. Substituting R1 , R2 of Eq. (2.10) into Eq. (2.9) and eliminating ω, we obtain the following formula for turning radius R: L R= 2
VL + VR VR − VL
.
(2.11)
The robot moves in a straight line if VL = VR , implying from Eq. (2.11) that R −→ ∞. It turns about its own centre if R = 0, implying, from Eq. (2.11), that VL = −VR .
34
2. Hardware Components
2.2.3 Basic Motion Control: A Circular Path Analysis Consider the two-wheel robot shown in Fig. 2.6. Suppose we want the robot to move from one position to another along a circular path. To analyze circular motion control, it is convenient to denote the relative coordinate positions of the robot by (R, 0) and (R, ϕ), which specify its current and desired coordinate positions respectively. ϕ is called the robot’s turning angle about its unique ICR-axis.
D R
ϕ
Fig. 2.6. Circular path and angle of turning
The robot is assumed to be stationary at (R, 0), say at time t0 . In applying the wheel (rotational) velocity set (ωR , ωL ) (an equivalent to control input [ν ω]T by Eq. (2.3)), the robot’s motor-driven wheels will first accelerate to attain and then cruise at these applied velocities, before decelerating to a halt at the desired (R, ϕ). This gives rise to a velocity profile which, for a simple analysis, is reasonably approximated by a piece-wise linear function, as shown in Fig. 2.7. Based on this profile, the length of the circular path D is computed with reference to Eq. (2.3), as follows:
2.2 Mobile Robots
ω
35
ω ωR
ωL
t0
t1
t2
t3
Time
t0
(a) Left Wheel
t1
t2
t3
Time
(b) Right Wheel
Fig. 2.7. Rotational velocity profile of the two wheels
t3
D=
ν dt t0 t3
VL + VR dt 2 t0 t3 ωL + ω R = r dt 2 t0 t3 t3 1 = r ωL dt + ωR dt 2 t0 t0 1 = r (ωL + ωR ) (t3 − t0 + t2 − t1 ) . 4 =
(2.12)
By Eqs. (2.9) and (2.11), the turning radius R is given by:
R=
L 2
VL + VR VR − VL
=
L 2
ωL + ω R ωR − ω L
.
(2.13)
Hence, the desired turning angle ϕ (in radians) is determined by:
ϕ=
D r = (ωR − ωL )(t3 − t0 + t2 − t1 ). R 2L
(2.14)
Note that only when the ratio of the two wheel velocities is constant, such as if t0 = t1 and t2 = t3 , will the robot move in a perfectly circular path. As a final remark, cirular motion control is fundamental in some navigation methods such as a recently proposed limit-cycle method; the essence of this method comes from the observation that when a robot is moving towards a specified position, it can avoid colliding with any obstacle by turning clockwise or counter-clockwise around the obstacle. More will be said of this method in subsequent chapters.
36
2. Hardware Components
2.3 A Two-Wheel Command-Based Soccer Robot Fig. 2.8 shows the hardware architecture of a two-wheel command-based soccer robot. The essential components of the architecture are 1. a microcontroller (with supporting logic devices and associated circuitry), 2. two motors (with gears and wheels) and two motor drivers, 3. a wireless communication (receiver) module and a power system. For educational purposes, we omit inner-loop control (see Fig. 1.8) that would have required encoders (and encoder counters) for feedback sensing needed for determining the robot’s current wheel velocities. Fig. 2.9 shows three views of a physical robot built using this hardware architecture. The robot designed comprises of two printed circuit boards (PCBs), a rechargeable battery, two small DC motors (including gears and wheels) and a mechanical housing frame. Mounted on the bottom PCB are the power regulator and the IC chip for driving the motors. Mounted on the top PCB are the microcontroller, the wireless communication module and the associated circuitry. The communication module is a receiver that establishes
Fig. 2.8. Hardware architecture of a soccer robot
2.3 A Two-Wheel Command-Based Soccer Robot
37
a one-way link with a transmitter at the host computer. The rechargeable battery supplies power to the motor driving and other onboard logic circuitry. The (gear box accompanying the) motor used for this robot design has a gear ratio of 130 : 1 (motor rotations : wheel rotations). 2.3.1 Microcontroller This section focusses on the microcontroller which is the main processing unit in the robot hardware. We shall discuss the following aspects of a microcontroller: • • • • • •
basic functions, supporting architecture, data and address buses, chip selector, clock and interfacing.
Finally, we briefly describe a microcontroller chip, and elaborate the use of the microcontroller’s on-chip PWM (pulse width modulation) for actuating the motors, and its on-chip serial communication interface for receiving data. Basic Functions. There are 4 basic functions in a microcontroller, namely reset, instruction fetch, instruction execution and service interrupts. 1. Reset: This function sets the program counter (PC) to the starting address held in the reset vector. You may think of an address as a pointer to a memory location. The PC holds the physical address of the next instruction to be read from memory after the current instruction is executed. It also initializes the other internal registers to default values. 2. Instruction Fetch: The number of machine operations required for a single instruction fetch varies, depending on the type of microcontroller. In general, fetching an instruction is divided into 4 basic operations. First,
Fig. 2.9. Harware of a soccer robot
38
2. Hardware Components
the function writes the current PC value into the memory address register (MAR), and hence onto the address bus. Second, it sends out a read control signal on the control bus, upon which the requested data from the addressed memory is output on the data bus. Third, it checks the memory buffer register (MBR) to see if the requested data has been latched into it. Fourth, it reads the data from the MBR. Each machine operation takes up one timing state, with a duration of 2 or 3 clock pulses. 3. Instruction Execution: This function stores the instructions read from MBR (as done by the instruction fetch function) in the instruction register (IR). It then decodes and executes the opcode contained in the IR. An opcode is a machine code executable by the processor in the microcontroller to carry out a corresponding instruction. 4. Service Interrupts: This function checks for interrupt requests. An interrupt request occurs if a higher priority task arrives. Upon such a request, it suspends the running program by storing the program’s states (current memory address held in PC and other data) in a stack and then setting the PC with the data in the interrupt vector; this data is the starting physical address of the interrupt handling routine. After the interrupt handling routine has been executed, this function restores the suspended program’s states from the stack, which enable the instruction fetch and execution functions to continue on the original program. The cycle in which the basic functions except service interrupts are carried out in a microcontroller is shown in Fig. 2.10. Instruction Execution Reset
Internal registers’ values initialized
Instruction Fetch
Decode
Execute
Fig. 2.10. Normal operational sequence of a microcontroller
The sequence in the microcontroller’s handling of an interrupt request is shown in Fig. 2.11. Supporting Architecture. Fig. 2.12 shows the basic architecture set up to support the functionality of a microcontroller. The main integrated circuit (IC) chips and buses (interconnections in which the electrical signals flow) are shown. Here, it suffices to know that an IC chip is a microelectronic semiconductor device consisting of many interconnected transistors and other components. As the name implies, the data bus carries data bits, and the address bus carries address bits. Each bit is a digital signal representing
2.3 A Two-Wheel Command-Based Soccer Robot
Requests interrupt service
Stores all program states in stack
Executes interrupt handling routine
Gets start address of interrupt handling routine from interrupt vector
39
Restores all program states from the stack, and continue with program execution
Fig. 2.11. Handling an interrupt request
binary logic 0 or 1. Typically, a microcontroller has an 8-bit data bus (D7D0) and a 16-bit address bus (A15-A0); with conventions D0 and A0 referring to the least significant bits, and D7 and A15 referring to the most significant bits of the respective buses. The byte A15-A8 is called the upper-byte address and the byte A7-A0 is called the lower-byte address. DataBus
ROM
RAM
Microcontroller Read/Write AddressBus Chipselector
Fig. 2.12. Supporting architecture for a microcontroller
The ROM (read-only memory) is used to store programs (the ‘firmware’) and the RAM (random-access memory), to store data. To read and write from the memories, the microcontroller needs to generate the read and write control signals respectively. These control and address signals ‘direct’ the chip selector accordingly: To read, the chip selector enables either the RAM or ROM where data or instruction is to be read from the addressed memory location; to write, the chip selector enables the RAM where data is to be written into the addressed memory location. The chip selector and associated logic circuitry are designed to enable only one memory chip at a time. In practice, we select the ROM and RAM such that the total memory size is twice that estimated for an application program and data. The ROM, RAM and chip selector are essential devices in a microcontroller architecture. The microcontroller chip may or may not have all these devices built-in. The 89C52 IC chip is an example of a microcontroller that has an internal 8-Kbyte ROM but no chip selector. The 80C196 IC chip is an example of one that has neither a ROM nor a chip selector. In selecting
40
2. Hardware Components
a microcontroller, due considerations must be given to the extra peripherals and associated logic circuits that may need to be added externally. Address and Data Buses. Typically, a microcontroller with an 8-bit data and 16-bit address is compactly designed with its lower-byte address A7-A0 multiplexed with the 8-bit data D0-D7. In other words, the 8-bit data and the lower-byte address share the same bus, thus labelled as AD0-AD7 in Fig. 2.13. This design is possible because during instruction fetch, the address bits generated are put on the bus A15-A8-AD7-AD0 first before the data bits are output on the lower-byte bus AD7-AD0. The lower-byte address bits are on the bus AD7-AD0 only during the active ALE (Address Latch Enable) signal generated by the microcontroller chip. Thus, this ALE signal can be employed to separate the data and address signals, using the address latch scheme as shown in Fig. 2.13.
AD0
A0
AD1
A1
AD2 AD3
A2 A3
AD4 AD5
Microcontroller
8 bit Latch
DataBus
A4 A5
AD6
A6
AD7
A7
ALE
Address Bus
A8 A9 A10 A11 A12 A13 A14 A15
Fig. 2.13. Scheme separating address and data signals
To elaborate, in this scheme, when the ALE signal goes active, the 8-bit latch is enabled such that its output signals, labelled A7-A0, ‘follows’ the signals on the bus AD7-AD0 that contains the lower-byte address. When the ALE signal turns inactive thereafter, the latch output holds this lower-byte address; during this time, the upper-byte address remains on the bus A15-A8, and the data bits are put on the AD0-AD7 bus, hence separating the data signals from the address signals that ‘point’ at a memory location where the data signals are being read from or written into. Chip Selector. As mentioned, a chip selector enables only one device at any instant, and disables all the other devices connected to it. Depending on the microcontroller used, a chip selector need not be added externally in the microcontroller board. Microcontrollers such as the AM188 and 80296 chips do have a chip selector built in.
2.3 A Two-Wheel Command-Based Soccer Robot
Control Signal
CS1 CS2 CS3 CS4 CS5 CS6 CS7 CS8
Chipselector
Address
41
Chip Select Signal
Fig. 2.14. Input-output of a chip selector
Fig. 2.14 shows the inputs and outputs of a chip selector. As shown, the chip selector receives at its inputs, control signals such as Read/Write (RD/WD) and the address signals, usually the upper-byte A15-A8, from the microcontroller, and generates chip-select (CS) digital signals at its output. Each CS output is connected to exactly one device chip. So, for the architecture in Fig 2.12, two CS signals are needed, one each for the RAM and ROM. A CS output signal can be made to go active (in order to enable the connected device) according to a logic CS equation expressed in terms of the logic variables for the inputs. An example of a CS equation is shown in Fig. 2.15. The CS output signal is said to be active-low if it is functional (with respect to exclusive chip select) at logic 0, and active-high if it is functional at logic 1. ANDoperator !CNT1_CS = !A15 & A14 & A13 & A12 & A11 & A10 & A9 & !A8 & !RD; 0
1
1
1
1
1
1
0
RD
Active-low = 7E00H~7EFFH
Note: Symbol ‘!’ denotes ‘NOT’ logic operator. Fig. 2.15. An example CS equation
To ensure that only one CS signal is active at any instant, the logical ‘AND’ing of any two CS equations must always be at logic 0. As a design principle, the CS equation, being expressed partly in terms of the upperbyte address bits A15-A8, should define a unique upper-byte address or a unique set of upper-byte addresses that enables the connected device such as a ROM, RAM, PPI (programmable peripheral interface) or counter, and thus uniquely situates the device in a block of full addresses A15-A0 in the system (memory) address map (0000H-FFFFH). An example of an address map ‘partitioned’ by a set of CS equations is given in Fig. 2.16. The set of CS equations has to be designed and programmed to configure the internal hardware logic of the chip selector. Generally speaking, a decoder may be used as a chip selector but programmable array logic (PAL) and
42
2. Hardware Components 000H
ROM PPI Chip-select
7D00H 7E00H 7F00H
Counter1 Chip-select
7FFFH
Counter2 Chip-select
RAM
FFFFH Equations !RAM_CS !PPI_CS !CNT1_CS !CNT2_CS !ROM_CS
= = = = =
A15; !A15 & A14 & A13 & A12 & A11 & A10 & !A9 & A8; !A15 & A14 & A13 & A12 & A11 & A10 & A9 & !A8 & !RD; !A15 & A14 & A13 & A12 & A11 & A10 & A9 & A8 & !RD; !A15 & PPI_CS & CNT1_CS & CNT2_CS;
Fig. 2.16. A system address map and corresponding CS equations
generic array logic (GAL) are preferred because they are programmable and physically more compact. Note however that a PAL can be programmed only once, whereas a GAL is re-programmable. Programming methods for PAL and GAL may differ depending on the compilers used. Clock. The operations in a microcontroller are synchronized by running digital signals known as the clock. As a result, the processing speed of a microcontroller depends on the frequency of the clock. There are two methods to generate clock signals, as shown in Fig. 2.17. C2 XTAL2 C1 XTAL1
Microcontroller
GND
Fig. 2.17. Selecting the clock frequency
NC
XTAL2
External Oscillator Signal
XTAL1
GND
Microcontroller
2.3 A Two-Wheel Command-Based Soccer Robot
43
The first method is a circuit (shown on the left-hand side) that consists of a crystal used to drive the internal oscillation circuit. The frequency of the crystal determines the clock speed. The second method (shown on the right-hand side) uses an external oscillator as the microcontroller’s clock; this method is used when a clock is needed to synchronize the operational timings of two or more IC chips. Interfacing. To add a peripheral device chip, the following aspects need to be considered: • number of address and data bits required, • whether or not a device’s chip select signal (CS) is active-high or active-low, • whether or not a device’s reset signal is generated by software or hardware.
Data Bus
Chip
Output
RD WR CS A0 A1 Reset Address Bus
Fig. 2.18. Device interfacing
Consider the example shown in Fig. 2.18. This example shows a device chip interfaced with an 8-bit data, a 2-bit address, two control signals RD and WD, the reset signal and a chip select signal CS. The CS signal is from the chip selector, and reset signal, used to initialize the device, as well as all the other signals, are from the microcontroller. As a note of precaution, care must be taken when interfacing with analog devices because most analog devices tend to draw large instantaneous currents. Drawing a large instantaneous current from a digital device could damage the device. It is highly advisable to use only the driver (logic) circuits recommended by the manufacturer for interfacing analog devices such as DC motors. Microcontroller PIC16C73/73A. A microcontroller chip is typically a central-processing unit (CPU) integrated with modules for motion control
44
2. Hardware Components
OSC1/CLKIN OSC2/CLKOUT MCLR
: : :
RA0-RA5
:
RB0-RB7
:
RC0-RC7
:
VSS VDD
: :
Oscillator crystal input/external clock source input. Oscillator crystal output. Master clear (reset) input. This pin is an active low to reset the device. Data PORTA is a bi-directional I/O port. It can also be used for analog input. Data PORTB is a bi-directional I/O port. PORTB can be software programmed for internal weak pull-up on all inputs. Data PORTC is a bi-directional I/O port. RC1 can also be configured as CCP2. output pin, RC2 as CCP1 output pin; RC6 as USART asynchronous transmit pin and RC7 as USART asynchronous receive pin. Ground reference for logic and I/O pins. Positive supply for logic and I/O pins.
Fig. 2.19. Pin Configuration of PIC16C73/73A microcontroller
and communication purposes. The PIC16C73/73A microcontroller is one such chip; the pin configuration is shown in Fig. 2.19. The PIC16C73/73A device has 192 bytes of RAM and 22 I/O pins. In addition, several peripheral features are available, including: three timers / counters, two Capture/Compare/PWM (CCP) modules, two (serial) Universal Synchronous Asynchronous Receiver Transmitter (USART) modules and a 5-channel high-speed 8-bit A/D converter. The USART module is also known as the Serial Communication Interface (SCI). Appendix A elaborates on the use of its CCP and USART modules for motion control and communication, respectively.
2.3 A Two-Wheel Command-Based Soccer Robot
45
2.3.2 DC Motors and Auxiliary Components Broadly speaking, the hardware of a soccer robot comprises of a mechanical and an electrical subsystem. Motors are the main components of the mechanical subsystem. DC (Direct Current) motors are commonly used in soccer robots because of their relatively smaller sizes and lower costs compared to stepper motors’. DC motors, however, usually need additional components for good motion control performance. Besides the DC motors, motor drivers and wheels, the DC motor subsystem is usually also equipped with gears and encoders (with encoder counters). Encoder pulse signals, measuring the motor rotation angle, are fed-back and used to calculate the motor (and hence the wheel) velocities, and the gear boxes are used to increase the output torque of the DC motors (needed to rotate the wheels). Motor driver chips are needed to increase the current output of the PWM signals (generated by the microcontroller) to drive the DC motors. To build a compact soccer robot, one can generally reduce the size of the printed circuit board by using SMD (Surface Mount Device) components and through optimal layout of circuits. However, not many options are available when it comes to adding DC motors to the robot hardware. This is because a DC motor is usually required to be integrated as a mechanical ‘package’ comprising of the motor, a gearbox and an encoder, but such individual components which are compatible with one another are available in only a few fixed sizes. Fig. 2.20 shows an exploded view of such a DC motor ‘package’.
Encoder
Motor
Gear box
Fig. 2.20. A DC motor ‘package’ and its exploded view
Working Principle of a DC Motor. A rotor and a stator constitute a DC motor. The stator consists of two permanent bar magnets with opposite polarity facing each other, creating a magnetic field B in between, as depicted in Fig. 2.21. A current I flowing through a commutator brush (not shown) to the rotor of length l, will result in opposite forces F exerted on it in the directions indicated in Fig. 2.21, and governed by
46
2. Hardware Components
F N
S l I
Stator
Stator F
Rotor Fig. 2.21. Working principle of a DC motor
F = B · I · l.
(2.15)
The direction of the force F follows Fleming’s left-hand rule: Stretch out the thumb, first and second fingers of your left hand to be mutually perpendicular; then, with the first Finger pointing in the direction of the magnetic Field and the seCond finger in the direction of the Current, the Thumb is pointing in the direction of the Thrust or force. The opposite forces F rotate the rotor that ‘cuts’ the magnetic field continually, inducing, according to Faraday’s law, a back-emf2 in the rotor, which can be shown to be given by E = kbemf · ωM , where, E: kbemf :
(2.16)
back emf, back emf constant,
ωM :
rotational velocity of (motor) rotor.
Hence, by Kirchoff’s voltage law, V = E + I · ra , where V : voltage applied, I: rotor current,
(2.17)
ra :
armature resistance.
V is the voltage of the amplified PWM signals output by the motor driver; the motor driving circuit will be discussed in Section 2.3.3. Finally, the torque generated, i.e., the ‘angular’ force due to the opposite forces F about the (rotor’s) axis of rotation, and the current flow I are related by 2
The electromotive force ( abbreviated ‘emf’ ) is an archaic term for an induced electric potential (i.e., an induced voltage).
2.3 A Two-Wheel Command-Based Soccer Robot
(τM + τr ) = kT · I,
47
(2.18)
where, τM : motor torque, τr : frictional torque.
kT :
torque constant,
The parameters of Eqs. (2.16) to (2.18) are documented in the manufacturers’ catalogues on DC motors. Torque. Following the combination of Eqs. (2.16) and (2.17) and Eq. (2.18), we get two graphs of rotational velocity ωM versus motor torque τM and I versus τM , as shown in Fig. 2.22. Current (mA) I
Torque (mNm) τM
Rotational Velocity (rpm)
ωM
Torque (mNm) τM
Fig. 2.22. Rotational velocity and current versus torque
The graphs show that there is a tradeoff between motor torque and velocity; increasing the drive current I leads to an increase in the motor torque τM , but a decrease in the motor velocity ωM , and conversely. Most soccer robots move on two wheels. In order to drive the wheels, a force FG is needed against a frictional force Fr , as depicted in Fig. 2.23.
ro
Wheel shaft
G
r
Fr
Fig. 2.23. Torque about the wheel axis
The formulae for the wheel-driving force FG and frictional force Fr are as follows:
48
2. Hardware Components
Fr = µ · m · g, ro FG = Fr · , r
(2.19)
where µ: frictional constant, m: mass of robot, g: gravitational constant, r: radius of the wheel. ro : radius of the shaft, The torque τM generated by the motor and that τG applied to the wheel (attached to it) are related by τM 1 = , τG N · ηG where N : 1:
gear ratio,
(2.20)
ηG :
gear (‘torque-transfer’) efficiency.
Now, it can be shown that τG =
1 1 · FG · r = · µ · m · g · ro . 2 2
(2.21)
Thus, substituting Eq. (2.21) into Eq. (2.20), we get
τM
1 = 2
µ·m·g N · ηG
· ro .
(2.22)
Eq. (2.22) shows that the motor torque τM is directly proportional to the radius ro of the wheel shaft. Power Consumption. It is important that the power supplied by the batteries can last through one half of the game since it is convenient to change batteries only during the half-time interval. The battery power that is consumed by a DC motor is the product of the voltage V across the motor and the current I flowing through it. Rewriting Eq. (2.18), the current I is given by
I=
τM + τr . kT
(2.23)
Define Io by Io =
τr . kT
Then Eq. (2.23) becomes
(2.24)
2.3 A Two-Wheel Command-Based Soccer Robot
I=
τM + Io . kT
49
(2.25)
Io is known as the no-load current as I = Io when there is no load, i.e., τM = 0. Some DC motor catalogues do not specify the τr values; in such cases, the no-load current Io can be easily obtained (by measurement) for Eq. (2.25). Linear and Rotational Velocity. The gear ratio N : 1 relates the velocities of the motor and the wheel as follows: ωG 1 νG = = , νM ωM N where νM : ωM :
(2.26)
motor linear velocity, motor rotational velocity,
νG : ωG :
wheel linear velocity, wheel rotational velocity.
Combining Eqs. (2.16) and (2.17) and rearranging, the rotational velocity ωM in rotations per minute (rpm) is given by ωM =
1 · (V − I · ra ) rpm. kbemf
(2.27)
Since ωM is in rpm and not rad/s, the wheel (linear) velocity νG in cm/s is given by νG = r ·
2 · π · ωG = 60
2·π·r 60
· ωG cm/s,
(2.28)
where r in cm is the radius of the wheel. Substituting Eq. (2.26) into Eq. (2.28), we have νG =
2·π·r 60
· ωG =
2·π·r 60N
· ωM cm/s.
(2.29)
Size. As explained, DC motors and their compatible gearboxes and encoders are individually available, but in only a few fixed sizes. As a result, placing the motors, gearboxes and encoders within a confined cubic space of 7.5cm × 7.5cm × 7.5cm for a MiroSoT robot becomes a challenging mechanical design problem. No systematic approach exists, and this design is, at best, done with a lot of engineering ingenuity. Fig. 2.24 shows two sample designs that meet the size specifications for MiroSoT. The design on the left-hand side uses tendons (timing belts or chains) wrapped around a pair of wheels on each side; the design on the right-hand side uses gears to connect the gearbox (or gear head) to the encoder on each side.
50
2. Hardware Components
Fig. 2.24. Two mechanical designs of motor-wheel assembly
How to Select a Motor Package: An Example. Consider a two-wheel MiroSoT robot with the following features. 1. 12V - 550mA power supplied by NiMH batteries. 2. Electronic circuitry draws a maximum current of 300mA. This means 250 that a DC motor to be selected can draw a maximum current of mA 2 or 125mA. 3. Wheel radius r = 2cm, wheel shaft radius ro = 0.25cm. 4. Mass of robot m = 0.4kg. In this example, we want to select two (identical) motor packages for the robot, such as the one shown in Fig. 2.20. We assume that we have already selected two identical encoders, each with a length of 10mm. Suppose the maximum (linear) wheel velocity νG |max desired is at least 1m/s. Then ωG |max ≥
60 · 100cm/s = 480rpm. 2 · π · 2cm
Assume that the floor’s frictional constant µ is 0.43. Then the output torque τG |min that a motor (with gear) must at least produce is 1 · µ · m · g · ro 2 1 = · 0.43 · 0.4kg · 9.8m/s2 · 0.0025m 2 = 0.0021Nm = 2.1mNm.
τG |min =
Finally, assume that the mechanical design of the robot can accomodate two identical motor packages with each having a total length not exceeding 60 mm; in other words, the length Lm of each package minus its encoder must not exceed 50mm. By the values stated or determined above, each identical motor (plus gear) to be selected needs to meet the following requirements.
2.3 A Two-Wheel Command-Based Soccer Robot
Criterion
Requirement
Power consumption P Rotational velocity ωG |max Torque τG Length Lm
≤ ≥ ≥ ≤
51
12V, ≤ 125mA 480rpm 2.1mNm 50mm
Let’s consider Escap’s 16C18-205 DC motor and B16C27 gearbox that have the characteristics as shown in Table 2.1. Table 2.1. Some motor and gear characteristics (source: datasheets) Measuring Unit
Unit
Value
Measuring Voltage No-load Speed Stall Torque Average No-load Current Io Typical Starting Voltage Max. Continuous Current Max. Continuous Torque Max. Angular Acceleration Back-emf Constant kbemf Torque Constant kT Terminal Resistance ra
V rpm mNm mA V A mNm 103 rad/s2 V/1000rpm mNm/A ohm
12 17300 1.2 9 0.15 0.16 0.96 59 0.66 6.3 65
(a) 16C18-205 DC Motor Measuring Unit
Value
Ratio N Efficiency ηG Length (with 16C18) Lm Mass m
27 0.73 33.7mm 6g
(b) B16C27 Gearbox
Using a B16C27 gearbox, the minimum motor torque τM |min required is τM |min =
τG |min 2.1mNm = = 0.1061mNm. N · ηG 27 · 0.73
The 16C18-205 DC motor can generate a maximum continuous torque of 0.96mNm which is greater than the required torque of 0.1061 mNm. The
52
2. Hardware Components
required torque is also much less than the stall torque of 1.2mNm, defined as the minimum torque at or beyond which the motor will stall. Thus, in this aspect of torque generation, the 16C18-205 DC motor is a good choice. Using a B16C27 gearbox, the minimum current I|min required to produce τM |min is I|min =
τM |min 0.1061mNm + 9mA = 25.8mA. + Io = kT 6.3mNm/A
This current of 25.8mA is much less than the maximum of 125mA that can be supplied, and much less than the maximum continuous current of 160mA allowed. Thus, in this aspect of current drive, the 16C18-205 DC motor is a good choice. With this 16C18-205 DC motor, the maximum motor rotational velocity ωM |max is 1 · (V |max − I|min · ra ) kbemf 1 = · (12 − 25.8mA · 65ohms) 0.66V/1000rpm = 15641rpm.
ωM |max =
The corresponding maximum wheel rotational velocity ωG |max using a B16C27 gearbox is ωM |max N 15641rpm = 27 = 579rpm.
ωG |max =
This means that the robot can move up to the maximum value of 579rpm, which is beyond the required maximum of 480rpm. By the selection criteria, the combination of a 16C18-205 DC motor and a B16C27 gearbox is an acceptable choice for the MiroSoT soccer robot. 2.3.3 Motor Driving and Circuits The PWM (Pulse Width Modulation) method is one of the many methods used to drive DC motors. Referring to Fig. A.1, the amplified PWM signal V output by the motor driver can rotate (the rotor of) the DC motor at a velocity ωM that is proportional to the applied voltage V , according to Eq. (2.27). Why is a motor driver needed ? To serve two purposes, as explained below.
2.3 A Two-Wheel Command-Based Soccer Robot
53
1. For Current Amplification Eq. (2.18) shows that the motor torque τM is proportional to the current drive I. Referring to Fig. A.1, the PWM signal produced by the microcontroller has a low current output. Without amplification, this current is too weak to generate a sufficient torque τM to rotate the motor. The motor driver amplifies this signal to produce one with a stronger current drive for this purpose. 2. For Enabling Clockwise and Counter-clockwise Rotation
Vcc
PWM Logic Signals
SW1
SW2
+ NOT Gate SW4
M
-
SW3
GND
Fig. 2.25. H-Bridge circuit for motor driving
Consider the basic circuit of a motor driver, connected to a DC motor as shown in Fig. 2.25. The basic circuit is a H-bridge circuit, with 4 transistors acting as switches. Motor driver IC chips such as the L293 and L298 from SGS-Thomson contain two such H-bridges per chip. Some typical amplified PWM signal waveforms V output at the terminals of the DC motor by this circuit are shown in Fig. 2.26. To produce the output waveforms shown, the H-bridge circuit ‘switches’ the direction of the instantaneous current flow I in the motor according to the logic level of the PWM signals input from the microcontroller. The following explains how it works: Referring to Fig. 2.27(a), when the periodic PWM signal goes to logic 1, SW1 and SW3 are turned on, and SW2 and SW4 are turned off; the result is a current flow I in the direction indicated. The opposite occurs when the PWM signal goes to logic 0, as illustrated in Fig. 2.27(b). Hence the waveforms shown in Fig. 2.26.
54
2. Hardware Components {vu
V
{vmm
Average Voltage
Vp
STOP
0
time
-Vp {w~t
V
Average Voltage {vmm
{vu
CLOCKWISE
Vp 0
time
-Vp V
{vu
{vmm
Average Voltage
COUNTER-CLOCKWISE Vp time
0 -Vp
Fig. 2.26. Amplified PWM signals with different duty cycles
Vcc
Vcc
SW1 = ON
SW2 = OFF
SW1 = OFF
Current flow I
+
M
-
+
SW3 = ON
SW4 = OFF
SW2 = ON Current flow -I
-
SW3 =OFF
SW4 = ON
GND
GND
(a) PWM Signal at Logic 1
M
(b) PWM Signal at Logic 0
Fig. 2.27. PWM-based operations of H-bridge circuit
2.3.4 Velocity and Duty Cycle Referring to Fig. 2.26 again, the average voltage V¯ of the amplified PWM signal V is
2.3 A Two-Wheel Command-Based Soccer Robot
V¯ =
(n+1)T
1 · (TON − TOFF ) · Vp TPWM = (2d − 1) · Vp ,
V dt = nT
55
(2.30)
TON , 0 ≤ d ≤ 1, is the duty cycle TPWM of the PWM signal. Note that ideally, Vp = Vcc . This average voltage V¯ applied across the motor, with an average current I¯ flowing through it, determines the resultant DC motor rotational velocity ω ¯M according to the following formula. for an arbitrary n ≥ 0, where d =
ω ¯M =
1 kbemf
· V¯ − I¯ · ra rpm.
(2.31)
This formula follows directly from Eq. (2.27). Since |V¯ | >> |I¯ · ra |, V¯ (dominantly) determines, by its magnitude, how fast the DC motor rotates, and by its sign (+ or -), determines its direction of rotation (clockwise or counter-clockwise, respectively). Substituting Eq. (2.30) into Eq. (2.31), we obtain
ω ¯M =
1
· (2d − 1) · Vp − I¯ · ra rpm
kbemf 1 ≈ · (2d − 1) · Vp rpm. kbemf
(2.32)
Hence the resultant DC motor velocity (and rotational direction, as indicated by its sign) can be altered by changing the duty cycle d of the input PWM signal, as illustrated in Fig. 2.26. By combining Eqs. (2.26) and (2.32) and rewriting, we have ω ¯ G ≈ ωG |max · (2d − 1), where ωG |max =
Vp . N · kbemf
(2.33)
Hence the resultant DC motor-driven velocity (and rotational direction, as indicated by its sign) can be altered by changing the duty cycle d of the input PWM signal. Practical Implementation of Motor Actuation: An Example using PIC16C73/73A Microcontroller. Consider an example of motor actuation using the PIC16C73/73A microcontroller introduced in Section A.1. Then, by setting the integer values (PR2) and W as binary numbers in register PR2 and CCPRxL:CCPxCON5 : 4 of Eqs. (A.1) and (A.2) respectively, we get
56
2. Hardware Components
TON TPWM W , where A = (PR2) + 1. = 4A
d=
(2.34)
With 0 ≤ d ≤ 1, and by substituting Eq. (2.34) into Eq. (2.33), we obtain ω ¯ G = ωG |max ·
W −1 2A
rpm, where 0 ≤ W ≤ 4A.
(2.35)
By Eq. (2.35), the ‘delimiting values’ of ω ¯G (in rpm) follow. −ωG |max if W = 0, if W = 2A, ω ¯G = 0 ωG |max if W = 4A. The value of A is fixed a priori in the robot’s microcontroller program. Thus, wheel actuation at a dynamically changing desired velocity is reduced to computing and recomputing the velocity data W to generate the required pulse width of the PWM signals that drive the corresponding motor. Strictly speaking, for a given integer value of A, Eq. (2.35) holds only for those values of ω ¯ G for which the values of the integer variable W exist. So, a more general equation that holds for all values of ω ¯ G should have the error W included, as follows: ω ¯ G = ωG |max ·
(W + W ) −1 2A
rpm, where 0 ≤ W ≤ 4A.
(2.36)
Rewriting Eq. (2.36), we get
W W ω ¯ G = ωG |max · − 1 + ωG |max · rpm. 2A 2A specified actual
(2.37)
error
For a specified velocity ω ¯ G , the integer value of W should be determined with minimum error |W | ≤ 0.5 (ideally |W | = 0). Therefore, for a more precise arbitrary velocity setting, the value of A should be fixed as large as possible (but up to 28 as allowed by the 8-bit register PR2 plus 1). This will provide a more finely divided denominator range over which the integer W determined using Eq. (2.37) has a higher probability of corresponding more closely (if not exactly) to the specified velocity ω ¯G .
2.3 A Two-Wheel Command-Based Soccer Robot
57
Dead Zone and Saturation. There is a dead zone range of ±zd , where zd ≥ 0, at around W = 2A, within which the rotational velocity ω ¯G is 0. Motor saturation occurs at |¯ ωG | = ωsat ≤ ωG |max . Both phenomena are due to the inherent motor characteristics. Incorporating these into Eq. (2.36), we have a piece-wise function for ω ¯G (in rpm) as follows: + W − zd | · min ω if W + ≥ (2A + zd ), − 1 , ω G max sat 2A + ω ¯G = 0 + if |W − 2A| ≤ zd , W + zd − 1 , −ωsat if W + ≤ (2A − zd ); max ωG |max · 2A (2.38) where W + = W + W and −zd ≤ W + ≤ 4A + zd . The actual values of zd and ωsat depend on the motor used. Equivalently, we have a piece-wise function for W + as follows:
W+
ω ¯G + min 2A · + 1 + zd , 2A + Wsat if ω ¯ G > 0, ωG |max = 2A ¯ G = 0, if ω ω ¯ G + + 1 − zd , 2A − Wsat if ω ¯ G < 0; max 2A · ωG |max (2.39)
ωsat + zd . ωG |max The graph for Eq. (2.38) or Eq. (2.39) is shown in Fig. 2.28. The host computer program for a command-based robot soccer system needs to compute W + using Eq. (2.39) and send W as wheel velocity data to a respective team robot for its motor actuation. + where Wsat = 2A ·
2.3.5 Communication In this section, we discuss two communication means and the associated methods. One method uses radio frequency (RF) while the other uses infrared (IR) light. These methods are applicable to our purpose of sending actuation command data such as the desired velocity data from the host computer to the individual team robots. IR is an alternative medium frequently used in the remote switching of TV channels and many other consumer electronic products. IR supports communication with directivity; hence communication can be easily localized to a targeted area. IR transmitter/receiver circuits are simpler and available on an IC chip that is usually smaller than an RF communication module. But IR
58
2. Hardware Components
ωG ωGmax
M
W_S` V 4
W+ V
OV
M
W_S`
Tc
VcBM V
-ωGmax + + Note: Only the range [2A − Wsat , 2A + Wsat ] of W + is valid.
Fig. 2.28. Graph showing the relationship between the wheel rotational velocity ω ¯ G and the PWM data W + required to attain it
only supports short distance communication and is sensitive to fluctuations in environmental lighting. RF is a medium of communication that supports long distance and multichannel communication, with no directivity (i.e., there is no need to position and direct (or ‘point’) the transmitters at the receivers). But building an RF communication module is generally difficult because RF transmitter/receiver circuits are quite complex and require a certain degree of engineering knowhow. In building the communication module for a robot soccer system, a commercially available RF module is preferred. In employing RF communication, the selected carrier frequency for use should preferably not fall within or near the (carrier) frequency bands allocated for commercial use (e.g. cellular phone and pager). This is to avoid possible interference or ‘jamming’ of signal transmissions. Associated with each communication method is a set of communication protocols; a protocol is a standard set of rules that determines how computer-
2.3 A Two-Wheel Command-Based Soccer Robot
59
based and robotic agents communicate with one another across the medium. When these agents communicate with one another, they exchange a series of messages. To understand and act on these messages, the agents must agree on what a message means. A protocol has its rules described in terms of the format that a message must take and the systematic procedure in which agents must exchange messages within the context of a particular activity, such as sending and receiving messages across the medium. The message exchange procedure attempts to ensure the electronic messages are correctly formatted and transmitted from the originating agent to the destination agent. Agents of different types are able to communicate with one another on a certain activity - in spite of their differences - when they agree to use an appropriate communication protocol that offers a standard format and message exchange procedure. IR: Communication Circuit and Protocol. To use IR as the means of communication, two methods are available. One is the ASK (Amplitude Shift Key) method and the other is the base band method. These methods follow the standards set by the Infrared Data Association (IrDA, http://www.irda.org/). IrDA is an international non-profit organization that creates and promotes interoperable, low cost infrared data interconnection standards. In the ASK method, data is transmitted on a carrier signal and in the base band method (IrDA1.0), it is transmitted by switching the transmitter on-and-off. Fig. 2.29 illustrates how the serial data is transmitted using these methods. vGG { W
X
W
k
\WWo¡
zGhzrG
Z
pkhXUWG
X]
{
Fig. 2.29. IR communication using ASK and IrDA1.0 methods
W
60
2. Hardware Components
Referring to Fig. 2.29, when the data bit is logic 0, the ASK method generates and transmits the corresponding digital signals at a frequency of 500kHz; when the data bit is logic 1, it does not generate any signal. In the base band (IrDA 1.0) method, when the data bit is logic 0, the method 3 of the one bit time Tb ; when generates a corresponding pulse for the first 16 the data bit is logic 1, it does not generate any pulse. In the rest of this section, we focus on the base band (IrDA1.0) method.
IR Receiver (PIN)
IR Transmitter (LED)
Serial Output
Serial Input
Receiver Circuit (Amplifier and Quantizer)
Transmitter Circuit (LED DRIVER)
Decoder Encoder (3/16
th
Pulse Width Modulator)
Serial data output by host computer through an external data dispatcher
(a) Transmitter Module
(Edge Detector and Pulse Width Demodulator)
Serial data input to USART of robot˅s microcontroller
(b) Receiver Module
Fig. 2.30. Module block diagrams implementing the base band method for IR communication
Fig. 2.30 shows the block diagrams of the transmitter and receiver modules that implement the base band method. These module block diagrams, which are self-explanatory, can be realized using a HSDL7001 encoder/decoder IC chip and a HSDL1000 LED driver/receiver IC chip in a generic module for data transmission (at the host computer) or reception (at the team robot), as shown in Fig. 2.31. We turn our attention to the experimental robot soccer system. This system uses the base band method for IR communication. As explained on page 280 in Section A.2, without carrier signals at different frequencies to identify the teams, the two teams need to share the same IR communication
2.3 A Two-Wheel Command-Based Soccer Robot g
Qf
VCC
61
vQt
Be
VCC
VCC U3 JP1
JP2
JP3
TXDATA RXDATA
2 3 9 4 5 6
TXD RCV NRST
11 IR_TXD 10 IR_RCV
TXD RXD
JP4 JUMPER
JUMPER
JUMPER
JUMPER
1 15 16XCLK 14 OSCIN OSCOUT
X1 X2
13 POWERDN 7 CLK_SEL 12 PULSEMOD
A0 A1 A2
16 VCC 8 GND
R1
R2
U1
TXD RXD
6 4 1 2
VCC
C1
C2
C3
TXD LDEA RXD LEDC CX1 CX2
VCC GND
8 7 3 5
LEDA
VCC
HSDL1000
HSDL7001 R3
R4
R5
GND
C7 X1
Y1
R6
cBBB\ vzfcvcB_`B
BBvfBBB
C8 X2
cB
BB\ tzfcvcB^_B
BBtf BB
module, set up as a transceiver (a data ‘dispatcher’) connected to several IR transmitter modules, as shown in Fig. 2.32.
PC 1
v c
v
v
PC 2
v d kktBvBoBBB
Fig. 2.32. A game set-up for teams using IR base band communication
Because of the directivity of IR radiation and the small maximal view angle θt or θr (about 14o) of the IR transmitters and receivers, several IR transmitters are needed to cover the playground area. The MiroSoT game set-up allows the IR transmitter modules depicted in Fig. 2.32 to be placed 2m above the 150cm × 130cm playground, and hence four IR transmitter modules are required to sufficiently cover the whole playground area; the plan view of their positions is shown in Fig. 2.33. This coverage ensures that a robot can receive data at any position on the playground. Fig. 2.34 shows the circuit block diagram of the transceiver. The transceiver receives the data packets from (the host computer of) each team and transmits them out serially (i.e., one packet at a time) but simultaneously through the four IR transmitter modules. Each data packet is a communication message with the format as defined in Fig. 2.35.
62
2. Hardware Components
{
θ θ
SUR
RNR
v
t
UW
NUW
UW
N[W
SSW
NUW
SSW
N[W
Nθ r \x
t
θr
SWR
t
z Fig. 2.33. IR transmission coverage
The simple procedure of the broadcast protocol used for data send and receive is as follows: • The team host computer transmits the messages continually to the team robots through the transceiver. • The IR receiver module onboard each of the 3 team robots receives and decodes the messages as they arrive; the microcontroller stores these messages in its memory automatically via its configured hardware logic. The microcontroller program examines every message stored for the following conditions. 1. Start of message: the 0th -2nd bytes each contains the value 0xFF; 2. Team ownership: the 3rd byte contains its team ID; 3. Data delimiter: the 6th , 9th and 12th bytes each contains the value 0xAA; • If all the conditions are true, it proceeds to extract the left-wheel and rightwheel velocity data from the (3i + 4)th and (3i + 5)th bytes respectively, where i ∈ [0, 2] is the unique robot ID. Data Communication from Host Computer to Team Robots. Now, consider a command-based robot soccer system with the host computer program continually deciding and sending the velocity data W (for each wheel of each PIC16C73/73A microcontroller-based robot) through the send and receive communication protocol introduced above. Recall that W determines the duty cycle as in Eq. (2.34). In the experimental robot soccer system, the duty cycle of the PWM signal has a resolution of at most 10-bits. i.e., 0 ≤ (CCPRxL : CCPxCON5 : 4 )2 ≤ 210 − 1.
2.3 A Two-Wheel Command-Based Soccer Robot
63
v
jB e JvBcK
tuOTUTeBnBf JoczTUTK
uB u
u v r JSXeWWR wctvK
uB
jB e JvBdK
u v r JSXeWWR wctvK
o
JkBZRWSK
r v u JSXeWWR wctvK
ktBv o
Fig. 2.34. Circuit block diagram of a transceiver
0xFF
0xFF
0xFF
TID
0 HL
0 HR
0xAA
1 HL
1 HR
0xAA
2 HL
2 HR
0xAA
Format Definition • The first 3 bytes of 0xFF ( ‘0x’ denotes Hexadecimal) indicate start of message. • Byte TID contains a team ID 0x0F or 0xF0 that denotes Team A or Team B, respectively. i , for 0 ≤ i ≤ 2, contain, respectively, the unsigned velocity • Byte HLi and byte HR data for the left and right wheels of team robot with robot ID i. • Byte 0xAA indicates the end of velocity data pair for each team robot. Fig. 2.35. Communication message format
In other words, the velocity data W needs at least 10-bits but the protocol defined admits only one byte for velocity data. The simple approach used is for the host computer to send byte data H , given by H=
W , rounded to the nearest integer ≤ 28 − 1 (i.e., 255). 4
64
2. Hardware Components
The microcontroller program in a team robot with robot ID i would then, upon receiving each of its team communication messages, extracts and comi putes to obtain the pair of desired velocity data, 4HLi and 4HR . Finally, the byte values 0xFF (255) and 0xAA (170) are special codes defined in the communication format. As a design principle, the host computer program does not use them for velocity data. Hence H ∈ {j| 0 ≤ j ≤ 254 and j = 170}. RF: Communication Circuit and Protocol. For RF communication, a commercial communication module, the ARFM 424/447 from ALLINTEK (http//www.allintek.co.kr), can be used. The module implements the frequencyshift keying (FSK) method. In FSK, the binary values (0 and 1) are represented by two different frequencies near the carrier frequency. One bit time Tb
0
1
0
0
Data
FSK
Fig. 2.36. RF communication using the FSK method
As depicted in Fig. 2.36, at time t, the resulting FSK signal s(t) is s(t) =
Ao cos(2πf1 t) if data(t) = 1, Ao cos(2πf2 t) if data(t) = 0,
(2.40)
where Ao is an amplitude constant, and typically, f1 = fc + f
and f2 = fc − f
i.e., they are offset from the carrier frequency fc by an amount f . Like most commonly used commercial modules, the ARFM 424/447 transceiver module has a fixed frequency of either 424MHz or 447MHz. The communication mode is half-duplex, i.e., the module supports either signaltransmit only or signal-receive only. This communication module can send data up to 40Kbps. Some electrical characteristics of this module are as follows:
2.3 A Two-Wheel Command-Based Soccer Robot
Supply Power: Current Consumption: Input/Output Impedance:
65
DC 3.3V - 5V, 45mA Max, 50 Ohms.
Fig. 2.37 show the appearance and physical dimensions of the real product. The pin configuration is listed in Table 2.3.5. Table 2.2. Pin Configuration of ALLINTEK ARFM-424 module Pin No. 1 2 3 4 5 6 7
Function RXD TXD RXE TXE VCC GND ANT
8
AGND
Description Receiving Data(TTL level) Transmitting Data(TTL level) Reception Enable Pin, ’H’ Enable / ’L’ Disable Transmission Enable Pin, ’H’ Enable / ’L’ Disable DC Power, 3.2V 5.0V Ground Transmitting Data RF Output Pin, Antenna Port Impedance : 50 ANT Ground
Designed in combination with some auxiliary components, the ARFM-424 transceiver module can be used in a command-based robot soccer system as follows: 1. At the host computer Fig. 2.38 shows a commercially available RF transceiver circuit for use (as a transmitter module) by the host computer. The circuit is set (via a mode switch) to modulate and transmit the signals from the host computer via the attached RF antenna. 2. On each team robot Fig. 2.39 shows an RF transceiver circuit for use (as a receiver module) by each team robot. The RF communication circuit is set (via a mode switch) to demodulate the signals received from the host computer via the attached RF antenna. 3. Communication protocol Fig. 2.40 shows the message format of a protocol for RF communication. The first byte is a dummy byte for frequency locking and the second byte i indicates start of message. Two bytes, HLi and HR , for i ∈ [0, 2] (a 3-aside team), contains respectively, the unsigned velocity data for the left and right wheels of the team robot with ID i. The byte mode-i is reserved for extended use as a control mode for the team robot with ID i.
66
2. Hardware Components
(a) Photo image
32.00mm 27.94mm 2.54mm RXD
RXE
TXD
TXE
ANT
GND
AGND
GND
AGND
30.00mm
3.57mm
0.8T
16.2mm
ARFM-4XX VCC
22.86mm
GND
2.03mm
2.5mm 6.2mm
(b) Physical dimensions Fig. 2.37. An ALLINTEK ARFM-424 RF communication module
2.3.6 Power System The choice of a power source is crucial from the hardware point of view. Batteries constitute a significant percentage of the total weight of a MiroSoT soccer robot. On the one hand, larger batteries usually supply higher voltages (and hence currents), but this may be at the expense of attaining good speed as they might contribute excessive weight and size to the overall mechanical
2.3 A Two-Wheel Command-Based Soccer Robot
67
design of the robot. On the other hand, smaller batteries might not be able to supply the robot with a constant current that is sufficient to drive its logic and motor circuitry, or that lasts long enough to sustain the required autonomy
68
2. Hardware Components
dummy
header
0 HL
0 HR
mode-0
1 HL
1 HR
mode-1
2 HL
2 HR
mode-2
Fig. 2.40. Communication message format
(during the game). Besides, an under-weight robot that results might not be favourable strategically since the fairly rugged nature of the game means that it can be easily ‘pushed off’ its posture during the game. As a guideline from the viewpoint of sustained robot autonomy, the batteries must supply sufficient current through one half of the two-half game duration. Functionally, a battery converts chemical energy into electrical energy. There are basically two types of battery, namely, the rechargeable and nonrechargeable (i.e., ‘used once’) type. For a robot soccer system, it is economically cheaper to use rechargeable batteries. Table 2.3. Some batteries and their characteristics Battery Chemistry
Recharge
Energy Density (Whr/kg)
Cell Voltage
Typical Capacity (Ah)
Internal Resistance (ohms)
Comments
Alkaline
No
130
1.5
AA: 1.4 C: 4.5 D: 10
0.1
Most common primary battery
Lead-acid
Yes
40
2.0
C: 1.2-120
0.006
Available in a wide variety
0.3
Excellent energy density, high unit cost
Lithium
No
300
3.0
A: 1.8 C: 5 D: 14
Mercury
No
120
1.35
Coin: 0.19
10
NiCd
Yes
38
1.2
AA: 0.5 C: 1.8 D: 4
0.009
NiMH
Yes
57
1.3
AA: 1.1 4/3A: 2.3
Zinc-air
No
310
1.4
Carbonzinc
No
75
1.5
Low internal resistance, available from many sources Better energy density than NiCd, expensive High energy density but not widely available, limited range of sizes
D: 6
Inexpensive but obsolute
Table 2.3.6 lists some commercially available batteries and their characteristics. The capacity (or energy stored) of a battery is quantified in terms of amp-hours (Ah) or milliamp-hours (mAh). For example, a battery with 500mAh means that it can supply 500mA for one hour. Therefore, if a supply of 300mA is needed for 5 hours, a battery with 1500mAh is appropriate. When the motors start up or reverse direction, they could draw large transient currents from the power system, leading to a possibly large instan-
Notes on Selected References
69
taneous drop in the supply voltage. Thus, in using a common power system for the logic circuits and the motors of a soccer robot, batteries with the right capacities need to be chosen to ‘withstand’ the maximum transient currents that the motors might draw. As a guideline, a rechargeable battery needs to have a capacity at least one third of the (total) maximum instantaneous current that can be drawn. For example, if the maximum instantaneous current is 3000mA, the battery capacity must be at least 1000mAh. Otherwise, the battery supply voltage will drop instantaneously and excessively when such a current occurs, leading to battery ‘breakdown’ and/or logic circuit reset. Overall, in selecting the batteries for a soccer robot’s power system, one needs to balance the economic cost with the capacity (in mAh) and total weight of the batteries. Power Regulation. It is uncommon to always use IC chips such as those of the 74HC series that can operate over a wide (discrete) range of voltages (i.e., the ‘Vcc ’s). Most IC chips are powered by applying a fixed Vcc voltage. Thus, for an electrical circuit system containing several IC chips powered by different voltages (e.g. 5V and 12 V), power regulation is needed. Two types of power regulator are commonly used; as illustrated in Fig 2.41, a linear regulator ‘steps down’ a supply voltage while a DC-DC converter ‘steps up’ the voltage; a DC-DC converter can also be used to reverse the polarity of the voltage. Besides voltage level transformation, importantly, a power regulator ensures its output voltage is constant regardless of some inevitable fluctuations in the supply voltage or the load (i.e., the electrical currents drawn by the system circuitry) due to, for example, the start up or direction reversal of the motors in the soccer robot. The logic circuits need to operate under a constant applied voltage, and hence this voltage is best provided for by the regulator output (see Fig. 2.8). 2.3.7 Other Considerations We conclude with some mechanical design considerations, listed as follows: 1. The battery compartment in the robot should allow easy replacement of batteries since this is often done during the half-time interval of the game. 2. The robot’s wheels should be designed to have good friction with the contact surface; this is to minimize slips and hence reduce position errors.
Notes on Selected References The two companion books [14, 15] provide a reasonably good reference on PIC microcontrollers and the auxiliary devices for various applications.
70
2. Hardware Components
VCC 7805 7~35 volts
Supply
IN
OUT
GND
10u
47u
Battery
Supply POWER SW
(a) A linear regulator
470uH LX 2~5 volts
5 volts
VOUT
MAX631 LBI
GND
100u VFB
GND
(b) A DC-DC converter Fig. 2.41. Examples of power regulation IC chips
3. How to Sense? Use Computer Vision Techniques
3.1 Introduction In a robot soccer game, real-time information about the robots’ and the ball’s dynamically changing coordinate positions and directions of move is vital. In a MiroSoT set-up, such information is obtained using a real-time vision system which consists of a vision algorithm continually processing the digital images captured by a vision board that receives the analog images from a video camera overlooking the complete playground. The FIRA MiroSoT rules specify well defined colours for different objects in the playground, and these are used as major cues for object detection. Vision processing in a robot soccer system is therefore colour-based. The performance of a vision system for robot soccer is gauged in terms of its rate and accuracy in determining the objects’ coordinate positions and directions of move in the playground. Assuming that a good and fast enough vision processing algorithm exists, the former is limited to the image sampling rate of its vision processing board (also called an image frame grabber); the latter is limited by the quality of the digital images it processes. Most commercial vision boards provide about 30 image frames per sec1 ond, implying that new image frames are sampled once in every 30 s (or about 33.3ms). Thus, in a robot soccer system, if the system processing cycle time (of vision processing, deciding, controlling and communicating) does not exceed this frame sampling time, the maximum of 30 actuation commands per second can be received by the team robots. But each sampled image frame 1 is interlaced with an even and an odd field, and each field is captured at 60 s (about 16.7ms) intervals. So, if individual image fields are processed instead, and the system processing cycle time does not exceed this field sampling time, the maximum rate is doubled to 60 actuation commands received per second. But this higher rate is achieved at the cost of lower accuracy because individual image fields processed are clearly of lower resolution than an image frame. This chapter focusses on the (visual) SENSE primitive; it covers the basics of vision processing systems most relevant to the study of robot soccer systems, and presents how the postures of target objects in robot soccer can be computed using centralized vision techniques. Real examples then follow to study several specialized aspects of real vision systems as used by J.-H. Kim, D.-H. Kim, Y.-J. Kim, K.-T. Seow: Soccer Robotics, STAR 11, pp. 71-101, 2004 Springer-Verlag Berlin Heidelberg 2004
72
3. How to Sense?
previous FIRA Cup MiroSoT teams. These examples highlight the practical considerations in building a good vision system for a MiroSoT team.
3.2 Vision Basics 3.2.1 Computer Vision Computer or machine vision studies how useful information of a scene can be extracted from the images of the scene. The information refers to the features of objects found in the scene. Examples of an object’s features include its position, heading angle, contours and colours. The kind of information to be extracted depends on the application. Besides mobile robotics, examples of vision applications include medical diagnosis and weather forecasting based on satellite images, to mention a few. Related Disciplines. Some fields related to computer vision include image processing, computer graphics, pattern recognition and artificial intelligence. A comprehensive discussion of these related fields is beyond the scope of this book, but it is noteworthy that many techniques from these fields have significant bearings on computer vision, as briefly mentioned below. 1. The output of image processing is an image which is either an enhanced, compressed, de-blurred or focus-corrected version of the input image; image processing is therefore useful in the early stages of a vision system, since it could be used to enhance particular information and suppress noise in the original image frame. 2. Computer graphics studies how images can be generated from geometric primitives (image synthesis), and is therefore an inverse of computer vision concerned with estimating geometric primitives and other features of objects from images (image analysis). Graphics techniques such as those for curve and surface representations are applicable to approximating object contours in computer vision. 3. Pattern recognition studies how numerical and symbolic data are classified as patterns. Many statistical and syntactical techniques developed for classifying patterns play an important role in computer vision for object recognition. 4. Artificial intelligence studies how intelligent systems are built as well as the computational aspects of intelligence. Techniques from artificial intelligence can be used to analyze scenes by constructing a symbolic representation based on the features of the scene objects obtained by vision processing. In fact, many artificial intelligence techniques play important roles in all aspects of computer vision that vision is often considered a subfield of artificial intelligence.
3.2 Vision Basics
73
Fig. 3.1. Basic architecture of a computer vision system
3.2.2 Vision System Operations Images are two-dimensional (2-D) projections of the three-dimensional (3-D) scene. The information of a scene is therefore not directly available. To extract it, a vision system requires high-level knowledge about the objects in the scene, and low-level knowledge on image formation, namely projection geometry and the physics of light. Projection geometry determines the relative location of an arbitrary point in the scene in the image (display) plane; the physics of light determines the brightness of a point in the image plane as a function of scene illumination and surface properties. Fig. 3.1 shows a basic architecture of a computer vision system. The knowledge on image formation is built mainly into the vision hardware (i.e., the camera and frame grabber) while the application knowledge about the scene, such as the models of and the relationships among the objects which could be found in the scene, is coded into the computer vision algorithm. Through its optical lens, the camera projects a 3-D scene onto a 2-D intensity image. The illumination or brightness of this image is cast on the light-sensitive photocells of the CCD1 (Charge Coupled Device), an image sensor that converts and outputs these photo intensities as a continuous (electrical) charge signal. Alternative image sensors include CMOS (Complementary Metal Oxide Silicon), CID (Charge Injection Device) and PDA (Photo Diode Array). For robot soccer systems, CCD cameras are popularly used. The frame-grabber digitizes the analog charge signal output by the CCD camera into evenly time-spaced integer data points (that constitute the sensed 2-D digital images), and stores them in memory as a 2-D image ar1
CCD is a semiconductor technology used to build light-sensitive electronic devices such as cameras and image scanners. Such devices may detect either colour or black-and-white. Each CCD chip consists of an array of light-sensitive photocells. The photocell is sensitized by giving it an electrical charge prior to exposure.
74
3. How to Sense?
rays under computer control. The host computer program then calls upon a resident vision algorithm to process the stored digital images. 3.2.3 Sampling, Pixel, and Quantization Images are classified as either stationary or dynamic. Images captured in a robot-soccer game are dynamic, and they are either black and white (gray scale) or coloured, depending on the type of overhead camera used. A real image is a continuous tone picture, and output by the image sensor as a continuous signal waveform2 that represents brightness. The frame grabber digitizes this waveform by sampling3 using its scanner and quantizing using its analog-to-digital (A/D) converter to transform it into a 2-D array of integer data points or samples. Depending on the type of camera and optical filters used, each sample (for gray scale images) or a vector of ‘neighbourhood’ samples (for colour images) constitutes what is called a picture element or pixel for short. Sampling selects the evenly time-spaced ‘data points’ on the charge signal waveform; each data point indicates the original intensity value of the selected or sampled signal point. Quantization assigns each real-value data point an integer-value number to represent the intensity level for computer storage. Graphically, as depicted in Fig. 3.2, a 2-D n × m image array has n rows and m columns of pixels may be conveniently represented in a grid of equalsized squares, and for each pixel denoted by a, a[i, j] stores an intensity level quantized to an integer value for a gray scale (one-channel) image, and for a colour (multi-channel) image, stores a vector of intensity levels, with each for a basis colour channel and quantized to an integer. The basis colours depends on the colour model used. [i, j] refers to its pixel position which is a square point in the grid, where i, 1 ≤ i ≤ n, is the row index and j, 1 ≤ j ≤ m, is the column index. The grid represents the image plane and each grid square is said to be occupied by a pixel. The sampling rate of a chosen frame grabber determines the image array size or pixel resolution, i.e., how many pixels the digitized image will have. The quantization range or intra-pixel (intensity) resolution, limited by finite word size of the computer (usually set at 8-bits), determines how many levels are available to represent the intensity level of a signal sample point. 3.2.4 Gray Scale, Binary, and Colour Images For a gray scale image, the pixel is frequently represented as an unsigned 8-bit integer, and hence the quantization range is [0, 255], with 0 corre2
3
Strictly speaking, this waveform is not analog, but a dense spectrum of signal points each indicating the level of light intensity in a photocell of the exposed CCD (photocell) array. Note that the image sensor’s photocell ‘scanning’ rate is an integer multiple of this sampling rate.
3.2 Vision Basics
Pixel pixel a[1,1]
Column j
75
m Columns
Row
Pixel
a[i, j]
n Rows
Fig. 3.2. An n × m image grid
sponding to black, 255 corresponding to white and shades of gray distributed over the middle values. If the pixel is represented by a one-bit integer, the quantization range is [0, 1]; in this case, the image formed is called a binary image. For colour images based on the RGB colour model, each pixel a(i, j) = [R G B]T , where R, G and B are the integers values for its red, green and blue components; if the component channels are each quantized to an 8-bit integer, 28 × 28 × 28 (or 1.677 × 107 ) different colours can be represented for a pixel. It should be clear that the quality of digital images obtained depends on the image pixel and intra-pixel resolutions, as well as the environmental lighting conditions. In many practical vision applications, the sampling and quantizing rates are predetermined due to the limited choice of available cameras (or image acquisition hardware). It is however important to know the effects of sampling and quantizing rates on retaining information in digital images; this is discussed in the book [16]. 3.2.5 Colour Models Visible light has a wavelength ranging from 400nm to 700nm. Colours are created by mixing different visible lights. A colour model (or colour space) is a way of representing these colours and their relationship to one another. The RGB Colour Model. The RGB colour space consists of the three additive primaries: red, green, and blue. Spectral components of these colours combine additively to produce a resultant colour. In what follows, the colours are normalized (i.e., their values lie between 0 and 1.0). This is easily accomplished by dividing the colour by its maximum
76
3. How to Sense?
value allowed by the quantization range. For example, an 8-bit colour is normalized by dividing by 255. The RGB model is represented by a 3-dimensional cube with red green and blue at the corners on each axis, as shown in Fig. 3.3. Black is at the origin. White is at the opposite end of the cube. The gray scale follows the line from black to white. In a 24-bit colour graphics system with 8 bits per colour channel, red is (255,0,0). On the colour cube, it is (1,0,0).
Blue = (0,0,1)
Magenta = (1,0,1)
Black = (0,0,0)
Cyan = (0,1,1) White White = (1,1,1)
Green = (0,1,0)
Red = (1,0,0) Yellow = (1,1,0)
Fig. 3.3. The RGB colour cube
Often times, it becomes necessary to convert an RGB image into a gray scale image. To convert an image from RGB colour to gray scale, the following equation is used. Gray scale intensity = 0.299R + 0.587G + 0.114B.
(3.1)
This equation comes from the NTSC4 standard for luminance. Another common conversion from RGB colour to gray scale is a simple average. Gray scale intensity = 0.333R + 0.333G + 0.333B.
(3.2)
This is used in many applications. 4
National Television System Committee, a committee that sets colour television standards which are used in America, Korea and Japan.
3.3 Binary Image Processing
77
Other Colour Models. Different image processing systems use different colour models for different reasons. The colour picture publishing industry uses the CM Y colour model. Colour CRT monitors and most computer graphics systems use the RGB colour model. Systems that must manipulate hue, saturation, and intensity separately use the HSI colour model. The Y IQ and Y U V (or Y Cb Cr ) colour models are used respectively in NTSC and PAL5 video for broadcast television. Many commercial frame grabbers use the RGB model, but some use the Y U V model, such as the one in the example system of Section 3.4.3. Other colour models, based on human vision, include XY Z, LAB and LU V ; these were proposed by the commision internationale de l’eclairage which means the international commision on illumination. Relationship between Colour Models. Relationships exist to convert from one colour model to another and back. Listed below are some matrix equations that directly convert the RGB model to Y IQ, Y U V and Y Cb Cr models, respectively. Y 0.299 0.587 0.114 R I = 0.596 −0.275 −0.321 G . Q 0.212 −0.528 0.311 B
(3.3)
Y 0.299 0.587 0.114 R U = −0.169 −0.331 0.500 G . V 0.500 −0.419 −0.081 B
(3.4)
Y 0.299 0.587 0.114 R Cb = −0.299 −0.587 0.886 G . Cr 0.701 −0.587 −0.114 B
(3.5)
3.3 Binary Image Processing Binary images have only 2 (gray level) intensity levels, 0 and 1. Compared to gray scale level or colour images, binary images require less memory storage and can be processed more quickly, but clearly contain a lot less information of the scenes they represent. However, many techniques developed for binary vision systems are also applicable to vision systems which use gray scale or colour images. A convenient way to represent an object in a gray scale or colour image is to use its mask. The mask of an object is a binary image in which the pixel values 5
Phase Alternating Line, another set of colour television standards used in West Germany, The United Kingdom, parts of Europe, South America, parts of Asia and Africa.
78
3. How to Sense?
of the object are 1 and those of the background are 0. After an object has been ‘separated’ from the background, its geometric properties such as size, position and orientation may be required for decision making. These features can be computed from its binary image. In other words, the many basic concepts and processing techniques of computer vision are found in the simpler domain of binary image processing. This section introduces and explains these concepts and techniques through the processing of binary images. 3.3.1 Thresholding Thresholding is a method to convert a gray scale image into a binary image so that objects of interest are separated from the background. For thresholding to be effective in object-background separation, it is necessary that objects and background have sufficient contrast and the intensity levels of either the objects or the background are known. Thresholding is therefore an applicable vision technique for a MiroSoT system since the rectangular playground is black, providing a good contrast with the 5cm high white side walls and the blue or yellow patches worn on top of each soccer robot assigned as a team identification (ID) colour to visually identify the different team it belongs to. For a gray scale image F , let F [i, j] be the original intensity level of its pixel (referenced as a[i, j] in the grid of Fig. 3.2) and FT [i, j] be the thresholded value of the same pixel predicated on a criterion C on F [i, j]. To obtain a binary image B (for which its pixel intensity level B[i, j] = FT [i, j] ∈ {0, 1}), the thresholding function for FT [i, j] is defined by FT [i, j] =
1 if C(F [i, j]), 0 otherwise.
(3.6)
Depending on the application knowledge, the criterion function C(F [i, j]) can take one of the following formulae. 1. F [i, j] ≤ Tu , for some upper bound threshold Tu . This is used to separate darker colour objects (lower gray levels) from the lighter colour background (higher gray levels). 2. F [i, j] ≥ Tl , for some lower bound threshold Tl . This is used to separate lighter colour objects (higher gray levels) from the darker colour background (lower gray levels). 3. T1 ≤ F [i, j] ≤ T2 , for some threshold range [T1 , T2 ]. This is used to separate objects with intensity values in the range [T1 , T2 ] from the background known to be outside this range. 4. F [i, j] ∈ Z, where Z is a union of several disjoint threshold ranges. This is a general thresholding scheme used to separate objects with intensity levels that may come from several disjoint ranges from the background known to be outside all these ranges.
3.3 Binary Image Processing
79
Fig. 3.4 shows a gray level image and its resulting binary images obtained by using different thresholds.
(a) Image
(b) Tl = 48
(c) T1 = 21, T2 = 48
Fig. 3.4. A gray level image and its resulting binary images using different thresholds
Automated thresholding of images is often the first step in the analysis of images in computer vision systems. Many available thresholding techniques utilize the intensity distribution in an image and the knowledge about the objects of interest for selecting a proper threshold value automatically. To elaborate briefly, consider an image of a MiroSoT robot in the playground and its histogram, as shown in Fig. 3.5; an image histogram is an intensity
80
3. How to Sense?
distribution plot showing the number of image pixels for each gray scale level. The MiroSoT robot appears as a bright square in the image and it lies in the gray level range [140, 180]. From the ‘valley’ between the two peaks in the histogram, it is thus clear that setting the threshold T (either lower or upper bound) to 135 will quite distinctly separate the square from the background.
\WWW
[WWW
ZWWW
YWWW
XWWW
W
WGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGXWWGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGX\WGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGYWW
Fig. 3.5. An image and its histogram
One should note that although the thresholding techniques available are useful tools, a proper threshold value is usually selected on the basis of human experience with the application domain. 3.3.2 Computing Geometric Properties In this section, we assume that a thresholding technique has yielded a binary image B of size n × m (n rows, m columns). This image has only one object; the pixel values of the object are 1 and those of the background are 0. Size. The area A occupied by the object in binary image B is given by
A=
n m
B[i, j].
(3.7)
i=1 j=1
Position. The position of an object in an image plays an important role in robot soccer. In MiroSoT, the objects, namely, the robots and the ball, appear on a known surface - the rectangular playground - and the position of the overhead camera is known with respect to the playground. In this case, an object’s position in the image determines its spatial location (in the playground). The position of an object in an image may be defined using the centre of area of the object image. Though other methods such as ‘using a rectangle
3.3 Binary Image Processing
81
to enclose the object image’ may be used, the centre of area of the object image is a point and is relatively insensitive to noise in the whole image. The following equation provides formulae to compute the centre position (¯ x, y¯) of the object in binary image B with respect to the image Cartesian x-y plane. n x ¯= y¯ =
m
i=1
j=1
mA
n i=1
j=1
j · B[i, j] i · B[i, j]
A
, (3.8)
.
To illustrate, consider the image in Fig. 3.6. Note the origin of the image Cartesian x-y plane and the directions of the image x-y axes. T T T
(x, y) Fig. 3.6. An example showing the centre position (¯ x, y¯) of an image
In this example, A = 13. The centre position (¯ x, y¯) is calculated using Eq. (3.8) as follows: 5×4+6×5+7×4 13 = 6.0. 4×2+5×3+6×3+7×3+8×2 y¯ = 13 = 6.0.
x ¯=
82
3. How to Sense?
In general, the calculated x¯ and y¯ may not be integers, and usually lies between the column and row indices of two pixel positions. It does not imply, however, that the calculated position is better than the resolution of pixel positions. Orientation. Calculating the orientation of an object is a little more complicated than calculating its position. For an object which is circular, its orientation is not unique. An object’s orientation is unique if it is elongated, such as that shown in Fig. 3.7. y
Array row i
Line equation:
Orientation: x Array column j
Fig. 3.7. Finding the orientation of the object
The angle θ between the thick line and x-axis in Fig. 3.7 is defined as the object’s orientation. This thick line, called the object’s orientation line, is the least squares fit of the positions of all the pixels of the object (or simply called object points) in binary image B, i.e., it is the line that best fits the object points in that the sum of the squared distances between the object points and the line is minimized. Formally, to find the equation of such a line from which the orientation information about the object can be obtained, minimize λ2 , the sum of the squared perpendicular distances of all object points from the line given by λ2 =
m n
d2ij · B[i, j],
(3.9)
i=1 j=1
where dij is the perpendicular distance from an object point [i, j] to the thick line. To avoid numerical problems when the line is nearly vertical, represent the line in polar coordinates:
3.3 Binary Image Processing
ρ = x cos θo + y sin θo .
83
(3.10)
Referring to Fig. 3.7, θo is the orientation of the normal to the thick line with the x-axis, ρ is the normal (and of course, shortest) distance between the line and the origin. The normal distance d between an arbitrary Cartesian coordinate (x, y) and the line characterized by Eq. (3.10) satisfies the following equation. d2 = (x cos θo + y sin θo − ρ)2 .
(3.11)
Plugging Eq. (3.11) for every image point into Eq. (3.9) (the minimization criterion), we get
λ2 =
m n (xij cos θo + yij sin θo − ρ)2 · B[i, j],
(3.12)
i=1 j=1
where (xij , yij ) is the Cartesian coordinate point of pixel position [i, j] (i.e., pixel at the i-th row and j-th column). The characteristic equation model (ρ, θo ) of the line that best fits the object points can be obtained by minimizing λ2 , done as follows: In Eq. (3.12), set the derivative of λ2 with respect to ρ to zero. Then solving for ρ yields ρ=x ¯ cos θo + y¯ sin θo ,
(3.13)
which shows that the regression line passes through the centre (¯ x, y¯) of the object points. Define (˜ xij , y˜ij ) = (xij − x ¯, yij − y¯). Substituting these definitions and Eq. (3.13) into Eq. (3.9), we get λ2 = a cos2 θo + b sin θo cos θo + c sin2 θo ,
(3.14)
where a=
m n (˜ xij )2 · B[i, j], i=1 j=1 m n
b = 2· c=
x ˜ij y˜ij · B[i, j],
i=1 j=1 m n
(˜ yij )2 · B[i, j].
i=1 j=1
Eq. (3.14) can be rewritten as λ2 =
a + c (a − c) · cos 2θo b sin 2θo + + . 2 2 2
(3.15)
84
3. How to Sense?
In Eq. (3.15), by setting the derivative of λ2 with respect to θo to zero, we get tan 2θo =
b . (a − c)
Therefore, θo =
1 arctan 2
b a−c
(3.16)
It follows easily that π 1 θ = θo + = arctan 2 2
.
(3.17)
b a−c
+
π . 2
(3.18)
To fix the angular reference, let θ ∈ (−π, π]. Note that if b = 0 and a = c, the object’s orientation is undefined. 3.3.3 Labelling Given a binary image, we need to group all spatially close pixels of value 1 into connected components that distinctly represent the different objects. This is done using a component labelling algorithm, which finds all connected components in the image and assigns a unique label, usually an integer, to all pixels in the same component. Fig. 3.8 shows an example of component labeling of a binary image derived using some thresholding technique. The dark pixels (see Fig. 3.8(a)) have ‘1’ values and the others have ‘0’ values. There are a total of 4 connected components and they are given labeling values of 1, 2, 3 and 4 (see Fig. 3.8(b)) through some labeling algorithm. Computing the geometric properties (such as size, position and orientation) of each object, as has been covered in Section 3.3.2, is usually made as its component is labelled. Other properties that can be computed are perimeter (number of pixels at perimeter) and compactness of the object image. Compactness is defined by 4π · Area , Perimeter2 where ‘Area’ refers to the object image area defined by A in Eq. (3.7). A circular object has a compactness value of 1; usually, objects with more complex shapes have smaller values. If the shapes of objects are known, as in robot soccer, compactness and perimeter values are helpful in finding and recognizing them. Computing object properties can be easily integrated into the labelling algorithm. However, in order not to clutter the idea of component labelling, this section only presents the intrinsic labelling algorithms. The notion of spatial proximity has to be made precise before two labelling algorithms can be clearly presented. For this purpose, some definitions are introduced first. Compactness =
3.3 Binary Image Processing
85
5555 555 666 66 777 7777 7777 (a) Binary image
88 8888 8888
(b) Labelled binary image
Fig. 3.8. A binary image and its labelled connected components
Neighbours. Consider a digital image represented on a grid of squares each representing an image pixel. In this representation, a pixel has a common boundary with each of four other pixels, i.e., it shares every side of its square with one different pixel. It shares every corner of its square with each of four additional pixels. We say that two pixels are 4-neighbours if they share a common boundary; and are 8-neighbours if they share at least one corner. As shown in Fig. 3.9, for a pixel at square position [i, j] in a grid, its four 4-neighbours are at [i + 1, j], [i − 1, j], [i, j + 1], [i, j − 1] and its eight 8-neighbours are at [i + 1, j + 1], [i + 1, j − 1], [i − 1, j + 1], [i − 1, j − 1], plus the positions of all of its 4-neighbours. A pixel is said to be 4-connected to its 4-neighbours and 8-connected to its 8-neighbours. Paths. A path from the pixel at [i0 , j0 ] to the pixel at [in , jn ] is a position sequence of pixels [i0 , jo ], [i1 , j1 ], [i2 , j2 ], · · · , [in , jn ] such that any two consecutive pixels in the sequence, at [ik , jk ] and [ik+1 , jk+1 ], 0 ≤ k ≤ n − 1, are neighbours. If the consecutive pixels at [ik , jk ] and [ik+1 , jk+1 ] are 4neighbours for all k, the path is a 4-path; if they are 8-neighbours, the path is an 8-path. Simple examples of these are shown in Fig. 3.10. Foreground. The set of all unity valued pixels in a binary image is called the foreground and is denoted by S.
86
3. How to Sense?
i, j
i, j
(a) 4-neighbours
(b) 8-neighbours
Fig. 3.9. The 4- and 8-neighbourhoods of a pixel at square position [i, j]
(a) 4-path
(b) 8-path
Fig. 3.10. Examples of a 4-path and an 8-path
Connectivity. A pixel p ∈ S is said to be connected to q ∈ S if there is a path from p to q consisting entirely of pixels of S. For any three pixels p, q, r ∈ S, the following properties are satisfied. 1. Pixel p is connected to itself (reflexivity). 2. If p is connected to q, then q is connected to p (commutativity). 3. If p is connected to q and q is connected to r, then p is connected to r (transitivity). In other words, mathematically, connectivity is an equivalence relation. Connected Components and Spatial Proxmity. A subset of S in which each pixel is connected to all other pixels is called a connected component. Points over the same object surface project onto spatially close pixels in an image, and this is captured by the concept of a connected component.
3.3 Binary Image Processing
87
Following, we present two basic algorithms for finding and assigning connected components in a binary image. One is recursive while the other is sequential. A recursive algorithm is very time inefficient on a sequential processor, and so is usually implemented on parallel processors. A sequential algorithm takes less computation time and memory. 3.3.4 Labelling Algorithm 1: Recursive Recursive Connected Components Algorithm 1. Initialize label L = 1. 2. Scan the binary image to find an unlabelled pixel p ∈ S and assign it label L. 3. Recursively assign the label L only to all pixels q ∈ S that p ∈ S is connected to. 4. Set L := L + 1. 5. Go to Step 2.
Given below is a pseudocode Label(r,c) that implements Step 3 of algorithm using 4-connectivity. Label(r,c): // Begin Store(r,c,L); If p[r][c-1] is If p[r][c+1] is If p[r-1][c] is If p[r+1][c] is // End
1 1 1 1
and and and and
unlabeled, unlabeled, unlabeled, unlabeled,
Label(r,c-1); Label(r,c+1); Label(r-1,c); Label(r+1,c);
Label(r,c) is a recursive function. Store(r,c,L) assigns label L to a pixel at [r,c]. p[r][c] stores the value of the pixel at row r and column c. Hence, Label(r,c) recursively labels a pixel at [r,c], and all its 4-neighbor pixels (around it) that have unity values. 3.3.5 Labelling Algorithm 2: Sequential, 4-Connectivity Sequential Connected Components Algorithm 1. Scan the binary image from left to right, top to bottom. 2. If a pixel at [i, j] is in S (i.e., its pixel value is 1), then for the two pixels at [i − 1, j] and [i, j − 1]:
88
3. How to Sense?
a) If one has been assigned6 with label (say, L1 ) and the other is not labelled, assign label L1 to the pixel at [i, j]. b) If both have been assigned and with the same label (say L2 ), assign label L2 to the pixel at [i, j]. c) If both are assigned but with different labels, then assign the label of pixel at [i − 1, j] to that at [i, j], and record the two labels in the equivalence table (as equivalent labels). d) If both the pixels are not labelled, assign a new label L to the pixel at [i, j] and record it in the equivalence table. 3. If not all pixels in S are labelled, go to Step 2. 4. Determine the lowest-valued label for each equivalent-label set in the equivalence table. 5. Scan the image again and replace each label by the lowest-valued label in its equivalent-label set.
Equivalent labels are different label values assigned to pixels of the same connected component. Equivalent labels constitute an equivalent-label set, and these sets constitute an equivalence table. The algorithm requires two scans of the image. In the first scan (Steps 1-3), the connected components are found, and all those labels of the pixels in one component are put into its equivalent-label set. In the second scan (Steps 4-5), each label assigned to a pixel in the image is replaced by the lowest-valued label in its equivalent-label set. Given below is a pseudocode that implements Step 2 of the algorithm, but with details of the equivalent label recording in Step 2 abstracted away. It assumes that the elements of array label are all initialized to 0; this means that all the pixels of a binary image are initially unlabelled. // Begin L=1; If p[r][c] = 1 { if(label[r-1][c] > 0 && label[r][c-1] = 0) label[r][c] = label[r-1][c]; if(label[r-1][c] = 0 && label[r][c-1] > 0) label[r][c] = label[r][c-1]; if(label[r-1][c] > 0 && p[r][c-1] == label[r-1][c]) label[r][c] = label[r-1][c]; 6
Note that only pixels which have unity values are labelled.
3.3 Binary Image Processing
89
if(label[r-1][c] = 0 && label[r][c-1] = 0) { label[r][c] = L; L = L+1; } } // End Fig. 3.11 shows an example of the algorithm’s workings on an image. The equivalence table derived after Steps 1- 3 is ET = {ES1, ES2}, and each equivalent-label set contains equivalent labels, namely, ES1 = {1, 3} and ES2 = {2, 4}. Each label in a set belongs to the same connected component. After Step 4, the lowest-valued labels in ES1 and ES2 are 1 and 2 respectively. In Step 5, all the pixels with labels 3 and 4 are reassigned with labels 1 and 2, respectively. The final outcome is two connected components, each uniquely identified by a different label assigned to all its pixels.
Fig. 3.11. An example illustrating the workings of the sequential connected components algorithm on an image
3.3.6 Size Filtering Noise is inherent in computer vision. Some ‘extraneous’ components in an image could appear due to noise arising from the unstable resolution of the camera and the uneven illumination in environmental lighting. The high irregularities of noise often result in many scattered noise components in the (labelled) binary image, but these components are usually small and ragged contours.
90
3. How to Sense?
In many applications such as robot soccer, the objects of interest have connected components that are individually of greater sizes (i.e., areas in terms of the number of pixels in them) than the biggest noise component. Therefore, one may use what is called size filtering to remove noise after component labeling. This involves changing all the pixel values of a component from 1 to 0 if the component area is less than an appropriately selected size filter Af . This simple filtering mechanism has been found to be very effective in removing noise. Fig. 3.12 shows an example.
(a) Noisy image
(b) Noise filtered image
Fig. 3.12. A noisy binary image and its resulting image after application of a size filter (Af = 8)
3.4 Vision System For MiroSoT Robot Soccer The purpose of the vision system in robot soccer is to compute the robots’ postures and the ball’s position through the processing of the situation images in the playground during the game. A robot’s direction of move or heading direction is indicated by the robot’s heading angle, which is the angle its heading direction makes with the x-axis of the Cartesian x-y frame. In this section, we outline the basic steps in MiroSoT vision processing, the relation between a physical coordinate and its corresponding image coordinate, and provide examples on several practical but specialized aspects of the real vision systems used by previous FIRA Cup MiroSoT teams. 3.4.1 System Processing The basic steps involved in MiroSoT vision processing are as follows:
3.4 Vision System For MiroSoT Robot Soccer
91
1. Calibrate the vision system for colour recognition of target objects with respect to a colour model (we shall assume that the Y U V colour model is used in our description): • This amounts to determining and setting the intensity ranges [Ymin , Ymax ], [Umin , Umax ] and [Vmin , Vmax ] of every colour used for the target objects under the Y U V colour model. According to the FIRA MiroSoT rules, a different robot soccer team is distinguished by a different team colour of either yellow or blue patches placed on top of its team robots. Additional colours can be placed to uniquely identify each individual robot of a team. The target objects in robot soccer are the robots and the ball; thus the Y U V intensity ranges of each different colour for the team ID patch, robot ID patch and the ball need to be determined and set. 2. Run the vision system When the vision system is running, it performs the following steps in a cyclic fashion. a) Obtain a binary image of target objects from a captured Y U V -colour image. • By applying a thresholding technique against the Y U V intensity ranges. b) Obtain a labelled connected components image from the binary image. • By applying a labelling algorithm to the binary image. c) Remove noise in the labelled connected components image. • By performing size filtering of the labelled connected components image. d) Determine the (centre) positions of the remaining connected components. • By applying Eq. (3.8) to each connected component (with other components ‘virtually’ removed). e) Recognize target objects from all the remaining connected components and compute the postures of the target robots. • Through various approaches that exploit the geometry of the target objects (i.e., the robots and the ball) and the layout of the colour patches on top of each robot. During a soccer match, Steps 2a - 2e of vision processing (the SENSE functionality) is executed continually and as shown in the flow chart of Fig. 3.13, it works in a close loop in conjunction with the DECIDE and ACT:Control functionalities. 3.4.2 Image and Physical Coordinates on MiroSoT Playground Before proceeding to the examples, we formalize the simple relationship between a physical coordinate point (x, y) in the 150cm × 130cm playground
92
3. How to Sense?
Start_Game VISUAL SENSE Grab_Image Scan_Image Set_Colour_Ranges
Label_Image Locate_Objects Loop
Interrupt ?
Stop_Game NO
DECIDE and Generate CONTROL
YES
Stop_AllRobots
Send_Command
Fig. 3.13. A high-level flow chart showing vision processing as a software component of a robot soccer host-system program
and the corresponding point (xj , y i ) which is the image coordinate of the pixel at [i, j] (i.e., in the i-th row and j-th column) in its n × m image (i.e., image of n rows and m columns of pixels). Note that in the image coordinate point (xj , y i ), xj and y i correspond to column j and row i of the pixel, respectively. The physical and image coordinate frames are depicted in Fig. 3.14. The two frames are displaced such that the top-left corner of the playground has the image coordinate point (Jmin , Imin ), Jmin ≥ 0 and Imin ≥ 0; and the bottom-right corner has the image coordinate point (Jmax , Imax ), Jmax ≤ m and Imax ≤ n. Then, for an arbitrary image coordinate point (xj , y i ); the corresponding physical coordinate point is (x, y), given by xj − Jmin × 150cm, Jmax − Jmin Imax − y i y= × 130cm. Imax − Imin
x=
The intra-pixel spatial resolution of the image is defined by
(3.19)
3.4 Vision System For MiroSoT Robot Soccer
93
Image X - axis [1,1]
Physic al Y - a xis
I mage Y-- axis
j
( J min, I min) (0, (0,130) 180) ( xj , yi ) ( x,y )
180cm
( J max ,I max ) (220, (150,0) 0) 220cm Physical X-- axis [ n,m]
i Fig. 3.14. On mapping the image and physical coordinate points in the playground
Lg cm per pixel (column-wise), Jmax − Jmin Bg cm per pixel (row-wise), Imax − Imin
(3.20)
where the length Lg and breadth Bg of the playground are 150 cm and 130 cm, respectively. In general, the lower this intra-pixel spatial resolution is, the higher the accuracy of the computed physical position (x, y). Note that elsewhere in this book, we rely on context rather than notations to indicate if a coordinate (x, y) is an image point or a physical point. 3.4.3 Example 1: System Hardware Most participating teams of the previous FIRA Cup’s used similar vision hardware, differing only in the various vision software algorithms for computing the robots’ postures. There are many commercially available vision boards that support very high sampling rates, but these teams used the relatively cheaper boards that sample at 30 image frames per second.
94
3. How to Sense?
Fig. 3.15. Frame grabber (Media camp 7 plus)
The hardware components of an example vision system are as follows: 1. Frame grabber: Media camp7 Plus (see Fig. 3.15) - DOOIN electronics a) Input: Video 1 / Video 2 / SVHS / TV. b) Real image capture: 240 × 320, 30 frames/sec. c) Video signal: NTSC / PAL. 2. Overhead camera: PULNiX TMC-7 a) 768(H) 494(V) resolution. b) Controllable shutter speed: 1/30 - 1/10,000 second. c) lens: 8mm F1.3. 3. Host computer: Pentium PC. The vision board uses the Y U V colour model at 4 : 1 : 1 format. This ratio means that 4 consecutive colour-filtered signal point values (sampled, quantized and stored) in a row of the 2-D signal array consitute a pixel in the 2-D pixel array of the image captured; of which two signal points are for the Y constituent colour, and one each is for the U and V constituent colours. Hence, the actual pixel resolution of the image is not 240 × 320, but 240 × 80. 3.4.4 Example 2: Vision Processing This example illustrates the steps of vision processing for a MiroSoT team, up to and including Step 2d (see Section 3.4.1). The hardware in Example 1 is used. 1. Calibrate the vision system for colour recognition of target objects. • The calibration steps of determining and setting the Y U V intensity ranges for the different colours of the team ID patch, robot ID patch and the ball are as follows: a) Store a sampled image into memory, and display it on the host monitor screen.
3.4 Vision System For MiroSoT Robot Soccer
95
b) Do a scan of the area where an object (team robot or ball) with the target colour is. c) Scrutinize the Y , U and V constituent colour values of pixels in that area to determine the maximum and minimum values of each constituent colour for the target colour. To lessen any deviation due to noise, these maximum and minimum values are adjusted and readjusted several times, possibly on a different team robot patch, until each and every image pixel of the object (or that part of the object) which has the target colour appears on the monitor screen in that colour. d) Store the finalized maximum and minimum values in computer memory. e) Repeat Steps 1c-1d for each different target colour. 2. Run the vision system a) Apply a thresholding technique against the Y U V intensity ranges. In this process, the Y , U and V constituent values of every pixel in the 2-D colour image frame are checked if they fall in the corresponding constituent ranges of any target colour. If so, the value of the pixel is changed to a unity value, and its position is stored in association with the matched target colour. b) Apply a labelling algorithm to the binary image. The labelling algorithm used modifies Step 2 of the sequential labelling algorithm introduced in Section 3.3.5. The modified step is given below. If a pixel at [i, j] has unity value, then do the following: i. If position [i, j] is within the intra-component distance of an existing representative pixel of component (say, with label L), then A. assign label L to the pixel, B. increment the pixel count for this component by 1. ii. Else, A. include the pixel as a representative of a new component and assign it a new component label, B. create and initialize the pixel count for this new component to 1. A component representative pixel is defined as the first unity-valued pixel found for a new component. A pixel at [i, j] is said to be within intra-component distance dc of a component representative pixel at [ir , jr ] (and therefore is also a pixel of the component) if (i − ir )2 + (j − jr )2 ≤ dc . This intra-component distance is userspecified.
96
3. How to Sense?
The resulting algorithm is faster than the sequential algorithm but can erroneously map a bigger component if the objects come very close to one another. It assumes that the sizes of the target objects are known and will remain constant in the images captured; this assumption is, however, easily satisfied since the robots and the ball are of fixed sizes, and the overhead camera is viewing the whole playground vertically downward from a fixed height. Fig. 3.16 shows an example of a connected components image labelled by this modified algorithm. Image X - axis
[1,1] Image Y - axis
j [3,3]
Component 1
[4,13]
Component 2
Component 3 [9,5]
[10,16]
Component 4
[13,21] i
: Representative pixel of each component : Component pixel
Fig. 3.16. Example 1: Labelled components image
c) Do size filtering of the labelled connected components image. The size filter Af needs to be set. Suppose Af = 5, then component 4 in Fig. 3.16 will be filtered out (i.e., removed). d) Determine the (centre) positions of the remaining connected components. After filtering, three labelled connected components in Fig. 3.16, namely 1, 2 and 3, remain. Their centre coordinates (with respect to the image x-y axes) can be calculated using Eq. (3.8), and converted to physical coordinates using Eq. (3.19). For instance, for connected component 1, its centre position (¯ xj1 , y¯1i ) can be computed as follows:
3.4 Vision System For MiroSoT Robot Soccer
97
3×3+4×3+5×2 8 = 3.875. 3×2+4×3+5×3 y¯1i = 8 = 4.125.
x ¯j1 =
The image pixel resolution is n × m, where n = 13 rows and m = 21 columns. Suppose that Imin = Jmin = 0, and Imax = 13 and Jmax = 21, i.e., the playground and its image boundaries coincide. Then the physical coordinate (¯ x1 , y¯1 ) of the centre position of component 1 is given by 3.875 × 150cm 21 = 27.67 cm. 13 − 4.125 y¯1 = × 130cm 13 = 88.75 cm.
x ¯1 =
3.4.5 Example 3: Information Extraction Following up on Example 2, this example shows how to recognize the target objects and compute the posture of each team robot. This addresses Step 2e of the vision process (see Section 3.4.1).
Robot ID colour
Robot ID colour
Team ID colour Team ID colour
Fig. 3.17. A robot’s colour patch layout
Two commonly used colour patch layouts for the top of a robot are as shown in Fig. 3.17. In Fig. 3.17(a), the placement of the robot ID colour square-patch on the upper left quadrant (on the square top of the robot), with the team ID
98
3. How to Sense?
colour square-patch on the lower right quadrant, makes it possible to draw the robot’s orientation line through the centres of the square patches and the robot square top, and visually set the robot’s heading direction as shown in Fig. 3.18.
Y
(220cm,180cm)
p π 4
Playground
( xr ,,yyr )
Heading direction
θqo
θq
( xt , yt ) ( x,yy)) (x,
Orientation line of colour colour patches
(0.0)
X
Fig. 3.18. Computing a robot’s posture
For the layout in Fig. 3.17(b), the robot’s orientation line can be drawn by applying the general technique introduced in Section 3.3.2 (on page 82) to the team ID colour hexagon-patch (placed obliquely on the square top of the robot). The placement of the robot ID colour triangle-patch on the upper right-hand corner, with the ‘base’ of the triangle-patch parallel to the orientation line, then provides a means to help visually set the robot’s heading direction. In this example, the layout as shown in Fig. 3.17(a) is used. The target objects are recognized from the labelled connected components image as follows: 1. The Ball Identified as the connected component with the most number of pixels.
3.4 Vision System For MiroSoT Robot Soccer
99
2. Team Robot Identified from a pair of connected components, each representing a team ID colour patch and a robot ID colour patch. Let Dr and Dt contain the labels of components with robot ID colours and team ID colours, respectively, and do be the intra-object distance; do is user-specified. Then, the pairing is done using ‘Nearest Neighbour Association’, as follows: For each connected component Lrp ∈ Dr , do the following: a) For each connected component Ltq ∈ Dt , • compute the distance dpq between the centre points (xrp , ypr ) and (xtq , yqt ) of components Lrp and Ltq , respectively, using – dpq = (xtq − xrp )2 + (yqt − ypr )2 . b) Select component Lts for which • dps = min{dpq | for all Ltq , given a Lrp }. c) If the shortest distance dps ≤ do , then • pair up the components Lrp and Lts . After associating the (labelled) connected components with the target objects, each target robot’s posture can be computed. Referring to Fig. 3.18, the image coordinates (xr , yr ) and (xt , yt ) are, respectively, the centre points of the robot ID and team ID square colour patches that uniquely identify the team robot. The image coordinate (x, y) of the robot’s centre point can be calculated as follows: xr + xt , 2 yr + y t y= . 2
x=
(3.21)
To find the heading angle θ, the orientation angle θo should be computed first; this can actually be determined based on the general technique introduced in Section 3.3.2 (on page 82) by treating the team ID and robot ID colour patches as one whole, and using some means for direction indication. In this example, a simpler and perhaps more practical method is used by computing θo as follows: θo = tan−1
yrt xrt
,
where yrt = (yr − yt ) and xrt = (xr − xt ).
(3.22)
100
3. How to Sense?
Note that yrt and xrt can be positive or negative, and their signs together define the heading direction of the robot. Fixing θo ∈ (−π, π], θo θo θo θo
∈ [0, π2 ] ∈ ( π2 , π] ∈ (− π2 , 0) ∈ (−π, − π2 ]
if if if if
yrt yrt yrt yrt
≥0 ≥0 <0 <0
and and and and
xrt xrt xrt xrt
≥ 0, < 0, > 0, ≤ 0.
θ ∈ (−π, π] is then obtained as follows: θ=
2π + (θo − π4 ) if θo ∈ (−π, − 34 π], θo − π4 otherwise.
(3.23)
3.4.6 Example 4: Window Tracking for Fast Vision Processing To achieve the maximum of 30 actuation commands communicated per second using the hardware of Example 1, the system processing cycle time (of vision processing, deciding, controlling and communicating) must not exceed the frame sampling time of about 33ms. In an attempt to achieve this, this example presents a specialized method, called ‘window tracking’, that can effectively reduce the vision processing time. The window tracking method significantly reduces the time required to locate the target objects in the image frame. This is because, instead of always scanning the whole image frame to determine the postures of the target objects, it only scans a much smaller rectangular window area in the image frame for each object estimated or predicted to be located within the window. This is done by first scanning an image frame to determine the initial centre position of each target object in the image. A small rectangular window area is then defined for each object in the image, enclosing the object completely and fixing its window centre at the centre position of the object. Using an appropriate window size, the object often remains within the window in the next image frame, and hence only this window area needs to be scanned to determine the object’s new position and heading angle as captured in this subsequent frame. This basic method is illustrated in Fig. 3.19, where robot image 0 is in the initial image frame, and every robot image z, z ≥ 1 is in a subsequent image frame, and lies inside the window area z centred at the centre coordinate of the previous robot image z − 1. However, it is possible that an object could ‘escape’ from such a window due to collision or abrupt acceleration To mitigate this problem, one way is to predict the image position of each target object in the next frame from its current velocity, calculated first using its positions and heading angles in the previous and the current image frames. The window area for the object is then defined, fixing its centre at the object’s predicted centre position (for the next frame), instead of fixing it at the object’s centre position in the
Notes on Selected References
rBBB
Y
y
X
X W
W V
V S
U
T
101
U T
R
S
Fig. 3.19. Basic window tracking
current frame, as in the basic method. This modification has been found to be very effective.
Notes on Selected References The vision requirements for robot soccer have also been examined by other researchers [17, 18, 19, 20]. Computer vision has been an area of active research. The book [21] provides a good introduction to vision. Aspects of digitization are covered in detail in books on image processing [16]. There are several excellent references on image processing [16, 22] and computer graphics [23, 24] that contain information useful to computer vision. The reader interested in knowing more about pattern recognition should refer to [25]. There are also many good books on artificial intelligence, including [26, 27, 28].
4. How to Decide and Act? Use Intelligent Systems and Control Techniques
4.1 Introduction In robot soccer, the game situation in the playground is typically read in terms of the robots’ postures and the ball’s position. Using real-time information of this dynamically changing game situation, the host computer program of a command-based robot soccer team would need to continually decide the role and action to take of each team robot, and to direct each robot to perform a selected action. The purpose, of course, is to get the team robots to exhibit some artificial form of cooperation, manifested by their coordinated movements, ball passing, running into proper postures and ball shooting in the goal-ward direction as often as the opportunity arises in the course of the match. Performing an action requires one or more skills executed in a procedure; each skill a robot executes in turn registers an observable effect called behaviour1 ; hence a skill is often termed behavioral. In robot soccer, an example of a behaviour is a soccer robot slowing down to a halt in executing an actuation skill for a simple ‘stop’ action. Such a behaviour is emergent in the sense that it is generally not known a priori, since it depends on the conditions under which the robot executes the behavioral skill. Naturally, good strategies are needed to decide the roles and actions of the team robots during the game. But to be effective, they need to be organized into an architecture for proper management and control. The general performance of the host system depends on the selected control architecture, designed actions and embedded behavioral skills. The majority of control architectures found in the literature can be classified into two types, namely, hierarchical and behaviour-based. The fundamental differences between them lie in the way their computational processes are interconnected, and the type of data these processes handle. The hierarchical architecture decomposes a system into processes connected in series, with each converting one type of data to another. In other words, the data is said to be ‘transformed’ as it flows from the (visual) sensors to a series of perception processes, to a decision-making process, and through a selected action process consisting of a series of skill processes that run the actuators. 1
The absence of any effect due to a skill is called a null behaviour.
J.-H. Kim, D.-H. Kim, Y.-J. Kim, K.-T. Seow: Soccer Robotics, STAR 11, pp. 103-140, 2004 Springer-Verlag Berlin Heidelberg 2004
104
4. How to Decide and Act?
The behaviour-based architecture is centred on arbitration mechanisms that select one of the possible actions or behavioral skills to perform in reaction to changes in the game situation. As we would explain later, a hybrid of these two control architectures is recommended to facilitate the design of the host system for a MiroSoT team. From the control architecture perspective, the SENSE primitive has a distinct functionality of providing information feedback and can therefore be conceptually isolated as an independent system component, but easily referenced implicitly in the architecture. However, one cannot in general talk about the functionality of the DECIDE primitive without incorporating the functionality of the ACT primitive. This should be clear once we realize that an important consideration of a control architecture lies in its ability to decide what or which action to take next in a dynamically changing environment. This chapter focusses on the DECIDE and ACT primitives. To reflect the conceptual dependence of DECIDE on ACT, it then covers both the primitives, unified through a hybrid control architecture of relevance to the study and development of robot soccer systems. That the SENSE primitive is an integral part of the architecture is implicit. The architecture introduced integrates the three primitives of SENSE, DECIDE and ACT in a hierarchy of four interacting levels, namely, role, action, behavioral and execution. To expose the technical challenges involved, example strategies at the role level and action level are presented. Action designs for robot soccer, to be implemented at the behavioral level, are classified and explained. An overview of classical PID control, applicable at the behaviour level and execution level, follows. Finally, two different navigation methods, applicable at the behaviour level, are presented.
4.2 Hybrid Control Architecture The recommended control architecture for a MiroSoT team is a hybrid of hierarchical and (reactive) behaviour-based structures. The former facilitates modularity of design from top-down or bottom-up, while the latter enables high reactivity in the dynamic game situation. A hybrid control architecture proposed, as shown in Fig. 4.1, combines these two useful features. Table 4.1 shows the mapping of the hierarchy of the hybrid control architecture onto the robot primitives discussed in Chapter 1; the hierarchy has four levels, and the function in each level is summarized. The reader should relate Fig. 4.1 to the block diagram of Fig. 1.8 on page 22. The rest of Section 4.2 elaborates the four levels in detail.
4.2 Hybrid Control Architecture
105
fgekfg
xkuwcnBugpug
t
c
cev d
g
g Fig. 4.1. A hybrid control architecture for robot soccer (MiroSoT Category)
4.2.1 Role Level: The Who Issue In this level is a role assigner that chooses a role and an area of manoeuvre in the playground for each robot as per the game situation and the role-level strategy. Two self-explanatory but significant situations that a MiroSoT robot system designer would need to consider are depicted in Fig. 4.2. 4.2.2 Action Level: The What Issue The team robots should take appropriate actions according to their different roles as attacker, defender and goalkeeper. At this level is an action selection mechanism that selects an appropriate action for each team robot in an assigned role, according to the online game situation and adopted action-level strategy. The action-level strategy for each role can be designed as a supervised discrete-event system modelled by a formalism such as Petri nets or automata. Each state in the modelled strategy is a ‘decision point’ that should capture
106
4. How to Decide and Act?
Table 4.1. Robot primitives and hierarchy levels defined for robot soccer (MiroSoT Category)
ROBOT PRIMITIVE
HIERARCHY LEVEL
FUNCTION
1
Role
Decide if each robot should be defender, attacker or goalkeeper.
2
Action
Select action of each robot (e.g. shooting, blocking, pushing).
: Control [C]
3
Behaviour
Move with or without obstacle avoidance. to perform selected actions.
: Actuation [M]
4
Execution
Actuate motors with or without control.
DECIDE [D]
ACT [A]
a significant discrete game situation, at which only a subset of actions are available, from which the mechanism selects one for the team robot in the assigned role. The strategy is updated from one state to another in response to the completion of the selected action, or the occurrence of a significant event in the game situation. The design of a robot’s actions will be presented later, in Section 4.4. 4.2.3 Behaviour Level: The How Issue In this hybrid architecture, every action needs to be performed through the execution of certain skills to bring about the intended behaviours. In robot soccer, the behavioral skills need to be implemented to bring about the behaviours of moving with and without obstacle avoidance. The obstacles of interest include the ball, team robots and opponent robots. This level embeds the behavioral-based architecture for reactive control, in that under the scope of a selected action, the posture control continually generates the desired wheel velocities for each team robot, to bring about the resultant move behaviours in the robots in direct response to detected changes in the game situation.
4.3 Example Strategies
Defender
Opponent team
Opponent team
Defender
107
Attacker
Attacker
Fig. 4.2. Situational problems encountered by role-level assigner
4.2.4 Execution Level: The Motion Issue This level implements velocity control that drives the motors of each team robot. A two-wheel MiroSoT robot exercises a move behaviour by driving its motors through a skilful combination of left-wheel and right-wheel velocities input from the behaviour level. These possibly different and dynamically changing reference wheel velocities continuously modify the directional turn and motion of the robot. The robot is said to be proactive when it exercises a move behaviour towards a desired posture, and reactive when it also has to move in avoidance of an obstacle in its path.
4.3 Example Strategies Devising good strategies is important to winning a robot soccer game, especially when the opponent team’s ability is technically comparable in terms of the robot hardware and vision system used. Strategy can be implemented using ‘if-then’ rules or other knowledge representation formalisms, at different levels of the hierarchy. To drive home the idea of a strategy, we now present two related examples, one at the role level and the other at the action level.
108
4. How to Decide and Act?
4.3.1 Action-Level Strategy We consider an action-level strategy that enforces prioritized selection of actions defined for each role a robot can assume.
Common area
Opponent goal
Team goal
Left-wing zone
Right-wing zone
Fig. 4.3. Basis areas of manoeuvre for zone defence strategy (Small League MiroSoT Category)
In formulating such a strategy for Small League MiroSoT, the playground is divided into three basis areas of manoeuvre or zones for the keeper, leftwing and right-wing robots, as depicted in Fig. 4.3. The left-wing and rightwing areas share the common area. Each robot is assigned a role and a different zone as decided by the role assigner at the top-most level. The strategy obeys the following general rule: Whenever a selected action has finished or a significant event in the playground has occurred, 1. first select an available action for a team robot whose assigned zone the ball is located within; call it the leading robot; 2. then select the available actions for the other two robots that either coordinate or do not conflict with the selected action for the leading robot. Consider the instance when the ball enters the keeper zone, and the goalkeeping robot is in the zone. In this case, the SweepBall action should be selected for the goalkeeping robot while the Block action or other actions are selected for the other two robots; in this way, the goalkeeping robot could act to push
4.4 Design of Robot Soccer Actions
109
the ball away from the keeper zone, while the other team robots could either help with the defence by blocking the ball shot at goal or moving away in cooperation so as not to hinder the action of the goalkeeping robot, thereby reducing the risk of conceding a goal. 4.3.2 Role-Level Strategy To reiterate, at the role level is a role assigner that determines a role and a zone in the playground for each robot as per the game situation and the role-level strategy. Here, we consider a role-level strategy (used by the role assigner) that addresses the problems of fault occurrence in a team robot and robot-blocking by opponent robots. It is designed to work with the example action-level strategy that we have described in the preceding section. The strategy obeys the following general rule: Whenever a team robot malfunctions or gets blocked by an opponent, 1. re-draw the zone boundaries; 2. assign a suitable role and a re-defined zone to each of the remaining two robots. Consider, for instance, a critical situation when the goalkeeping robot malfunctions due to weak batteries. In this case, one of the two remaining robots should be assigned the role of goalkeeping covering the keeper zone, while the other assumes the role of an attacker covering both the left-wing and right-wing as its re-defined zone. Other related role-level strategies that can be incorporated include formation strategies that affect how the roles and variable zones are assigned to the team robots; for example, in a Middle League (i.e., 5-a-side) MiroSoT game, a ‘1-0-4’ formation strategy (a goalkeeper and four attackers) can be used against a weak opponent team; other formations are ‘1-2-2’ (a goalkeeper, two defenders and two attackers), ‘1-1-3’ (a goalkeeper, a defender and three attackers) and ‘1-3-1’ (a goalkeeper, three defenders and an attacker). These formations are depicted in Fig. 4.4. In general, the selection of role-level strategies is decided by the human manager. In MiroSoT, this can be done by implementing a ‘strategy database’, from which appropriate strategies are selected and loaded prior to the game and during the half-time interval, depending on how the human manager reads the game.
4.4 Design of Robot Soccer Actions Central to a good robot soccer system is a set of carefully designed actions that model the key aspects of game play, namely, save for the goalkeeper,
110
4. How to Decide and Act?
(a) ‘1-2-2’
(b) ‘1-1-3’
(c) ‘1-3-1’
(d) ‘1-0-4’
Fig. 4.4. Formations for Middle League MiroSoT
and stop, pass, dribble and shoot for the other players. Of course, given that the MiroSoT robots move on wheels, ball passing, dribbling and saving are done by ‘bumping’ and blocking against the ball respectively, and these are behaviorally less skilful (or more naive) than a human player does, or what a humanoid player might be able to do in the future. This section presents a set of basic actions organized into the base class and three other classes according to the roles, namely, attacking, defending and goalkeeping. What each action does is given, without any specific on how it could do perform its function(s) for which there are possible many ways. The purpose of this section is really to give the reader conceptual ideas of the game-specific actions. 4.4.1 Base Class: Primitive As the word base suggests, the actions in this class are meant for every team robot, regardless of its role.
4.4 Design of Robot Soccer Actions
111
Stop Action and Wandering Actions. The Stop action is selected to halt a team robot.
Team goal
Opponent goal
Action trajectory
Fig. 4.5. Wandering action
The Wandering action is selected for a team robot to move to and fro in parallel to the length of the playground (i.e., the x-axis) as depicted in Fig. 4.5, whenever the robot has no other appropriate action to perform; this happens especially when one other robot has been deemed more suitable to go for the ball. SweepBall and Position To SweepBall Actions. The SweepBall action is selected for a team robot to attempt to take out a ball bumped into a corner of the playground. The general direction of ball sweeping at each of the four corners is depicted in Fig. 4.6. If the robot’s posture does not permit it to perform the action, then the Position To SweepBall action is selected first to put the robot in a new posture in anticipation of performing the SweepBall action subsequently. 4.4.2 Attacker Class: Shoot The actions in this class are defined for an attacking robot. These actions bring about the robot behaviours of ball shooting. Shoot Action. This action is selected for a team robot to strike the ball through a given target point towards goal; the basic idea is to get the robot move along a trajectory towards a desired point, with the ball in the trajectory path, as depicted in Fig. 4.7. The desired robot posture (i.e., the desired point facing an appropriate heading direction) would need to be determined first. This action is selected for the robot whenever it is deemed to be in a
4. How to Decide and Act?
Opponent goal
Team goal
112
Action trajectory
Fig. 4.6. SweepBall action
‘shootable’ area. As Fig. 4.7 illustrates, a robot’s shootable area is marked off by the ball coming in front of the robot (facing it) and the target point, and a suitable distance between the robot and the target point. Note that the robot need not be directly facing the goal (or the target point). What is deemed a suitable distance between the robot and the target point is usually set offline, and is in general a subjective decision of the human manager.
B
Opponent goal
TP DP
Shootable area Action trajectory
Fig. 4.7. Shoot action
Cannon Shoot Action. This action is similar to Shoot, but is selected for a team robot to strike the ball towards goal without a given target point. This
4.4 Design of Robot Soccer Actions
113
action is selected whenever the robot is facing (a sufficiently wide gap in) the opponent goal, with the ball directly in-between, as depicted in Fig. 4.8.
Opponent goal
DP B
Action trajectory
Fig. 4.8. Cannon Shoot action
Position To Shoot Action. In attacking, the Shoot and Cannon Shoot actions are given the highest priorities, and so the conditions to select either one are always checked for first. Whenever none of them can be selected, the Position To Shoot action is selected for the team robot to move to a suitable posture in anticipation of performing one of the shoot actions subsequently. This is depicted in Fig. 4.9.
i
Action trajectory
Fig. 4.9. Position To Shoot action
Opponent goal
TP DP
114
4. How to Decide and Act?
4.4.3 Defender Class: Push The actions in this class are defined for a defending robot, and they bring about the robot behaviours of ball pushing.
kw TR
i
Pushable area Action trajectory Fig. 4.10. PushBall action
PushBall Action. The PushBall action is an ‘inverse’ of the Shoot action, in the sense that while the latter is selected for a robot to move towards a desired point, striking the ball in its path to roll it towards a target point at the opponent goal, the former is selected for the robot to move towards a desired point, striking the ball to roll it away from the target lateral range (width of the goal mouth) at its own team goal, as depicted in Fig. 4.10. The desired robot posture (i.e., the desired point facing an appropriate heading direction) would need to be determined first. This action is selected for the robot whenever it is deemed to be in a ‘pushable’ area. As Fig. 4.10 illustrates, a robot’s pushable area is marked off by the ball coming in front of the robot (facing it) and the target range, and a suitable distance between the target range and the robot. Note that the target range (i.e., the team goal) need not be directly behind the robot. As in that for the Shoot action, what is deemed a suitable distance between the target range and the robot is usually set offline, and is in general a subjective decision of the human manager. Position To PushBall Action. In defending, the PushBall action is given the highest priority, and so the conditions to select it are always checked for first.
4.4 Design of Robot Soccer Actions
115
TR
DP
B Action trajectory
Fig. 4.11. Position To PushBall action
Whenever it cannot be selected, the Position To PushBall action, an inverse of the Position To Shoot action, is selected for the team robot to move to a suitable posture in anticipation of performing the push action subsequently. This is depicted in Fig. 4.11. Screen Out Ball Action. This action is selected for a team robot to move parallel to its own team goal line, outside the penalty box, whenever the ball is located close to the team goal. The purpose is to block the opponent robots or the ball, to prevent the ball from being struck either by an opponent robot or accidentally by itself towards its own team goal. Of course, the x-distance between the trajectory of this action and the goal line is an offline decision determined subjectively by the human manager. 4.4.4 Goalkeeper Class: Block The actions of this class are defined for the goalkeeping robot to bring about its behaviours of ball blocking, usually within the penalty box. BlockBall Action. In goalkeeping, this action is of the highest priority, and is selected for the goalkeeping robot to move parallel to its own team goal line towards a desired point, whenever the ball is struck towards the team goal. The ball velocity (i.e., speed and direction of move) has to be estimated first from the images of previous and current ball positions over an image sampling time period. Assuming that this ball velocity is constant, the time required or the robot to move to the desired point to intercept the ball can
4. How to Decide and Act?
Team goal
116
DP BIP
Keeper B
Fig. 4.12. BlockBall action
be calculated, using which the desired robot velocity can be computed. The ball intercept point is at the intersection of the predicted ball trajectory and the dotted line next to the one along which the goalkeeping robot moves, as depicted in Fig. 4.12. KDefaultPosition and KDefendGoal Actions. The KDefendGoal and KDefaultPosition actions are selected for the goalkeeping robot to hold its position against the attack of the opponent robots. KDefendGoal is selected for the robot to move parallel to its team goal line to a position that has the same y-coordinate as the predicted target point of the ball, whenever the x-distance between the team goal and the ball is in a predetermined range (for example, [30cm, 50cm]). KDefaultPosition is selected for the robot to move parallel to its team goal line to a predetermined position, whenever the x-distance between the team goal and the ball is less than the lower bound of a predetermined range for selecting the KDefendGoal action (for example, 30cm). Keeper Attack and KNeedEscapeGoal Actions. The KNeedEscapeGoal action is selected for the goalkeeping robot to exit from inside its own goal whenever it enters it due to inexact visual information or pushing by an opponent robot.
4.5 Control Basics
117
The Keeper Attack action is selected for the goalkeeping robot to push the ball by itself whenever the ball has stayed close to the team goal for a sufficiently long time in which no other team robot could push it away towards the opponent’s half of the playground.
4.5 Control Basics This section presents the classical techniques that can be applied at the behaviour and execution levels. At the behaviour level, the reference (or setpoint) input is the desired posture of every team robot; in other words, this level implements robot posture control, i.e., the control purpose is to (try to) bring each robot to its referenced position and heading angle by computing and recomputing the desired wheel velocities of the robot. This is navigation control. It implements the ‘Intelligent Control’ block of Fig. 1.8. At the execution level, the reference inputs are the desired left-wheel and right-wheel velocities of every team robot; in other words, this level implements velocity control, i.e., the control purpose is to drive the motors of each robot accordingly to attain the desired wheel velocities. This is motion control. It implements the ‘Actuation’ block of Fig. 1.8. 4.5.1 PID Control The PID controller is the most common form of feedback in use today. The structure of a PID controller is simple. PID stands for P I D
: : :
Proportional control, Integral control, Derivative control.
These three terms treat the current control error (P), past control errors (I), and predicted future control errors (D). To understand the operation of a PID feedback controller, the three terms should be considered separately. Proportional Control. Proportional control is a pure gain adjustment acting on the error signal to provide the driving input to the process known commonly as the plant. The P term in the PID controller is used to adjust the speed of the system. Integral Control. Integral control is implemented through the introduction of an integrator. Integral control is used to provide the required accuracy for the control system.
118
4. How to Decide and Act?
PID controller P
Reference r(t)
R
Σ
R
e(t) I
T
Σ
R
u(t)
y(t) Plant
R D
y(t)
Sensor
Fig. 4.13. Block diagram representation of a PID controller
Derivative Control. Derivative control is normally introduced to increase the damping in the system. But the derivative term also amplifies the existing noise, and this can cause problems including instability. Continuous Function. Fig. 4.13 shows a common block diagram representation of a PID controller (in parallel form). If we now look at its general continuous time function, the three terms can be recognised as follows: u(t) = KP · e(t) + KI ·
t
e(τ ) dτ + KD · e(t), ˙
(4.1)
o
where the error signal e(t) = r(t) − y(t), and 1. KP is the proportional (P) gain, 2. KI is the integral (I) gain given by KI =
KP , Ti
where Ti is the integral (or reset) time constant,
3. KD is the derivative (D) gain given by K D = K P · Td ,
where Td is the derivative (or rate) time constant.
Proportional feedback control can lead to reduced errors to disturbances but still has a small steady-state error; it can also increase the speed of response but typically at the cost of a larger transient overshoot. If the controller has a term proportional to the integral of the error, the error in response to a step input can be eliminated, but there tends to be a further deterioration of the dynamic response. Finally, the addition of a term proportional to the error derivative can add damping to the dynamic response. These three terms combined form the classical PID controller.
4.5 Control Basics
119
Digital Function. The PID control introduced is continuous and is built using analog electronics such as resistors, capacitors and operational amplifiers. However, a robot soccer system and most control systems today use digital computers (usually microprocessors or microcontrollers) with the necessary input/output hardware to implement the controllers. To implement in a digital computer, the PID control law described by Eq. (4.1) has to be converted to a digital form. One simple way to make a digital approximation of the real-time solution of differential equations is to use the following relationship: x˙ ≈
x(k) − x(k − 1) , TS
(4.2)
where • • • • •
k is an integer and k ≥ 1, TS = tk − tk−1 (the sample interval in seconds), tk = kTS (for a constant sample interval), x(k) is the value of x at tk , x(k − 1) is the value of x at tk−1 .
This approximation can be used in place of all derivatives in the controller differential equations to convert them to a set of difference equations that can be solved repetitively with time steps of length TS by a digital computer. Differentiating the continuous PID control law (i.e., Eq. (4.1)) with respect to time, we get u(t) ˙ = KP · e(t) ˙ + KI · e(t) + KD · e¨(t).
(4.3)
Thus, applying Eq. (4.2) to the I term (once) and D term (twice) of Eq. (4.3), we get the digital form of the PID control law as the following difference equation: u(k) =u(k − 1) + Td TS Td Td KP + e(k − 2) . 1+ e(k) − 1 + 2 e(k − 1) + Ti TS TS TS (4.4) Tuning. The different gain parameters (KP , KI , KD ) of a PID controller interact with one another. Therefore, to meet the design specification for a control system, these parameters must be adjusted or tuned. In general, a time domain specification is given by a desired system step response in terms of the rise time, settling time and overshoot. The step response refers to the system output signal variation over time in response a unit step input. Rise time refers to that needed for the system to reach the
120
4. How to Decide and Act?
vicinity of its final value; settling time refers to that needed to settle, i.e., for the transients to decay away; and overshoot refers to the maximum amount the system exceeds its final value divided by its final value. It has been understood, that perhaps with the sharing of knowledge and experience among many MiroSoT participants, such controllers have often been tuned by trial and error, with reasonable success. However, it is often best to resolve this matter as systematically as possible. Systematic methods for automatic tuning are available, and we refer the reader to the textbook [29] and the journal issue [30] for an excellent exposition to these methods.
4.6 Unified Navigation Control As already implied, robot navigation is concerned with behaviorally moving a robot to a specific posture. One key issue is navigation with obstacle avoidance. Through visual sensing, the robot under control should move towards a specified posture while reactively avoiding any moving obstacle in its path. In robot soccer, this reduces to getting a robot move past (and therefore avoid colliding with) any opponent robot towards a desired posture to kick the ball, as shown in Fig. 4.14. In our context, when we say a robot kicks the ball, we mean the robot shoots or pushes the ball by bumping against it.
Fig. 4.14. Robot soccer situation: A robot (in white) should kick the ball (round) avoiding a opponent robot (in grey)
4.6 Unified Navigation Control
121
This section presents two independent techniques for real-time navigation control that can be applied at the behaviour level for robot soccer. Unlike conventional navigation methods, each considers the robot’s heading direction required at a target position, and computes at every moment, the robot’s desired direction (and the velocity control input to drive it) in an environment of moving obstacles and constantly changing target positions that render inapplicable any method requiring the full navigation path to be determined for a set of fixed target and obstacle positions. Such a method is said to be unified as it enables the robot to attempt to unfold a navigation path, not known a priori, by continually determining its desired current heading angle and following it in anticipated projection towards a target position, instead of separately determining a complete (fixed) path first and then following it, as in a conventional navigation method. In so doing, a unified navigation method ensures high adaptability of the robot to rapid environmental change, and uses relatively less computing power. The method is useful for implementing key actions such as Shoot and PushBall effectively, such that in performing such actions, a soccer robot can, under control, negotiate past any opponent robot - a moving obstacle - to exert at an appropriate heading angle when it hits the ball - the moving target. Prior to presenting the two unified methods, the basic control of a twowheel robot is reviewed. 4.6.1 Control of a Two-Wheel Robot A Traditional Approach. Differential-drive mobile robots with non-slipping and pure rolling are considered. Recalling from Chapter 3, the velocity vector U = [ν ω]T consists of a translational velocity ν defined at the centre of the robot, and a turning velocity ω defined with respect to the centre of the robot. The robot kinematics in terms of velocity vector U = [ν ω]T and posture vector P = [x y θ]T is expressed as follows: x˙ P˙ = y˙ = J(θ)U. θ˙
(4.5)
The Jacobian matrix is cos θ 0 J(θ) = sin θ 0 0 1 and the control input vector is T 1
T V + VL VR − VL 2 U= ν ω = R = − L1 2 L
(4.6)
1 2 1 L
VL , VR
(4.7)
122
4. How to Decide and Act?
Y
VL v
VR
θ y
Y θe θ
0
x (a) Kinematics modeling
θd
X
X (b) Angle error
Fig. 4.15. Robot modeling
where VL is the left-wheel velocity, VR is the right-wheel velocity and L is the distance between the two wheels. The robot should be controlled to move to it to any posture by computing VL and VR . For notational convenience, the robot’s position (x, y) is denoted by symbol p. Then, a vector from point p to an arbitrary point g is denoted by → pg. In general, a unified navigation method consists of a direction generation step and a posture control step. In direction generation, the desired heading angle θd of the robot’s current position p (or (x, y)) as shown in Fig. 4.15(b) is determined. With the heading angle θ of the robot, the angle error θe is θe = θd − θ.
(4.8)
It can be easily shown that L · ω, 2 L VR = ν + · ω. 2 VL = ν −
(4.9)
Thus, in posture control, by keeping constant the desired velocity ν defined at the centre of the robot, and applying ω = KP · θe + KD · θ˙e ,
4.6 Unified Navigation Control
123
where KP and KD are the proportional and the derivative gains for the robot’s turning velocity ω, respectively, the following PD control law can be used to move the robot: VL = ν − KP · θe − KD · θ˙e , VR = ν + KP · θe + KD · θ˙e ,
(4.10)
where KP =
L · KP 2
and KD =
L · KD . 2
Now, a velocity input [VL , VR ]T inherently generates a centrifugal force on a turning robot. A robot can slip or overturn if the centrifugal force generated exceeds the limit of the frictional forces between the wheels and their points of contact with the surface. To prevent such mishaps from happening, the robot must satisfy the following kinematics constraints: |VL | ≤ Vm , |VR | ≤ Vm ,
(4.11)
|νω| ≤ Rm ,
(4.12)
where Vm is the maximum speed for each wheel and Rm quantifies the maximum turn allowed. By restricting |νω|, the robot can be controlled with the centrifugal force not exceeding the frictional forces. Constraints (4.11) and (4.12) map out what is called the available velocity region in the (VL , VR ) space, and is shaded grey as shown in Fig 4.16. Any velocity input (VL , VR ) outside the grey region could cause the robot to slip or overturn. With velocity input P1 , a robot would move straight with maximum speed; with input P2 , it would rotate with maximum angular velocity without changing (the centre of) its position. A straightforward approach to prevent slipping or overturning is to restrict the desired wheel velocities, computed in accordance to the PD control law of Eq. (4.10), to within the square inside the grey region, as shown in Fig 4.16. But this unnecessarily limits the speed performance of the robot. Besides, when the robot is executing a turn at a relatively high speed, due to PD control not following (or tracking) the desired heading angle on time, subsequent ‘corrective’ control results in its rate of desired angle θd varying significantly with alternate sign changes, thus causing the robot to oscillate as it moves. The next section presents a new control law that extends the maximum speed limit of a robot to fully cover the available velocity region, without causing the robot to slip or overturn, or to oscillate.
124
4. How to Decide and Act?
VL P2 (V R2 , VL2 )
P1 (V R1 , VL1 )
|VL |
VR |vω|
|VL|
Fig. 4.16. Available velocity region
A New Approach. The error in angle θe between the robot’s current heading angle θ and the orientation of the field univector θd is given as follows: θe = θd − θ.
(4.13)
The derivative of θe is θ˙e = θ˙d − ω.
(4.14)
By Eq. (4.5), θ˙d can be expressed by ∂θd ∂x ∂θd ∂y θ˙d = + , ∂x ∂t ∂y ∂t ∂θd ∂θd = cos θ v + sin θ v, ∂x ∂y = φv (x, y, θ)v,
where
φv =
(4.15)
∂θd ∂θd cos θ + sin θ. ∂x ∂y
The following control input component ω is considered: ω = φv ν + KW sgn(θe ) |θe | where KW is a positive constant. From Eqs. (4.14) and (4.16), θ˙e = −KW sgn(θe ) |θe |.
(4.16)
(4.17)
4.6 Unified Navigation Control
125
Then, θe will become zero within the time T ≥ 2 θe (0)/KW [31]. It means that once the robot’s heading direction has converged to the univector direction, the angle error θe becomes zero. To satisfy the constraints to prevent slipping and overturning, the other control input component ν should be bounded. From Eq. (4.16) and Constraint (4.11), the inequality is L ω| ≤ Vm . (4.18) 2 Substituting Eqs. (4.16) to (4.18), the inequality becomes L L L |ν ± ( + KW sgn(θe ) |θe |)| ≤ (1 + |φv |)|ν| + KW sgn(θe ) ≤ Vm . 2 2 2 (4.19) |ν ±
Solving the last two terms of Eq. (4.19), a sufficient condition satisfying Constraint (4.11) can be derived as 2Vmax − LKW |θe | . (4.20) |ν| ≤ 2 + L|φv | From Constraint (4.12), the following inequality is obtained similarly. |νω| = |ν(φv ν + KW sgn(θe ) |θe |)| ≤ |φv ||ν|2 + KW |θe | ≤ Rm . (4.21) A sufficient condition satisfying Constraint (4.12) is KW 2 |θe | + 4Rm |φv | − KW |θe | |ν| ≤ . 2|φv |
(4.22)
If the robot moves with velocity ν satisfying Conditions (4.20) and (4.22), it can satisfy the Constraints (4.11) and (4.12), i.e., can move without slipping or overturning. Consequently, the control law is ν =min(v1 , v2 , v3 ),
ω =φv ν + KW sgn(θe ) |θe |,
where
2Vm − LKW |θe | v1 = , 2 + L|φv | KW 2 + 4Rm |φv | − KW |θe | v2 = , 2|φv | v3 = Kd ||p − g||.
(4.23)
In Eq. (4.23), v3 is used for the translational velocity ν to slow down to zero near the destination g. In the following, two navigation methods are introduced, with emphasis on the generation step, i.e., on how the desired direction θd is determined.
126
4. How to Decide and Act?
4.6.2 Univector Field Method The potential field method is a unified method for real-time robot navigation control [32]. This section provides a brief overview of this method before presenting, in more detail, the univector field method that modifies and significantly improves upon the potential field method. The Vector Field Method. Fig. 4.18 shows a potential field. Graphically, it consists of dash lines connected to equally-spaced points. Each line is a field vector indicating a direction that points outward from the point it is connected. In the potential field navigation method, these lines indicate the directions in the vector field that a robot should follow under control. As depicted in Fig. 4.18, the direction of each field vector is generated in proportion to a resultant force, with an attractive force component from a target position and a repulsive force component from an obstacle to avoid. The component forces, F t and F io , exerted at an arbitrary point due to a target position and one of the obstacles, respectively, are determined by the following equations: Attractive force :F t = (Kt dt )nt , Repulsive forces :F io = (
Koi i )n , dio o
(4.24)
where • • • • •
dt is the distance between the point and the target position, nt is the unit vector in the direction from the point to the target position, Kt is a coefficient parameter for the target position, dio is the distance between the point and the (centre of the) i-th obstacle, nio is the unit vector in the direction from the i-th obstacle to the point, and • Koi is a coefficient parameter for the i-th obstacle. The resultant force F can then be obtained by vector addition, as follows: Resultant force :F = F t + F io . (4.25) all i
Fig. 4.17 depicts an example of component forces at a point (where the centre of the robot is) due to a target position (to move to) and an obstacle. This method is simple and helps to control a robot in real time. However, it has been found that when the robot’s constant velocity cannot be maintained and the obstacle to avoid is too big, the robot under control is liable to break into oscillations and its heading direction cannot be guaranteed at the target point [33, 34].
4.6 Unified Navigation Control
127
Obstacle
do Ft Fo
dt
F Target position Fig. 4.17. Component forces for generating a potential field
Fig. 4.18. A potential field
Univector Navigation Control: An Overview. Better behavioral control of a two-wheel robot is possible with a univector field, which is a modification of the potential vector field. In a univector field, the magnitude of each vector is unity at all (equally-spaced) positions, hence its name univector. It is thus a vector field containing only directional information. The univector field is composed of two univector subfields. One subfield is concerned with a robot going through a desired position at a desired heading angle. The other is concerned with the robot avoiding any obstacle. Combining these two fields yields the resultant univector field of the robot’s environment. Conceptually, the direction of a univector at each position in the resultant field is the desired heading angle θd of the robot at that position.
128
4. How to Decide and Act?
Robot control is required to steer the robot’s heading to converge in the direction of the univector computed at the (centre of the) robot’s position. It has been found that using the univector navigation method, a robot can navigate rapidly through a desired position at a desired heading angle without oscillating and taking unnecessarily longer paths. The next two sections discuss the two subfields in more detail. Univector Field For Attaining Target Posture. Fig. 4.19 shows a univector field, where the tiny circles with a small dash line attached to each indicates the direction. The tiny circle is meant to represent a position and the straight line attached to it represents a direction it is pointing in. A field univector at position p (or (x, y)) is defined as F (p) (or F (x, y)). For attaining a desired heading angle at a desired point (i.e., attaining a target posture), the field vector at the robot’s position p is generated by
→
F (p) = pg −nφ,
(4.26)
where →
→
φ = pr − pg, n is a positive constant to be determined, g is the desired point and r is a guidance point. The symbol denotes the angle of a vector mapped onto the range (−π, π]. The ‘shape’ of the vector field and hence the turning motion of the ‘vector-following’ robot change according to the parameter n and the length of the line gr. In implementing the Shoot action for a soccer robot, the desired point g will be the position (or that directly in front) of the ball and the guidance → point r will determine the robot’s desired angle gr at point g. The guidance point r is selected close to the goal point g, but such that in moving towards point r, the robot can go through point g and kick the ball with sufficient force. The univector field method is based on Eq. (4.26). By this equation, we can obtain the field vectors at all points for the desired heading angle at point → g, which is gr. Assume that the points g and r are fixed and that a robot is initially at position p as indicated in Fig. 4.19; then conceptually, the robot should follow a sequence of univectors, such that as it approaches the desired point g, the angle φ decreases to zero. Thenceforth, the robot moves straight through the desired point g towards point r. Univector Field For Obstacle Avoidance. Fig. 4.20 shows the univector field for obstacle avoidance of a circular object. It can be easily obtained as follows: Consider a univector at an arbitrary point p. If the vector can be extended to intersect with the boundary of an obstacle, then change its direction to that of the extended line from point p to the tangential point of the obstacle boundary.
4.6 Unified Navigation Control
129
p nφ
φ
g
r
Fig. 4.19. A univector field
xbound
Fig. 4.20. Univector field for obstacle avoidance by a point object
For instance, the modified direction of the univector at point p is in the direction of F (p) as shown in Fig. 4.20, but the initial direction was to the right. In an actual implementation, a uniform margin surrounding a circular obstacle must be set, such that the univector at any point within the margin do not point towards the obstacle; in particular, the vector at any point on
130
4. How to Decide and Act?
the circular boundary of the obstacle should be orthogonal to, and points directly away from the boundary. A modified univector field satisfying such requirements is shown in Fig. 4.21. Ro is the radius of the obstacle and M is the width of the boundary margin.
Fig. 4.21. Modified univector field for obstacle avoidance by a robot while moving towards a target point g
Note that the centre of a robot in univector field navigation would never enter the margin surrounding a stationary obstacle. But should the robot finds its centre position within the margin of a moving obstacle, due, for instance, to its relatively lower speed against the oncoming obstacle, the robot may be able to move away from the obstacle by following a sequence of univectors within the margin, as depicted in Fig. 4.21. In other words, the margin of width M is said to provide an anti-collision buffer against the moving obstacle that it surrounds. Other than that the width of the margin M must be set wider than half of the robot’s width L to account for the robot’s size, the actual setting is decidedly a subjective opinion of the human designer. The resultant univector field for real-time robot navigation can be obtained using optimization techniques. We defer to Section 5.6.4, in the next chapter, a learning approach using evolutionary programming technique that generates a sub-optimal univector field. 4.6.3 Limit-Cycle Method The essence of this technique comes from the observation that to avoid an obstacle while moving towards a target posture, a robot could negotiate around the obstacle, either clockwise (CW) or counter-clockwise (CCW). The basic idea is to continually adjust and readjust the radius of the motion circle
4.6 Unified Navigation Control
131
around any detected obstacle and decide the direction of obstacle avoidance, considering also its target posture. This technique reactively generates a trajectory path that a robot follows to avoid possibly multiple moving obstacles while navigating in real time towards a target posture, using the limit-cycle characteristics of a 2nd -order nonlinear function that models its motion. Formally, consider the following 2nd -order nonlinear system [35]: x˙ 1 = x2 + x1 (1 − x21 − x22 ); x˙ 2 = −x1 − x2 (1 − x21 − x22 )
(4.27)
and the Lyapunov function, V (x) = x21 + x22 . The derivative of V (x) along the trajectories of the system is given by V˙ (x) = 2x1 x˙ 1 + 2x2 x˙ 2 = 2x1 x2 + 2x21 (1 − x21 − x22 ) − 2x1 x2 + 2x22 (1 − x21 − x22 ) = 2V (x)(1 − V (x)).
(4.28)
The derivative of V (x) is positive for V (x) < 1 and negative for V (x) > 1. Hence, on the level surface of V (x) = c1 with 0 < c1 < 1, all the trajectories will be moving outward, while on the level surface of V (x) = c2 with c2 > 1, all the trajectories will be moving inward. This shows that the annual region M = {x ∈ R2 |c1 ≤ V (x) ≤ c2 } is positively invariant in the sense that x(0) ∈ M implies that x(t) ∈ M, ∀t ≥ 0 [35]. It is also closed, bounded, and free of equilibrium points, since the origin x = 0 is the unique equilibrium point. Thus, from the Poincar´e-Bendixson theorem, there is a periodic orbit in M . Since the above argument is valid for any c1 < 1 and any c2 > 1, c1 and c2 can be close to 1 so that the set M shrinks toward the unit circle. This shows that the unit circle is a periodic orbit as shown in Fig. 4.22(a). This periodic orbit is called a limit-cycle. Figure 4.22(a) shows the phase portrait of Eq. (4.27) with x1 and x2 according to the x-axis and the y-axis, respectively. The trajectories from all points (x1 , x2 ) including inside the circle, move toward the unit circle clockwise as already explained. The counter-clockwise field can be derived in the following. Eq. (4.27) can be transformed to x˙ 1 = −x2 + x1 (1 − x21 − x22 ), x˙ 2 = x1 + x2 (1 − x21 − x22 ).
(4.29)
Then, all the trajectories will be moving inward, as shown in Figure 4.22(b). The general form of Eq. (4.27) can be derived by replacing 1 with r as follows: x˙ 1 = x2 + x1 (r2 − x21 − x22 ) x˙ 2 = −x1 + x2 (r2 − x21 − x22 )
(4.30)
132
4. How to Decide and Act?
3
2
X2 x2
1
0
-1
-2
-3 -3
-2
-1
0 x1 X1
1
2
3
3
2
X x2 2
1
0
-1
-2
-3 -3
-2
-1
0 x1 X1
1
2
3
4.6 Unified Navigation Control
133
and the Lyapunov function is V (x) = x21 + x22 . The derivative of V (x) along the trajectories of the system is given by V˙ (x) = 2x1 x˙ 1 + 2x2 x˙ 2 = 2x1 x2 + 2x21 (r2 − x21 − x22 ) − 2x1 x2 + 2x22 (r2 − x21 − x22 )
(4.31)
= 2V (x)(r − V (x)). 2
The derivative of V (x) is positive for V (x) < r2 and negative for V (x) > r2 . The general form of the limit-cycle is thus derived, using which we can adjust the radius and the direction of the limit-cycle while maintaining motion stability. Importantly, it provides an easy programming basis for implementation, as would be elaborated in the next section. The limit-cycle method generates a local navigation plan by applying the limit-cycle characteristics exhibited in Eq. (4.31). The plan thus shows an efficient way by which the robot can avoid obstacles without having to move far away from them. The Limit-Cycle Local Navigation. Figure 4.23 depicts the limit-cycle method which can drive a robot towards the desired direction and avoid an obstacle. At this time, the direction, either clockwise or counter-clockwise, should be decided. Fig. 4.24 shows a situation in robot soccer where the rightmost robot needs to avoid three obstacle robots A, B and C in moving towards the target (ball).
Robot Desired direction
Obstacle
rv
Fig. 4.23. Navigation using the limit-cycle method
Before applying the limit-cycle to local navigation, some terminology required is defined in the following. • Rotational direction: It decides the turning direction taken to avoid an obstacle, counter-clockwise (CCW) or clockwise (CW).
134
4. How to Decide and Act?
A T21
r n1
O d2 C
O n1 T11
l
Target T22
B T12
rd1 Od1
Fig. 4.24. Multiple obstacle situation
• Variable obstacle (Ov ): In general, the robot is assumed to be a point mass in a simulated situation. This may lead to collision with actual obstacles in a real implementation. So, we define the variable obstacle whose radius is decided by its relative position to the robot and the sizes of the obstacle and the robot, which will be explained later. Here, for simplicity, the variable obstacle is assumed to be circular. The circle of Ov will be a limit-cycle such that the robot follows the circle boundary of Ov . • Variable radius (rv ): The radius of the variable obstacle. It varies with the size of the robot and the obstacle’s relative position to the robot. If we use rv as the radius of a limit-cycle, the robot can navigate without collision with the obstacle. • Disturbing obstacle (Od ): Variable obstacles that are in the way between the robot and the target point. These obstacles are assigned consecutive numbers such that for any two such obstacles Odx and Ody , Odx is nearer to the robot than Ody if x < y. The disturbing obstacle nearest to the robot is designated Od1 , and the next one is Od2 , etc. • Non-disturbing obstacle (On ): Variable obstacles that are not in the way between the robot and the target point. The nondisturbing obstacle nearest to the robot is designated On1 , and the next one is On2 , etc. • Tangent points (Tn1 , Tn2 ): Intersection points of the circle with the variable radius and its tangent lines through the target point. Note that there are two tangent points on each obstacle. Now, the steps of the limit-cycle method (for local navigation) are as follows: 1. Draw a line l from the robot to the target in a global coordinate OXY as follows:
4.6 Unified Navigation Control
ax + by + c = 0.
(4.32)
2. Treat variable obstacles as disturbing obstacles Odi ’s if the line l crosses them, else, treat them as non-disturbing obstacles, On ’s. 3. Move towards the target if there is no Od .
Y rv (Q x, Qy)
l d
(Rx, Ry)
Target (Gx, Gy) 0
X
Fig. 4.25. Decision of rotational direction
4. Referring to Fig. 4.25, calculate the distance d from the centre of the nearest disturbing obstacle, Od to the line l, using d=
aQx + bQy + c √ , a2 + b2
(4.33)
where (Qx , Qy ), (Gx , Gy ) and (Rx , Ry ) are the xy-values of centre positions of the obstacle, the target and the robot, respectively. Eq. (4.30) is extended to fit to the navigation plan by substituting d and rv . If x1 and x2 are matched with x and y in the global coordinate OXY , calculate the desired direction of the robot at each position using d y + x(rv2 − x2 − y 2 ), |d| d y˙ = − x + y(rv2 − x2 − y 2 ), |d|
x˙ =
(4.34)
where x and y are relative values to the obstacle. In this equation, if d is positive, the robot avoids the obstacle Od , clockwise. If d is negative, the avoidance takes place in a counter-clockwise direction. Calculate rv by the size of the robot and the relative position to the obstacle, using the following equation: rv = rr + ro + δ,
(4.35)
135
136
4. How to Decide and Act?
where rr and ro are the radii of the robot and the obstacle, respectively. Here, ro = r, where r is the radius of the real obstacle if the obstacle is a disturbing one; ro = 0 if it is non-disturbing or virtual. δ is a safety margin for collision avoidance. As the robot moves, the line l varies. So, repeat Steps 2 ∼ 4 until the the destination is reached. Note that to obtain rv experimentally, we enclose the obtained 2D image of the obstacle within an appropriate circle. The radius of the circle is ro . With rv and δ given and ro measured, rv is obtained using Eq. (4.35). For example, suppose, as shown in Fig. 4.24, there are three obstacles between the robot and the target. The robot should move towards the target avoiding these obstacles which are marked as A, B and C. First, a line l can be marked from the robot to the target (Step 1). This line goes through two obstacles B and C, so they are considered as Od1 , Od2 , respectively and obstacle A as On1 (Step 2). Using the direction of line l through Od1 , the robot decides the direction in which it should avoid obstacle B. The counter-clockwise direction is chosen (Step 4) as shown in Fig. 4.26(a). It follows the chosen directions until it avoids obstacle B. Once the robot passes obstacle B, the line l ceases to go through obstacle B. Thus, obstacle B becomes On1 and obstacles C and A become Od1 and On2 , respectively, (Steps 1 ∼ 2). Applying the limit-cycle method again (to avoid obstacle C), the navigation path thus generated is as shown by the solid line in Figure 4.26(b). Extended Limit-Cycle Local Navigation. The robot in Fig. 4.27 is avoiding obstacle A counter-clockwise by the limit-cycle navigation method. Later, however, it moves clockwise beside the obstacle B and hence it will get stuck in a local minima between obstacle A and obstacle B. To overcome this problem, we have to add the following rule to Step 4. The distance d from the centre of the obstacle Od to the line l can be calculated as d=
aQx + bQy + c √ . a2 + b2
(4.36)
If more than two variable obstacles are overlapped, they can be regarded as one obstacle and a new central position of obstacles can be defined as 1 Qxk , n n
Qx =
k=0
1 Qyk n n
Qy =
(4.37)
k=0
where Qxk and Qyk are the x-y coordinate values of centre position of the overlapped obstacles. With this (Qx , Qy ), new distance dfor all overlapped Od ’s can be calculated. In the global coordinate OXY , the desired direction of the robot at each position can be calculated from
4.6 Unified Navigation Control
137
A Od2
O n1 1
T11 4
C
l
Target B rd1
T12
Od1 2
A Od1 2 T11 Target
T12
C
O n2 1 l
4
B O n1
rd1
Fig. 4.26. Navigation example
d y + x(rv2 − x2 − y 2 ), |d| d y˙ = − x + y(rv2 − x2 − y 2 ), |d|
x˙ =
(4.38)
where x and y are relative values to the obstacle. rv can be obtained from Eq. (4.35). For example, Fig. 4.28 shows three overlapped variable obstacles. First, new (Qx , Qy ) can be calculated by former the modified Step 4. Substituting this in Eq. (4.36), d can be calculated. Obstacle B is the closest to the robot.
138
4. How to Decide and Act?
On1 B Target
A O d1
Fig. 4.27. Local minima with two overlapped obstacles
So, with rv and d, navigation plan can be provided to avoid the obstacle by Eq. (4.38). Finally, the robot can move towards the target avoiding three obstacles as shown in Fig. 4.28.
Od1 Target
B
On1 rv
A (Qx, Qy)
d
C On2 Fig. 4.28. Extended navigation method
Application to Robot Soccer. In the previous section, the limit-cycle navigation method is proposed for avoiding obstacles and moving to a target. In robot soccer, a primary task of the robot is to kick the ball into the opponent goal. So, as shown in Fig. 4.29(a), when the robot reaches the ball, it has to position itself behind the ball in such a way that it faces the opponent goal area. In other words, the final position and direction should be satisfied for kicking. To apply the limit-cycle navigation method to robot soccer, the following rule should be added to Step 4 in Section 2.4. Putting two virtual variable obstacles on either side of the ball, the modified target, where the target heading is the centre of the goal, is on the extended line and
4.6 Unified Navigation Control
Od2
139
B Od1
A
(a) The limit-cycle navigation method without virtual variable obstacles
Od2
B
A
Od1
O d3 On1
(b) The limit-cycle navigation method with virtual variable obstacles Fig. 4.29. Robot soccer example
d=
−1 (CCW ) if a virtual variable obstacle is on the left of the ball, 1 (CW ) if a virtual variable obstacle is on the right of the ball. (4.39)
For example, a situation as in Fig. 4.29(a) is modified to a situation where it is assumed that two virtual variable obstacles are on either side of the ball as in Fig. 4.29(b). Thus, the modified target is on the extended line from the opponent goal to the ball and is adjacent to virtual variable obstacles
140
4. How to Decide and Act?
with the minimum variable radius rvmin . In the robot soccer, however, the robot moves as fast as 150cm/s. It is meaningless to calculate the minimum variable radius without considering the centrifugal velocity. The following equation shows the non-slippery minimum radius of the robot: rvmin ≤
mvc2 Fc
(4.40)
where m is the mass of the robot, vc is the centrifugal velocity and Fc is the frictional force of the robot. It should be noted that the upper limit of the minimum variable radius is constrained by the centrifugal velocity and the frictional force of the robot. Since the frictional force is fixed and measurable, the minimum virtual radius can be decided if the centrifugal velocity is known. In Fig. 4.29(b), On1 is on the left side of the ball, so d = −1 by the above rule, for Od3 , d = 1. Then, the robot moves using the limit-cycle navigation plan as shown in Fig. 4.29(b). Without modification, if there is no obstacle A in Fig. 4.29(b), d is negative by the former limit-cycle navigation method. So the robot may kick the ball to home side. Since the limit-cycle navigation method does not calculate all the trajectories in the current situation, but only calculates the next trajectory of the robot using the robot’s current relative positions of the target and any obstacle, this method generates the navigation plan incrementally and “adapts” to the dynamically changing environment. The limit-cycle navigation method, as already described, can adjust the direction and the safety distance for obstacle avoidance, and so is applicable to robot soccer.
Notes on Selected References The book [29] provides an excellent coverage of PID control. Aspects of digital PID and its implementation can be found in the books [29, 36]. A special journal issue on PID control [30] focusses on the design methods and future potential of PID control. For an introduction to robot navigation, refer to the textbook [37]. The univector method originated with the work of [38]. The limit-cycle navigation method originated with the work reported in [39]. Several other navigation methods have been reported for robot soccer, including [40] that proposes an optimal path generating navigation method using a combination of a geometric method and a fuzzy logic method optimized using evolutionary programming. The limitations of potential field navigation methods are discussed in [33, 34]. Earlier unified navigation methods include motor-schema [3], navigation templates [41, 42] and artificial potential functions [43].
5. How to Improve Intelligence? Use Soft Computing Techniques
5.1 Introduction The field of robot soccer provides numerous opportunities for the application of AI methods for game strategy development. As mentioned in the previous chapter, good strategies are needed to decide the roles and actions of team robots during the game. Chapter 4 has introduced a hybrid control architecture in which these strategies can be organized or integrated for proper management and control. In general, building a proper strategy is best guided by the intelligence aspects of search and evolution, knowledge representation and inference and learning and adaptation. In this chapter, these aspects of intelligence as needed by the DECIDE and ACT primitives and their importance are first discussed. The basics of some widely known soft-computing paradigms that make concrete (at least one of) these abstract aspects are then introduced. They include the formalisms of Petri nets, Q-learning, neural networks, evolutionary programming and fuzzy logic. Along which, the use of each paradigm for formulating strategies in robot soccer is motivated through simplified examples taken from previous FIRA Cup MiroSoT teams that demonstrate and emphasize its applicability in control, either at high-level (also called supervisory) or low-level. More specifically, for each paradigm, one or two examples are provided that address some key issues at specific hierarchical levels of the hybrid control architecture introduced in Chapter 4. By this, however, we do not imply that these paradigms cannot be applied at the other levels. The reader interested in the performance evaluations of the example techniques presented should consult the research papers referenced therein. As a note of caution though, the performance evaluation results are often inconclusive and are frequently based on limited empirical testing. As each paradigm is an elaborate field in itself, the reader interested in these paradigms in general should consult the many textbooks referenced.
5.2 Intelligence Basics The central theme in soccer robotics is the concept of an intelligent agent. The notion of such an agent has been defined in Chapter 1. To build the J.-H. Kim, D.-H. Kim, Y.-J. Kim, K.-T. Seow: Soccer Robotics, STAR 11, pp. 141-204, 2004 Springer-Verlag Berlin Heidelberg 2004
142
5. How to Improve Intelligence?
decision-making mechanism of an agent or multi-agent program, it is useful to understand the various aspects that guide in realizing the high-level features underlying agent intelligence, namely, autonomy, reactiveness, pro-activeness and communicativeness. One view held is that these features constitute the primary basis of intelligence, and can be combined in various ways to give rise to other agent features such as cooperation, robustness, fault-tolerance and reliability. The three mutually dependent aspects, also called intelligence basics, that guide in building such features into agents, are 1. search and evolution, 2. knowledge representation and inference, 3. learning and adaptation. 5.2.1 Search and Evolution Given an objective to attain, an agent often needs to decide what to do next by systematically considering the outcomes of various sequences of actions it might take. A state is a discrete representation of the relevant aspects of the agent’s working environment. In the space of states that includes states satisfying the objective, each action taken will lead the agent from one state to another. In general, with several immediate options of unknown values, the agent can decide what to do by first examining different possible sequences of actions that lead to states of known values, and then choosing the best sequence. The process of looking for such a sequence is called search. A search algorithm takes a problem as input and returns a solution in the form of an action sequence. It then uses the solution to guide its actions, doing whatever the solution recommends as the next action to take, and then removing the action from the sequence. Once the sequence has been executed, the objective is said to be achieved, and the agent will find or be assigned a new objective. Finding a solution is done by searching through the state space. The search procedure involves expanding a current state which is not an objectivesatisfying state by applying operators to the state to generate a new set of states, from which the agent needs to choose one. The essence of search then is in choosing one option and putting the others aside, in case the first choice does not lead to a solution. Continual choosing, testing and expanding is made until a solution is found, or there are no more states to be expanded. The choice of which state to expand first is determined by the search strategy, evaluated in terms of the following citeria. • Completeness: Is the strategy guaranteed to find a solution when there is one? • Time complexity: How long does it take to find a solution? • Space complexity: How much memory does it require to perform the search? • Optimality: Does the strategy find the highest-quality solution with respect to some objective function when there are several alternative solutions?
5.2 Intelligence Basics
143
We refer the reader to the book [28] for the many search algorithms using various basic strategies. Evolution. In the literature, a class of stochastic search methods inspired by the process of natural evolution have emerged. These are called evolutionary algorithms, and are distinguished from classical search algorithms by the evolutionary paradigm wherein a population of candidate solutions undergoes iterative operations of variation and selection. There are three predominant EAs, namely, evolutionary programming, evolution strategies and genetic algorithms; more will be said about them in Section 5.6 in the next chapter. A drawback of any evolutionary algorithm is that a solution is ‘better’ only in comparison to other, presently known solutions; such an algorithm actually has no concept of an ‘optimal solution,’ or any way to test whether a solution is optimal. For this reason, evolutionary algorithms are best employed on problems where it is difficult or impossible to test for optimality. This also means that an evolutionary algorithm never knows for certain when to stop, aside from the length of time, or the number of iterations or candidate solutions, that is initialized allow it to explore. 5.2.2 Knowledge Representation and Inference States and actions need to be appropriately represented as knowledge in order that an agent can maintain a relevant description of its environment as new sensory information (or percepts) arrives, and draw new inferences to decide a course of action to achieve its goal. The terms ‘inference’ and ‘reasoning’ are generally used to cover any process by which conclusions are reached. A knowledge representation (KR) is not a data structure; what makes it representational is that it carries meaning called semantics, i.e., there is a correspondence between its constructs, called syntax, and the things it models in the external environment. That correspondence in turn carries with it some constraints. While every representation must be implemented in the machine by some data structure, the representational property is in the correspondence to the things it models in the environment, and in the constraints this correspondence imposes. Some key schemes used for knowledge representation and reasoning (KRR) are 1. Procedural knowledge: Knowledge is encoded in functions/procedures. 2. Networks: A compromise between declarative and procedural schemes, knowledge is represented in a labeled, directed graph whose nodes represent concepts and entities, while its arcs represent relationships between these entities and concepts. 3. Frames: A network in which each node represents prototypical concepts and/or situations. Each node has several property slots whose values may be specified or inherited by default. 4. Logic: A way of declaratively representing and inferring knowledge.
144
5. 6. 7. 8.
5. How to Improve Intelligence?
Decision trees: Concepts are organized in the form of a tree. Statistical knowledge: The use of certainty factors. Rules: The use of production systems to encode condition-action rules. Hybrid schemes: Any representation formalism employing a combination of KRR schemes.
5.2.3 Learning and Adaptation Learning is perhaps the only way an agent can acquire what it needs to know in the absence of complete knowledge about the environment that the agent designer can build into the agent. Learning provides autonomy, and helps the agent improve its behaviour through diligent study of its own experience. Here, no explicit distinction is made between adaptation and learning; instead, it is assumed that adaptation is covered by learning, in that according to common usage, the term adaptation is only applied to those self-modifications that enable an agent to survive in a changed environment. There is a great variety in the possible forms of learning for an agent in a multi-agent environment, and there are several key criteria that may be applied in order to structure this variety. Two standard examples of such criteria, which are well known in the field of machine learning (ML), are the following: 1. The learning method or strategy used by a learning entity (a single agent or several agents). The following methods are usually distinguished. • rote learning (i.e., direct implantation of knowledge and skills without requiring further inference or transformation from the learner); • learning from instruction and by advice taking (i.e., operationalization - transformation into an internal representation and integration with prior knowledge and skills - of new information like an instruction or an advice that is not directly executable by the learner); • learning from examples and by practice (i.e., extraction and refinement of knowledge and skills like a general concept or a standardized pattern of motion from positive and negative examples or from practical experience); • learning by analogy (i.e., solution-preserving transformation of knowledge and skills from a solved to a similar but unsolved problem); • learning by discovery (i.e., gathering new knowledge and skills by making observations, conducting experiments, and generating and testing hypotheses or theories on the basis of the observational and experimental results). A major difference between these methods lies in the amount of learning efforts required by them (increasing from top to bottom). 2. The learning feedback that is available to a learning entity and that indicates the performance level achieved so far. This criterion leads to the following usual distinction.
5.2 Intelligence Basics
145
• supervised learning (i.e., the feedback specifies the desired activity of the learner and the objective of learning is to match this desired action as closely as possible); • reinforcement learning (i.e., the feedback only specifies the utility of the actual activity of the learner and the objective is to maximize this utility); • unsupervised learning (i.e., no explicit feedback is provided and the objective is to find out useful and desired activities on the basis of trialanderror and selforganization processes). In all three cases the learning feedback is assumed to be provided by the system environment or the agents themselves. This means that the environment or an agent providing feedback acts as a ‘teacher’ in the case of supervised learning and as a ‘critic’ in the case of reinforcement learning; in the case of unsupervised learning, the environment and the agents just act as passive ‘observers’. It is important to see that different agents do not necessarily have to learn on the basis of the same learning method or the same type of learning feedback. Moreover, in the course of learning an agent may employ different learning methods and types of learning feedback. Both criteria directly or indirectly lead to the distinction between learning and teaching agents, and they show the close relationship between multiagent learning on the one hand and teaching and tutoring on the other. Examples of other than these two standard criteria, together with a brief description of their extreme values, are the following: 1. The purpose and goal of learning. This criterion allows to distinguish between the following two extremes (and many graduations in between them). • Learning that aims at an improvement with respect to one single agent, its skills and abilities. • Learning that aims at an improvement with respect to the agents as a unit, their coherence and coordination. This criterion could be refined with respect to the number and compatibility of the learning goals pursued by the agents. Generally, an agent may pursue several learning goals at the same time, and some of the learning goals pursued by the agents may be incompatible while others are complementary. 2. The decentralization of a learning process (where a learning process consists of all activities carried out by one or more agents in order to achieve a particular learning goal). This criterion concerns the degree of distribution and parallelism, and there are two obvious extremes: • only one of the available agents is involved in the learning process, and the learning steps are neither distributed nor parallelized;
146
5. How to Improve Intelligence?
• all available agents are involved, and the learning steps are ‘maximally’ distributed and parallelized. Of course, the degree of decentralization may vary for different learning processes. 3. An agent’s involvement in a learning process. With respect to the importance of involvement, one can identify the following two extremes: • the involvement of the agent under consideration is not a necessary condition for achieving the pursued learning goal (e.g., because it can be replaced by another equivalent agent); • the learning goal cannot be achieved without the involvement of exactly this agent. Other aspects of involvement that could be applied in order to refine this criterion are its duration and intensity. It also has to be taken into consideration that an agent may be involved in several learning processes, because it may pursue several learning goals. 4. The agent-agent and agent-environment interaction required for realizing a learning process. Two obvious extremes are the following: • learning requires only a minimal degree of interaction; • learning would not be possible without extensive interaction. This criterion could be further refined with respect to the frequency, persistence, level, pattern and type of interaction. Many combinations of different values for these criteria are possible. For instance, one might think of a small group of agents that intensively interact (by discussing, negotiating, etc) in order to understand why the overall system performance has decreased in the past, or of a large group of agents that loosly interact (by sometimes giving advices, sharing insights, etc) in order to enhance the knowledge base of one of the group members. The above criteria characterize learning in multi-agent systems at the single-agent and the total-system level, and they define a large space of possible forms of multiagent learning. Each point in this space represents a form of multiagent learning having its specific characteristics and its specific demands on the skills and abilities of the individual agents.
5.3 Petri Nets Petri nets are a tool for the study of systems. Petri net theory allows a system to be modelled by a Petri net, a mathematical representation of the system. The model should encapsulate what the designer feels are the important aspects of the system to be developed. 5.3.1 Petri Net Structure and Graph A Petri net is composed of four parts: a set of places P , a set of transitions T , an input function I, and an output function O. The input and output
5.3 Petri Nets
147
functions relate transitions and places. The input function I is a mapping from a transition tj to a collection of places I(tj ), known as the input places of the transition. The output function O maps a transition tj to a collection of places O(tj ) known as the output places of the transition. Formally, the structure of a Petri net, defined by its places, transitions, input function, and output function, is given as follows: Definition 5.3.1 (Petri Net Structure). A Petri net structure, C, is a four-tuple, C = (P, T, I, O). P = {p1 , p2 , . . . , pn } is a finite set of places, n ≥ 1. T = {t1 , t2 , . . . , tm } is a finite set of transitions, m ≥ 1. The set of places and the set of transitions are disjoint, i.e., P ∩ T = ∅. I : T −→ P∞ is the input function, a mapping from transitions to bags of places. O : T −→ P ∞ is the output function, a mapping from transitions to bags of places. The cardinality of the set P is n, and the cardinality of the set T is m. We denote an arbitrary element of P by pi , i = 1, . . . , n, and an arbitrary element of T by tj , j = 1, . . . , m. A place pi is an input place of a transition tj if pi ∈ I(tj ); pi is an output place if pi ∈ O(tj ). The inputs and outputs of a transition are collections of places called bags. A bag is a generalization of sets which allows multiple occurrences of an element in a bag. In other words, an element may be in a bag zero times (not in the bag), or one time, two times, three times or any specified number of times. The use of bags, rather than sets, for the inputs and outputs of a transition allows a place to be a multiple input or a multiple output of a transition. Corresponding to a place and a transition in a Petri net structure are a circle ◦ and a bar | in its graphical representation. For convenience, we simply call the circles places and the bars transitions. A Petri net graph is often useful in illustrating the concepts of Petri net theory. Fig 5.1 shows an example of a Petri net structure and its graphical representation. A line connecting a place and a transition is called an arc. Multiple lines connecting one place to one transition indicate that the place is a multiple input or output of the transition. 5.3.2 Petri Net Markings A marking µ is an assignment of tokens to the places of a Petri net. A token is a primitive concept for Petri nets, like places and transitions are. Tokens are assigned to, and can be thought to reside in, the places of a Petri net. The number of tokens and the places they reside in may change during the execution of a Petri net. The tokens are used to define the execution of a Petri net. Definition 5.3.2 (Marking). A marking µ of a Petri net C = (P, T, I, O) is a function from the set of places P to the nonnegative integers N . µ :
148
5. How to Improve Intelligence?
C = (P, T, I, O) P = {p1 , p2 , p3 , p4 , p5 , p6 } T = {t1 , t2 , t3 , t4 , t5 }
I(t1 ) = {p1 } I(t2 ) = {p3 } I(t3 ) = {p2 , p3 } I(t4 ) = {p4 , p5 , p5 , p5 } I(t5 ) = {p2 }
O(t1 ) = {p2 , p3 } O(t2 ) = {p3 , p5 , p5 } O(t3 ) = {p2 , p4 } O(t4 ) = {p4 } O(t5 ) = {p6 }
(a) Structure
p2
t5
p6
t1 p1 p3
p4 t3
t2
t4
p5 (b) Graph Fig. 5.1. A Petri net structure and its graph
P → N . It can also be defined as an n-vector, n = |P |, such that µ = (µ(p1 ), µ(p2 ), · · · , µ(pi ), · · · , µ(pn )). µ(pi ) gives the number of tokens in place pi . Definition 5.3.3 (Marked Petri Net). A marked Petri net M = (C, µ) is a Petri net structure C = (P, T, I, O) and a marking µ. It is sometimes written as M = (P, T, I, O, µ). On a Petri net graph, a token is represented by a small dot • in a place. In cases where the number of tokens µ(p) assigned to place p is large, the convention is to write the number inside the place.
5.3 Petri Nets
149
5.3.3 Rules for Petri Net Execution The execution of a Petri net is controlled by the number and distribution of tokens in the Petri net. Tokens reside in the places and control the executions of the transitions of the net. A Petri net executes by firing transitions. A transition is said to fire by removing tokens from its input places and creating new tokens which are distributed to its output places. A transition can fire provided it is enabled. A transition is enabled if each of its input places has at least as many tokens in it as the arcs from the place to the transition. Multiple tokens are needed for multiple input arcs. The tokens in the input places which enable a transition are its enabling tokens. For example, if the only inputs to transition t4 are places p1 and p2 , i.e., input bag I(t4 ) = {p1 , p2 }, then t4 is enabled if p1 has at least one token and p2 has at least one token. For a transition t7 with input bag I(t7 ) = {p6 , p6 , p6 }, place p6 must have at least three tokens to enable t7 . A transition fires by removing all of its enabling tokens from its input places and then depositing into each of its output places one token for each arc from the transition to the place. A transition t3 with I(t3 ) = {p2 } and O(t3 ) = {p7 , p13 } is enabled whenever there is at least one token in place p2 . Transition t3 fires by removing one token from place p2 and depositing one token in place p7 and one token in place p13 (its outputs). Extra tokens in place p2 are not affected by firing t3 , although they may enable additional firings of t3 . A transition t2 with I(t2 ) = {p21 , p23 } and O(t2 ) = {p23 , p25 , p25 } fires by removing one token from p21 and one token from p23 and then depositing one token in p23 and two tokens in p25 (since p25 occurs twice in the output bag O(t2 ).) Note that firing a transition will in general change the marking µ of the Petri net to a new marking µ . Since only enabled transitions can fire, the number of tokens in each place always remain nonnegative when a transition is fired. Firing a transition can never try to remove a token which is not there. If there are not enough tokens in any input place of a transition, then the transition is not enabled and cannot fire. Transition firings can continue as long as there exists at least an enabled transition. If ever there is no enabled transition, the execution halts. To summarize, the execution rules described above only specify when a transition is enabled and how tokens are distributed and re-distributed when an enabled transition fires. In building a system using Petri nets, in general, one needs to complete the Petri net design with a formulation of ‘liveness’ conditions that trigger the actual firing of each enabled transition; this is an important but application-dependent design issue. 5.3.4 Example 1: Role Level This example is taken from [44]. It illustrates how a role-assignment supervisor is designed using Petri nets. With the role of a robot fixed as goalkeeper,
150
5. How to Improve Intelligence?
the supervisor, according to the game situation, assigns the role of attacking or defending to each of the other two robots. To read the game situation, the supervisor continually receives feedback information on the robots’ postures and ball’s position. Supervisor Model. The following places and transitions are defined for a Petri net model C s of the supervisor: Structure Places Transitions Input function
: : : :
Output function
:
C s = (P, T, I, O). P ={P1 , P2 , P3 , P4 , P5 , P6 , P7 , P8 }. T ={T1 , T2 , T3 , T4 , T5 , T6 }. I(T1 ) = {P1 }, I(t2 ) = {P2 }, I(T3 ) = {P1 , P3 }, I(T4 ) = {P4 , P5 }, I(T5 ) = {P7 , P8 }, I(T6 ) = {P2 , P6 }. O(T1 ) = {P2 }, O(T2 ) = {P1 }, O(T3 ) = {P1 , P4 , P8 }, O(T4 ) = {P3 }, O(T5 ) = {P6 }, O(T6 ) = {P2 , P5 , P7 }.
The following places are defined to model the supervisor: P1 P2 P3 P4 P5 P6 P7 P8
: : : : : : : :
robot 1 defending, robot 2 attacking, robot 1 attacking, robot 2 defending, robot 1 attacking, robot 1 defending, both robot 1 and robot 2 defending, robot 2 attacking, robot 2 defending, both robot 1 and robot 2 attacking.
The transitions are defined as follows: T1 : robot 1 is in a good position to attack, T2 : robot 2 is in a good position to attack, T3 : robot 1 is presently defending, but it is in a good position to attack, and it takes the attacking role, T4 : robot 1 is presently attacking, but it is in a good position to defend, and it takes the defending role, T5 : robot 2 is presently defending, but it is in a good position to attack, and it takes the attacking role, and T6 : robot 2 is presently attacking, but it is in a good position to defend, and it takes the defending role. The graph of this Petri net supervisor is shown in Fig. 5.2. To start with, there are three tokens in P1 , P4 and P6 , meaning that initially robot 1 is defending and robot 2 is attacking. In the following instant, if transition T1 is fired, the token in P1 moves to P2 (see Fig. 5.3(a)), then T6 is fired, and the tokens in P2 and P6 move to P2 , P5 and P7 (see Fig. 5.3(b)). When T4 fires, the tokens in P4 and P5 move to P3 (see Fig. 5.3(c)). The team robots used are assumed to be good at moving straight, but not so good at turning. Thus, when a robot is close to the ball and oriented
5.3 Petri Nets
151
T1 P1
T2
P3
T3
P2
P5
P6
T4
T5
P4
T6
P8
P7
Fig. 5.2. A Petri-net graph for role assignment supervision
T1 P1 P3
T3
P5
T4
P4
P2
T2
P6
T5
P8
T6
P7
(a)
T1
T1 P1
T3
P5 T4
P4
P2
T2
P3
(b)
T6
P7
T3
P6
P5 T4
P4
P2
T2
P3
P6 T5
P8
P1
T5
P8
T6
P7
(c)
Fig. 5.3. A Petri-net supervisor for role assignment: transition firings and token redistributions
152
5. How to Improve Intelligence?
towards it, it can attack more effectively than the other robot, and should change its role to attacking if it is not already in. For the Petri net supevisor, this has to be formalized as ‘liveness’ conditions for transitions T1 and T2 . To do so for transition T1 , with i ∈ {1, 2}, let 1. di be the distance between robot i and the ball; 2. θi be the angle between the heading direction of robot i and the ball. This is depicted in Fig. 5.4. Then, an example of the ‘liveness’ condition for
Opponent goal
Robot 1 d1
d2
θ1
θ2
Robot 2
Fig. 5.4. Role selection: who should attack?
transition T1 , that used for assigning the role of attacking or defending to the two team robots, is given as follows: Suppose robot 1 plays the defending role while robot 2 plays the attacking role. Then, if d1 < 2d2 , −45◦ < θ1 < 45◦ and |θ1 | < |θ2 |,
(5.1)
the roles of robot 1 and robot 2 are interchanged. Note that by convention, the angle θ2 is positive if in counter-clockwise direction, the heading directional line is ‘leading’ (relative to the directional line from the robot’s centre to the ball), and negative if it is ‘lagging’.
5.3 Petri Nets
153
The condition for T2 is similar, but with the two robot IDs swapped in (5.1). All other transitions can fire immediately once enabled. 5.3.5 Example 2: Action Level The example of Section 5.3.4 is extended to modelling what a robot should do in an assigned role of defending, goalkeeping or attacking. The overall DECIDE or role-action structure is as given in Fig. 5.5.
Supervisor
Attacking controller
Role assignment
Defending controller
Goalkeeping controller
Fig. 5.5. A role-action structure for Petri net supervision and control
The example Petri net controllers for the defending and goalkeeping robots are given. Defending Robot Controller. The principle to apply is simple and not unlike human soccer play: the robot assigned to defend should kick the ball away from its own team goal before any opponent robot can kick it. Assuming that the left half of the playground is the opponent side, it is reasonable that the defending robot must move to the right side of the ball as soon as possible, so as to get behind the ball. Four simple situations are assumed for the defending robot (controller): (a) (b) (c) (d)
defending robot behind the ball, defending robot kicks the ball, self goal position, and defending robot in contact with ball and behind.
Fig. 5.6 depicts the four key situations. In situation (a), the defending robot is in a probable position to kick, in situation (b) it is kicking the ball, in situation (c), it is in front of the ball and facing its own team goal (self goal position), so needs to be careful to avoid a self goal, and in situation (d), it is in contact with the ball. Defence Control. ‘Angle’ is used to refer to the angle between the heading direction of the defending robot and ball. ‘Distance’ is used to refer to the distance between the defending robot and ball in pixels. In what follows, the values for ‘Angle’ and ‘Distance’ are as fixed by the designer. In situation (a), the defending robot controller should command the robot to move to the ball and kick it. In situation (b), if ‘Angle’ is above 45◦ , or
154
5. How to Improve Intelligence?
Defending robot
Ball Defending robot
Team goal
Ball
Defending robot
Team goal
Defending robot
Team goal
Team goal
Ball
Ball
5.3 Petri Nets
155
‘Distance’ is more than 25 pixels, the defending robot controller transits to situation (a). In situation (a), if the ball is on the right side of the defending robot (deemed as a self goal position), the defending robot controller should transit to situation (c). In situation (b), if the defending robot fails to kick the ball, the controller transits to situation (c). In situation (c), the controller commands the defending robot to move sideways and if it comes behind the ball without touching it, the controller would transit to situation (a). In situation (c), if ‘Distance’ is below 15 pixels, the controller transits to situation (d), in which case it should command the robot to move away from the ball till ‘Distance’ is above 25 pixels, before transiting to situation (c). Defence Control Model. The following places and transitions are defined for a Petri net model C d of the defending robot controller. Note that the key situations described above are represented by places Pi , i ∈ {1, 2, 3, 4}. Structure Places Transitions Input function
: : : :
Output function
:
C d = (P, T, I, O). P ={P1 , P2 , P3 , P4 , P5 , P6 , P7 , P8 }. T ={T1 , T2 , T3 , T4 , T5 , T6 , T7 }. I(T1 ) = {P2 , P5 }, I(T2 ) = {P1 , P6 }, I(T3 ) = {P1 , P7 },I(T4 ) = {P3 , P5 }, I(T5 ) = {P4 , P7 }, I(T6 ) = {P3 , P8 }, I(T7 ) = {P4 , P7 }. O(T1 ) = {P1 }, O(T2 ) = {P2 }, O(T3 ) = {P3 }, O(T4 ) = {P1 }, O(T5 ) = {P3 }, O(T6 ) = {P4 }, O(T7 ) = {P3 }.
The following places are defined to model the defending robot controller: P1 P2 P3 P4 P5 P6 P7 P8
: moves behind the ball, : kicks the ball, : tries to escape from self goal position, : in contact with the ball and behind, so it has to move far away from the ball, : not in a good position to kick, : in a good position to kick, : in self goal position, : in front of the ball (self goal position).
The transitions are defined as follows: T1 : tries to kick the ball, though it is not in a good position to kick, T2 : in front of the ball and at the following instant it is in a good position to kick, T3 : in front of the ball, and moving to a self goal position, T4 : in self goal position and escaping from that, T5 : misses the ball, and is in front of the ball,
156
5. How to Improve Intelligence?
T6 : in self goal position, and then in contact with the ball and behind, and T7 : away from the ball and behind, but still in a self goal position. It should be clear that the purpose of such a controller is to keep track of and influence the situational occurrences as modelled by transitions, according to situations modelled by (tokenized) places that the defending robot finds itself in. With the above defined places and transitions, the Petri-net graph for the defending robot controller is arrived at, as shown in Fig. 5.7. The ‘liveness’ condition for each enabled transition can be easily formulated from the descriptions above.
P1 P5
T1
T3
P7
T4
T2 P2
P6
T5
P3 T6 P8 T7
P4 Fig. 5.7. A Petri-net graph for defending robot control
Note that in the control execution of this Petri net controller, a token would be put in an auxiliary place Pi , i ∈ {5, 6, 7, 8} provided the visual feedback information asserts that the situation (or condition) the place characterizes (or is predicated on) becomes true. Such places are said to be controllable. Goalkeeping Robot Controller. The principle of goalkeeping also follows human soccer play: the goalkeeping robot should block or kick the ball away from its own team goal. In this example, the goalkeeping controller is designed to react to situations characterized according to the distance between the team goal and the ball. The three situations considered are (a) far distance, (b) medium distance, and (c) in goal area.
5.3 Petri Nets
157
In the following, the distance values for these situations are as fixed by the designer. ‘Far distance’ means the distance between the goal and the ball is above 60 pixels, ‘medium distance’ means it is above 20 pixels but within 60 pixels, and ‘in goal area’ means, it is below 20 pixels. A subroutine predictor() predicts the ball direction using a linear equation. It is assumed that the ball always moves in a straight line. Using the past four positions and the current position of the ball, the predictor can derive an equation for the line of movement of the ball using a curve fitting algorithm. Fig. 5.8 illustrates the role of predictor().
Past ball positions Current ball position
Position “TARGET”
Goal keeper
Fig. 5.8. Predictor of target point of the ball.
Goalkeeping Control. The goalkeeping robot would only move within the goal area, along a line parallel to the team goal line. The goalkeeping robot controller performs the following: (a) It commands the goalkeeping robot to move to the centre of the goal for ‘kick off’ when the game commences/resumes. (b) It commands the goalkeeping robot to guard the ‘TARGET’ position by moving to and staying in between the ball and the ‘TARGET’ position, called the ‘TARGET’-intercept position, when the ball is in the opponent’s half of the playground, i.e., at a far distance from the team goal.
158
5. How to Improve Intelligence?
The ‘TARGET’ position is the point on the goal line where the predicted path of the moving ball intersects. The predicted path is determined by predictor(). (c) It commands the goalkeeping robot to move to the ‘ASSUME’ position when the ball is in the team’s half of the playground but outside the goal area, i.e., at a medium distance from the team goal. The ‘ASSUME’ position refers to the point of intersection of the goalkeeping robot’s predetermined path (which is parallel to the goal line) and the straight line that passes through the current position of the ball and the centre of the goal (line). (d) It commands the goalkeeping robot to (try to) kick the ball away from the team goal when the ball is in the goal area. Goalkeeping Control Model. The following places and transitions are defined for a Petri net model C g of the goalkeeping robot controller. Structure Places Transitions Input function
: : : :
Output function
:
C g = (P, T, I, O). P ={P1 , P2 , P3 , P4 , P5 , P6 , P7 , P8 , P9 }. T ={T1 , T2 , T3 , T4 , T5 , T6 , T7 , T8 }. I(T1 ) = {P1 P3 , P6 }, I(T2 ) = {P2 , P6 , I9 }, I(T3 ) = {P1 , P5 , P7 }, I(T4 ) = {P3 , P7 , P9 }, I(T5 ) = {P1 , P5 , P8 }, I(T6 ) = {P4 , P8 , P9 }. O(T1 ) = {P2 }, O(T2 ) = {P1 }, O(T3 ) = {P3 }, O(T4 ) = {P1 }, O(T5 ) = {P4 }, O(T6 ) = {P1 }, O(T7 ) = {P3 }, O(T8 ) = {P4 }.
The following places are defined to model the goalkeeping robot controller: P1 P2 P3 P4 P5 P6 P7 P8 P9
: goal keeper moving to the centre of the goal, : goal keeper moving to the ‘TARGET’-intercept position, : goal keeper moving to the ‘ASSUME’ position, : goal keeper moving toward the ball (it looks like as if it is going to kick the ball), : ball is coming to team goal area, : ball is far away, : ball is at medium distance, : ball is in goal area, : ball is moving away.
The transitions are defined as follows: T1 T2 T3 T4 T5 T6
: : : : : :
ball ball ball ball ball ball
is is is is is is
at far distance and coming towards the goal, at far distance and moving away, at medium distance and coming in, at medium distance and moving away, in goal area and coming in, in goal area and moving away,
5.4 Q-Learning: A Model-Free Reinforcement Learning Method
159
T7 : ball is at far distance, but coming to medium distance, and T8 : ball is at medium distance, but coming to goal area. With these defined places and transitions, the Petri-net graph for the goalkeeping robot controller is arrived at, as shown in Fig. 5.9. P5 T1
P2
P6 T7
T2 T3
P1
P7
P3 T4 T8 T5
P8 P9
P4 T6
Fig. 5.9. A Petri-net graph for goalkeeping robot control
The places, P5 , P6 , P7 , P8 , and P9 are controllable places, meaning that a token would be put in such a place provided the visual feedback information asserts that the situation (or condition) the place characterizes (or is predicated on) becomes true. For example, two tokens are created, one each in P5 and P6 (see Fig. 5.10(a)), when the ball is approaching the team goal area from a far distance; this enables T1 , which when fired, redistributes the tokens to P2 accordingly (see Fig. 5.10(b)).
5.4 Q-Learning: A Model-Free Reinforcement Learning Method Reinforcement learning is the problem faced by an agent that learns how to act through trial-and-error interactions with a dynamic environment. It can
160
5. How to Improve Intelligence?
P5 T1
P2
P6 T7
T2 T3
P1
P7
P3 T4 T8 T5
P8 P9
P4 T6
(a)
P5 T1
P2
P6 T7
T2 T3
P1
P7
P3 T4 T8 T5
P8 P9
P4 T6
(b) Fig. 5.10. A Petri-net control for robot goalkeeping: a transition firing and token redistributions
5.4 Q-Learning: A Model-Free Reinforcement Learning Method
161
be seen as a way of programming agents by reward and punishment without the need to specify or model how the task is to be achieved. 5.4.1 Standard Reinforcement Learning In the standard reinforcement-learning model, an agent is connected to its environment in the SENSE-DECIDE-ACT paradigm, abstracted as in Fig. 5.11. On each step of interaction, the agent receives as input i, some indication of the current state s of the environment; the agent then chooses an action a1 , to generate as output. The action changes the state of the environment, and the value of this state transition is communicated to the agent through a scalar reinforcement (or reward) signal r. The agent’s decisionmaking mechanism should choose actions that tend to increase the long-run sum of values of the reinforcement signal. It can learn to do this over time by systematic trial and error, guided by a wide variety of algorithms [45]. We shall however only concentrate on Q-learning, a classic model-free algorithm for reinforcement learning.
Environment
s
Input
i
Agent Reward
a
r
Fig. 5.11. The standard reinforcement-learning model
Formally, the model consists of • a discrete set of environment states S, • a discrete set of agent actions A, and • a set of scalar reinforcement signals; typically {0, 1} or a set of real numbers. Fig. 5.11 also includes an input function, which determines how the agent views the environment state; we will assume that it is an identity function 1
Note that the term action is used here in a broader sense that defined under the hierarchical control architecture in Chapter 4.
162
5. How to Improve Intelligence?
(that is, i = s, implying the agent perceives the exact state of the environment). The agent’s job is to find a policy π, mapping states to actions, that maximizes some long-run measure of reinforcement. Notice that after choosing an action, the agent is told the immediate reward and the subsequent state, but is not told which action would have been in its best long-term interests. It is necessary for the agent to gather useful experience about the possible system states, actions and rewards actively to act optimally. 5.4.2 Q-Learning Q-learning is one of the simplest and most promising reinforcement learning methods. It provides agents with the capability of learning to act optimally in Markovian domains by experiencing the consequences of actions without the need to build maps of the domains. Consider the following finite state, finite action Markov decision problem: at each discrete time step t, the agent observes the state st ∈ S of the Markov process, chooses its action at ∈ A(st ), where A(st ) is the set of actions available at state st , receives a probabilistic reward rt+1 , whose mean value Rst (at ) depends only on the state and action, and the state of the environment changes probabilistically at t + 1 to state s according to the law: Prob[ st+1 = s |st , at ] = Pst s [at ].
(5.2)
The task facing the agent is to determine an optimal policy, one that maximizes the cumulative discounted expected reward Rst (at ) for performing an action at ∈ A(st ) at every state st ∈ S, modelled by ∞ (5.3) Rst (at ) = E γ j rt+1+j , j=0
where γ, 0 ≤ γ < 1, is a discount factor. By discounted reward, we mean that a reward received immediately after j + 1 time steps is worth less than the one received at time step t by a factor of γ j (0 < γ < 1). Under a policy π, the value V of state s is V π (s) ≡ Rs (π(s)) + γ Pss [π(s)]V π (s ) (5.4) s
because the agent expects to receive Rs (π(s)) immediately after performing the action which policy π recommends, and then moves to a state that is ‘worth’ V π (s ), with probability Pss [π(s)]. The theory of dynamic programming assures us that there is at least one optimal stationary policy π∗ such that
5.4 Q-Learning: A Model-Free Reinforcement Learning Method
∗
V (s) ≡ V
π∗
(s) = max Rs (a) + γ a
∗
Pss [a]V π (s ) .
163
(5.5)
s
Dynamic programming provides a number of methods for calculating V ∗ and the corresponding π∗ , assuming that Rs (a) and Pss [a] are known. For a policy π , define Q values (or state-action values) as Pss [π(s)]V π (s ). (5.6) Qπ (s, a) = Rs (a) + γ s
In other words, the Q(s, a) value is the expected discounted reward for executing an action a at state s and following policy π thereafter. The objective in Q-learning is to estimate the Q values for an optimal policy. For conve∗ nience, define these as Q∗ (s, a) ≡ Qπ (s, a), ∀s, a. It is straightforward to ∗ show that V ∗ (s) = max Q (s, a) and that if a∗ is an action at which the a
maximum is attained, then an optimal policy can be formed as π∗ (s) ≡ a∗ . Herein lies the utility of the Q values - if an agent can learn them, it can easily decide what it is optimal to do. Although there may be more than one optimal policy or a∗ , the Q∗ values are unique. In Q-learning, an agent learns through its experience which consists of a sequence of distinct stages or episodes (also called ‘trials’). In the nth episode, n ≥ 1, the agent • • • • •
observes its current state st , selects and performs an action at , observes the subsequent state st+1 , receives an immediate payoff rt+1 and adjusts its Qn−1 (s, a) values at state st+1 , using a learning factor αn and discount factor γ, according to
Qn (s, a) =
(1 − αn )Qn−1 (s, a)+ if s = st and αn [rt+1 + γmax{Qn−1 (st+1 , at+1 )}] a = at ,
at+1
Qn−1 (s, a)
(5.7)
otherwise.
Note that strictly speaking, the notations st and at should be replaced by st,n and at,n , respectively, to more appropriately denote the state and action at time instant t of episode n, and rt+1 should be replaced by rt+1,n to denote the reward at a next time instant t + 1 of episode n, when the agent has just entered a new state st+1,n due to taking action at,n at state st,n . max{Qn−1 (st+1 , at+1 )} is the best the agent thinks it can do from state at+1
st+1 . Of course, in the early stages of learning, the Q values may not accurately reflect the policy they implicitly define. The initial Q values, Q0 (s, a), for all states s ∈ S and actions a ∈ A are either set to zero or assumed to be known.
164
5. How to Improve Intelligence?
5.4.3 Example 1: Role Level This example illustrates how a role-assignment supervisor is designed using Q-learning [44]. Problem Formulation. The states, actions and rewards are defined as follows: 1. The distance between the attacking robot and the ball is classified into 5 states (see Fig. 5.12) : ra0 , ra1 , ra2 , ra3 , ra4 . 2. The angle between the attacking robot and the ball is classified into 7 states (the forward direction of the attacking robot and the angle between it and the ball) : θ0 , θ1 , θ2 , θ3 , θ4 , θ5 , θ6 .
θ 0 =30 θ1 =30
o
θ6 =30
o
θ 5 =45
o
o
r a4 >40cm r a3<40cm ra2 <30cm
θ2 =45
o
ra1 <20cm o θ4 =90 ra0<10cm
θ3 =90
o
Fig. 5.12. States of the attacking robot
3. The distance between the defending robot and the ball is classified into 3 states (see Fig. 5.13) : rd0 , rd1 , rd2 . 4. The position of the ball in the playground is classified into 9 states (see Fig. 5.14) : b0 , b1 , b2 , b3 , b4 , b5 , b6 , b7 , b8 . Assuming that the two team robots involved are technically identical, then 5 × 7 × 3 × 9 = 945 (composed) states (sm , m = 1, 2, · · · , 945), would need to be considered. Each state is denoted by (raia , θja , rdid , bib ), where a and d indicate the roles of attacking and defending, respectively, 0 ≤ ia ≤ 4, 0 ≤ ja ≤ 6, 0 ≤ id ≤ 2 and 0 ≤ ib ≤ 8. In each state, the following two decision actions are considered.
5.4 Q-Learning: A Model-Free Reinforcement Learning Method
165
r d2>40cm rd1 <40cm r d0<20cm
Fig. 5.13. States of the defending robot
b0
b2
b6
b4
b7
b5
b8
vB
b1
b3
Fig. 5.14. States of the ball
• Action a0 : change of roles between the attacking robot and the defending robot. • Action a1 : no change of roles between the attacking robot and the defending robot. Fig. 5.15 shows one of the states. If the supervisor chooses the action of not changing roles, the attacking robot would keep moving behind the ball. The time taken τ by one of the two robots that moves towards and kicks the ball first is determined by the product of the image frame sampling time and the number of frame samples taken that captures the robot movements from its current position to the kick position. This is deemed as the complete effect of the decision action taken that brings the transition from one state to another. The reward for taking an action at at state st , recorded at a next state st+1 , is then computed as an inverse proportion to time τ , given by rt+1 (st , at ) =
β1 , τ + β2
where β1 and β2 are constants.
(5.8)
166
5. How to Improve Intelligence?
Team goal
Robot 2 Robot 1
Ball
Fig. 5.15. A Q-learning state (ra3 , θ1 , rd2 , b5 ) of a role-level strategy for robot soccer
The Q-learning Process. To illustrate the offline process of learning whether the role-assignment supervisor should exchange roles of the two robots in a particular state, experiments, using state s617 = (ra3 , θ1 , rd2 , b5 ) as depicted in Fig. 5.15, were carried out [44]. The following parameters were used.
Learning factor:
αn
Discount factor: Constants :
γ β1 β2 TS Q0 (s, a0 ) Q0 (s, a1 )
Image sampling period: Initial Q-values for all s ∈ S:
1 n = 0.3 =2 =1 = 33ms = 0.5 (for role change) = 0.6 (for no role change) =
Initially, robot 1 was assigned an attacking role, and robot 2, a defending role. Consider taking the decision action a1 (no role change) at state s617 . Then, in the 1st episode, robot 1 kicked the ball; it took about 27 sampling instances (0.891s) to reach the ball and kick it. The reward for taking decision a0 was
5.5 Neural Networks
167
2 0.891 + 1 = 1.058.
r1 (s617 , a0 ) =
Thus, applying Eq. (5.7), Q1 (s617 , a0 ) = 1.058 + (0.3 × 0.6) = 1.238. This concluded the first episode of learning when action a0 at state s617 was taken. Repeating for 29 more episodes, Q30 (s617 , a0 ) = 1.616. Consider taking the decision action a0 (role change) at state s617 . Then, in the 1st episode, robot 2 kicked the ball, it took 23 sampling instances (0.759s) to reach the ball and kick it. The reward for taking decision a1 was 2 0.891 + 1 = 1.137.
r1 (s617 , a1 ) =
Applying Eq. (5.7), Q1 (s617 , a1 ) = 1.317. This concluded the first episode of learning when action a1 at state s617 was taken. Repeating for 29 more episodes, Q30 (s617 , a1 ) = 1.024. If we base on these (not necessarily optimal) Q-learning results, then since Q30 (s617 , a0 ) > Q30 (s617 , a1 ), π(s617 ) = a0 , that is, the supervisor would exchange the robots’ roles in state s617 .
5.5 Neural Networks Neural networks are techniques based on nonlinear mapping functions, where the key idea is to use functions of several variables. At first sight, these methods may seem quite complicated, but once we strip off the colourful language used, we will find that they are nothing but nonlinear functions. This section concentrates on the basic ideas of neural networks. Neural networks originated in attempts to make simple models for neural activity in the brain and attempts to make devices that could recognize and carry out simple learning tasks. A brief description that captures the essential ideas follows. 5.5.1 A Simple Neuron A schematic diagram of a simple neuron is shown in Fig. 5.16. The system has many inputs and one output. If the output is y and the inputs are u0 , u1 , u2 , · · · , un , the input-output relation is described by the following equation: ! y = f (w0 u0 + w1 u1 + w2 u2 + · · · + un wn ) = f
n k=0
" wi ui
,
(5.9)
168
5. How to Improve Intelligence? u1 w1
w2
u2 . . .
x
f
y
w0
wn
un
u0
Fig. 5.16. Schematic diagram of a simple neuron
where the numbers wi are called weights, w0 = 1 and u0 is the threshold. The function f is a so-called sigmoid function, an example of which is illustrated in Fig. 5.17.
1
0
-1
-4
-2
0
2
4
Fig. 5.17. Two sigmoid functions, with λ1 = 0, λ2 = 1
Such a function can be represented as follows: 1 eσx − e−σx + λ f (x) = 1 λ2 eσx + e−σx for
! x=
n
(5.10)
" wi ui ,
i=0
where σ is a steepness parameter, λ1 , λ2 , λ1 ≥ 0, λ2 > 0, sets the open range of f (x) as
5.5 Neural Networks
f (x) ∈
(λ1 − 1) (λ1 + 1) , λ2 λ2
169
.
(5.11)
This model of a neuron is thus simply a non-linear function. Some special classes of functions can be approximated by Eq. (5.9). 5.5.2 Neural Network Structure
u1
u2
y1
u3
u4
y2
u5
Fig. 5.18. A feedforward neural network
More complicated models can be obtained by connecting neurons together as shown in Fig. 5.18. This system is called a neural network or a neural net. The adjective feedforward is often added to indicate that the neurons are connected in a feedforward manner. There are also other types of neural networks. In the feedforward network, the input neurons are connected to a layer of neurons, the outputs of the neurons in the first layer are connected to the neurons in the second layer, etc., until we have the outputs. The intermediate layers in the net are called hidden layers. Each neuron is described by Eq. (5.9). The input-output relation of a neural net is thus a nonlinear static function. Conversely, we can consider a neural net as one way to construct a nonlinear function of several variables. The neural network representation implies that a nonlinear function of several variables is constructed from two components, namely, a single nonlinear function, the sigmoid function (5.10), which is a scalar function of one variable; and linear operations. It is thus a simple way to construct a nonlinearity from simple operations. One reason why neural networks are interesting is that practically all continuous functions can be approximated by neural networks having one hidden layer. It has been found practical to use more hidden layers because then, fewer weights can be used [29].
170
5. How to Improve Intelligence?
5.5.3 Learning Notice that there are many parameters (weights) in a neural network. Assuming that there are n neurons in a layer, if all neurons are connected, n2 parameters are then required to describe the connections between two layers. Another interesting property of a neural network is that there are so-called learning procedures. This is an algorithm that makes it possible to find parameters (weights) so that the function matches given input-output values. The parameters are typically obtained recursively by giving both an input value to the function and the desired output value. The weights are then adjusted so that the data is matched. A new input-output pair is then given and the parameters are adjusted again. The procedure is repeated until a good fit has been obtained for a reasonable data set. This procedure is called training a network. A popular method for training a feedforward network is called back propagation [46, Ch. 6]. For this reason, the feedforward net is sometimes called a back-propagation network. 5.5.4 Example 2: Action Level This example illustrates how, using a feedforward neural network learning approach, an action-level module selects an action for a soccer robot based on the ‘level of activation of each of its actions’ due to the ‘disturbance’ of an opponent robot in a simplified one-a-side soccer match [47]. The general structure of the action selection mechanism (ASM), as shown in Fig. 5.19, is described in the following. The action-level module, called Intervention, is part of the ASM.
Fig. 5.19. Structure of the proposed ASM
5.5 Neural Networks
171
The Action Selection Mechanism. Before illustrating the neural network approach, the function of each module in the ASM is first described as follows: 1. Action Set calculates the run-time parameters and determines the feasibility of actions. This module is characterized by the following: • Ir : index set to indicate team agents. • Io : index set to indicate opponent agents. • Aj : a set of actions given to agent j. • Naj : a number of actions given to agent j. • Ija : index set to indicate actions of agent j. • aji : action i given to agent j; i ∈ Ija . From the above definitions, agent j has Ija = {1, 2, . . . , Naj }, and Aj = {aji | i ∈ Ija }, j ∈ Ir . As an example, consider agent 2 with four actions which are a21 , a22 , a23 , and a24 . Then, A2 = {a21 , a22 , a23 , a24 } or A2 = {a2i | i ∈ I2a }, I2a = {1, 2, 3, 4}. All actions given to agent j are not always executable or feasible. For example, if the ball is not found due to sensor noise, the action that requires information about the ball is not possible. The following are defined to assert the feasibility of an action: • fij (k): feasibility of action i of agent j at time k, based on sensor values at that time instant; 0 or 1. • Fij (k): feasibility of action i of agent j at time k, determined by agent j considering fij (·) values; 0 or 1. • dfij (k): the duration until time k, in which fij (k) holds the same value; nonnegative integer. • t dfij : threshold of dfij (k); nonnegative integer. Specifically, agent j determines the feasibility Fij (k), taking into account past sensor values as well as the current ones. First, fij (k) is determined by fij (k)
=
1 if action i is feasible, 0 otherwise.
Then, Fij (k) is determined as follows: j fi (k) if dfij (k) > t dfij , Fij (k) = j Fi (k − 1) otherwise. 2. Supervisor enforces that an agent takes a certain action or modifies the attributes of an action. The attributes of an action are the robot parameters such as speed, heading direction and obstacle-avoidance ‘enable’ setting.
172
5. How to Improve Intelligence?
The following are needed to define the Supervisor module: • ENij (k): enforcement of action i on agent j at time k; 0 or 1. • aji en : enforced action on agent j at time k; i en ∈ Ija , aji en ∈ Aj . • ALji (k): allowance for action i of agent j at time k; 0 or 1. • M Oij (k): modification for the attributes of action i selected at time k by Final Selection module; 0 or 1. If action i is enforced (allowed, modified), ENij (k) (ALji (k), M Oij (k)) is 1. Otherwise, it is 0. 3. Internal Motive selects a suitable action for a game situation without considering the opponents. The following are needed to define the Internal Motive module: • Ajim : a set of actions given to Internal Motive module of agent j. j : the number of actions given to Internal Motive module of agent • Nim j. • Ijim : index set to indicate actions of Ajim . • Pij : the priority of action i of agent j; a natural number. • Mij (k): motivation level of action i of agent j at time k; nonnegative integer. The higher the priority is, the larger the value of Pij . As an example, j if Nim = 4, Pij can be set to one of 1, 2, 3, and 4. From the above definitions, Ajim = {aji | i ∈ Ijim }, Ajim ⊂ Aj , and Ijim ⊂ Ija . Mij (k) and its maximum value are calculated as follows: Mij (k) = Pij · Fij (k) · ALji (k) M ax M j (k) = max Mij (k). j i∈Iim
Denote the action corresponding to M ax M j (k) by aji im . Then, i im ∈ Ijim and aji im ∈ Ajim . aji im is passed to the Final Selection module. 4. Intervention calculates the activation level of every action for each agent, due to the ‘disturbance’ by opponent robots. If the activation level due to the disturbance of opponent robots exceeds some threshold, it suppresses the Internal Motive module and selects one of its given actions for each agent. The following are needed to define the Intervention module: • Ajit : a set of actions given to Intervention module of agent j. • Nitj : the number of actions given to Intervention module of agent j. • Ijit : index set to indicate actions of Ajit • φjl i (k): the activation level of action i of agent j at time k, caused by opponent agents; j ∈ Ir , l ∈ Io ; real value between 0 and 1. • θij : threshold of φjl i (k). • IT j (k): a value representing whether the Intervention module interrupts the Internal Motive module or not; 0 or 1.
5.5 Neural Networks
173
From the above, Ajit = {aji | i ∈ Ijit }, Ajit ⊂ Aj , and Ijit ⊂ Ija . If the interrupt for Internal Motive module is necessary, IT j (k) = 1. Otherwise, IT j (k) = 0. The Intervention module interrupts the Internal Motive j module only when φjl i (k) is more than θi , action i is feasible and allowed j by the Supervisor module. IT (k) can be obtained through the following calculations involving several parameters: j j j ηijl (k) = u(φjl i (k) − θi ) · Fi (k) · ALi (k)
M ax ηj (k) = where
u(x) =
max
i∈Ijit , l∈Io
ηijl (k)
x if x > 0, 0 if x ≤ 0.
IT j (k) =
1 if M ax η j (k) > 0, 0 if M ax η j (k) = 0.
Denote the action corresponding to M ax ηj (k) by aji it . Then, i it ∈ Ijit and aji it ∈ Ajit . aji it is transferred to the Final Selection module. 5. Final Selection takes into account the outputs of all the other modules and selects an action for each agent for the situation considered. The following are needed to define the Final Selection module: • aji f s : the action selected based on the outputs at time k of the Supervisor, Internal Motive and Intervention modules; i f s ∈ Ija , aji f s ∈ Aj . • A F S j (k): the final action selected by the Final Selection module (output of ASM); A F Sj (k) ∈ Aj . • daji (k): the duration until time k, in which aji holds the same action; nonnegative integer. • t daji : threshold of daji (k); nonnegative integer. t daji is a parameter to keep the persistence of actions, which is directly related to the stability of the overall system. In selecting action aji f s , the priority among the modules of the ASM are set as follows: Supervisor Highest
>
Intervention
>
InternalMotive . Lowest
Fig. 5.20 shows how to select aji f s with this order of priority. daji f s (k) corresponding to aji f s is compared with t daji f s and A F S j (k) is finally determined as follows:
174
5. How to Improve Intelligence?
Fig. 5.20. The process of selecting aji f s
j
A F S (k) =
aji f s if daji f s (k) > t daji f s , j A F S (k − 1) otherwise.
It is necessary for the Intervention module to determine φjl i (k), a measure of the activation level due to the disturbance from opponent agent l, that it needs to select an action. But as it is not clear how an explicit model of the game can be obtained in terms of generally non-deterministic behaviour of opponent agents, approaches such as neural network, fuzzy logic and Qlearning that can learn human judgements of what action is appropriate in various game situations are appropriate for the Intervention module. Problem Formulation: One-A-Side MiroSoT Game. Consider the following ASM formulation for a one-a-side game, depicted in Fig 5.21, where each team has only one robot player. • • • • • • • • • •
Ir = {1}, j ∈ Ir , Io = {1}, l ∈ Io , Aj = {Shoot, Position To Shoot, Sweep Ball, Stop, BlockBall}, Ija = {1, 2, 3, 4, 5, 6}, Naj = 6, i ∈ Ija , Ajim = {Shoot, Position To Shoot, Stop}, j Nim = 4, Ijim = {1, 2, 3, 5}, j P1 = 4, P2j = 3, P3j = 2, P5j = 1, Ajit = {SweepBall, BlockBall}, Nitj = 2, Ijit ={4, 6}, t dfij = 1, θij = 0.8, t daji = 2.
In the feedforward neural network approach, the training data to collect are the situation-action pairs. The pairing is done using human judgement; more
5.5 Neural Networks
175
Fig. 5.21. A one-a-side MiroSoT game
will be said of this later. The inputs ui to the neural network are situation variables that characterize a game situation at each time instant k. The 10 input variables used are as follows: • the ball’s velocity; the opponent robot’s velocity; • the four variables characterizing ball possession (depicted in Fig. 5.22), defined by – θBR : angle between the team (or home) robot’s heading direction and its direction towards the ball, – θBO : angle between the opponent robot’s heading direction and its direction towards the ball, – DBR : distance between the team robot and the ball, – DBO : distance between the opponent robot and the ball; • the two variables representing the risk level of conceding a goal (depicted in Fig. 5.23), defined by – DBRG : distance between the ball and the team (or home) goal, – DIRG : distance between the centre of the goal and the intersection point, of the team goal line and the line passing through the opponent robot and the ball; • and the two variables representing the team robot’s winning score against the opponent robot (depicted in Fig. 5.23), defined by – DBOG : distance between the ball and the opponent goal, – DIOG : distance between the centre of the goal and the intersection point of the opponent goal line and the line passing through the team robot and the ball.
176
5. How to Improve Intelligence?
θ θ
Fig. 5.22. Four situation variables characterizing ball possession
Home robot
Opponent robot Ball
Team goal
DBOG DBRG DIRG DIOG
Fig. 5.23. Four situation variables representing the team (or home) robot’s winning score against the opponent robot and the risk level of conceding a goal
The sigmoid function2 used as the activation function for φjl i (k) is f (x) = 2
1 ∈ (0, 1). 1 + e−ax
Note that this sigmoid function can be obtained by setting λ1 = 1, λ2 = 2, and a = 2σ in Eq. (5.10).
5.5 Neural Networks
177
In other words, the output of each simple neuron in the net represents the activation level of one of the two actions in Ajit , where SweepBall is an action done to kick away the ball and BlockBall is done to block the ball and avoid conceding a goal. Thus, the feedforward neural network to be set up for the Intervention module has 10 inputs ui and 2 outputs yj . In [47], 2 hidden layers were used, with the first and second hidden layers consisting of 12 nodes and 6 nodes, respectively. The complete net built was a 10 input, 2 output, 2 hidden layer, fully-connected, feedforward neural network. The error back propagation algorithm [46], one of the supervised learning methods, was used to train the network depicted in Fig. 5.24.
Fig. 5.24. Structure of the feedforward neural network
Neural Network Training. As mentioned earlier, the training data to collect are the situation-action pairs; in each training data pair is an input vector of a situation and an action-output vector that has the activation value of the most desired action set to 1, and those of the other actions set to 0. The situation data is computed from raw data collected through a real robot soccer game. The exercise game is done between an agent with the ASM excluding Intervention module, and an opponent agent which may have some kind of ASM or other control algorithms. The game raw data, such as the cooordinate position and heading angle of each robot, and the coordinate position of the ball, are stored. Then, the human manager observes the replayed game on a computer 2-D graphics display, and assesses the situation where the activation level of an action by the team robot in response to the disturbance of the opponent agent is deemed to be very high. The best among the actions given to the Intervention module will be identified and the corresponding raw data for the situation will be stored. Note that if the
178
5. How to Improve Intelligence?
replay is displayed on three-dimensional delicate graphics, instead of twodimensional animation, the human manager would make better judgments on the situation-desired action pairing. The situation variables - inputs to the neural net - can be readily calculated using the raw data for each situation. Using the training data obtained this way, back-propagation algorithm is used to train the neural network. The trained net is then applied to the recorded game to check for training effectiveness. Once the performance is deemed to be within the desired levels, the neural network-based Intervention module can be deployed in a one-a-side soccer game, but under the set-up conditions in which the training data pairs are obtained.
5.6 Evolutionary Programming Evolutionary Programming (EP), originally conceived by Lawrence J. Fogel in 1960, is a stochastic optimization strategy similar to genetic algorithms (GAs). GAs arose from a desire to model the biological processes of natural selection and population genetics, with the original aim of designing autonomous learning and decision-making systems. Other analogous algorithms that have also been proposed in the literature include evolution strategies (ES). Together, EP, GAs and ES have been classified under the umbrella group of evolutionary algorithms (EAs). EP can be better understood in relation to GAs. For this, an overview of GAs is first presented. GAs are global, parallel, search and optimisation methods, founded on Darwinian principles. They work with a population of potential solutions to a problem as follows: 1. Each individual within the population represents a particular solution to the problem, generally expressed in some form of genetic code. The population is evolved, over generations, to produce better solutions to the problem. 2. Each individual within the population is assigned a fitness value, which expresses how good the solution is at solving the problem. The fitness value probabilistically determines how successful the individual will be at propagating its genes (its code) to subsequent generations. Better solutions are assigned higher values of fitness than worse performing solutions. 3. Evolution is performed using a set of stochastic genetic operators, which manipulate the genetic code. Most GAs include operators that select individuals for reproduction, produce new individuals based on those selected, and determine the composition of the population at the subsequent generation. Crossover and mutation are two well-known operators:
5.6 Evolutionary Programming
179
• The crossover operator involves the exchange of genetic material between chromosomes (parents), in order to create new chromosomes (offspring). • The mutation operator, in its simplest form, makes small, random, changes to a chromosome. 4. Once the new generation has been constructed, the processes that result in the subsequent generation of the population are begun once more. GAs explore and exploit the search space to find good solutions to the problem. It is possible for a GA to support several dissimilar, but equally good, solutions to a problem, due to its use of a population. However, despite the simple concepts involved, GAs can become quite complicated. Many variations have been proposed since the first GA was introduced. Rigorous mathematical analysis of a GA is difficult and is still incomplete. EP is similar to GAs, but instead, places emphasis on the behavioral linkage between parents and their offspring, rather than seeking to emulate specific genetic operators as observed in nature. The behavioral linkage can be obtained by using a zero mean Gaussian mutation. EP is similar to evolution strategies (ES), although the two approaches were developed independently. Like both ES and GAs, EP is a useful method of optimization when other techniques such as gradient descent or direct, analytical discovery are not possible. Combinatorial and real-valued function optimization in which the optimization surface or fitness landscape is ‘rugged’, possessing many locally optimal solutions, are well suited for evolutionary programming. 5.6.1 The EP Process For EP, there is an underlying assumption that a fitness landscape can be characterized in terms of variables, and that there is an optimum solution (or multiple such optima) in terms of those variables. For example, if one were trying to find the shortest path in a Traveling Salesman Problem, each individual (solution candidate) would be a path. The length of the path could be expressed as a number, which would serve as the individual’s fitness. The fitness landscape for this problem could be characterized as a hypersurface proportional to the path lengths in a space of possible paths. The goal would be to find the globally shortest path in that space, or more practically, to find very short tours in finite time. The basic EP method involves 3 steps (Repeat until a threshold for iteration is exceeded or an adequate solution is obtained): 1. Choose an initial population of individuals (trial solutions) at random. The number of individuals in a population is highly relevant to the speed of optimization, but no definite answers are available as to how many individuals are appropriate (other than > 1) and how many individuals are just wasteful.
180
5. How to Improve Intelligence?
2. Each individual is replicated into a new population. Each of these offsprings is mutated according to a distribution of mutation types, ranging from minor to extreme with a continuum of mutation types in between. The severity of mutation is judged on the basis of the functional change imposed on the individuals. 3. Each offspring is assessed by computing its fitness. Typically, a stochastic competition is held to determine the number of individuals to be retained for the population. It should be pointed out that EP typically does not use any crossover as a genetic operator. The pseudocode for algorithm EP is given in Fig. 5.25.
// Begin EP // start with an initial time t := 0; // initialize a random population of individuals initpopulation P (t); // evaluate fitness of all initial individuals of population evaluate P (t); // test for termination criterion (time, fitness, etc.) while not done do { // perturb the whole population stochastically P’(t) := mutate P (t); // evaluate its new fitness evaluate P’ (t); // stochastically select the survivors from actual fitness P(t+1) := survive P(t),P’(t); // increase the time counter (also called generation counter) t := t + 1; } // end while // End EP Fig. 5.25. Pseudocode of algorithm EP
As an example, the following is an EP program that attempts to minimize the function: f (x1 , x2 ) = x21 + x22 .
5.6 Evolutionary Programming
181
Among several mutation methods, Gaussian random number generator is used for this example; each individual consists of an instance of (x1 , x2 ), associated with a corresponding instance of standard deviation (η1 , η2 ). 1. Begin EP 2. t := 0; 3. initpopulation (xi , ηi ), ∀i ∈ {1, · · · , µ}, where • xi = (xi1 , xi2 ), xij is the the j-th parameter, j ∈ {1, 2}, of the i-th individual, • ηi = (ηi1 , ηi2 ), ηij is the standard deviation of the j-th parameter of the i-th individual for Gaussian mutations, and • µ is the population size; 4. evaluate P (t) for each individual (xi , ηi ), ∀i ∈ {1, · · · , µ}; 5. while not done do { • P (t) := mutate P (t); In this step, each parent (xi , ηi ), i = 1, · · · , µ, creates a single offspring (xi , ηi ) as follows: ηij = ηij exp(τ N (0, 1) + τ Nj (0, 1)), xij = xij + ηij Nj (0, 1), where – xij , xij , ηij and ηij denote the j-th parameter of the vectors xi , xi , ηi and ηi , respectively; – N (0, 1) denotes a normally distributed one-dimensional random number with mean 0 and standard deviation 1. $ # √ −1 √ −1 The factors τ and τ are commonly set to 2 n and 2n . • evaluate P (t) for each individual (xi , ηi ), ∀i ∈ {1, · · · , µ}; • P (t + 1) := survive P (t), P (t); In this step, the following are carried out: a) Pairwise comparisons over the union of parents (xi , ηi ) and offsprings (xi , ηi ), ∀i ∈ {1, · · · , µ}, done as follows: for each individual, – q opponents are chosen uniformly at random from all the parents and offsprings; – if the individual’s fitness is no smaller than opponent’s, it receives a ‘win’. b) Selection of µ individuals, out of (xi , ηi ) and (xi , ηi ), ∀i ∈ {1, · · · , µ}, that have the most wins, to be parents of the next generation. • t := t + 1; } // end while 6. End EP
182
5. How to Improve Intelligence?
5.6.2 EP and GAs There are two important ways in which EP differs from GAs, though they are merging together in a unified algorithm. Representation: There is no constraint on the representation of individuals in a population. The typical GA approach involves encoding the problem solutions as a string of representative tokens, the genome. In EP, the representation follows from the problem. A neural network can be represented in the same manner as it is implemented, for example, because the mutation operation does not demand a linear encoding. (In this case, for a fixed topology, real-valued weights could be coded directly as their real values and mutation operates by perturbing a weight vector with a zero mean multivariate Gaussian perturbation. For variable topologies, the architecture is also perturbed, often using Poisson distributed additions and deletions.) Genetic Operators: While crossover and mutation operators are needed in GA, the mutation operation in EP simply changes aspects of the individual according to a statistical distribution which retains the behavioral linkage between parents and their offsprings. Further, the severity of mutations is often reduced as the global optimum is approached. There is a certain tautology here: if the global optimum is not already known, how can the spread of the mutation operation be damped as the solutions approach it? Several techniques have been proposed and implemented which address this difficulty. The most widely studied one is the MetaEvolutionary technique in which the variance of the mutation distribution is subject to mutation by a fixed variance mutation operator and evolves along with the individual. EP uses stochastic competition while GAs use the Roulette wheel method for selection; it is noted that the stochastic characteristics of the selection methods are similar. 5.6.3 EP and ES The first communication between the evolutionary programming and evolution strategies groups occurred in early 1992, just prior to the first annual EP conference. Despite their independent development over 30 years, they share many similarities. When implemented to solve real-valued function optimization problems, both typically operate on the real values themselves (rather than any coding of the real values as is often done in GAs). Multivariate zero mean Gaussian mutations are applied to each parent in a population and a selection mechanism is applied to determine which individuals to remove (i.e., ‘cull’) from the population. The similarities extend to the use of self-adaptive methods for determining the appropriate mutations to use – methods in which each parent carries not only a potential solution to the problem at hand, but also information on how it will distribute new trials
5.6 Evolutionary Programming
183
(offspring). Most of the theoretical results on convergence (both asymptotic and speed) developed for ES or EP also apply directly to the other. The main differences between ES and EP are 1. Selection: EP typically uses stochastic selection via a competition. Each individual in the population faces competition against a preselected number of opponents and receives a ‘win’ if it is at least as good as its opponent in each encounter. The number of wins is counted. Selection then eliminates those individuals with the least wins. In contrast, ES typically uses deterministic selection in which the worst individuals are purged from the population based directly on their fitness evaluation. 2. Recombination: EP is an abstraction of evolution at the level of reproductive populations (i.e., species) based on phenotypic representation, and thus no recombination mechanisms are typically used because recombination does not occur between species (by definition: see Mayr’s biological species concept [48, p. 318]). In contrast, ES is an abstraction of evolution at the level of individual behavior. When self-adaptive information is incorporated, this is purely genetic information (as opposed to phenotypic) and thus some forms of recombination are reasonable and many forms of recombination have been implemented within ES. The effectiveness of such operators depends on the problem at hand. 5.6.4 Example: Behaviour Level This example illustrates how EP can be used to train univector fields for robot navigation [38]. The idea of generating univector fields for robot navigation has already been introduced in Section 4.6.2 of the previous chapter. To exploit the univector field F to achieve higher performance in robot control, the field has to be optimized. For this purpose, a grid net and a function approximator are developed. To start with, in general, a grid of size b × a is located within the workspace3 as shown in Fig. 5.26(a). The shape and density of the grid net can be varied in accordance with the application and the desired accuracy. A node represents the point of intersection of the grid lines. Denser grid implies a larger number of nodes. pi,j is the (coordinate) position of node (i, j) and Fi,j represents the field vector at pi,j . The set of angles of univectors Fi,j forms an b × a matrix, which is defined as univector field matrix Φ as follows: Φ = {ψi,j |1 ≤ i ≤ b, 1 ≤ j ≤ a},
(5.12)
where ψi,j is the angle of vector Fi,j . To determine the field vector at an arbitrary position p, interpolating operation is adopted. At first, the operator finds the four neighbouring nodes at positions pi,j , pi+1,j , pi,j+1 , and pi+1,j+1 3
In robot soccer, the playground is the workspace.
184
5. How to Improve Intelligence?
Fi,j
(a) Grid
Fi,j+1
F i,j
F(P)
Fi+1,j+1 Fi+1,j
(b) Interpolation Fig. 5.26. Grid net of the function approximator
surrounding the point p. Then as shown in Fig. 5.26(b), the distances da , db , dc , and dd between these four nodes and p are computed. Interpolated field vector F (p) and its angle ψ(p) at p are calculated as follows: F (p) = F /||F ||, ψ(p) = F (p),
(5.13)
with (db dc dd ) Fi,j + (da dc dd ) Fi,j+1 + (da db dd ) Fi+1,j + (da db dc ) Fi+1,j+1 , db dc dd + da dc dd + da db dd + da db dc ||F || = Magnitude of F , F =
where
5.6 Evolutionary Programming
da = ||p − pi+1,j+1 ||, dc = ||p − pi+1,j ||,
185
db = ||p − pi,j+1 ||, dd = ||p − pi,j ||.
Fi,j at any node (i, j) is known beforehand. Thus F (p) can be evaluated. F (p) represents an intermediate vector for Fi,j , Fi,j+1 , Fi+1,j , and Fi+1,j+1 vectors. As p approaches pi,j , F (p) converges to Fi,j . Thus, by setting the elements of the matrix {Fi,j |1 ≤ i ≤ b, 1 ≤ j ≤ a} to each of the node values, all the vectors in the field F can be fully determined by interpolation using Eq. (5.13). For example, consider the following 3 × 3 univector field matrix: π π π 2
Φ= π 3 4π
2 3 2π π 4
4
0 .
(5.14)
π 6
Fig. 5.27 (a) shows the field vectors represented by the matrix Φ and Fig. 5.27 (b) is the univector field calculated using Eq. (5.13). In the next section,
(a)
(b)
Fig. 5.27. Simple example of grid net
the optimization of field vector Fi,j is discussed. Evolutionary Programming for the Grid Net. To optimize the univector field, univector field matrix Φ is used as the data structure for an individual (i.e., a trial solution) of the population. In this example, the evaluation function is decided based on the elapsed time, the heading angle error, the positioning error, the distance from the obstacle, and the maximal angular acceleration. ˙ for These criteria are merged to form an evaluation function f (ts , p, ω) each followed path as follows: f (ts , θc , p, ω) ˙ = kt ts + kd | θc (ts ) − θd | + ft (p) + fo (p) + fa (ω), ˙ (5.15)
186
5. How to Improve Intelligence?
where ts is the elapsed time and θd is the desired heading angle at point g. This evaluation function is used for optimizing the univector field matrix {Fi,j |1 ≤ i ≤ b, 1 ≤ j ≤ a}. The first term in the evaluation function is to ensure quick reachability of the desired point g and the second term forces the robot to converge to the desired heading angle θd at point g. The third term ft (p) makes the robot move to the desired point g: 0 if arrived at point g within allowable error bound, ft (p) = (5.16) Tp + mint∈[0,ts ] ( |p(t) − g| ) otherwise, where p(t) is the position of the robot centre at time t, point g is the desired position, and Tp is a penalty value that is added when the robot does not arrive at point g. If the robot does not reach the desired position as indicated in (5.16), the distance from the robot centre to the desired point, the corresponding value mint∈[0,ts ] ( |p(t) − g| ), and Tp are used to obtain ft (p) as a penalty. The fourth term fo (p) prevents the robot from colliding with an obstacle and assumes the following values: 0 if no obstacle collision, fo (p) = (5.17) Bp + maxt∈Ω ( |p(t) − pb | ) otherwise, where Bp is a penalty parameter, Ω ⊂ [0, ts ] refers to the time interval during which the robot is within an obstacle boundary, and pb is the closest point on the obstacle boundary from the robot centre. When the robot collides with an obstacle, the function fo (p) is calculated by projecting the robot trajectory nearest to the obstacle center. The shortest distance of such a point from the periphery of the obstacle is used for getting the value of the function fo (p). The last term fa (ω) ˙ prevents the robot angular acceleration, ω, ˙ from exceeding its limit αmax : 0 when ω˙ is within the limit αmax , fa (ω) ˙ = (5.18) Ap + maxt∈[0,ts ] ( |ω(t) ˙ − αmax | ) otherwise. In the computer simulation [38], the scaling factor kt and kd are taken as 1 and 5, respectively. The penalty values Tp , Bp , and Ap are taken as 500 cm, 100 cm, and 50 rad/s2 , respectively. The value of Tp is set to be greater than the sum of the other two terms Bp and Ap in evaluation function. The terms ft (p), fo (p), and fa (w) are made to satisfy the constraints, which ensure that the robot reaches the desired position without collision and exhibits movements without ripples. The remaining terms kt ts and kd | θc (ts ) − θd | of (5.15) are used as fine tuning for short navigation time and desired heading angle at point g. Once the univectors are optimized properly, the values of ft (p), fo (p), and fa (w) are all zeros. The penalty parameters Tp , Bp , and Ap determine the properties of optimization progress. At the first stage of optimization, the individuals that fail
5.6 Evolutionary Programming
187
to drive the robot to the desired position are weeded out because Tp is the strongest penalty. But if all the individuals violate this constraint, the term on the right-hand side of Tp in (5.16) becomes meaningful. By this term the individuals are inclined to approach the desired position. The terms Bp and Ap in Eqs. (5.17) and (5.18) are used in a similar manner. The penalty values of Tp , Bp , and Ap can be varied to suit the designer’s intentions. Table 5.1. Algorithm EP for offline training of univector fields 1. Initialization a) t ←− 0 b) Initialize population P0 2. While (not termination condition) do a) t ←− t + 1 b) q ←− 0 c) While (not q = np ) do i. q ←− q + 1 ii. Mutate q-th univector field matrix iii. S(q) ←− 0 iv. k ←− 0 v. While (not k = ns ) do A. k ←− k + 1 B. Simulate the robot navigation C. Calculate evaluation function f D. S(q) ←− S(q) + f d) Select the best np candidates using S() for population Pt+1 . 3. End
Algorithm EP using the evaluation function in Eq. (5.15) is summarized in Table 5.1. In Table 5.1, np is the number of individuals in a population and ns is the number of simulations per individual. The cumulative evaluation value S(q) for the q-th individual is stored in S(), which is used to select the best np individuals. Different termination conditions can be used. In this example, the optimization is terminated if the total number of generations exceeds a predefined one. By this algorithm, the sub-optimal univector field matrix can be obtained. For mutation, the following self-adaptive Gaussian mutation [49] which is widely applied in optimization problems is used: σij = σij exp(τ G(0, 1) + τ Gij (0, 1)), mij = σij Gij (0, 1), 1 1 τ = √ , τ = √ , 2 Kv 2 Kv
(5.19)
where Kv is the number of variables in each individual of the population. G(0, 1) is a random variable of normal probability distribution whose mean and variance are 0 and 1, respectively. The global factor exp(τ G(0, 1)) allows
188
5. How to Improve Intelligence?
an overall change in the mutability and guarantees the preservation of all degrees of freedom, whereas exp(τ Gij (0, 1)) allows individual changes with a mean step size of σij . The univector field matrix is updated using the following: ψ(pij ) ←− ψ(pij ) + k mi,j +
1−k (mi−1,j + mi,j−1 + mi+1,j + mi,j+1 ). 4 (5.20)
where 0 ≤ k < 1. The smoothing coefficient k suppresses the ripples in the fields. For details on constrained optimization by evolutionary programming, the reader is referred to [50]. Real-time Univector Field Navigation for Robot Soccer. Note that the optimization of the univector field is done separately for the two subfields (of the same scale for the same workspace). In optimizing the subfield for a target posture, the target point g and guidance point r are fixed; in particular, point g is fixed at (0, 0), and point r is fixed at (xr , 0). The length |xr | of the line gr depends on which action it implements. In training the subfield for obstacle avoidance, an obstacle of a known size is considered, with the centre of the obstacle (let’s call it the obstacle point) positioned at (0, 0). In robot soccer, a robot is treated as (an object placed within the boundary of) a circular obstacle. In deploying these optimized subfields in a real-time MiroSoT game, the desired heading angle θd of a team robot at an arbitrary point p, i.e., θd = F (p), can then be determined by (some translational and rotational displacements of) these two subfields. Conceptually, such transformation should be done in such a way that 1. the target point g and guidance point r of the subfield for a target posture coincides respectively with the actual target point and guidance point in the playground; 2. the obstacle point (and there are 5 of them for Small League MiroSoT) of each duplicate subfield for obstacle avoidance coincides with the actual centre of the obstacle in the playground. All the 6 subfield univectors whose positions coincide at the actual robot’s position p are then mathematically ‘combined’ to yield the univector F (p), and hence the desired heading angle F (p), as discussed in Section 4.6.2 and depicted in Fig. 4.21.
5.7 Fuzzy Logic and Control Fuzzy control is a control paradigm that has received a lot of attention recently. In this section we will give a brief description of the key ideas. We will start with fuzzy logic, which has inspired the development.
5.7 Fuzzy Logic and Control
189
5.7.1 Fuzzy Logic Ordinary Boolean logic deals with quantities that are either true or false. Fuzzy logic is an attempt to develop a method for logic reasoning that is less sharp. This is achieved by introducing linguistic variables and associating them with membership functions, which take values between 0 and 1. In fuzzy control, the logical connectives ‘and’,‘or’ and ‘not’ are operators on linguistic variables. These operations can be expressed in terms of operations on the membership functions of values of the linguistic variables. Consider two linguistic values, A and B, of a linguistic variable x, with the respective membership functions, µA (x) and µB (x). The logical operations are defined by the following operations on the membership functions. µA and B (x) = min(µA (x), µB (x)), µA or B (x) = max(µA (x), µB (x)), µnot A (x) =1 − µA (x).
(5.21)
A linguistic value A, where the membership function is zero everywhere except for one particular measured value x0 of the linguistic variable x, is called a crisp value. Formally, it is characterized by the following membership function: µA (x) =
1 if x = x0 , 0 if x = x0 .
(5.22)
Assume for example that we want to reason about temperature. For this purpose we introduce the linguistic values cold, moderate, and hot, and we associate them with the membership functions shown in Fig. 5.28. The membership function for the linguistic values cold and moderate and cold or moderate are also shown in the figure. 5.7.2 A Fuzzy Controller A block diagram of a fuzzy PD controller is shown in Fig. 5.29. The measured values of the linguistic variables, the control error e and the time derivative (or rate of change) of the error ce, are converted to so-called ‘linguistic values’ in a process called ‘fuzzification.’ This procedure converts continuous values (of the linguistic variables) to a collection of linguistic values. The number of linguistic values is typically quite small. Examples of linguistic values are: Negative Big (N B), Negative Medium (N M ), Negative Small (N S), Negative Zero (NE), Zero (ZE), Positive Zero (P O), Positive Small (P S), Positive Medium (P M ) and Positive Big (P B). The control strategy is expressed in terms of a function that maps linguistic variables to linguistic variables. This function is defined in terms of a set of if-then rules. As an illustration, we
190
5. How to Improve Intelligence?
cold
moderate
hot
1 0.5
0 -10
0
10
20
30
40
20
30
40
20
30
40
1
cold and moderate 0.5 0 -10
0
10
cold or moderate
1 0.5 0 -10
0
10
Fig. 5.28. Illustration of membership functions for fuzzy values. The upper diagram shows the membership functions of cold, moderate, and hot. The middle diagram shows the membership functions for cold and moderate; the lower diagram shows the membership functions for cold or moderate.
ce
Fuzzyfier e
Inference engine
Fuzzy rule base
Fig. 5.29. A fuzzy PD controller
u
Defuzzifier
5.7 Fuzzy Logic and Control
191
give the rules for a PD controller where the error e and its derivative ce are each characterized by three linguistic values (N, Z, P ) and the control input u is characterized by five linguistic values (N B, N M, ZE, P M, P B). Rule Rule Rule Rule Rule Rule Rule Rule Rule
1: 2: 3: 4: 5: 6: 7: 8: 9:
If If If If If If If If If
e e e e e e e e e
is is is is is is is is is
N and ce is P , then u is ZE. N and ce is Z, then u is N M . N and ce is N , then u is N B. Z and ce is P , then u is P M . Z and ce is Z, then u is ZE. Z and ce is N , then u is N M . P and ce is P , then u is P B. P and ce is Z, then u is P M . P and ce is N , then u is ZE.
These rules can also be expressed in table form, see Table 5.2. Table 5.2. Representation of the fuzzy PD controller as a table u
ce
e
P
Z
N
N
ZE
NM
NB
Z
PM
ZE
NM
P
PB
PM
ZE
The membership functions representing the linguistic values normally overlap (see Fig. 5.28). Due to this, several rules contribute to the control input u. The inferred output of each rule is aggregated. The aggregated output is represented by a (fuzzy) linguistic set. The linguistic set representing the control input is then mapped into a real number by an operation called ‘defuzzification.’ More details are given in the following. 5.7.3 Fuzzy Inference Many different shapes of membership functions can be used. In fuzzy control it is common practice to use overlapping triangular shapes like the ones shown in Fig. 5.28 for both the error and its derivative, and the control input.
192
5. How to Improve Intelligence?
Typically, only a few membership functions are involved in the inferencing of each rule for the measured variables. Fuzzy logic is used to a moderate extent in fuzzy control. A key issue is to interpret logic expressions of the type that appears in the description of the fuzzy controller. Some special methods are used in fuzzy control. To describe these, we assume that µA , µB , and µC are the membership functions associated with the linguistic values A, B, and C. Furthermore, let x and y represent measurements. If the values x0 and y0 are measured, they are considered as crisp values. The fuzzy statement If x is A and y is B is then interpreted as the crisp (‘specific’) value Z 0 = min(µA (x0 ), µB (y0 )),
(5.23)
where ‘and’ is realized by the minimum operation of the membership functions. min(wi , wj ) is called a T-norm operator and its arguments, wi and wj , are called firing strengths. Instead of the min(.) operator, any other T-norm operator for implementing ‘and’, such as algebraic product, bounded product and drastic product, can be used. The linguistic variable u defined by If x is A and y is B then u is C is interpreted as a linguistic set C with the membership function µC (u) = min(Z 0 , µC (u)).
(5.24)
If there are several rules, as in the description of the PD controller, each rule is evaluated individually. The results obtained for each rule are aggregated using the ‘or’ operator. This corresponds to taking the maximum operation of the membership functions obtained for each individual rule. Similarly, instead of the maximum operator max(.), called a T-conorm operator, any other Tconorm operator for implementing ‘or’, such as algebraic sum, bounded sum and drastic sum, can be used. Fig. 5.30 is a graphical illustration for the case of the first two rules of the PD controller. The figure shows how the so-called qualified (induced) consequent membership function corresponding to each rule is constructed, and how the overall output membership function representing the control input is obtained by taking the maximum of the membership functions obtained from all rules. The inference procedure described is called ‘min-max.’ This refers to the operations on the membership functions. Other inference procedures are also used in fuzzy control. The ‘and’ operation is sometimes represented by taking the product of two membership functions and the ‘or’ operator by taking an algebraic sum. Combinations of the schemes are also used. In this way, it is possible to obtain ‘min-max’ and ‘min-sum’ inference.
5.7 Fuzzy Logic and Control
193
Rule 1: If e is N and ce is P, then u is Z.
min N
P
Z
w1 w2
Rule 2: If e is N and ce is Z, then u is NM. N
NM
Z
min w3 w4
ce
e
Fig. 5.30. Illustration of fuzzy inference with two rules using the min-max rule.
5.7.4 Defuzzification Fuzzy inference results in a control input expressed as a linguistic set and defined by its membership function. To apply a control input to the real system, we must have a real value. Thus, the linguistic set defining the control input must be converted to a real number through the operation of ‘defuzzification.’ This can be done in several different ways. Consider an overall output linguistic set C with the membership function µC(u). Defuzzification by the ‘the centroid of area’ method gives the value % uµ (u) du u0 = % C . µC (u) du
(5.25)
Defuzzification by the ‘bisector of area’ method gives a real variable u0 that satisfies u0 ∞ µC (u) du = µC (u) du. (5.26) −∞
u0
194
5. How to Improve Intelligence?
5.7.5 Example: Behaviour Level This example implements a navigation controller (at the behaviour level) for a Shoot action using fuzzy logic control [40].
Allowed
Not allowed
Fig. 5.31. Shooting from the left side when the line connecting the ball to the opponent goal is on the right
In the example, the following two constraints are considered: • Constraint 1 The robot should approach the ball from the side opposite to that with the line connecting the ball and the opponent goal; this is depicted in Fig. 5.31. In other words, it should always approach the ball so as to ‘bump’ it towards the opponent goal. • Constraint 2 The robot should avoid obstacles in the playground that are not very close to the ball. The relative posture of a robot is characterized by three variables (ρ, ϕ, θ), where the polar coordinate (ρ, ϕ) is the robot’s position relative to the ball’s, as depicted in Fig. 5.32. These variables are needed to implement the shooting action in view of the abovementioned constraints. Overall Fuzzy Navigation Controller. The overall fuzzy navigation control structure is as shown in Fig. 5.33. It consists of two sub-controllers organized in a two-level hierarchy. The higher-level sub-controller is a fuzzy
5.7 Fuzzy Logic and Control
195
{
ϕ z
qB
d
ρ
θ t Fig. 5.32. Variables for relative posture characterization
path-planner; it generates a desired global path connecting the robot’s current position to the ball, without violating the two constraints. The lowerlevel sub-controller is a fuzzy posture-controller; it outputs the robot’s left and right wheel velocities to follow the desired path from the robot’s current posture. Fuzzy Planner. The fuzzy planner is for generating a path globally that meets the constraints by calculating the robot’s desired heading angle θd at each relative position (ρ, ϕ). It comprises two blocks: one is the destination block that generates a path which leads to the destination (the ball); this path satisfies Constraint 1; the other is the obstacle block that compensates θd for obstacle avoidance so as to satisfy Constraint 2. Destination Block. This is for determining the desired heading angle at each robot’s position relative to the ball, (ρ, ϕ). Fig. 5.34 shows the basic idea of constructing a path. A desired path is represented by a line extending to an arc; to move along a directional path, a robot’s desired heading angle θd at an arbitrary point on the path is the angle the tangent to the point (in the same direction of move) makes with the X-axis. In Fig. 5.34, the turning radius Rmin is set to 5 cm, considering the size of the ball and that of the MiroSoT robot. The input variable membership functions are depicted in Fig. 5.35. The output θd has singleton values obtained at sampled positions as shown in Fig. 5.36; this figure shows the upper-half plane. Since the lower-half and upper-half planes are symmetrical about the X-axis, it suffices to consider the upper-half plane.
196
5. How to Improve Intelligence?
Fig. 5.33. Overall fuzzy control structure for the Shoot action
The input, output and rules for the destination block are defined as follows: 1. Input space (ρ, ϕ), relative position of the robot to the ball: ρ ∈ [0cm, 60cm], ϕ ∈ [0, 180 deg.]. 2. Output space (θd ), desired heading angle: θd ∈ [-180 deg., 180 deg.] (indicated by arrows → in Fig. 5.36). 3. Rules 49 rules are obtained for θd at sampled positions as shown in Fig. 5.36, with ϕ and ρ each characterized by seven linguistic values (N B, N M, N S, ZE, P S, P M, P B). Since the input space is uniformly divided, the rules are ‘sampled’ at the centre of each input region. The resultant rules for the destination block are represented in Table 3.
5.7 Fuzzy Logic and Control
197
Fig. 5.34. Desired output Table 5.3. Rules for the destination block θd ϕ NB NM NS ZE PS PM PB
NB -270.0 -240.0 -200.0 -170.0 -140.0 -20.0 0.0
NM -216.9 -201.5 -187.9 -180.0 -135.0 -30.0 0.0
NS -202.6 -180.0 -155.5 -126.9 -80.0 -34.2 0.0
ρ ZE -196.3 -171.0 -143.6 -120.6 -76.9 -35.9 0.0
PS -192.7 -166.1 -137.6 -114.3 -71.8 -35.5 0.0
PM -190.4 -163.0 -134.0 -110.7 -69.0 -40.2 0.0
PB -188.8 -161.0 -131.6 -108.4 -67.4 -45.4 0.0
Obstacle Block. This block modifies θd using offset angle θf if there is any obstacle nearby. Four variables, namely, velocity Vr , direction Dr , distance dr , and position Pr (positive if the obstacle is in front, negative otherwise.), all relative to the robot, are utilized to obtain θf in the presence of obstacles. Those relative quantities are necessary to obtain the escape radius Rs so as to avoid any obstacle that is either stationary or moving, as shown in Fig. 5.37. The input variable membership functions are depicted in Fig. 5.38 and Fig. 5.39.
198
5. How to Improve Intelligence? 1.2
NB
NM
NS
ZE
PS
PM
PB
0
10
20
30 rho
40
50
60
1
Membership value
0.8
0.6
0.4
0.2
0 −10
70
(a) For ρ
1.2
NB
NM
NS
ZE
PS
PM
PB
1
Membership value
0.8
0.6
0.4
0.2
0
0
20
40
60
80
100
120
140
160
180
phi
(b) For ϕ Fig. 5.35. Membership functions
The input, output and rules for the obstacle block are defined as follows: 1. Input space(Vr , Dr , dr , Pr ), the velocity, direction, distance and position of an obstacle relative to the robot: Vr ∈ [-0.5, 1.5], Dr ∈ [0 deg., 180 deg.], dr ∈ [0 cm, 90 cm], Pr ∈ [-0.5, 1.5].
5.7 Fuzzy Logic and Control
199
60
50
40
30
20
10
0 -6 0
-4 0
-2 0
0
20
40
60
Fig. 5.36. θd sampled at each input region in the vicinity of the ball at (0, 0)
do
Resc s
θs dr Obstacle v
θd
θd' Robot
Fig. 5.37. Obstacle avoidance scheme
2. Output space θf , the offset angle to be added to θd to produce θd = θd + θf : θf ∈ [-180 deg., 180 deg.]. 3. Rules Fig. 5.40 shows the fuzzy logic control (FLC) for the obstacle block. As shown in Fig. 5.40, Vr and Dr are needed to obtain Rs , while Pr and dr are used to obtain the proportional gain, wo , which is multiplied with θs to produce θf . θs is calculated with the relation: θs = tan−1
Rs . do
(5.27)
9 rules are obtained for Rs , with Vr and Dr each characterized by three linguistic variables (N B, ZE, P B). 12 rules are obtained for wo , with Pr characterized by three linguistic values (N E, ZE, P O) and dr characterized by four linguistic values (ZE, P S, P M, P B). These resultant rules for the obstacle block are represented in Table 5.4.
200
5. How to Improve Intelligence?
1.2
NB
ZE
PB
1
Membership value
0.8
0.6
0.4
0.2
0 −0.5
0
0.5 relative velocity
1
1.5
(a) For Vr
1.2
NB
ZE
PB
1
Membership value
0.8
0.6
0.4
0.2
0
0
20
40
60
80 100 relative direction
120
140
160
180
(b) For Dr Fig. 5.38. Membership functions Table 5.4. Rules for the obstacle block, FLC1 (left) and FLC2 (right) Rs Vr NB ZE PB
NB 20 20 20
Dr ZE 20 25 35
PB 20 30 40
wo Pr NE ZE PO
dr ZE 0.8 1.0 1.0
PS 0.7 1.0 1.0
PM 0.6 0.9 1.0
PB 0.0 0.0 0.0
5.7 Fuzzy Logic and Control
201
1.2
ZE
PS
PM
PB
1
0.8
0.6
0.4
0.2
0 −10
0
10
20
30
40
50
60
70
(a) For dr
1.2
ZE
NE
PO
1
0.8
0.6
0.4
0.2
0 −40
−30
−20
−10
0
10
20
30
40
(b) For Pr Fig. 5.39. Membership functions
Fuzzy Posture Controller. In the overall structure of Fig. 5.33, the fuzzy posture controller block receives θd from the fuzzy planner block and the ρpart of robot posture information from the vision processing system. Then the posture controller block generates the appropriate left-wheel and right-wheel velocities to make θ follow θd at non-zero linear speed before ρ diminishes. So the posture controller is only concerned with directing the robot’s heading angle θ to follow θd at positive linear velocity. For this conventional problem of mobile robotics, the following heuristics are incorporated:
202
Vr
5. How to Improve Intelligence?
Rs
FLC1
θs
Tan-1( )
Dr
θf =woθs dr
wo
FLC2
Pr Fig. 5.40. FLC for obstacle block
• If ρ big → VL , VR big. • If |θe | = |θd − θ| big → |VL − VR | big. The input variable membership functions are depicted in Fig. 5.41. The input, output, and rules for the posture controller block are defined as follows: 1. Input space (ρ, θe ), posture error of the robot to the ball and the path: ρ ∈ [0cm, 60cm], θe ∈ [-120 deg., 120 deg.]. 2. Output space (VL , VR ), desired left-wheel and right-wheel velocities: VL , VR ∈ [-54 cm/s, 153 cm/s]. 3. Rules According to the above heuristics, 49 rules are acquired each for the left-wheel and right-wheel velocities. Table 3 is the rule table for the right-wheel speed VR , with ρ and θe each characterized by seven linguistic values (N B, N M, N S, ZE, P S, P M, P B). The left-wheel speed is symmetrical about the X-axis (i.e., ϕ = 0). In the table, one unit corresponds to 1.534 cm/sec for the MiroSoT robot. Table 5.5. Rules for right wheel VR ϕ NB NM NS ZE PS PM PB
NB -35 -25 -15 30 15 25 35
NM -27 8 15 30 40 51 63
NS -27 8 22 50 44 51 63
ρ ZE -3 18 35 60 65 61 67
PS -3 31 57 90 82 68 67
PM -3 31 67 100 92 68 67
PB -3 42 67 100 92 77 67
You would have noticed that all the rule tables contain real-value entries instead of the linguistic values for the respective control variables, namely,
5.7 Fuzzy Logic and Control
203
1.2
NB
NM
NS
ZE
PS
PM
PB
0
10
20
30 Distance
40
50
60
PS
PM
1
Membership value
0.8
0.6
0.4
0.2
0 −10
70
(a) For ρ
1.2
NB
NM
NS
ZE
PB
1
Membership value
0.8
0.6
0.4
0.2
0
−150
−100
−50
0 Theta error
50
100
150
(b) For θe Fig. 5.41. Membership functions
θd , Rs , wo , and VR . In the path planner tables for θd , Rs , and wo , these are crisp (defuzzified) values, obtained without resorting to defuzzification such as using Eq. (5.25), which will be used as control input. In the posture control table for VR , these singleton values have been determined in a heuristic and empirical manner; they need to be fine-tuned for better control performance. EP can be applied to tune the fuzzy posture controller, and we refer the reader to [40] for details.
204
5. How to Improve Intelligence?
Notes on Selected References On search algorithms, knowledge representation and learning from an agent’s perspective, the textbook [28] is a good source. To use Petri net theory for systems modelling, the book [51] should be consulted. The book [52] is a good introduction to reinforcement learning. The paper [45] surveys, from a computer-science perspective, the field of reinforcement learning prior to 1996. On Q-learning, reference to the work of Watkins and Dyan [53] besides the book [52] is recommended. As demonstrated in Section 5.4.3, the state space for Q-learning is quite large for a Small League MiroSoT team, and is set to increase with the number of robots. To extend Q-learning to a Middle League MiroSoT or NaroSoT team, the state explosion problem must be mitigated; the paper [54] presents a modular Q-learning approach that attempts to address this problem for multi-agent cooperation at the action-level for a NaroSoT (5-robot) team. There are a number of good textbooks on neural networks; the book [46] is a suitable reference for beginners. The book [55] is the landmark publication for EP applications, although many other papers appear earlier in the literature. In the book, finite state automata were evolved to predict symbol strings generated from Markov processes and non- stationary time series. Such evolutionary prediction was motivated by a recognition that prediction is a keystone to intelligent behavior (defined in terms of adaptive behavior, in that the intelligent organism must anticipate events in order to adapt behavior in light of an objective to achieve). Recent references on evolutionary programming include the book [56]. The book [57] is a self-contained volume of research papers covering both introductory material and selected advanced topics on the theory and application of evolutionary computation. For an introductory course on fuzzy logic, refer to the textbook [58]. For a treatment on fuzzy control from a control-engineering perspective, the book [59] is recommended. A number of defuzzification methods more flexible than the ones introduced in Section 5.7.4 can be found in [60, 61, 62].
6. Robot Soccer System: Software Components and Programming
6.1 Introduction The software complexity of a real-time robot soccer system calls for a structured development methodology and framework. In this chapter, we describe a host software model for the development of a robot soccer system. This model emphasizes modularity of design. An overview of the programming framework for robot soccer is then presented, in which a number of the robot soccer concepts described in earlier chapters are illustrated through example ‘C’ programs. These programs implement the key functions of a commandbased robot soccer system for MiroSoT. An overview of the functions, categorized into basic and applied skills, is given below. 1. Basic skills For a specified soccer robot, a) Velocity() sets its left-wheel and right-wheel velocity data; b) Angle() sets its desired left-wheel and right-wheel velocity data towards achieving a specified turning angle; c) Position() sets its left-wheel and right-wheel velocity data to move towards a specified position; and d) Shoot(), similar to Position(), but additionally sets the desired angle at which the robot should arrive at the specified position (where the ball is). The purpose is to direct the robot to hit the ball in the intended direction. 2. Applied skills These are more advanced functions that consider various strategic game situations. For a specified robot, a) Kick() implements a strategic process of ball kicking by pushing, as described by a state machine; b) Goalie() implements a strategic process of goalkeeping, as described by IF-THEN rules; and c) AvoidBound() implements an auxiliary strategy to prevent the robot from getting ‘stuck’ at the side-wall. The strategy is also described by IF-THEN rules.
J.-H. Kim, D.-H. Kim, Y.-J. Kim, K.-T. Seow: Soccer Robotics, STAR 11, pp. 205-256, 2004 Springer-Verlag Berlin Heidelberg 2004
206
6. Robot Soccer Programming
Several game and robot navigation strategies are also suggested. For the overall game, a simple zone-defence strategy is explained and its ‘C’ program code is given. The univector field and limit-cycle navigation methods, studied in Section 4.6, are good alternative strategies for implementing the functions, Kick() and Position(). The ‘C’ codes of the essential component functions for implementing each method are also given and explained. To build a solid MiroSoT team, these example and generic codes could be modified or expanded to incorporate other ideas and techniques, many of which have been introduced in earlier chapters.
6.2 MiroSoT Host Software Model 6.2.1 Modular Software The desired characteristics of a host software system for a MiroSoT team include fault tolerance and ease of development. Fault tolerance is desired to accomodate time critical activities and ensure that the system remains stable even if deadlines are sometimes missed. To achieve these characteristics, a modular approach to building the host computer software is recommended. Modularity allows the task of software development to be broken along functional boundaries and assigned to different members of the programming team. It provides for ease of development and maintenance in that the inputs and outputs of each module may be specified, inspected, and debugged. It allows for the eventual migration of some parts of the software to the robots themselves without re-programming from scratch. Finally, it even allows modules to be restarted or replaced during a match without requiring a full reset. 6.2.2 Modular Design Rules The robot soccer system is multi-tasking and asynchronous, so the potential for deadlocks exists that must be totally avoided. Thus, for a modular software design to be deadlock-free, a module should adhere to the following rules: 1. It should not have any loop in the module data-dependency path. 2. It only implements a single task with a single data dependency. For example, the vision module should only depend on the video frame grabber card and not also on the status of the communication port, so the communication driver and video interface must be in separate modules. 3. It should be independent of other modules to the greatest extent possible, in order to minimize dependencies and allow independent specification, implementation, and verification.
6.2 MiroSoT Host Software Model
207
Based on these general rules, a modular software model is presented as shown in Fig. 6.1. It shows all the (software) modules comprising the MiroSoT host software, with the directional lines denoting message passing.
User interface
Vision
Calibration
Strategy
Image data recorder
Real time monitor display
Message transmission
Game data recorder
Data User commands Critical data
Fig. 6.1. Host software model for a MiroSoT team
The key system modules are the vision, strategy and message transmission modules; they implement the functionalities of SENSE, DECIDE and ACT:Control, and ‘interface’ between ACT:Control and ACT:Actuation, respectively. As this is a real-time system, data is continually being generated. A module implementation should therefore adhere to the following rules necessary to achieve fault-tolerance in real-time: 1. It should be able to act only on the most current set of data and not waste time processing old data. This implies that the system as a whole should not have any queuing. 2. It should be able to function even if the module ‘listening’ to (i.e., receiving) its output becomes inoperative. In other words, sending its output data to an inoperative or nonexistent module will not cause it to stall, a feature called non-blocking write. 3. It does not re-transmit and must communicate with other modules using only atomic messages (i.e., messages that cannot be fragmented). 4. It needs to be able to tolerate the failure of a module that it receives its data set from, and has code to handle such events. For example, the message transmission module monitoring the command data from the strategy module would need to tell the team robots to halt if the strategy module fails to send the command data within an alloted time, to prevent these robots from crashing onto the side walls.
208
6. Robot Soccer Programming
5. It should perform the same function regardless of the number of listening robots, in order to ensure consistent output from running in simulation to running in a real environment. Note that an exception to Rule 1 is the implementation of predictor(), an important subroutine of the strategy module. This is because the subroutine requires not only the current but also the last four positions of the ball, in order to predict the next ball position (see Fig. 5.8).
6.3 Programming Framework: An Overview The vision, strategy and message transmission modules of the host software model of Fig. 6.1 are implemented as Find_Object(), My_Strategy(), and Send_Command(), respectively, in the program structure depicted in Fig. 6.2.
2Q5HDG\
)LQGB2EMHFW 0\B6WUDWHJ\ 6HQGB&RPPDQG
Fig. 6.2. Overall program structure
My_Strategy() is developed to cover all the aspects (or modes) of the MiroSoT game, as shown in Fig. 6.3. The program code fragment for selecting a game mode is given as follows: void My_Strategy() { switch(m_nGameMode) { case GAME_STARTKICK: Kick_Off(); break;
// Strategy for kick-off
case GAME_PENALTY: Penalty_Kick(); // Strategy for penalty-kick break; . . .
. . .
. . .
. . .
6.3 Programming Framework: An Overview
209
.LFNB2II
0\B6WUDWHJ\
3HQDOW\B.LFN )UHHB.LFN )UHHB%DOO *RDOB.LFN 1RUPDOB*DPH
Fig. 6.3. My Strategy()
case GAME_NORMAL: Normal_Game(); // Strategy for game break; } } The game mode is selected via the system GUI, and the program invokes accordingly via switch-variable m_nGameMode. The key robot soccer functions/procedures needed for each mode of the game are listed in Table 6.1. They are grouped into basic and applied skill functions. The next two sections describe how they are implemented as program codes for the experimental robot soccer system; in this system, only the proportional (P) control law is used to compute the desired wheel velocities, and then close-loop (actuation) control is implemented to achieve the commanded or desired wheel velocities in each team robot. Besides, the P gain values used in all the program codes are empirical values. The microprocessorbased hardware and firmware for a soccer robot as introduced in Chapter 2 are assumed. The following are definitions of some programming constants, variables and arrays needed to understand the program code following the description of each robot soccer function: 1. Constants HOME1 HOME2 HGOALIE M_PI
: : : :
team robot ID 0. team robot ID 1. team robot ID 2. π (3.14).
210
6. Robot Soccer Programming
Table 6.1. Program functions for robot soccer system (MiroSoT category) Mode
Functions
Kick_Off()
Velocity(), Goalie().
Position(),
Penalty_Kick()
Kick(),
Goalie().
Free_Kick()
Position(),
Goalie().
Goal_Kick()
Kick(),
Position().
: Attack()
Position(),
Kick().
: Defend()
Position(),
Kick().
: Goalie()
Position(), Velocity().
Angle(),
Normal_Game()
2. Variables whichrobot d_e
:
robot ID. error in distance between robot’s current : position (x, y) and desired position (xd , yd ). theta_d : desired angle θd (in degrees). theta_e : error in angle θ e . vL : desired (PWM-based) velocity data HL for left-wheel of robot. vR : desired (PWM-based) velocity data HR : for right-wheel of robot. Recall that data H ∈ [0, 255] is a PWM integer data (see Section 2.3.5, page 63). 3. Arrays and functions • PositionOfBall[0]: the x-coordinate value of the ball’s position. • PositionOfBall[1]: the y-coordinate value of the ball’s position. • AngleOfHomeRobot[whichrobot]: the heading angle θ (in degrees) of robot with ID whichrobot.
6.4 Basic Skill Functions
211
• PositionOfHomeRobot[whichrobot][0]: the x-coordinate value of the position of the robot with ID whichrobot. • PositionOfHomeRobot[whichrobot][1]: the y-coordinate value of the position of the robot with ID whichrobot. • atan2(arg1,arg2): tan−1
arg1 arg2
(in radians).
6.4 Basic Skill Functions 6.4.1 Velocity() To actuate the motors, PWM (Pulse Width Modulation) is used (see Section A.1, from page 273 onwards). As explained in Section 2.3.5 (from page 63), the velocity data H is a PWM data sent by the host computer and converted to the actual PWM data W by the receiving robot for motor actuation. The required PWM data H for a desired wheel (rotational) velocity ω ¯G can be 1 computed using Eq. (2.39) for W and H = W . 4 For a specified robot, the Velocity() function sets the H data vL and vR, given the respective ‘normalized’ velocity data vl and vr defined by 1 1 ω ¯ GL ω ¯ GR , vr = A , (6.1) vl = A 2 ωGL |max 2 ωGR |max where ωGL and ωGR denote the rotational velocities - averages (overhead bar ¯) and maximum constants (|max ) - of the left and right wheels, respectively. Equivalently, we get 1 VL 1 VR vl = A , vr = A , (6.2) 2 VL |max 2 VR |max where VL |max and VR |max are constants denoting the linear maximum velocities of the left and right wheels, respectively. Motors of the same driving capacity are used, thus VL |max = VR |max = Vmax . Hence, the ‘PWM-version’ of Eq. (4.9) is vl = Kν · ν − Kω · ω, vr = Kν · ν + Kω · ω, where Kν =
1 L · A and Kω = Kν · . 2Vmax 2
(6.3)
212
6. Robot Soccer Programming
As we will show later, the inputs vl and vr to Velocity() are generated by selecting and applying a control law for the translational velocity ν and/or turning velocity ω of the (centre of the) robot. Referring to Eq. (2.39), suppose we set A = (PR2) + 1 = 256, and for the left (L) and right (R) motors used, i ∈ {L, R}, ω ¯ Gi | ≤ 0.83 (due to saturation). zdi = 9 (dead zone), | ωGi |max Then applying Eq. (6.1), ω ¯ GL ω ¯ GR vl = 128 ∈ [−106, 106], vr = 128 ∈ [−106, 106]. ωGL |max ωGR |max 1 From Eq. (2.39) for the actual PWM data W and that H = W for vL and 4 vR, the program code for Velocity() follows: void Velocity(int whichrobot, int vl, int vr) { // Max limits for backward (-ve) and forward (+ve) directions // in backward direction if( vl < -106 ) vl = -106; // For left wheel if( vr < -106 ) vr = -106; // For right wheel // in forward direction if( vl > 106 ) vl = 106; if( vr > 106 ) vr = 106;
// For left wheel // For right wheel
// Use ASCII code extensions 128-255 // (representation for non-printable characters) if( vl >=0 ) vL = 137 + vl; // For left wheel else vL = 119 + vl; if( vr >=0 ) vR = 137 + vr; else vR = 119 + vr;
// For right wheel
// To avoid 0xAA (alternating byte) // used in communication protocol if(vR == 0xAA) vR = 0xAC; if(vL == 0xAA) vL = 0xAC; switch(whichrobot){ // case HOME1: // command[4] = (unsigned char)(vL); // command[5] = (unsigned char)(vR); // break;
Specify which robot When whichrobot = 0 Left-wheel velocity Right-wheel velocity
6.4 Basic Skill Functions
213
case HOME2: // When whichrobot = 1 command[6] = (unsigned char)(vL); // Left-wheel velocity command[7] = (unsigned char)(vR); // Right-wheel velocity break; case HGOALIE: // When whichrobot = 2 command[8] = (unsigned char)(vL); // Left-wheel velocity command[9] = (unsigned char)(vR); // Right-wheel velocity break; } } In the code, the global array command[] holds 10 one-byte elements. The experimental host program uses this array to send data by IR communication to the team robots, on the message format shown in Fig. 2.35. The elements command[i], for which i = 4,5,..,9, are the variables that store the respective velocity data H for the team robots. The other elements, for which i = 0,..,3, are the variables that store information in accordance to the IR communication protocol. The communication function used is the Send_Command() function given below: void Send_Command() { m_pComm.WriteCommBlock((LPSTR)command, 10); // Sends data in command array of size 10 bytes } Note that the byte 0xAA is not included in the command[] array; the transmitter automatically adds these bytes during the actual transmission. Before Send_Command() can be used, a serial port must be selected; details of port selection are, however, omitted. Velocity() is the most fundamental function. The next two basic skills are formulated in terms of this function. 6.4.2 Angle() This function sets the desired velocity data vL and vR for a specified robot towards achieving a desired turning angle theta_d. Referring to Fig. 6.4, the x-distance and y-distance between the position (x, y) of the robot and the desired point are |dx| and |dy|, respectively; the desired angle theta_d is thus given by theta_d = tan−1
dy . dx
214
6. Robot Soccer Programming ڴ
dx
Desired point
dy
θe θe = θd −θh θd
θh ڳ
Fig. 6.4. Angle or turning control
For a specified robot, Angle() first uses a control law to set the ‘normalized’ velocity data vl and vr, given a desired angle theta_d, and then uses Velocity() to generate the velocity data vL and vR. The specified robot that receives the data will rotate its two wheels accordingly; for example, the robot depicted in Fig. 6.4 would rotate its left-wheel forward and right-wheel backward to face the desired direction. Consider proportional (P) control for ω. Then ω = KPa · θe ,
(6.4)
where KPa is the proportional gain. Because Angle() is concerned with turning motion only, ν = 0. Substituting Eq. (6.4) and ν = 0 into Eq. (6.3), we get the ‘PWM-version’ of P control law to move the robot: vl = −ka · θe , vr = ka · θe ,
(6.5)
where ka = Kω · KPa . The robot’s heading angle theta is obtained in real-time through a routine called AngleOfHomeRobot[]. Using this routine and Eq. (6.5), the program code for Angle() follows: void Angle(int whichrobot, int theta_d) { // declare variables theta_e, vl, vr int theta_e, vl, vr; // calculate theta_e = theta_d - theta theta_e = theta_d - AngleOfHomeRobot[whichrobot];
6.4 Basic Skill Functions
215
// keep theta_e within (-180, 180] while(theta_e > 180) theta_e -= 360; while(theta_e <= -180) theta_e += 360; // switches heading directions of the robot if(theta_e<-90) theta_e += 180; else if(theta_e>90) theta_e -= 180; if(abs(theta_e) > 50){ // |theta_e| > 50. // Calculate normalized PWM data vl = (int)(-9./90.*(double)theta_e);// for left wheel vr = (int)(9./90.*(double)theta_e); // for right wheel } else if(abs(theta_e) > 20) { // 20 <|theta_e| <= 50 vl = (int)(-10./90.*(double)theta_e); vr = (int)(10./90.*(double)theta_e); } else { // 0 < |theta_e| <= 20 vl = (int)(-11./90.*(double)theta_e); vr = (int)(11./90.*(double)theta_e); } Velocity(whichrobot, vl, vr);
// Call Velocity function
} As shown in the above code, three ranges of angle errors are handled, namely, when it is greater than 50◦ , between 20◦ and 50◦ , and between 0◦ and 20◦ . This is done to enable the suitable application of different proportional gains ka ’s. It has been found that if the smaller proportional gain used for bigger angle errors is used for smaller angle errors, the velocity data values sent to the robot are smaller than the required values. When called periodically with the same theta_d, the above program Angle() directs the specified robot to turn to the desired angle. 6.4.3 Position() This function sets the velocity data vL and vR for a specified robot to move towards a desired position (x_d, y_d). For a specified robot, Position() first sets the ‘normalized’ velocity data vl and vr, given a desired position (x_d, y_d) to reach, and then uses Velocity() to generate the velocity data vL and vR.
216
6. Robot Soccer Programming ڴ
dx
Desired point (xd, yd)
θe
dy
θd
θh
(x, y)
ڳ
Fig. 6.5. Position control
Referring to Fig. 6.5, to reach the desired point (xd , yd ) from the robot’s online position (x, y), the errors in distance de and angle θe to correct are de = dx2 + dy 2 , θe = θd − θ. Consider proportional (P) control for ν and ω. Then ν =KPl · de , ω =KPa · θe ,
(6.6)
where KPl and KPa are the proportional gains for ν and ω, respectively. Substituting Eq. (6.6) into Eq. (6.3), we get the ‘PWM-version’ of P control law to move the robot: vl = kl · de − ka · θe , vr = kl · de + ka · θe ,
(6.7)
where kl = Kν · KPl and ka = Kω · KPa . The robot’s x-coordinate and y-coordinate of its position are obtained by the routines PositionOfHomeRobot[][0] and PostionOfHomeRobot[][1], respectively. Using these routines, AngleOfHomeRobot[] and Eq. (6.7), the program code for Position() follows: void Position(int whichrobot, double x_d, double y_d) { // declare the variables int theta_d = 0, theta_e = 0, vl, vr; double dx, dy, d_e, kl = 1.0, ka = 0.1; dx = x_d - PositionOfHomeRobot[whichrobot][0];
6.4 Basic Skill Functions
217
dy = y_d - PositionOfHomeRobot[whichrobot][1]; // calculate the distance error d_e = sqrt(dx*dx+dy*dy); // set kl and if(d_e > 100) else if(d_e > else if(d_e > else if(d_e > else if(d_e > else
ka according to kl = 50) kl = 30) kl = 20) kl = 10) kl = kl =
distance error 1.0, ka = 0.12; 1.2, ka = 0.125; 1.4, ka = 0.13; 2.0, ka = 0.14; 3.0, ka = 0.16; 5.0, ka = 0.18;
// calculate the desired angle if(dx==0 && dy==0) theta_d = 90; // prevent div. by zero else theta_d = (int)(180/M_PI*atan2((double)(dy),(double)(dx))); // keep theta_e within (-180,180] while(theta_e > 180) theta_e -= 360; while(theta_e <= -180) theta_e += 360; // calculate theta_e = theta_d - theta theta_e = theta_d - AngleOfHomeRobot[whichrobot]; // switch heading direction of the robot if(theta_e < -90){ theta_e += 180; d_e = -d_e; } else if(theta_e > 90){ theta_e -= 180; d_e = -d_e; } // calculate normalized PWM values vl = (int)(kl*d_e - ka*(double)theta_e); // for left wheel vr = (int)(kl*d_e + ka*(double)theta_e); // for right wheel Velocity(whichrobot, vl, vr);
// Call the Velocity function
} When called periodically with the same (x_d, y_d), the above program Position() directs the specified robot to move to the desired position.
218
6. Robot Soccer Programming
However, the code does not address two practical problems. First, as the specified robot approaches the desired point, it slows down and eventually stops at the desired point. This means that if this desired point is the position of the ball, the robot stops when it reaches (or is close to) the ball. In other words, it cannot kick the ball. Second, as depicted in Fig. 6.6, when the angle error theta_e is at +90 ◦ or −90◦ , the robot experiences a swinging motion as it continually switches its heading to either the conventional forward or backward direction. To elaborate, Position(), called when θe = +90◦, will direct the robot to move forward (to the right) as shown in Fig. 6.6(a); but in so doing, the angle error θe increases marginally above 90◦ by the second call to Position(). In this case, Position() will switch the robot’s heading to the conventional backward direction, in the opposite direction as indicated by the thick ‘Moving direction’ arrow. Following which, the robot moves in the new heading direction (to the left) as shown in Fig. 6.6(b), causing the angle error θe to decrease marginally below −90◦ by the third call to Position(). The condition leads Position() to switch the robot’s heading back to the conventional forward direction, as indicated by the thick ‘Moving direction’ arrow. This cyclic pattern continues, resulting in oscillatory motion that the robot will experience. Assuming a stationary ball, this motion can continue for several cycles before the robot can reach the ball.
(a) When θe = 90◦++
(b) When θe = −90◦−−
Fig. 6.6. Problem of oscillation about θe = ±90◦ with Position()
6.4 Basic Skill Functions
219
To overcome the first problem, apply proportional (P) control to ω only, i.e., vl = Vc − ka · θe , vr = Vc + ka · θe ,
(6.8)
but with Vc = Kν · ν set by the following exponential function: 1 Vc = vo , − ε 2 1 + e−ε1 de
(6.9)
where vo , ε1 and ε2 are positive constants. A graph of Eq. (6.9) is shown in Fig. 6.7. The purpose of Eq. (6.9) is
Vc
40
20
0
1cm
5cm
de
Fig. 6.7. Graph of Vc against de
to keep the robot’s velocity ν at a certain positive limit when it reaches the desired position. To overcome the second problem (the oscillation problem), the code for Position() needs to be revised to switch robot’s heading direction to the opposite direction if the angle error θe is greater than (90◦ + β + ) or less than −(90 + β − ). Besides, under the condition of θe ∈ [−(90◦ + β − ), −(90◦ − β − )] or θe ∈ [(90◦ − β + ), (90◦ + β + )], it needs to set the velocity data vl and vr for robot turning only in order to exit this condition swiftly. Incorporating the aforementioned considerations with vo = 70, ε1 = 3 and ε2 = 0.3 for Eq.(6.9) and β + = β − = 5, the program code for Position() is revised as follows:
220
6. Robot Soccer Programming
void Position(int whichrobot, double x_d, double y_d) { // declare the variables int theta_d=0, theta_e = 0, vl, vr, vo =70; double dx, dy, d_e, ka = 10.0/90.0; // calculate the distance error dx = x_d - PositionOfHomeRobot[whichrobot][0]; dy = y_d - PositionOfHomeRobot[whichrobot][1]; d_e = sqrt(dx*dx+dy*dy); if(dx==0 && dy==0) theta_d = 90; else theta_d = (int)(180/M_PI*atan2((double)(dy),(double)(dx))); // calculate theta_e = theta_d - theta_r theta_e = theta_d - AngleOfHomeRobot[whichrobot]; // keep theta_e within (-180,180] while(theta_e > 180) theta_e -= 360; while(theta_e <= -180) theta_e += 360; // set the ka if(d_e>100) else if(d_e > else if(d_e > else if(d_e > else
according to distance error ka = 17./90.; 50) ka = 19./90.; 30) ka = 21./90.; 20) ka = 23./90.; ka = 25./90.;
if (theta_e > 95 || theta_e < -95) { // switch robot’s heading direction theta_e += 180; if (theta_e>180) theta_e -= 360; if(theta_e<-80) theta_e = -80; // manage an exception if(d_e<5.0 && abs(theta_e)<40) ka = 0.1; // calculate vr and vl of robot vr = (int)(-vo*(1.0/(1.0+exp(-3.0*d_e))-0.3)+ka*theta_e); vl = (int)(-vo*(1.0/(1.0+exp(-3.0*d_e))-0.3)-ka*theta_e); } else if(theta_e < 85 && theta_e > -85) { // manage an exception if(d_e<5.0 && abs(theta_e)<40) ka = 0.1;
6.4 Basic Skill Functions
221
// calculate vr and vl of robot vr = (int)(vo*(1.0/(1.0+exp(-3.0*d_e))-0.3)+ka*theta_e); vl = (int)(vo*(1.0/(1.0+exp(-3.0*d_e))-0.3)-ka*theta_e); } else { // if magnitude of angle error is within [90-5,90+5] // calculate vr and vl of robot (turning only) vr = (int)(+0.17*theta_e); vl = (int)(-0.17*theta_e); } Velocity(whichrobot, vl, vr); // call the Velocity function }
With the revised code for Position(), the robot can continue moving once it has reached the desired position, rather than slow down to a halt as in the previous code. Governed by Eq. (6.9), when the robot is far away from the desired position which, say, is the position of the ball, the velocity ν will be set higher, to drive the robot more quickly towards the ball, but will be set gradually lower as it approaches the ball, until a specified limit of 0.2vo (according to selected parameter values of vo = 70, ε1 = 3 and ε2 = 0.3 for Eq. (6.9)), thus improving the accuracy of the robot hitting the ball, and at a positive lowest-limit velocity. Although the program code has been tested to work reasonably well, no claim is made that it is the best. By adjusting the parameters in Eq. (6.9) for the Position() function, the human designer can increase the robot’s velocity ν when the robot is at a further distance from the ball, and decrease it more sharply as it reaches the ball. Additionally, if the robot is at an ideal position for shooting, the designer can set the lowest velocity-limit higher for a stronger shot of the ball at goal. In general, the designer can always improve the program and experiment with different value settings for the parameters used. 6.4.4 Shoot() When called periodically with the same (x_d, y_d), the Position() function directs the specified robot to move to the desired position. But notice that the function does not consider the desired heading angle at the desired position (x_d, y_d). This consideration is important for the Shoot() function, which is also concerned with the desired angle at which the robot should come into contact with the ball at (x_d, y_d). The purpose is to direct the robot to shoot the ball in the intended direction.
222
6. Robot Soccer Programming
Path 2
Desired direction 1
q2 q1
Ball Path 1
Desired direction 2
Robot
Fig. 6.8. Different paths of the robot for shooting the ball in different desired directions
To elaborate, consider Fig. 6.8. To shoot the ball in desired direction 1, the robot has to negotiate Path 1 and reach the ball’s position at a heading angle of (180◦ − θ1 ), but to do so in desired direction 2, it has to negotiate Path 2 and reach the ball’s position at a heading angle of −(180◦ − θ2 ). Referring to Fig. 6.9, given the position of the ball (xb , yb ) (the desired position for the robot to pass through) and the position of the target point (xt , yt ) (that the ball should pass through), the Shoot() function needs to compute the desired angle θd at its online posture [x y θ]T , using θd = 2φ2 − φ1 ,
(6.10)
where φ1 = tan−1
dy1 , dx1
φ2 = tan−1
dy2 dx2
and dx1 = xt − xb , dx2 = xb − x,
dy1 = yt − yb , dy2 = yb − y.
Note that φ1 ≤ 0 provided φ2 ≥ 0.
6.4 Basic Skill Functions
223
Y
Ball
θe yb
dx1
θ
θd
y
φ2
dy2
φ1
dx2
Desired heading direction
yt Target Point
x
xb
Opponent goal
Robot
dy1
xt
X
Fig. 6.9. Geometric relationships for calculating the robot’s desired heading angle θd at its online position (x, y)
The angle error θe at the robot’s online position (x, y) is θe = θd − θ. Keeping the translational velocity ν constant, and applying proportional (P) control for ω, we have ω =KPa · θe ,
(6.11)
where KPa is the proportional gain. Substituting Eq. (6.11) into Eq. (6.3), we get the ‘PWM-version’ of P control law to move the robot as follows: vl = Vc − ka · θe , vr = Vc + ka · θe ,
(6.12)
where ka = Kω · KPa , and Vc = kν · ν can be governed by Eq. (6.9) or a constant. For a specified robot, given the ball’s position (x_b, y_b) and a target point (x_t, y_t), and applying Eq. (6.12), the program code for Shoot() follows:
224
6. Robot Soccer Programming
void Shoot(int whichrobot, double x_b, double y_b, double x_t, double y_t) { // declare the variables double dx1, dy1, dx2, dy2, Vc = 50.0, ka = 0.22; int phi1, phi2, theta_d, vl, vr, theta_e; dx1 = x_t - x_b; // calculate dx1 dy1 = y_t - y_b; // calculate dy1 // calculate phi1 if(dx1 == 0 && dy1 == 0) phi1 = 90; else phi1 = (int)(180/M_PI*atan2((double)(dy1), (double)(dx1))); // calculate dx2 dx2 = x_b - PositionOfHomeRobot[whichrobot][0]; // calculate dy2 dy2 = y_b - PositionOfHomeRobot[whichrobot][1]; // calculate phi2 if(dx2==0 && dy2==0) phi2 = 90; else phi2 = (int)(180/M_PI*atan2((double)(dy2),(double)(dx2))); // calculate desired angle theta_d = 2*phi2 - phi1; // calculate the angle error theta_e = theta_d -AngleOfHomeRobot[whichrobot]; //keep theta_e within (-180,180] while(theta_e > 180) theta_e -= 360; while(theta_e <= -180) theta_e += 360; // switch heading direction of robot if(theta_e < -90){ theta_e += 180; Vc = -Vc; } else if(theta_e > 90) { theta_e -= 180; Vc = -Vc; } // set the ka according to theta_e if(abs(theta_e) > 50) ka = 0.16;
6.4 Basic Skill Functions
else else else else
225
if(abs(theta_e > 40)) ka = 0.18; if(abs(theta_e > 30)) ka = 0.2; if(abs(theta_e > 20)) ka = 0.22; ka = 0.24;
// calculate normalized PWM data vl = (int)(Vc - ka*(double)theta_e); //for left wheel vr = (int)(Vc + ka*(double)theta_e); //for right wheel Velocity(whichrobot,vl,vr); // call the Velocity function } When called periodically with the same position inputs, the above program Shoot() should direct a specified robot to bump against the ball at the desired heading angle. However, the code does not address the technical problem of robot chattering. Ideally, Shoot() should direct the specified robot at high velocity to negotiate any bend smoothly, and then approach the ball along a straight line. But due to ‘overshoot’ in negotiating the bend at high speed, there is subsequently a continual angle-error correction in the robot’s heading as it approaches the ball, resulting in a chattering motion trajectory as depicted in Fig. 6.10. To elaborate, when the fast-moving robot ‘cuts’ the straight line from below, Shoot() would set the robot’s desired heading angle θd to negative. But by the next call to Shoot(), the fast-moving robot would have ‘cut’ the line from above, so Shoot() would set the robot’s desired heading angle θd to positive. This cyclic pattern continues rapidly, resulting in the robot chattering as it approaches the ball.
Robot Ball
Fig. 6.10. Problem of chattering with Shoot()
Other techniques, such as the two methods detailed in Section 4.6, are suitable for implementing the Shoot() function. In particular, we highlight
226
6. Robot Soccer Programming
a modification of the univector navigation method presented in Section 4.6.2 for addressing the chattering problem, as follows: Shown in Fig. 6.11 is a modified univector field for attaining a target posture (i.e., a desired heading angle for the robot at the position of the ball). The area altered has all unit vectors at 0 degree (i.e., pointing horizontally to the right where the ball is). This differs from the original field, shown in Fig. 4.19, which has all unit vectors in the same area pointing at either positive or negative angles, constituting a vector flow that converges to the line of horizontal unit vectors passing through point g (the ball’s position in Fig. 6.11) and point r. Thus, in the original field, chattering can occur when the robot turns into the said area at high speed. But any unit vector that the robot ‘latches’ onto upon entering the modified area can direct the robot straight towards the ball, eliminating the chattering problem. In this modified univector field navigation approach, the setting for width y of the modified area should, optimistically, not exceed the robot’s width L; this is so that the robot could hit the ball with the intended impact direction that goes through the centre of the ball.
0.5∆
oB oB
d d
0.5∆
Fig. 6.11. A modified univector field for solving the problem of chattering
6.5 Applied Skill Functions
227
6.5 Applied Skill Functions The following applied robot soccer functions are formulated in terms of the basic skill functions studied in the preceding section, via a structure. For Kick(), the structure is realized by a state machine. For Goalie() and AvoidBound(), the structure is realized by IF-THEN rules. 6.5.1 Kick() This function uses Position() and Angle() to implement the process of ball kicking by a specified robot. This function is an alternative to the Shoot() function. In the following, the pseudocode for Kick() is first given. We then explain how this function can strategically direct the specified robot to kick the ball towards the opponent goal, taking the borders (or side-walls) of the playground into consideration. Overall Pseudocode. void Kick(int whichrobot) { static int flag; // kicking direction Set the kicking direction to the goal; Near the playground border, change the direction to avoid collision; switch(flag){ case 0: // state S0 Go behind the ball (i.e., into ‘shootable area’) by using Position(); If the robot is near the specified shooting position, switch to state S1 by setting flag = 1; break; case 1: // state S1 Turn to face the ball by using Angle(); If the robot’s direction is towards the ball, switch to state S2 by setting flag = 2; break; case 2: // state S2 Kick the ball by using Position(); If the robot is out of the shootable area, switch to state S0 by setting flag = 0; break; }
228
6. Robot Soccer Programming
The structure of the Kick() pseudocode is re-defined by the program structure of Fig. 6.12(a) that can be naturally described by the state machine, represented as a (self-explanatory) state graph as shown in Fig 6.12(b).
VZLWFK KI IODJ ^ ^ FDVH H VWDWH6 'R R$ $ ,I I& & W WKHQ QIO IOD DJ EUHDN
&
FDVH H VWDWH6 'R R$ $ ,I I& & W WKHQ QIO IOD DJ EUHDN
`
6 $
FDVH H VWDWH6 R$ $ 'R ,I I& & W WKHQ QIO IOD DJ EUHDN
& 6 $
&
&
6 $
&
(a) Program structure
&
(b) State machine
State
Action
Condition
S0 : Far from the ball
A0 : Move behind the ball
C0 : When robot is near the specified shooting position behind the ball
S1 : Behind the ball
A1 : Turn to face the ball
C1 : When robot’s direction is towards the ball
S2 : Kicking the ball
A2 : Kick the ball
C2 : When robot is out of shootable area
Fig. 6.12. A strategy for the Kick() function
We now present the codes with explanation for the various sections of the Kick() program. Set Kicking Direction. In this code section, ideally, the function should direct the specified robot to kick the ball towards the opponent goal, but to
6.5 Applied Skill Functions
229
prevent the robot from colliding with any playground border or side-wall, or miskicking towards its own team goal, the kicking direction has to be planned according to the position of the ball. The plan considered for this example program is shown in Fig. 6.13, where the direction of ball kick has to be aimed towards the goal if the ball is in subarea A, but otherwise modified as depicted in the figure.
Y
' B B ' C C
laog tnenoppO
' D D
laog maeT
AA
C' C D' D
B B'
X
Fig. 6.13. Mapping playground subareas to desired directions of ball kick
The code for this section follows: // calculate the angle of the vector from the ball to // the center of the opponent goal theta_d = 180/M_PI*atan2( 130./2.-PositionOfBall[1],150-PositionOfBall[0] ); // if the ball is in area C or C’ if(PositionOfBall[0]
230
6. Robot Soccer Programming
// if the ball is in area B or B’ else if (PositionOfBall[1]
(6.13)
where dx = Dh cos θd and dy = Dh sin θd . The constant Dh is a distance that needs to be set by the designer to determine what he/she deems as a good shooting position (pos[0], pos[1]) for the robot to move to. To prepare the same robot for an appropriate action in the next call to Kick(), the function checks if the robot is already near the desired point (pos[0],pos[1]); and if so, switches to state S1. Referring to Fig. 6.15, the robot is deemed to be near the desired point if C0-condition given by dx × dx + dy × dy < Dc , (6.14) is true, where constant Dc is a distance to be set by the designer. The code for this section follows: case 0: // get the position behind the ball pos[0] = PositionOfBall[0] - D_h*cos(theta_d*M_PI/180.); pos[1] = PositionOfBall[1] - D_h*sin(theta_d *M_PI/180.);
6.5 Applied Skill Functions
231
Dc
dy' OWSXP
dx '
! ¼ ¼ " ¼ ¼ #
232
6. Robot Soccer Programming
// go to the position behind the ball Position(whichrobot, pos[0], pos[1]); // calculate the distance tmp = sqrt( (pos[0]-PositionOfHomeRobot[whichrobot][0]) *(pos[0]-PositionOfHomeRobot[whichrobot][0]) + (pos[1]-PositionOfHomeRobot[whichrobot][1]) *(pos[1]-PositionOfHomeRobot[whichrobot][1]) ); //check the state change condition if (tmp
(6.15)
is true, i.e., the magnitude of the angle error θe between the robot’s heading angle and the desired angle θd at which the robot would directly face the ball is less than some angle constant Ad set by the designer. The code for this section follows:
case 1 // rotate to the goal Angle(whichrobot, (int) theta_d); // calculate the angle error theta_e tmp = theta_d - AngleOfHomeRobot[whichrobot]; // keep theta_e (held in variable tmp) // within (-180,180] while(tmp>180) tmp -= 360; while(tmp<-180) tmp += 360; // switch heading direction of the robot if(tmp>90) tmp -=180; else if (tmp<-90) tmp += 180;
6.5 Applied Skill Functions
233
Ad
Ball
θe
Robot Fig. 6.16. In state S1: to switch to state S2 if the angle error is less than a specified value Ad . Note that in this figure, the illustrated C1-condition ‘|θe | < Ad ’ is false.
// check the state change condition if (fabs(tmp)
(6.16)
where dx = Dv cos θd and dy = Dv sin θd . The constant Dv is a distance to be set by the designer; it provides the flexibility of adjusting the robot’s velocity upon arriving at the ball’s position. It helps to determine what the designer deems as an appropriate position (pos[0], pos[1]) for the robot to reach in moving through the ball’s position at a certain positive velocity, thereby kicking the ball. To prepare the same robot for an appropriate action in the next call to Kick(), the function checks if the robot is already out of the shootable area; and if so, switches to state S2. Referring to Fig. 6.18, the robot is deemed to
234
6. Robot Soccer Programming
Fig. 6.17. State S2: the desired point (pos[0], pos[1]) the specified robot should reach in order to move through the ball’s position at a certain positive velocity.
be out of the shootable area if C2-condition given by
(xb − x)2 + (yb − y)2 > Df
→
or |θd − bc | < 90◦
(6.17)
is true, i.e., the magnitude of distance between the robot at position (x, y) (denoted as point c) and the ball at position (xb , yb ) (denoted as point b) is longer than some constant Df set by the designer, or the ball is behind the →
robot, i.e., |θd − bc | < 90◦ , when range (−180◦, 180◦ ]. The code for this section follows:
→
→
bc and (θd − bc) are kept in the
case 2: // get the position in front of the ball pos[0] = PositionOfBall[0] + D_v*cos(theta_d*M_PI/180.); pos[1] = PositionOfBall[1] + D_v *sin(theta_d*M_PI/180.); // kick the ball Position(whichrobot, pos[0], pos[1]);
6.5 Applied Skill Functions
235
v
θd Jxb , ybK
Df
uB
Fig. 6.18. In state S2: to switch to state S0 if the robot is at less than a distance of Df from the ball, or is behind the ball.
// calculate the distance between the ball and the robot tmp = sqrt( (PositionOfBall[0]-PositionOfHomeRobot[whichrobot][0]) *(PositionOfBall[0]-PositionOfHomeRobot[whichrobot][0]) + (PositionOfBall[1]-PositionOfHomeRobot[whichrobot][1]) *(PositionOfBall[1]-PositionOfHomeRobot[whichrobot][1]) ); // calculate the angle between the desired angle (towards goal) // and the vector from the ball’s position to the robot’s tmp2 = theta_d - 180./M_PI*atan2( (double)(PositionOfHomeRobot[whichrobot][1]-PositionOfBall[1]), (double)(PositionOfHomeRobot[whichrobot][0]-PositionOfBall[0]) ); // keep angle difference (held in variable tmp2) within (-180,180] while (tmp2>180.) tmp2 -=360.; while (tmp2<=-180.) tmp2 +=360.; // check the state change condition if(tmp>D_f||fabs(tmp2)<90.) flag = 0; // switch to state S0 break; Tuning for Team Performance and Program Extension. In general, the constants Db , Dh , Dc , Ad , Dv , and Df need to be ‘tuned’ and set by the designer for robot team performance. As a final remark, the Kick() example program described herein applies under the NormalGame mode. It has to be extended to enable a specified robot to take a goal kick, penalty kick and ‘free’ ball.
236
6. Robot Soccer Programming
6.5.2 Goalie() This function uses Position() and Angle() to direct the goalkeeping robot to block the ball.
Team goal
Ball
Goalie Goalie
Near
Middle
Fa Far r
Fig. 6.19. Areas of surveillance for the Goalie() function
The following strategy is suggested for the Goalie() function: 1. Always direct the goalkeeping robot to move along a line parallel to the goal line. 2. Divide the playground into three areas of surveillance, namely, fardistance area, middle-distance area and near-distance area, as depicted in Fig. 6.19. Direct the goalkeeping robot to move to position (estimate_x, estimate_y ) computed according to the area the moving ball is in, as follows; when the ball is a) in far-distance area, direct the goalkeeping robot to move to position (G_OFFSET, 65.0 − y) (see Fig. 6.20); b) in middle-distance area, direct the goalkeeping robot to move to the y-coordinate of the ball, i.e., position (G_OFFSET, PositionOfBall[1]) (see Fig. 6.21 ); c) in near-distance area, direct the goalkeeping robot to move to the position of the ball, i.e., (PositionOfBall[0], PositionOfBall[1]) (see Fig. 6.22), but stay within the goal area bounded as depicted in Fig. 6.23. 3. Perform exception handling to direct the goalkeeping robot to turn away from any side-wall it comes into contact with. The basic problem is that
6.5 Applied Skill Functions
237
Y Ball PositionOfBall[1]
dy
estimate_y
∆y
dx ∆x
65cm
G_OFFSET
Far-distance area -
15cm estimate_x
PositionOfBall[0]
X
Fig. 6.20. Desired position (estimate x, estimate y) of goalkeeping robot dy when the ball is in far-distance area, with y = · x, where dy = dx PositionOfBall[1] − 130/2 and dx = PositionOfBall[0] − 15. Y
Ball estimate_y
PositionOfBall[1] G_OFFSET
Middle-distance area 15cm estimate_x
PositionOfBall[0]
X
Fig. 6.21. Desired position (estimate x, estimate y) of goalkeeping robot when the ball is in middle-distance area
238
6. Robot Soccer Programming
Y Near-distance area
PositionOfBall[0]
Ball estimate_y
PositionOfBall[1] G_OFFSET
estimate_x
X
Fig. 6.22. Desired position (estimate x, estimate y) of goalkeeping robot when the ball is in near-distance area
when the robot comes into contact with a side-wall, directing it to turn immediately away towards a desired position (usually the ball’s position) is not possible when the wall is obstructing ‘on the spot’ turning of the robot, or we say the robot is ‘stuck at the wall’. This happens when the (smaller) angle between the line along the robot’s heading direction and the normal to the side-wall it comes into contact with is within a certain (maximum) angle bound Ab set. In any such situation, the robot would need to be turned away from the wall first at a suitable non-zero translational velocity. Exception handling is done to prevent the robot from getting stuck at the wall in various situations, by applying the P control law of Eq. (6.12) with appropriate values for Vc and ka . Five exception-situations depicted in Fig. 6.24 are considered for the goalkeeping robot. In the figure and program code that follows later, angle bound Ab is denoted by G_ANGLE_BOUND. To elaborate, consider exception-situations 2 and 3 in Fig. 6.24. Set angle bound Ab = 60o . In the latter situation, the goalkeeping robot is facing rightward at a heading angle of between −60o and 60o, and should be directed to make
6.5 Applied Skill Functions {
239
{ d
wB
*B/ *B /(1* (1*7+ 7+
nB
d
*B2))6(7
*B2))6(7
z
z
Fig. 6.23. The two Y -coordinate boundaries of the goalkeeping robot
a left turn towards the goalie area. In the former situation, the robot is heading leftward at a heading angle of between 120o and 240o , and should be directed to make a reverse left turn towards the goalie area. In the next section, a similar problem for the other team robots will be addressed, using an Avoidbound() function. The following program code for Goalie() implements the suggested strategy: #define G_OFFSET
6.0
#define G_ANGLE_BOUND 60 #define G_BOUND #define G_LENGTH
8.0 60
// // // // // // //
offset value (in cm) of x-coordinate for goalie’s path maximum angle (in deg.) for turning away from side-wall maximum distance set for exception handling length of goal area
void Goalie(int whichrobot) { double estimate_x, estimate_y, dx, dy; // declare variables int theta_d, theta_e, dir=1, m_theta; G-BLOCK1; // calculate the desired position for goalie G-BLOCK2; // do posture control for goalie G-BLOCK3; // do exception routine for goalie
240
6. Robot Soccer Programming
m
5 m
4
G_BOUND
3
m
m G_ANGLE_BOUND
2
Fig. 6.24. Numbered exception-situations and the desired turning directions for the goalkeeping robot
} The ‘labels’ G-BLOCK1, G-BLOCK2 and G-BLOCK3 correspond to the respective steps of the strategy, and are to be replaced by the respective codes given below. Program code for G-BLOCK1 if (PositionOfBall[0]>70.0){ // far-distance case estimate_x = G_OFFSET; estimate_y = 130.0/2.0 -( (130.0/2.0 - PositionOfBall[1]) *G_OFFSET/(PositionOfBall[0]-15.0) ); }
6.5 Applied Skill Functions
241
else if (PositionOfBall[0]>25.0){ // middle-distance case estimate_x = G_OFFSET; estimate_y = PositionOfBall[1]; } else { // near-distance case estimate_x = PositionOfBall[0]; estimate_y = PositionOfBall[1]; } // Y-boundary constraint: upper and lower if (estimate_y>(65.0+G_LENGTH/2.0)) estimate_y=65.0+G_LENGTH/2.0; if (estimate_y<(65.0-G_LENGTH/2.0)) estimate_y=65.0-G_LENGTH/2.0;
Program code for G-BLOCK2
dx = estimate_x - PositionOfHomeRobot[whichrobot][0]; dy = estimate_y - PositionOfHomeRobot[whichrobot][1]; // function for goalie motion Position(whichrobot, estimate_x, estimate_y); if (dx==0&&dy==0) theta_d = 90; else theta_d =(int)(180/M_PI*atan2((double)(dy),(double)(dx))); while(theta_d >180) theta_d -= 360, dir = -1; while(theta_d <=-180) theta_d += 360, dir = -1; theta_e = theta_d - AngleOfHomeRobot[whichrobot]; while(theta_e > 180) theta_e -= 360; while(theta_e <=-180) theta_e += 360; // if distance error between goalie and desired point // is less than specified tolerance of 2cm, robot is // assumed to have reached it and so Angle() is called // to align its heading parallel to the goal line // the assumed heading in the next call to Goalie() if( (dx*dx+dy*dy < 2.0*2.0) ) Angle(whichrobot, dir*90);
242
6. Robot Soccer Programming
Program code for G-BLOCK3 while(AngleOfHomeRobot[whichrobot] >180) AngleOfHomeRobot[whichrobot] -= 360; while(AngleOfHomeRobot[whichrobot] <-180) AngleOfHomeRobot[whichrobot] += 360; // exception handling for goalie if(PositionOfHomeRobot[whichrobot][0] < 0.0) // exception-situation 1 Position(whichrobot, G_OFFSET, 65.0); else { m_theta = abs(AngleOfHomeRobot[whichrobot]); if(PositionOfHomeRobot[whichrobot][1]<65.0){ if(m_theta>180 - G_ANGLE_BOUND && PositionOfHomeRobot[whichrobot][0]<0.0+G_BOUND) // exception-situation 2 Velocity(whichrobot, -12, -1); else if(m_theta180-G_ANGLE_BOUND && PositionOfHomeRobot[whichrobot][0]<0.0+G_BOUND) // exception-situation 4 Velocity(whichrobot, -1, -12); else if (m_theta
6.5.3 AvoidBound() This function is similar to the exception handling routine that implements Step 3 of the suggested strategy for Goalie(), in that it prevents the specified team robot from getting ‘stuck’ at the side-wall of the playground, by directing the robot to turn away from the wall at a suitable non-zero translational velocity. But it considers different exception-situations depicted in Fig. 6.25. It handles these situations by applying the P control law of Eq. (6.12)
6.5 Applied Skill Functions
243
with appropriate values for Vc and ka as set by the designer. In the figure and program code that follows later, angle bound Ab is denoted by ANGLE_BOUND. To elaborate, consider exception-situation B1 in Fig. 6.25. Set angle bound Ab = 60o . In this situation, the robot is heading towards the bottom wall at a heading angle of between −150o and −30o, and should be directed to make a reverse turn, away from the wall.
T1
ANGLE_BOUND
T2 R1 DISTANCE_BOUND
ANGLE_BOUND
L1 ANGLE_BOUND
ANGLE_BOUND
ANGLE_BOUND
R2 ANGLE_BOUND
L2
L4
ANGLE_BOUND
R4
ANGLE_BOUND ANGLE_BOUND
ANGLE_BOUND
DISTANCE_BOUND
L3
ANGLE_BOUND
R3 DISTANCE_BOUND
B1
ANGLE_BOUND
B2
DISTANCE_BOUND
Fig. 6.25. Labelled exception-situations characterized by specified bounds (DISTANCE BOUND, ANGLE BOUND) and wall locations, namely top and bottom walls, left and right walls
The program code for AvoidBound() follows: #define ANGLE_BOUND 60 #define DISTANCE_BOUND 9 void AvoidBound(int whichrobot) { A-BLOCK1; // for top and bottom walls A-BLOCK2; // for left wall
244
6. Robot Soccer Programming
A-BLOCK3; // for right wall } The ‘labels’ A-BLOCK1, A-BLOCK2 and A-BLOCK3 are to be replaced by the respective codes given below. Program code for A-BLOCK1 #define ANGLE_BOUND 60 // in deg. #define DISTANCE_BOUND 9 // in cm int theta, m_theta,theta_e, theta_d, vl, vr; double dx, dy; // declare variables dx=PositionOfBall[0]-PositionOfHomeRobot[whichrobot][0]; dy=PositionOfBall[1]-PositionOfHomeRobot[whichrobot][1]; if(dx==0 && dy==0) theta_d = 90; else theta_d = (int)(180/M_PI*atan2((double)(dy),(double)(dx))); theta = AngleOfHomeRobot[whichrobot]; theta_e = theta_d - AngleOfHomeRobot[whichrobot]; while (theta>180) theta -= 360; while (theta<-180) theta += 360; while (theta_e>180) theta_e -= 360; while (theta_e<-180) theta_e += 360; // for top and bottom walls if (theta>-90-ANGLE_BOUND && theta<-90+ANGLE_BOUND && PositionOfHomeRobot[whichrobot][1]>130.0-DISTANCE_BOUND) { // top and bottom: case T1 vr = (int)(14+19.0/90.0*theta_e); vl = (int)(14-19.0/90.0*theta_e); Velocity(whichrobot, vl, vr); } else if (theta>90-ANGLE_BOUND && theta<90+ANGLE_BOUND && PositionOfHomeRobot[whichrobot][1]>130.0-DISTANCE_BOUND) { // top and bottom: case T2 vr = (int)(-14 +19.0/90.0*theta_e); vl = (int)(-14 -19.0/90.0*theta_e); Velocity(whichrobot, vl, vr); }
6.5 Applied Skill Functions
245
else if(theta>-90-ANGLE_BOUND && theta<-90+ANGLE_BOUND && PositionOfHome Robot[whichrobot][1]<0.0+DISTANCE_BOUND) { // top and bottom: case B1 vr = (int)(-14+19.0/90.0*theta_e); vl = (int)(-14-19.0/90.0*theta_e); Velocity(whichrobot, vl, vr); } else if(theta>90-ANGLE_BOUND && theta<90+ANGLE_BOUND && PositionOfHomeRobot[whichrobot][1]<0.0+DISTANCE_BOUND) { // top and bottom: case B2 vr = (int)(14+19.0/90.0*theta_e); vl = (int)(14-19.0/90.0*theta_e); Velocity(whichrobot, vl, vr); }
Program code for A-BLOCK2 // for left side-wall m_theta = abs(theta); if(m_theta < ANGLE_BOUND && PositionOfHomeRobot[whichrobot][0]<0.0+DISTANCE_BOUND && PositionOfHomeRobot[whichrobot][1]>85.0) { // left wall: case L1 vr = (int)(14 +19.0/90.0*theta_e); vl = (int)(14 -19.0/90.0*theta_e); Velocity(whichrobot, vl, vr); } else if (m_theta>180-ANGLE_BOUND && PositionOfHomeRobot[whichrobot][0]<0.0+DISTANCE_BOUND && PositionOfHomeRobot[whichrobot][1]>85.0) { // left wall: case L2 vr = (int)(-14 +19.0/90.0*theta_e); vl = (int)(-14 -19.0/90.0*theta_e); Velocity(whichrobot, vl, vr); } else if (m_theta
246
6. Robot Soccer Programming
else if (m_theta>180-ANGLE_BOUND && PositionOfHomeRobot[whichrobot][0]<0.0+ DISTANCE_BOUND && PositionOfHomeRobot[whichrobot][1]< 45.0) { // left wall: case L4 vr = (int)(-14 +19.0/90.0*theta_e); vl = (int)(-14 -19.0/90.0*theta_e); Velocity(whichrobot, vl, vr); }
Program code for A-BLOCK3 // right side-wall if (m_theta150.0-DISTANCE_BOUND PositionOfHomeRobot[whichrobot][1] > 85.0) { // right wall: case R1 vr = (int)(-14+19.0/90.0*theta_e); vl = (int)(-14-19.0/90.0*theta_e); Velocity(whichrobot, vl, vr); } else if (m_theta>180-ANGLE_BOUND && PositionOfHomeRobot[whichrobot][0]>150.0-DISTANCE_BOUND PositionOfHomeRobot[whichrobot][1]>85.0) { // right wall: case R2 vr = (int)(14 +19.0/90.0*theta_e); vl = (int)(14 -19.0/90.0*theta_e); Velocity(whichrobot, vl, vr); } else if (m_theta150.0-DISTANCE_BOUND PositionOfHomeRobot[whichrobot][1] < 45.0) { // right wall: case R3 vr = (int)(-14 + 19.0/90.0*theta_e); vl = (int)(-14 - 19.0/90.0*theta_e); Velocity(whichrobot, vl, vr); } else if (m_theta>180-ANGLE_BOUND && PositionOfHomeRobot[whichrobot][0]>150.0-DISTANCE_BOUND PositionOfHomeRobot[whichrobot][1]<45.0) { // right wall: case R4 vr = (int)(14 +19.0/90.0*theta_e); vl = (int)(14 -19.0/90.0*theta_e); Velocity(whichrobot, vl, vr); }
&&
&&
&&
&&
6.6 Game Strategy Development
247
Use of AvoidBound(). The example program given above for AvoidBound() has been found to work well as a function call inserted as the last statement in the example program code for the Position() function (see Section 6.4.3). With reference to the four-level hierarchy of the architecture introduced in Section 4.2, implemented at the behaviour level are the following functions: Velocity(), Angle() AvoidBound(), Position(), Kick(), and Goalie(). The last three functions are the key robot soccer actions of the framework, available for selection at the action level, while the rest can be considered as auxiliary. Because of the way these key actions are defined and implemented, assigning to a robot the role of attacking, defending, or goalkeeping implies selecting, respectively, the action of Kick(), Position(), or Goalie() for the robot, and conversely. In other words, for this framework, the mechanisms at the role-level and action-level are indistinguishable and thus, these two levels can be unified as the role-action level.
6.6 Game Strategy Development In a game of robot soccer, the objective of a robot soccer team is to score goals against an opponent team. To achieve this, a high-level game strategy continually revolves around the following two questions: 1. what action should the goalkeeping robot take? 2. which other robot should defend or attack, and with what action? 6.6.1 Zone-Defence Strategy In this section, we provide a simple zone-defence game strategy code for NormalGame(), with the team robots assigned their respective areas of manoeuvre, as depicted in Fig. 6.26. The robot, robot3, will perform goalkeeping by blocking or pushing away the ball according to the behaviour-level strategy implemented in Goalie(). Fixing robot3’s role, what remains is the role-action level strategy for the other two team robots, formulated as follows: Either robot1 or robot2 will kick the ball that enters its assigned area, while the other will position itself behind the ball so as to be in a potentially good location to kick the ball in the next possible moment. Kicking and positioning are executed according to the behaviour-level strategies implemented for Kick() and Position(), respectively.
248
6. Robot Soccer Programming
Fig. 6.26. Team robots’ assigned areas according to the zone-defence strategy
6.6 Game Strategy Development
249
The program for NormalGame() follows: YRLG1RUPDO*DPHLQWURERW LQW URERWLQW URERW ^ LI3RVLWLRQ2I%DOO>@!__3RVLWLRQ2I%DOO>@ 6WRS5RERW *RDOVFRUHGVWRSDOOURERWV HOVH^ *RDOLHURERW LI3RVLWLRQ2I%DOO>@! ^\BE!FP wBB URERWNLFNVWKHEDOO .LFNURERW URERWPRYHVEHKLQGWKHEDOO 3RVLWLRQURERW 3RVLWLRQ2I%DOO>@ ` HOVHLI3RVLWLRQ2I%DOO>@ ^\BEFP URERWNLFNVWKHEDOO nBB .LFNURERW URERWPRYHVEHKLQGWKHEDOO 3RVLWLRQURERW 3RVLWLRQ2I%DOO>@ ` ` `
In the NormalGame() program, a call to Stop_AllRobots() halts all the team robots. By the setting of A = 256, and that zd = 9 (motor’s dead zone) as used in the program code for Velocity() (see Section 6.4.1), the PWM velocity data H for the left wheel (vL) and the right wheel (vR) of each robot should be set to any integer in the range [128 − 9, 128 + 9]. The program code for Stop_AllRobots() that uses PWM data H = 127 for halting the robots is given below. It uses the global array command[] defined in Section 6.4.1 for sending data to the robots. void Stop_AllRobots() { command[0] = 0xFF; // command[1] = 0xFF; // command[2] = 0xFF; // command[3] = 0x0F; // command[4] = 0x7F; // command[5] = 0x7F; // command[6] = 0x7F; // command[7] = 0x7F; // command[8] = 0x7F; // command[9] = 0x7F; // }
first 3 bytes of FFH indicate start of message team ID = 0x0F robot1’s left PWM data robot1’s right PWM data robot2’s left PWM data robot2’s right PWM data robot3’s (goalie’s) left PWM data robot3’s right PWM data
250
6. Robot Soccer Programming
6.6.2 Univector Field Navigation The two methods, univector field and limit-cycle navigation, introduced in Section 4.6, provide a different set of behaviour-level strategies for implementing the key functions, Kick() and Position(). To recall, the essence of these unified navigation methods is that they enable a soccer robot negotiate past any opponent robot - a moving obstacle - and exert at an appropriate heading angle when it hits the ball, the moving target, as per the implementation for Kick(), or when it reaches a game-strategic position, a dynamically changing target point, as per the implementation for Position(). The generic ‘C’ codes of the two essential program functions for implementing the strategy of univector field navigation for a robot are presented in the following. 1. N_Posture() Towards attaining the robot’s target posture, this function uses Eq. (4.26) to generate a subfield univector F (p) and returns its direction F (p), where p is the robot’s current position (x,y). (x,y,theta) and (bx,by,btheta) denote the robot’s current posture and the (desired) target posture at position g, respectively, as depicted in Fig. 4.19. With these denotations, the program code for N_Posture() follows: #define D #define N
2. 8.
double N_Posture(double x, double y, double theta, double bx, double by, double bq) { double theta_d; double rx, ry; double r1, r2, r3, phi, dist; rx = bx + D*cos(btheta); ry = by + D*sin(btheta); r1 = atan2(by-y, bx-x); r2 = atan2(ry-y, rx-x); r3 = atan2(ry-by,rx-bx); dist = sqrt((bx-x)*(bx-x)+(by-y)*(by-y)); phi = r3-r1; while( phi > PI ) phi -= 2.*PI;
6.6 Game Strategy Development
251
while( phi <-PI ) phi += 2.*PI; theta_d = r1 - N*phi; while(theta_d> PI) theta_d -= 2.*PI; while(theta_d<-PI) theta_d += 2.*PI; return theta_d; } 2. N_Obstacle() This function modifies the desired heading angle theta_d and returns it as F (p), using the subfield univector generated for avoiding a circular obstacle, as depicted in Fig. 4.21. (x,y) and (ox,oy) denote the robot’s current position p and the position of an obstacle’s centre, respectively. Ro and M are the radius of obstacle and the boundary margin, as shown in Fig. 4.21. With these denotations, the program code for N_Obstacle() follows: #define Ro #define M
10 4
// radius of circular obstacle // region of modification at the // border of obstacle surface
double N_Obstacle(double x, double y, double ox, double oy, double Ro, double M, double theta_d) { int i; double dist, length, angle, diff_angle; double tmp_x, tmp_y; // dist: distance between the robot position (x,y) // and the obstacle position (ox,oy) // length: the length between the line from (x,y) // in direction to the univector field // direction and (ox,oy) // angle: the angle of the vector from (x,y) to (ox,oy) dist = sqrt((ox-x )*(ox-x)+(y-oy)*(y-oy)); length = fabs((ox-x)*sin(theta_d)+(y-oy)*cos(theta_d)); angle = atan2(oy-y,ox-x); diff_angle = theta_d - angle; // keep angle within [-PI, PI] while( diff_angle > PI ) diff_angle -= 2.*PI; while( diff_angle <-PI ) diff_angle += 2.*PI;
252
6. Robot Soccer Programming
// check whether the line from (x,y) in direction // to the univector field direction // passes the near region of obstacle (Ro+M) if( length < Ro+M && fabs( diff_angle ) < PI/2. ){ if( dist <= Ro ){ //modify theta_d to the outer direction from the // obstacle’s centre theta_d = angle-PI ; } else if( dist <= Ro+M ){ //modify theta_d to avoid it with clockwise direction if( diff_angle > 0.){ // make smooth transition near the obstacle boundary tmp_x = ((dist-Ro)*cos(angle-1.5*PI) +(Ro+M-dist)*cos(angle-PI))/M; tmp_y = ((dist-Ro)*sin(angle-1.5*PI) +(Ro+M-dist)*sin(angle-PI))/M; theta_d = atan2(tmp_y, tmp_x); } // modify theta_d to avoid it with counter clockwise // direction else{ // make smooth transition near the obstacle boundary tmp_x = ((dist-Ro)*cos(angle-0.5*PI) +(Ro+M-dist)*cos(angle-PI))/M; tmp_y = ((dist-Ro)*sin(angle-0.5*PI) +(Ro+M-dist)*sin(angle-PI))/M; theta_d = atan2(tmp_y, tmp_x); } } else{ // modify theta_d to avoid it with clockwise // direction if( diff_angle > 0.){ theta_d=fabs(atan((Ro+M) /sqrt(dist*dist-(Ro+M)*(Ro+M))))+angle; } // modify theta_d to avoid it with counter clockwise // direction else{ theta_d=-fabs(atan((Ro+M)
6.6 Game Strategy Development
253
/sqrt(dist*dist-(Ro+M)*(Ro+M))))+angle; } } } return theta_d; } Using the above two functions, function Uvect(), which returns the desired heading angle (in radians) of the robot at position p, can be written for a specified target position g in the presence of several circular obstacles. In robot soccer, a robot can be treated as (an object placed within the boundary of) a circular obstacle. There are perhaps several ways to program it; one such way is reported in [63]. Programming this function is left as an exercise for the reader. We now present the program code for Control() which reads the robot’s current posture (x,y,q), uses Uvect() and implements Eq. (4.23) to compute and output the control input [v, w]T . #define L
7.5
#define V_M #define R_M #define K_W
100. 300. 5.
// // // // //
distance between the left and right wheels (cm) maximal wheel speed (cm/s) maximal rotation (rad cm/s) feedback coefficient
void Control(double x, double y, double q, double *v, double *w) { double v1, v2; // v1 and v2 in the controller double dl = 0.000001; // small distance to approximate phi_v double theta_d, theta_f; // the robot’s desired heading direction // and its direction in front of the // robot’s center double phi_v, a_phi_v; // phi_v and the absolute value of phi_v double theta_e, a_theta_e; // angle error and its absolute value // approximate phi_v theta_d = Uvect(x, y); theta_f = Uvect(x+dl*cos(q), y+dl*sin(q)); phi_v = theta_f - theta_d; while( phi_v> PI ) phi_v -= 2.*PI; while( phi_v<-PI ) phi_v += 2.*PI; phi_v = phi_v/dl;
254
6. Robot Soccer Programming
a_phi_v = fabs(phi_v); // calculate theta_e theta_e = theta_d - q; while( theta_e> PI ) theta_e -= 2.*PI; while( theta_e<-PI ) theta_e += 2.*PI; a_theta_e = fabs(theta_e); // calculate v v1 = ( 2.*V_M - L*K_W*sqrt(a_theta_e) )/( 2. + L*a_phi_v ); if( a_phi_v> 0) { v2= sqrt(K_W*K_W*a_theta_e + 4*R_M*a_phi_v)K_W*sqrt(a_theta_e) /(2*a_phi_v); } else{ v2 = V_M; } *v = v10.){ *w = *v * phi_v + K_W*sqrt(a_theta_e); } else{ *w = *v * phi_v - K_W*sqrt(a_theta_e); } } 6.6.3 Limit-Cycle Navigation The generic ‘C’ codes of the two essential program functions for implementing the strategy of limit-cycle navigation for a robot are presented in the following. 1. TurnClockwise() Towards attaining the robot’s target posture, this function uses Eq. (4.27) to generate and return a clockwise field, where (x,y) denotes the position of the centre of the turning circle and r denotes the radius of the circle. With these denotations, the program code for TurnClockwise() follows:
6.6 Game Strategy Development
255
double TurnClockwise(double x, double y, double r, double bx, double by) { double dx, dy, ddx, ddy; double theta_d; // // // dx dy
dx and dy are the relative distances between the robot’s current position and the centre position of the turning circle = bx - x; = by - y;
ddx = dy + dx * (r - dx * dx - dy * dy); ddy = -dx + dy * (r - dx * dx - dy * dy); // calculate the desired angle theta_d = atan2((double)(ddy), (double)(ddx)); return theta_d; } 2. TurnCounterClockwise() Towards attaining the robot’s target posture, this function uses Eq. (4.29) to generate a counterclockwise field, where (x,y) denotes the position of the centre of the turning circle and r denotes the radius of the circle. With these denotations, the program code for TurnCounterClockwise() follows: double TurnCounterClockwise(double x, double y, double r, double bx, double by) { double dx, dy, ddx, ddy; double theta_d; // // // dx dy
dx and dy are the relative distances between the robot’s current position and the centre position of the turning circle = bx - x; = by - y;
ddx = -dy + dx * (r - dx * dx - dy * dy); ddy = dx + dy * (r - dx * dx - dy * dy);
256
6. Robot Soccer Programming
// calculate the desired angle theta_d = atan2((double)(ddy), (double)(ddx)); return theta_d; } Using the above two functions, function Limitcycle(), which returns the desired heading angle (in radians) of the robot at its current position (x,y), can be written for a specified target position in the presence of several obstacles. Programming this function is left as an exercise for the reader. To compute and output the control input [v, w]T , the program code for Control() that implements Eq. (4.23) can be used by replacing Uvect() with Limitcycle().
Notes on Selected References The host software model is adapted from the work of other researchers [19]. For beginners of ‘C’ programming, the textbook [64] is a good starting point.
7. Simulated Robot Soccer
7.1 Introduction This chapter complements the real-system programming framework presented in the previous chapter with a computer-simulated system programming framework. The simulation framework enables the development of robot soccer systems for the FIRA Simulated-Robot Soccer Tournament (SimuroSoT), which essentially is MiroSoT played in a computer simulator. Without the need for robot and vision hardware, the problems of sensing and acting are reduced to non-issues, and it becomes possible to focus on game strategy development for the two bigger leagues in SimuroSoT, namely the middle league (5-a-side) and the large or full league (11-a-side). This opens up the challenges of using advanced AI techniques to develop complex game strategies. This chapter describes the core simulator system for Large League SimuroSoT in terms of its architecture, system requirements and general operating procedure. It then presents the essentials for programming a game strategy. To illustrate the programming framework, example program fragments that implement some strategies and related actions are provided. These example codes could be modified or expanded to incorporate other new ideas and techniques.
7.2 Client-Server Architecture Under the SimuroSoT programming framework, the robot-soccer simulator consists of a network of three computers; namely, one server and two clients, connected as shown in Fig. 7.1. The server simulates the field dynamics, i.e., the motion of the robots and the ball in a virtual playground, with a monitor screen that displays the game situation. Each of the two clients, representing a competing team, receives simulated dynamic information on the coordinate positions and directions of move of the robots and the ball from the server, executes its game strategy, and pass back control information for each of its team robots to the server computer, which updates its monitor display accordingly. Fig. 7.2 shows the internal architecture of the simulator. J.-H. Kim, D.-H. Kim, Y.-J. Kim, K.-T. Seow: Soccer Robotics, STAR 11, pp. 257-271, 2004 Springer-Verlag Berlin Heidelberg 2004
258
7. SimuroSoT
Fig. 7.1. Client-server platform for SimuroSoT
tB
f
ugtxgt
pB
B
m
B
pB
B
eB B
uB JwOK
enkgpv
Fig. 7.2. Internal architectue of client-server based simulator for robot soccer programming
7.2.1 Server Side The server consists of five modules, as shown in Fig. 7.2. The function of each module is described in the following: • Network communication module 1. It receives the updated information of the ‘virtual’ field from the display module, and sends it to each client. The field information includes the coordinate positions and heading directions of the robots and the ball. 2. It receives control information (namely, the desired wheel velocities for each team robot) from each client and sends it to the kinematics module. • Kinematics module 1. It computes the latest field information according to some kinematics models and the latest control inputs from the clients, and sends this information to the collision test module. 2. The kinematics models employed are given in Section 7.3.
7.2 Client-Server Architecture
259
• Collision test module 1. Based on the field information received from the kinematics module, it checks for any collision – between robots, – between a robot and a side-wall, – between a robot and the ball, – between the ball and a side-wall. 2. According to any collision it detects, it modifies the current positions and heading directions of the affected robots and/or the ball, as initially computed by the kinematics module, before sending the field information to the referee module. • Referee module 1. Based on the latest (and possibly modified) field information received from the collison test module, it calls for a foul if needed, in accordance to the game rules. Whenever it calls for a foul, it automatically places the robots and the ball in predefined positions, and the robots in pre-defined heading directions, all in accordance to a ‘set-position play’ awarded (penalty-kick, free-kick, free-ball or goal-kick) against the committed foul; it sends this re-initialized field information to the kinematics module. 2. The rules of Middle League (5-a-side game) and Large League (11-a-side game) SimuroSoT are available on website http://www.fira.net. • Display module Based on the ‘finalized’ field information received from the referee module, it updates the graphic animation of the virtual field on the monitor screen. 7.2.2 Client Side The client consists of two modules, as shown in Fig. 7.2. The function of each module is described in the following: • Network communication module 1. It receives the latest field information from the server and sends it to the strategy module. 2. It receives the latest control information from the strategy module, and sends it to the server. • Strategy module 1. Based on the field information (or game situation) received, it strategically decides and generates the latest control information for its team robots. 2. This is the only user-defined module of the simulator; it requires client programming (in Visual C++ ) to build user-defined action-functions such as Position(), based on a system-defined function Velocity(), as well as the game strategy using these basic skill functions. More on this would be discussed in Section 7.5.
260
7. SimuroSoT
7.3 Kinematics Models 7.3.1 For the Ball The simulator applies the following model to generate the straight-line motion of the ball: νb (k) =νb (k − 1) + af t, xb (k) =xb (k − 1) + νb (k − 1)t cos{θb (k − 1)}, yb (k) =yb (k − 1) + νb (k − 1)t sin{θb (k − 1)},
(7.1)
where • af is the acceleration caused by friction, • k is an integer and k ≥ 1, • νb (k) and νb (k − 1) refer to the velocities of the ball at time tk and tk−1 , respectively, • (xb (k), yb (k)) and (xb (k − 1), yb (k − 1)) refer to the coordinate positions of the ball at time tk and tk−1 , respectively, • θb (k − 1) refers to the ‘heading’ angle of the ball at time tk−1 , and • time interval t = tk − tk−1 . 7.3.2 For the Robot The simulator applies the following model to generate the motion of each robot:
and
ν = Rω, & & & & & V & & M & ω=& & L& & &R + & 2
(7.2)
θ x(k) = x(k − 1) + R sin θ cos θ(k − 1) + , 2 θ y(k) = y(k − 1) + R sin θ sin θ(k − 1) + , 2
(7.3)
where • ν, ω and L are the robot’s translational velocity (at its centre), turning velocity and physical width, respectively,
7.4 How To Run the Simulator
261
• VL and VR are the robot’s left-wheel and right-wheel velocities, respectively, for which VM = max{|VL |, |VR |} and Vm = min{|VL |, |VR |}, • R is the turning radius of the robot, given by R=
L(VM + Vm ) , 2(VM − Vm )
• k is an integer and k ≥ 1, • ω(k − 1) refers to the robot’s turning velocity ω at time tk−1 , and time interval t = tk − tk−1 , such that θ = ω(k)t, • θ(k) and θ(k − 1) refer to the heading angles of the robot at time tk and tk−1 , respectively, such that θ(k) = θ(k − 1) + t, • (x(k), y(k)) and (x(k − 1), y(k − 1)) refer to the coordinate positions of the robot at time tk and tk−1 , respectively.
7.4 How To Run the Simulator The simulator package for Large League SimuroSoT (in Visual C++ ) is available for download from website http://www.fira.net. 7.4.1 System Requirements The minimum system requirements to run the package are as follows: 1. 2. 3. 4. 5. 6. 7.
Pentium III 800 MHz CPU (Ethernet-enabled), 256 Mbytes of RAM, Graphics accelerator with 32 Mbytes of RAM, Monitor screen resolution of 1024 × 768, Operating system: Windows 98, Windows 2000 (preferred) Direct X 7.0, 10 Mbytes of free hard disk space.
262
7. SimuroSoT
Fig. 7.3. SimuroSoT simulation display
7.4.2 Server Program Fig. 7.3 shows a screen display of the simulation environment when the SimuroSoT server program is started. The team which is connected to the server first is the left-team. The server program will function normally only when both the two teams are connected. The server program provides a clock, score record, team name, control commands and robots. The control commands are displayed at the top of the simulation display screen. Control commands can be selected via the keyboard; an activated command is displayed in red. To execute the selected command, press the ‘Enter’ key on the keyboard. When two teams are connected, their team names are displayed. Following the Kick-off command, the game starts. To pause the game, use the Break command. In this case, the clock is paused; following the Start command the game resumes. If the Stop command is executed, the clock is reset to 0:00. To replay, use the Repeat command. In the server program, there is an auto-referee which checks for fouls automatically in accordance to the game rules. Referee displays the current refereeing decision; for example, if the decision is a penalty-kick, then Referee displays Penalty-Kick and the game is paused automatically. To resume the game, use the Start command. The following are useful command keys (from the keyboard):
7.4 How To Run the Simulator
Key
‘p’, ‘l’, ‘;’,‘” ‘b’ ‘t’ ‘s’
263
: Function
: : : :
direction keys for positioning the ball or robots, select the ball, select the team, select the robot in order.
For example, to move the 3rd robot in the first team, select the team by pressing key ‘t’, select the robot by pressing key ‘s’ twice (now the 3rd robot in the first team is selected). Now you use the direction keys (‘p’, ‘l’, ‘;’,‘”) to reposition the robot as desired. 7.4.3 Client Program Fig. 7.4 shows a client interface.
Fig. 7.4. SimuroSoT client interface
Connection/disconnection icons are placed at the upper left-hand corner of the client interface. Click on the connection icon (pointed at by a bold arrow), and a window such as that shown in Fig. 7.5 would pop up. In this pop-up window, type the team name, server IP and port number. The port number to use is 6000. Note that the server and client programs need not run in different computers. This means that the simulated game can be played in a single computer.
264
7. SimuroSoT
Fig. 7.5. SimuroSoT client connection window
7.5 Client Programming In this section, we explain the basic client program structure and client simulator programming. The client program uses the information about the ball and robots from the server. The programming environment is MicroSoft Visual C++ 6.0. 7.5.1 Basic Program Structure The client program for simulating the motion of the team robots has to be developed entirely within the system-defined class name CStrategySystem. When the game starts, the strategy begins at the system-defined C++ method name or function name Action() of the class CStrategySystem. The following shows an example of a program structure for the function Action(); it simply calls the user-defined function Think(). Function Think() consists of several NormalGame()’s, but latest version of the simulator provides a system-defined variable m_nStrategy which is either 0 or 1. By the structure of function Think(), the user’s game strategy should be written in the user-defined function NormalGame(). void CStrategySystem::Action() { Think(); } void CStrategySystem::Think() { switch(m_nStrategy) {
7.5 Client Programming
265
case 0: NormalGame(); break; case 1: NormalGame(); break; case 2: NormalGame3(); break; case 3: NormalGame4(); break; case 4: NormalGame5(); break; } }
7.5.2 Attack Direction In developing a robot soccer strategy, the programmer would assume that the soccer team is the right-team, i.e., it attacks from right to left, but with the origin of the Cartesian (image) axes ( horizontal-x axis pointing left-toright, vertical-y axis pointing top-to-bottom) at the upper left-hand corner of the monitor screen. The coordinate conversion if it is the left-team, i.e., it plays from left to right during a game, is automatically handled by the server program. Note that in the SimuroSoT programming framework, the origin of the Cartesian axes for the playground and the team goal are on opposite sides, with the former at the upper left-hand corner while the latter is on the righthand side; this convention is different from that used in the MiroSoT programming framework, in which the team goal and the origin are usually on the same side. 7.5.3 System-Defined Variables and Constants The field information, i.e., the coordinate positions and heading directions of the robots and the ball, as well as the size of the playground, are, for programming purposes, obtained via system-defined variables given below:
266
7. SimuroSoT
Posture of Robot. 1. For an opponent robot j, j ∈ [1, 11]: a) opponent.positionj.x , b) opponent.positionj.y . Note that opponent robot 11 is the goalie. 2. For a team (or home) robot i, i ∈ [1, 10]: a) homei.angle, b) homei.position.x, c) homei.position.y. 3. For the team goalie: a) hgoalie.angle b) hgoalie.position.x c) hgoalie.position.y In the case of opponent robots, the heading angle information is not given. The centre of a robot is defined as its position, and is given in real pixel values. The pixel range of the x-coordinate is [0, 1030] and that of the ycoordinate is [0, 818]. The heading angle has a range of (−180◦ , 180◦] and it increases clockwise, a convention which is opposite to that used in the MiroSoT programming framework. Position of Ball. The coordinate position (xb , yb ) of the ball is given by following variables 1. Ball.position.x, 2. Ball.position.y. Size of Playground. Information on the boundaries of the field gives the convenience for determining an absolute position. In programming, the constants boundRect.left and boundRect.right are x-boundary values; the constants boundRect.top and boundRect.bottom are y-boundary values. The following inequalities relate these system-defined constants: boundRect.left < BoundRect.right and boundRect.top < boundRect.bottom. 7.5.4 Velocity() and Position() Functions As in a MiroSoT robot soccer system, Velocity() and Position() are two important basic skill functions for moving robots around.
7.5 Client Programming
267
Velocity(). This is the only system-defined basic skill function given by Velocity(int whichrobot, int vl, int vr). It simulates ‘driving the left and right wheels of a specified robot to attain the respective input velocities vl and vr. The velocity inputs are ‘normalized’ PWM velocity data lying in the range [−128, 128], where a negative value ‘drives’ the wheel backward, a zero value ‘drives’ it to a halt and a positive value ‘drives’ it forward. Position(). This is a user-defined basic skill function given by Position(int whichrobot, CPoint point). As in a MiroSoT system, it is implemented to get a specified robot move towards a specified point. The following is an example program of this function: void CStrategySystem::Position(int whichrobot, CPoint point) { Robot2 *robot; // Home robot is defined by Robot2 type double distance_e; int dx, dy, desired_angle, theta_e, vl, vr; switch(whichrobot){ //which means the number of robot, HOME1 is matched to 1 case HOME1: robot = &home1; break; case HOME2: robot = &home2; break; case HOME3: robot = &home3; break; . .
. .
. .
case HOME10: robot = &home10; break;
268
7. SimuroSoT
case HGOALIE: robot = &hgoalie; break; } // Position() is available for robots 1-10 and goalie now. // distance between robot and required position dx = point.x - robot->position.x; dy = point.y - robot->position.y; distance_e=sqrt(1.0*dx*dx+1.0*dy*dy); // angle between if(dx == 0 && dy desired_angle else desired_angle
robot and required angle == 0) = 90; = (int)(180.0/M_PI *atan2((double)(dy), (double)(dx)));
theta_e = desired_angle - robot->angle; // keep theta_e within (-180,180] while(theta_e > 180) theta_e -= 360; while(theta_e < -180) theta_e += 360;
// switch heading direction of robot if(theta_e < -90){ theta_e += 180; distance_e = -distance_e; } else if(theta_e > 90) { theta_e -= 180; distance_e = -distance_e; } // "PWM" velocity calculation using P control law vl = (int)(5.*(100.0/1000.0*distance_e+40.0/90.0*theta_e)); vr = (int)(5.*(100.0/1000.0*distance_e-40.0/90.0*theta_e)); Velocity(which, vl, vr); }
// call Velocity function // to "drive" the wheels
7.5 Client Programming
269
7.5.5 Example Game Strategy Programs: NormalGame() We provide two program fragments as examples on how to write strategy programs. Example 1. This example shows how the Position() function given above, the robots’ position information and the boundary values of the playground are used, as follows: void CStrategySystem::NormalGame() { CPoint target; int dx,dy; if(ball.position.x > (boundRect.left+boundRect.right)/2) // if ball’s x-position is greater than the x-coordinate // in the middle of the field { // get x-position of the middle of the field target.x = (boundRect.left + boundRect.right)/2; // get y-position of the middle of the field target.y = (boundRect.top + boundRect.bottom)/2; // get distance between home1 robot and middle of the field dx = home1.position.x - target.x; dy = home1.position.y - target.y; if(dx*dx+dy*dy < 400) //distance is less than 20 // position the robot in the middle of the field Position(HOME1,CPoint(target.x,target.y)); else // position the robot to follow the ball Position(HOME1,CPoint(ball.position.x, ball.position.y)); } . . . .
. . . .
. . . .
. . . .
} Example 2. This example implements a fairly effective ‘defence and attack by region’ strategy. The essence of the strategy is this: if a ball is in a location which is within the base region of a robot, the robot would have to go and
270
7. SimuroSoT
pass the ball to some other home robot; otherwise, it would return to its base position, as follows: void CStrategySystem::NormalGame() { CPoint pos1,pos2,pos3,pos4,pos5,pos6,pos7,pos8,pos9,pos10; int dx,dy; //define the base position of each robot pos1.x = boundRect.right - 3*(boundRect.right - boundRect.left)/8; pos1.y = boundRect.bottom - (boundRect.bottom - boundRect.top)/3; pos2.x = pos1.x; pos2.y = (boundRect.top + boundRect.bottom)/2; pos3.x = pos1.x; pos3.y = boundRect.bottom - 2*(boundRect.bottom - boundRect.top)/3; pos4.x = boundRect.right - 2*(boundRect.right - boundRect.left)/8; pos4.y = pos1.y; pos5.x = pos4.x; pos5.y = pos2.y; pos6.x = pos4.x; pos6.y = pos3.y; pos7.x = boundRect.right - (boundRect.right - boundRect.left)/8; pos7.y = boundRect.bottom - (boundRect.bottom - boundRect.top)/8; pos8.x = pos7.x; pos8.y = boundRect.bottom - 3*(boundRect.bottom - boundRect.top)/8; pos9.x = pos7.x; pos9.y = boundRect.bottom - 5*(boundRect.bottom - boundRect.top)/8; pos10.x = pos7.x; pos10.y = boundRect.bottom - 7*(boundRect.bottom - boundRect.top)/8; // // // dx dy
for robot 1 compute distance between home robot 1’s base position and the ball’s = ball.position.x - pos1.x; = ball.position.y - pos1.y;
Notes on Selected References
271
// if ball is "far" from its base position if(dx*dx+dy*dy > 800) // return to base position Position(HOME1,CPoint(pos1.x,pos1.y)); else // pass, if ball is "near" its base position Pass(HOME1); // // dx dy
for robot 2 similar as for home robot 1 = ball.position.x - pos2.x; = ball.position.y - pos2.y;
if(dx*dx+dy*dy > 400) Position(HOME2,CPoint(pos2.x,pos2.y)); else Pass(HOME2); // skip for robots 3-9 (can be implemented similarly) // for robot 10 dx = ball.position.x - pos10.x; dy = ball.position.y - pos10.y; if(dx*dx+dy*dy > 400) Position(HOME10,CPoint(pos10.x,pos10.y)); else Pass(HOME10); // call the Goalie function Goalie(HGOALIE); } In the program code above, the Pass() and Goalie() functions must be implemented with sound strategies. This is left as an exercise for the intersted reader.
Notes on Selected References The material in this chapter is written based on the simulator package [65] developed by Bing-Rong Hong’s research team at the Harbin Institute of Technology, P. R. China. The simulator package is available for download from website http://www.fira.net The textbook [66] is a good reference for C++ programming.
A. Programming the PIC16C73/73A Microcontroller
This appendix contains two sections that detail the use of Capture/ Compare/ PWM (CCP) and Universal Synchronous Asynchronous Receiver Transmitter (USART) modules of a PIC16C73/73A microcontroller for, respectively, motion control and communication by a command-based soccer robot (see Fig. 2.8). This appendix is to be read in consultation with the PIC16C7X manual, available for download from website http://www.microchip.com.
A.1 On-chip PWM Programming for Robot Motion Control To move, a soccer robot needs to generate control signals to turn its motors (and hence its wheels). The on-chip PWM (pulse width modulation) of a PIC16C73/73A microcontroller can provide this signal. In PWM, a digital signal of fixed frequency is generated, and can be used for driving the motors by connecting up the CCP pins (pin 12 and 13) as shown in Fig. A.1.
Wireless Receiver Module
Power Supply
RX
Motor
Motor
Gear Box
Gear Box
Wheel
PIC16C73/73A Microcontroller CCP1 CCP2
PWM Signals
Motor Driver
Motor Driver
Fig. A.1. Using CCP in PWM mode for motor driving and USART for data reception
274
A. Programming the PIC16C73/73A Microcontroller
To set the CCPx pins, x = 1 or 2, to PWM mode, clear the TRISC2 bit. The general notation REGi, 0 ≤ i ≤ 7, refers to the i-th bit of an internal register named REG. The frequency FPWM of PWM signals output at each CCPx pin is 1 . The PWM period is specified by setting the register PR2. (PWM period) The formula for the PWM period in terms of the value (PR2) set in register PR2 is PWM period TPWM =[(PR2) + 1] · 4 · TOSC · (TMR2 prescale value),
(A.1)
where • TOSC is the time period of the oscillator (system clock generator), and • TMR2 prescale value for Timer2. TON , i.e., the proportion of time TON TPWM within the PWM period TPWM that the PWM signal is at logic 1 (or high). For CCPx, it can be specified by setting the internal register CCPRxL and the CCPxCON5 : 4 bits, where the general notation REG i : j refers to the i-th and j-th bits in an internal register named REG. This forms a 10-bit value denoted by CCPRxL: CCPxCON5 : 4, where the CCPRxL contains the eight most significant bits (MSbs) and CCPxCON5 : 4 contains the two least significant bits (LSbs). The following equation is used to calculate the duty cycle of the PWM signal output at CCPx, x = 1 or 2, but conventionally in terms of the time TON as the PWM period TPWM is known and set. The PWM duty cycle is defined as
PWM ‘duty cycle TON =(CCPRxL : CCPxCON5 : 4)2 · TOSC · (TMR2 prescale value).
(A.2)
The maximum PWM resolution Rmax (in bits) (for setting the PWM duty cycle) at a given PWM frequency FPWM is log Rmax =
FOSC FPWM
log 2
where FOSC , equal to
bits, 1
(A.3)
, is the frequency of the oscillator. TOSC To illustrate, consider an example with FOSC = 20MHz and TMR2 prescale = 1. If the desired PWM frequency FPWM = 78.125kHz, then
A.1 On-chip PWM Programming for Robot Motion Control
275
• the value (PR2) required is calculated using Eq. (A.1) as follows: 1 1 = [(PR2) + 1] · 4 · ·1 78.125kHz 20MHz 12.8µs = [(PR2) + 1] · 4 · 50ns·1 (PR2) = 63; • the maximum PWM resolution Rmax is calculated using Eq. (A.3) as follows: 1 1 = 2Rmax · 78.125kHz 20MHz 12.8µs = 2Rmax · 50ns 256 = 2Rmax Rmax = 8 bits. This means that with a clock at 20MHz, the duty cycle of a PWM signal at 78.125kHz can have a resolution of at most 8-bits. i.e., 0 ≤ (CCPRxL : CCPxCON5 : 4)2 ≤ 255. So, any such value greater than 255 will result in a 100% duty cycle. To summarize, the following steps configure the microcontroller for PWM operation. 1. Set the TMR2 prescale value and enable Timer2 by setting the T2CON 2 bit. 2. Configure CCPx as an output pin by clearing the TRISC2 bit. 3. Set (to fix) the desired PWM period by setting register PR2 with a value obtained using Eq. (A.1). 4. Set (to initialize or change) the desired PWM duty cycle by setting CCPRxL:CCPxCON5 : 4 with a value obtained using Eq. (A.2). The following is a program code fragment that implements these steps. /*************** Program Example ********************************/ void main() { unsigned long pwm1, pwm2; // Declare the variables pwm1 = 0x01fc; // Set initial value of pwm2 = 0x01fc; // duty cycle ... ... /* Set TMR2 Prescale = 1 (for Timer2) */ T2CON.T2CKPS0 = 0; T2CON.T2CKPS1 = 0; T2CON.TMR2ON = 1;
// Enable Timer2
/* Clock freq = 11.0592MHz */ /* To set desired PWM signal period = 1/10.8kHz */
276
A. Programming the PIC16C73/73A Microcontroller OpenPWM1(0x0ff); OpenPWM2(0x0ff);
// Set PWM Period
... SetDCPWM1(pwm1); SetDCPWM2(pwm2);
// Initialize PWM Duty Cycle
... ... } /*************** Function **************************************/ void OpenPWM1(unsigned char period) { CCPR1L = 0x00; // Power on reset (POR) for duty CCP1CON = 0x0F; // cycle of PWM signal at CCP1 PR2 = period; // Set period TRISC.CCP1 = 0; // TRISC<2>. Configure CCP1 pin as output } void SetDCPWM1(unsigned long duty_cycle) { CCPR1L = duty_cycle >> 2; // Place 8 MSbs in CCPR1L duty_cycle = (duty_cycle << 4) & 0x30; // Place 2 LSbs in CCP1CON CCP1CON = CCP1CON | duty_cycle; } /* Functions OpenPWM2 and SetDCPWM2 are written similarly. */
How varying the amplified PWM signals can drive the DC motors at different speeds has already been explained in Section 2.3.4 .
A.2 On-chip USART Programming for Robot Communication A soccer robot needs to receive data from the host computer via asynchronous communication. In particular, it needs to receive the desired wheel velocity data and generate the appropriate PWM signals for motor actuation. The on-chip USART of a PIC16C73/73A microcontroller provides this means of data communciation. The desired communication baud rate is selected by setting the register SPBRG with a value X, 0 ≤ X ≤ 255, and the BRGH bit of the Transmit Status and Control register TXSTA, accordingly as follows:
A.2 On-chip USART Programming for Robot Communication
BRGH
Baud Rate
0 (Low Speed)
1 (High Speed)
FOSC 64(X + 1)
FOSC 16(X + 1)
277
Using the formulae tabulated above, the value of X can be computed easily for a desired baud rate. However, since X has to be set as an integer value in register SPBRG, the actual baud rate selected may deviate from the desired. It is a good practice to verify that the actual baud rate set is acceptable. To illustrate, consider the following example with a clock frequency FOSC of 16MHz and a desired baud rate of 9600 bps (bits per second). With BRGH = 0, we have 16MHz 64(X + 1) X = 25.042 ≈ 25.
9600 =
The actual baud rate set is
16MHz = 9615. Hence the deviation or 64(25 + 1)
error is Actual Baud Rate − Desired Baud Rate 9615 − 9600 = = 0.16%. Desired Baud Rate 9600 In this example, the settings for the desired baud rate is acceptable because the error is relatively small. Next, we study how the microcontroller is software configured to transmit or receive data. The microcontroller chip supports two methods of asynchronous communication, namely by polling and by interrupt. Here, only polling is discussed. • To configure for polling, set the bits of the respective registers as follows:
PIE15 PIE14
USART Receive Interrupt Enable USART Transmit Interrupt Enable
TXSTA4
USART Mode Select
RCSTA7
Serial Port Enable
RCIE = 0 TXIE = 0 SYNC
=0
SPEN = 1
278
A. Programming the PIC16C73/73A Microcontroller
• To select a desired baud rate, set the register SPBRG with an appropriate value X, and the corresponding High Baud Rate Select bit BRGH of the Transmit Status and Control register TXSTA. • Under polling, – to transmit 8-bit data, set the bits of the Transmit Status and Control register TXSTA as follows:
TXSTA6 TXSTA5
9-bit Transmit Enable Transmit Enable
TX9 TXEN
= 0, = 1;
– to receive 8-bit data, set the bits of the Receive Status and Control register RCSTA as follows:
RCSTA6 RCSTA4
9-bit Receive Enable Continuous Receive Enable
RX9 CREN
= 0, = 1.
To receive data only as in the soccer robot considered, we only need to connect up the RX pin (pin 10) as shown in Fig. A.1. Shown below is a program code fragment that configures the microcontroller for sending and continuously receiving data via polling at a baud rate of 19200 bps under a clock frequency of 11.0592MHz, using X = 8 with BRGH = 0. /***************
Program Example ********************************/
// USART Config Bit Definitions #define RX_INT_ON #define RX_INT_OFF #define TX_INT_ON #define TX_INT_OFF #define USART_ASYNCH_MODE #define USART_SYNCH_MODE #define USART_SYNC_MASTER #define USART_SYNC_SLAVE #define USART_TX_RX_MODE #define USART_RX_ONLY_MODE #define USART_HI_SPEED #define USART_LO_SPEED #define USART_NINE_BIT #define USART_EIGHT_BIT #define USART_CONT_RX #define USART_SINGLE_RX
0b11111111 0b01111111 0b11111111 0b10111111 0b11011111 0b11111111 0b11111111 0b11101111 0b11111111 0b11110111 0b11111111 0b11111011 0b11111111 0b11111101 0b11111111 0b11111110
A.2 On-chip USART Programming for Robot Communication
279
... main() { unsigned char config; config = RX_INT_OFF & TX_INT_OFF & USART_ASYNCH_MODE & USART_TX_RX_MODE & USART_LO_SPEED & USART_EIGHT_BIT & USART_CONT_RX;
/* Declare the variable */
/* Set config value to 0b00011001 */
OpenUSART(config, 8); // Clk : 11.0592MHz, Baud rate : 19200bps ... } /***************
Function
**************************************/
void OpenUSART(bits config, unsigned char baud) { /* A general function that sets according to the config bits */ /* RCSTA : SPEN RX9 SREN CREN - FERR OERR RX9D */ RCSTA = 0; // POR(Power on reset) values /* TXSTA : CSRC TX9 TXEN SYNC - BRGH TRMT TX9D */ TXSTA = 0; PIE1.RCIE = 0; PIE1.TXIF = 0; SPBRG = baud; baud = RCREG;
// // // //
Initialize USART Receive Interrupt Enable Initalize USART Transmit Interrupt Enable Set Baud Rate POR (Clear RCIF INT)
if (config.0) RCSTA.CREN = 1; else RCSTA.SREN = 1;
// CONTINUOUS RX
if (config.1) { TXSTA.TX9 = 1; RCSTA.RX9 = 1; }
// NINTH BIT
if (config.2) TXSTA.BRGH = 1;
// HIGH SPEED
if (config.3) {
// TX & RX
// SINGLE RX
280
A. Programming the PIC16C73/73A Microcontroller TXSTA.TXEN = 1; TRISC.TX = 0; } TRISC.RX = 1; if (config.4) TXSTA.CSRC = 1;
// SYNC MASTER MODE
if (config.5) TXSTA.SYNC = 1;
// SYNC MODE
if (config.6) PIE1.TXIE = 1;
// TX INT
if (config.7) PIE1.RCIE = 1;
// RX INT
RCSTA.SPEN = 1; }
// SERIAL PORT
To fully understand the program code fragment, the reader is advised to consult the PIC16C7X manual. In wireless communciation, be it radio frequency (RF) or infra-red (IR), data is usually transported on a particular high frequency carrier signal. This entails modulating or encoding the data signals for transmission at the transmitter, and demodulating or decoding these signals detected at the receiver. To differentiate team ownership of the data received during a robot soccer game, a simple solution is to have each team communicating at a different carrier frequency. We refer the reader to Section 2.3.5 for a presentation on the basic methods, input/output (I/O) hardware and protocols used for both IR and RF communication. The experimental (command-based) robot soccer program (available on website http://www.fira.net) uses IR for data communication. It is designed for the game set-up in which both the teams use what is called the base band method that does not require a carrier signal for data transmission; this method has been described in Section 2.3.5. Without a different carrier signal for uniquely carrying and identifying a team’s data, team communication is feasible only if the two robot teams use the same IR communication module to avoid the two teams concurrently transmitting data to the team robots; but then, it becomes necessary to check the robot team ID that would have to be formatted with the serial data sent. This means that a team needs to be assigned a team identification number (ID) upon power-on. The two logic levels 0 and 1 input at PORTB7 can be used to uniquely assign two team IDs. Each team uses a different level set by turning on (logic 0 set) or off (logic 1 set) the switch at RB7 pin, connected as shown in Fig. A.2.
A.2 On-chip USART Programming for Robot Communication
281
Logic 1
RB7
Switch GND
PIC16C73/73A Microcontroller
Fig. A.2. Switch for assigning Team ID
To initialize the team ID upon power-on, PORTB is software programmed for internal weak pull-up on input PORTB7 (pin 28), set by OPTION.RBPU = 0. Using internal weak pull-up eliminates the need for additional resistors. An example robot program code fragment follows that assigns the team ID based on the switch setting at PORTB7. /*************** #define SW4 #define TEAM_A #define TEAM_B
Program Example PORTB.7 0x0f 0xf0
*******************************/
// Define Switch SW4 to be PORTB<7> // Define Team ID
TRISB = 0b11110000;
// Set Port B<7:4> as inputs, // Set Port B<3:0> as outputs
OPTION.RBPU = 0;
// Set the internal pull-up
if(SW4 == 0) team = TEAM_A; else team = TEAM_B;
// // // //
SW4 is ON, hence at logic 0 i.e., connected to ground. SW4 is OFF, hence at logic 1 i.e., not connected to ground.
... ...
Each of the three robots has a unique robot identification number (ID), 0, 1 or 2, assigned similarly using switch (binary) settings at the two inputs PORTB5 : 4. Available on website http://www.fira.net is a full microcontroller program code that incorporates all the program code fragments given above for a team robot of the experimental system.
B. Reference Manual for an Experimental MiroSoT System
This appendix contains information for an experimental command-based MiroSoT system, organized into two parts. The first part details how the experimental system user interface can be used to set up and initialize the various parameters for vision processing during the game. The second part documents the library of routines implemented based mainly on the material in this book. Enquiries about the hardware system and FIRA-compliant micro-robots used for this experimental system can be made with Yujin Robotics1 . The source codes for the system (vision-based) user interface program and software library may be downloaded from the FIRA website http://www.fira.net. For the interested reader, it is hoped that the information provided herein could compress the learning curve in the practical (software) design and development of a MiroSoT robot soccer system.
B.1 Vision System: Set-up and Initialization Before set-up and initialization, Visual C++ needs to be installed. B.1.1 Build Program Executable Code 1. Find the directory named Vision\Source\. 2. Copy this directory into your hard disk. Go to the directory. 3. Double click on the Visual C++ project file named vision.dsw. This runs the vision program. If the Visual C++ program starts properly, the window in Fig. B.1 will appear on the screen. 4. Click Rebuild All. This builds the executable file for the system. Note that prior to this, the background colour for the window interface has to be set as 32 bits, and the resolution of the screen set to more than 1024 × 768.
284
B. Reference Manual for an Experimental MiroSoT System
B.1 Vision System: Set-up and Initialization
285
B.1.2 Run the Program Upon executing the program, the MiroSoT system user-interface is displayed, as shown in Fig. B.2. On the interface are a number of buttons for invoking the vision support functions. To begin, do the following: 1. Click the button [Input Image Tuning] (see Fig. B.3).
Fig. B.3. Colour-setting buttons
2. Wait for a few seconds. 3. Click the button Start Grab. B.1.3 Set Camera Image To set the ID colour of a robot, do the following: 1. Select [Input Image Tuning] from the main menu. This pops up the image setting submenu. 2. Click the [Default] button (see Fig. B.4). This puts the camera settings at default values (128).
Fig. B.4. Default colour setting
3. Use the slide bar or type the values for Hue, Saturation, Brightness and Contrast (see Fig. B.5).
286
B. Reference Manual for an Experimental MiroSoT System
Fig. B.6 shows a default camera image. In order to acquire the optimum screen display from the camera, adjust the above values. The screen display is optimum if it displays the image with the colour as is perceived by the human eyes. Adjust Hue, Saturation, Brightness and Contrast accordingly, since the fluorescent lights or halogen lamps in the surrounding are not likely to illuminate every corner of the playground. The following provide some usual guidelines to follow under normal situations. a) Decrease Brightness to clearly display the patch colour of the ground. Fig. B.7 shows an image acquired by adjusting Brightness. b) Adjust Hue and Saturation to obtain a clear screen display of the team ID colours (yellow and blue). Note that depending on the Hue value, yellow may become red or purple. Fig. B.8 shows an image acquired by adjusting Hue. 1
Company website is at http://www.yujinrobot.com/soccer/soccer-e.htm.
B.1 Vision System: Set-up and Initialization
Fig. B.7. Adjusting Brightness
Fig. B.8. Adjusting Hue
287
288
B. Reference Manual for an Experimental MiroSoT System
c) Adjust Saturation to acquire a spreading or clear edge of the colour patch. Note that it is better to acquire a clear image. Fig. B.9 shows an image acquired by adjusting Saturation.
Fig. B.9. Adjusting Saturation
d) Adjust Contrast and Brightness to acquire a clear display of the yellow patch. Note that in most cases, simply adjusting Brightness will provide a clear colour. It is recommended to adjust the screen to be a bit darker. 4. Click the [Update] button once a satisfactory image is acquired. This refreshes/updates the camera image. Note that clicking the [Default] button returns the settings to default values. To set the colour values again as necessary, it is more efficient to click the [Default] button than to type 128 for each value. After finishing with the colour settings, save the values. Without saving, all the settings will have default or meaningless values when the program restarts. In other words, to reuse the settings, make sure to save them. B.1.4 Set Ball Colour Following the image setting for the camera, set the ball colour as follows: 1. Click [SetColour].
B.1 Vision System: Set-up and Initialization
289
2. Click on the ball image in the setting window. A three-times enlarged view appear on the upper right-hand side of the screen, as shown in Fig. B.10.
Fig. B.10. Setting ball colour
3. Draw a small square inside the (enlarged image of the) ball. To draw, start by clicking the left mouse button at a point, drag the mouse, and end by clicking the left mouse button again. Make sure that the square covers the image area of the ball that has the same (uniform) colour as wide as possible. For example, if the edge of the ball is light orange, and the remaining part is dark orange, than the darker part should be selected in order to find the ball easily. When selecting the area, the part that is selected with the mouse will be displayed with the designated colour on the screen, and the remaining parts will be in black, as shown in Fig. B.11. 4. Select [Ball Colour] in [Colour Set] and adjust the YUV values (see Fig. B.11). This is done to optimize the ball colour. As a guide, decrease the minimum values and increase the maximum values gradually to acquire the optimum ball colour. In the optimum condition, only the ball colour is displayed when the screen is blank. In Fig. B.12, only the ball colour is displayed on the screen. In Fig. B.12, Y Min, Y Max, V Min, V Max, U Min and U Max indicate the lowest and highest value, respectively. In general, RGB is normally
290
B. Reference Manual for an Experimental MiroSoT System
B.1 Vision System: Set-up and Initialization
291
used to represent colour. In setting the colour with RGB, however, it becomes different between the colour set in the dark area and in the bright area. For example, red becomes light red in the bright area, while it becomes dark red in the dark area. This may be set to red, but when the computer calculates for itself, the settings of colour are not the same as they are perceived by human eyes. Therefore, it is very important to designate whether it is light, dark or medium red. To calculate the exact colour, it is advisable to use YUV for the colour settings for experimental soccer robot system. Y Min and Y Max refer to the minimum brightness and the maximum brightness, respectively. Y Min expresses that the brightness level is higher than the designated level, and Y Max expresses that the brightness level is lower than the designated level. As it is hardly the case that the ground would be lit evenly at every corner, you should set Y Min low and Y Max high to acquire uniform colour. U Min, U Max, V Min and V Max express the minimum and the maximum values for the colour. It is needed to set the values with minimum difference so as to acquire the desired colour. 5. Click [OK]. Fig. B.12 shows the colour of the ball set in this manner. B.1.5 Set Robot Team ID Colour After selecting the ball colour, select the robot team ID colour as follows: 1. Click [Set Colour]. 2. Click on a robot on the real-image screen as shown in Fig. B.13. This will display a three-times enlarged view of the robot on the upper right-hand side of the screen. 3. Select a square inside the robot team ID colour (yellow or blue) patch on the enlarged view with the left mouse button. This will pop up the [YUV Ranges of Colours] dialog box, as shown in Fig. B.14. 4. Select [Team Colour] in [Colour Select], and adjust the YUV to set the team ID colour in the same way that the ball colour was set. 5. Adjust gradually the Y Min lower and Y Max higher. If the green part of the screen remains the same without noise as the Y value is being changed, then change the U value and the V value, sequentially, to set the required colour. Because each colour has its own YUV range, simply decreasing Min and increasing Max will not lead to the optimum colour selection. 6. Click [OK] once a satisfactory result is obtained. Fig. B.14 shows the colour of the team ID set in this manner.
292
B. Reference Manual for an Experimental MiroSoT System
B.1 Vision System: Set-up and Initialization
293
B.1.6 Set Robot ID Colour After setting the team ID colour, set the robot ID colours in the same manner that the colours of the ball and the robot team ID were set, as follows: 1. Select colour for robot ID1 in [Colour Select]. Fig. B.15 shows the colour range when selecting a robot ID colour.
Fig. B.15. Setting robot ID colour
2. Click [OK] once a satisfactory result is obtained. B.1.7 Set Playground Boundary After setting the ball colour, the robot team ID colour and the robot ID colours, set the playground boundary as follows: 1. Select the [Set Boundary] button or the Set Ground Area button. 2. Set the boundary in the [Boundary of Field] dialog box (see Fig. B.16). Prior to setting, apply the colour patch to robots, and then place the robots on the edge of the ground. Although the robots are placed inside, on screen, they seem to be out of the boundary. Therefore, the boundary should be set a little wider. In changing the range of Top, Left, Bottom and Right, the blue square is also changed. Take special care to set the boundary size correctly. Note that the boundary does not include the goal area. It includes just front
294
B. Reference Manual for an Experimental MiroSoT System
Fig. B.16. Setting boundary of playround
line of the goal area. If the area of the boundary is not accurate, the robot position would be miscalculated. B.1.8 Set Robot Size After setting the boundary, set the robot size as follows: 1. Click [Set Robot Size]. 2. Click on a robot on the real-image screen as shown in Fig. B.17. This will display a three-times enlarged view of the robot on the upper right-hand side of the screen. 3. Select a square that contains the robot. This will pop up a dialog box with the robot size readings, as shown in Fig. B.17 Note that when the ground is displayed in full screen size, the robot size is about 29 × 29. This size may vary, depending on the colour patch and zoom status of the camera. 4. Click [OK] once a satisfactory result is obtained. B.1.9 Set Pixel Size Fig. B.18 shows the screen display for clicking the [Set Pixel Size] button. On the [Adjust UB & LB of Pixel] dialog box, the minimum and maximum pixel value for each colour (ball colour, team ID colour and robot ID colour) can be set. When selecting a robot or the ball, the upper left-hand
B.1 Vision System: Set-up and Initialization
295
Fig. B.17. Setting robot size
side of the screen shows ‘number of pixel = XXX ’ with the numeric figure XXX in green. This value should indicate the pixel size when the pixel size is set. LB indicates the minimum value, and if the pixel (for the ball or a robot colour) is smaller than that, the program cannot recognize the object. When the program fails to identify between the ball and the patch with a similar colour, adjust this value. Note however that in most cases, there is no need to adjust these values. (This function is needed only for advanced colour setting.) B.1.10 Save Vision Settings After finishing with the settings, click [Save Vision File] to save the settings. Fig. B.19 shows the result of saving. B.1.11 Open Vision Settings File To import the previous settings saved in a file, click the [Open Vision File] button, and the window as shown in Fig. B.20 would pop up. B.1.12 Set Auto Colours It is similar to colour settings with the [Set Colour] button, as follows: 1. Select [Auto Set Colour] from the menu. 2. To set team ID colour, do the following: a) Select the robot position on the window.
296
B. Reference Manual for an Experimental MiroSoT System
B.1 Vision System: Set-up and Initialization
Fig. B.20. Importing settings with [Open Vision File]
Fig. B.21. Zooming in after clicking on Auto Set Colour button
297
298
B. Reference Manual for an Experimental MiroSoT System
b) Click the left mouse button on the team ID colour (yellow or blue) in the enlarged view on the right-hand side of the ground screen (see Fig. B.21). c) Click the mouse on the desired colour in the same enlarged view. The selected part would be seen changing to green. On the same dialog box as in [Set colour], only the team ID colour is displayed on the dark screen.
Fig. B.22. Setting the team ID colour with auto colour setting
d) Check the team ID colour and then click [OK] (see Fig. B.22). 3. To set a robot ID colour, do the following: a) Set the ID1 colour to green. In the enlarged view on the right-hand side of the ground screen (see Fig. B.17). b) Click the left mouse button on green. The team ID colour would be observed changing into green. In the same dialog box as in [Set colour], only the team ID colour is displayed on the dark screen. c) Set the ID on the ground screen. d) Click the [Ok] button once the robot ID colour has changed to green. With the [Auto Colour Set] button, the program sets the robot team ID colour and robot ID colour automatically. 1. Select the [Enable] checkbox on [Auto Set Mode]. 2. Click the [ID1 Robot] option button for Robot1.
B.1 Vision System: Set-up and Initialization
299
300
B. Reference Manual for an Experimental MiroSoT System
3. Click the [Auto Set Mode] button. Upon which, the screen would become dark and Home1 (Robot1) would start moving. Every time the robot pauses, the enlarged view would be displayed. On this screen, the program automatically updates the colour. To change the colour of Home2 (Robot2) into sky blue (say), do the following: 1. Click the [Auto Set Colour] button. 2. Click the left mouse button on the violet patch on the ground screen. 3. Click the ID2 (Robot2) on the enlarged screen. Fig. B.25 shows the result.
Fig. B.25. Robot colour 2
4. Select ID2 Colour (Robot2 colour). 5. Select Home2 or Robot2 and click [Auto Set colour]. Upon which, Robot2 would start moving and the program would update the colour. 6. Click [OK]. The colour for Home3 or Robot3 can be set in the same manner. B.1.13 Change Colour Fig. B.26 shows the menu for changing colours. To change colour, click on [Change Colour]; a dialog box would be displayed and the screen would turn black (see Fig. B.27).
B.1 Vision System: Set-up and Initialization
301
302
B. Reference Manual for an Experimental MiroSoT System
Change Colour is used to check the colour setting or to change the colour value. The Change Colour function can help acquire a fine colour image with minor changes.
B.2 System Library The library is organized into three categories, as follows: 1. the SENSE category, for vision system-initialization and processing, 2. the communication category, for communication between the host computer and team robots; and 3. the DECIDE-and-ACT category, for game strategy programming. B.2.1 The SENSE Category Routines for Display Function Refer to Fig. B.28 for a flow-chart of the function.
Call a function HOOK as Received 1Frame Data from the camera
MyVisionFrame()
Copy camera image data into display memory
Cvisionview::DisplayDD(HBUF hBuf)
Display boundary of the playground stored in the display memory CvisionView::DisplayOverlay()
Fig. B.28. Display function
B.2 System Library
303
Change Colour. 1. MyVisionFrame Syntax : void MyVisionFrame (void *data); Return Value : NULL Description : This system calls a function when a message occurs indicating the frame grabber memory storage is full. Then it calls the function of MyVisionFrame(). Example : digHookFunction (m_hDig, MV_HOOK_FRAME, MyVisionFrame, this) ; See Also : MV_HOOK_FRAME, MV_HOOK_FIELD_EVEN, MV_HOOK_FIELD_ODD 2. DisplayDD Syntax : void CvisionView::DisplayDD(HBUF hBuf); Return Value : NULL Description : This function displays the real time image data stored in display memory; it first receives from the camera via the frame grabber (in HBUF format) and stores it into the display memory. 3. DisplayOverlay Syntax : void CvisionView::DisplayOverlay(); Return Value : NULL Description : Draw an outline of the playground on the display screen, using playground data stored in computer memories in order to prevent display flicking.
304
B. Reference Manual for an Experimental MiroSoT System
Select an image area
CvisionView::OnLButtonDown()
Display the image data of the selected area
CvisionView::Zoom()
Set up a desired range of colours in the zoom-in area CvisionView::OnLButtonDown() CvisionView::OnMouseMove()
Abstract YUV values over the colour range Cvisionview::MinMaxOfRGB()
Display image with updates through a timer CSettingColor::OnTimer()
Set up the range of colours CSettingColor::OnHScroll()
Update the values of selected colors CSettingColor::On……Color()
Bring the YUV values selected on SetColorDlg
Cvisionview::SetColorUpdate()
Save the values for vision processing CVisionView::UpdateLUTcolor()
Fig. B.29. Set-colour function
B.2 System Library
305
Routines for Set-Colour Function Refer to Fig. B.29 for a flow-chart of the function. 1. OnLButtonDown Syntax : void CvisionView::OnLButtonDown (UINT nFlags, CPoint point) Return Value : NULL Description : This function implements a Window Event , and is used for setting up the position of each robot in SetColor and its size. It facilitates saving or setting up of a selected point. 2. Zoom Syntax : void CvisionView:: Zoom(CPoint point); Return Value : NULL Description : This function saves the image data of a selected area into a buffer, and then reads from the buffer to display it 3-times enlarged on the screen. It facilitates the initialization of colours for the ball and robot ID patches. 3. OnMouseMove Syntax : void CvisionView:: OnMouseMove(CPoint point); Return Value : NULL Description : This function draws a square box on the screen, as the mouse moves, to select an area. It continually updates the colours within the selected area. 4. MinMaxOfRGB Syntax : void CvisionView:: MinMaxOfRGB (); Return Value : NULL Description : This function acquires the image data from a selected area, converts it to YUV format, and sets the YUV Min & Max values.
306
B. Reference Manual for an Experimental MiroSoT System
5. OnTimer Syntax : void CSettingColor::OnTimer(); Return Value : NULL Description : This function displays the data stored in the image buffer. It displays a currently selected range, and updates/refreshes the screen display once per 1,200 milliseconds. 6. OnHScroll Syntax : void CSettingColor::OnHScroll (UINT nSBCode, UINT nPos, CScrollBar* pScrollBar); Return Value : NULL Description : This function updates the values of YUV Min and Max thresholds in the case of scrolling the image on the screen via the scroll bar on the keyboard. 7. UpdateLUTcolor Syntax : CVisionView::UpdateLUTcolor(); Return Value : NULL Description : This function updates the YUV threshold values in the look-up table as RGB values. See Also : void UpdateLUTAllColor();
Routines for Vision Function Refer to Fig. B.30 for a flow-chart of the function. 1. OnWholeFind Syntax : CVisionView::OnWholeFind(); Return Value : NULL Description : This function finds the coordinate positions of the ball and robots on the (whole area of the) playground.
B.2 System Library
307
Do labelling of a selected color in the playground
Search for a position on playground
GlobalLabelingPartialImage()
CvisionView::OnWholeFind()
Set up size / position of the ball
wholeFindBall()
Search the color of opponent robot on playground
SearchFindOpponentRobot()
Search the opponent team’s designated colour in the whole area SearchFindOpponentRobot() Save size filter/position of an opponent robot
FindOpponentRobot()
Localize sections found on playground
DetermineOffsetXY
Find the positions of the home robots & the ball FindObject()
Do labelling in a selected area LabelingPartialImage()
Calculate size filter/position of home robot FindHomeRobot()
Fig. B.30. Vision function
2. GlobalLabelingPartialImage Syntax : CVisionView::GlobalLabelingPartialImage (long OffsetX, long OffsetY, long ImageSizeX, long ImageSizeY, short object); Return Value : NULL Description : This function labels all objects with designated colours on the playground, and compute the coordinate values of their centre positions. It then saves the number of labels and all centre position values. It is used when searching for the positions of objects such as BALL, HOME1, HOME2, HGOALIE, and OPPONENT in the whole area. Note that the ac-
308
B. Reference Manual for an Experimental MiroSoT System
curacy of the results it produces is not high because of it having to inspect the whole playground. Example : LabelingWholeImage(BALL); wholeFindBall(0, 0); if(bFlagBallFound == TRUE) /*if position of ball is found*/ { PositionOfBall[0]*=5; PositionOfBall[1]*=5; } See Also : LabelingPartialImage(); 3. WholeFind.... Syntax : CVisionView::WholeFind.... (); Return Value : NULL Description : To be used after executing GlobalLabelingPartialImage, this function searches for the positions of the ball or robots over the whole playground area. Example : size = PixelSizeOfComponent[BALL_COLOR][whichcomponent]; if(size > maxsize && size > 5 && size < UpperBoundOfBallSize/5) { maxsize = size; candidate = whichcomponent; } Note that the size would be the number gathered in swarms of sizefiltered pixels. See Also : FindBall(), FindOpponentRobot(). 4. SearchFindOpponentRobot Syntax : CVisionView::SearchFindOpponentRobot (); Return Value : NULL Description : This function finds the positions of all opponent robots on the playground. It is to be used after checking for the presence of any oppo-
B.2 System Library
309
nent robots via detecting its designated colour in some anticipated area of the playground. Example : if(OpponentColorLabeling(Est_OPPColorPositionX[index_bigteam], Est_OPPColorPositionY[index_bigteam],40, 40)==TRUE) { BOOL findFlag=FALSE; Size=0xffff; for(int RobotID=0;RobotID<3;RobotID++) { dx=Est_OPPColorPositionX[index_bigteam] PositionOfOpponent[RobotID][0]; dy=Est_OPPColorPositionY[index_bigteam] PositionOfOpponent[RobotID][1]; distance=(int)sqrt(dx*dx+dy*dy); } if((bFlagOpponentFound[RobotID] == FALSE) && Size > distance) { findFlag=TRUE; RobotCnt=RobotID; Size=distance; } } 5. OpponentColorLabeling Syntax : CVisionView::OpponentColorLabeling(long OffsetX, long OffsetY, long ImageSizeX, long ImageSizeY); • OffsetX, OffsetY: size of pixel labelling • ImageSizeX, ImageSizeY: size of screen displaying Return Value : NULL Description : This function selects the positions of the opponent robots based on their positions in the whole area moving a position to the whole area and finds if the opponent robots are in that position. 6. FindOpponentRobot Syntax : CVisionView::FindOpponent(short whichRobot, short OffX, short OffY); • (OffX , OffY ) : starting point for labelling on the screen. • WhichRobot : ID number of a specified opponent robot. Return Value : NULL
310
B. Reference Manual for an Experimental MiroSoT System
Description : After size-filtering, this function assigns a number to the group of pixels that is larger than the preset filter size, and saves the position. See Also : FindBall(), FindOpponentRobot(). 7. FindObjects Syntax : CVisionView::FindObjects Return Value : NULL Description : This function calculates the color tones based on the locations of all objects on the playground, and obtains the exact position and heading angle of each object. 8. DetermineOffsetXY Syntax : CVisionView::DetermineOffsetXY(short rx, short ry, short size, long *offx, long *offy); • (rx, ry) : coordinate point of the area for labelling • size : size of the area for labelling • *offx : data of point X which was converted as a point in the position of the area in the field • *offy : data of point Y which was converted as a point in the position of the area in the field Return Value : NULL Description : This function keeps the area for labelling within the selected area of the playground. 9. LabelingPartialImage Syntax : CVisionView::LabelingPartialImage(long OffsetX, long OffsetY, long ImageSizeX, long ImageSizeY, short object); • NumberOfComponent[COLORID] : number of components • Component[COLORID][component][pixel][0/1] : component of label not size-filtered • PixelSizeOfComponent[COLORID][component] : size of component • CenterXOfComponent[COLORID][component], CenterYOfComponent[][]: center point of component • SumOfX[COLORID][component], SumOfY[][] : centre point of pixel
B.2 System Library
311
• (OffsetX, OffsetY) : initial coordinate point for labelling • ImageSizeX : maximum length, along the X-axis, of the image to be searched • ImageSizeY : maximum length, along the Y-axis, of the image to be searched • BALL, HOME1, HOME2, HGOALIE, OPPONENT_C : objects Return Value : NULL Description : This function performs (component) labelling and saves information about the component labels and centre point for the selected object in the following arrays. // Component Identification Number NumberOfComponent[COLORID] // Component of label not size-filtered Component[COLORID][component][pixel][0/1] // Size of component PixelSizeOfComponent[COLORID][component] // Centre point of component CenterXOfComponent[COLORID][component] CenterYOfComponent[][] // Center point of pixel SumOfX[COLORID][component] SumOfY[][] 10. FindHomeRobot Syntax : CVisionView::FindHomeRobot(short whichRobot, short OffX, short OffY); • whichRobot : HOME1, HOME2, HGOALIE • OffX, OffY : positions of each point started labelling • PositionOfHomeRobot[][] : position of the robot/(logical coordinate system) • AngleOfHomeRobot[] : angle of the robot/(logical coordinate system) Return Value : NULL
312
B. Reference Manual for an Experimental MiroSoT System
Description : By first locating both the team ID color and robot ID color, this function determines the position and heading angle of the specified robot. Example : DetermineOffsetXY(PositionOfHomeRobot[HOME1][0], PositionOfHomeRobot[HOME1][1], PARTIAL_IMAGE_SIZE, &OffsetX, &OffsetY); LabelingPartialImage(OffsetX, OffsetY, PARTIAL_IMAGE_SIZE, PARTIAL_IMAGE_SIZE, HOME1); FindHomeRobot(HOME1, (short)OffsetX, (short)OffsetY); See Also : wholeFindHomeRobot.
B.2.2 The Communication Category Refer to Fig. B.31 for a flow-chart of the function. 1. SetComPort Syntax : CComm::SetComPort(intport,DWORD rate,BYTE bytesize, BYTE stop,BYTE parity); • int port : port number • DWORD rate : baud rate • BYTE bytesize : capacity of data to be transmitted • BYTE stop : set up Stop Bit • BYTE parity : set up parity bit Return Value : NULL Description : This function sets up the communication port via object ComPort. Example : m_pComm.SetComPort(1,19200,8,0,0) // Port no. 1, BAUD RATE 19200 bps, 8-bit data, // No Stop bit, No Parity bit 2. CreateCommInfo Syntax : CComm::CreateCommInfo (); Return Value : NULL
B.2 System Library
313
Set up Comport
SetComPort()
Initialize Comport CreateCommInfo()
Open Comport OpenComPort()
Send data to Comport
WriteCommBlock()
Close connection with Comport
CloseConnection() Fig. B.31. Communication function
Description : This function implements the TransmitEvent that occurs in ComPort. In setting up ComPort, make sure that the 3 functions are set in order, as in the example. Example : m_pComm.SetComPort(1,19200,8,0,0); m_pComm.CreateCommInfo(); m_pComm.OpenComPort(); 3. OpenComPort Syntax : CComm::OpenComPort (); Return Value : NULL Description : This function opens ComPort.
314
B. Reference Manual for an Experimental MiroSoT System
4. WriteCommBlock Syntax : CComm::WriteCommBlock( LPSTR lpByte , DWORD dwBytesToWrite); • LPSTR LpByte : address of data to transmit • DWORDdwBytesToWrite : size of the whole array of data to transmit Return Value : NULL Description : This function uses COMport and transmits data to it. Using ComPort is the same as saving data in a file. Example : Send_Command() { m_pComm.WriteCommBlock((LPSTR)command, 8); // Format of packet is made up of 0x00,Data1, Data2, // Data3,Data4,Data5,Data6,0x00, so the size of the // whole array is ’8’. } 5. CloseConnection Syntax : CComm::CloseConnection(); Return Value : NULL Description : This function closes ComPort.
B.2.3 The DECIDE-and-ACT Category
1. Velocity Syntax : void Velocity(int whichrobot, int vl, int vr); Return Value : NULL Description : This function converts and sets the required left-wheel and rightwheel PWM velocity data for a specified robot. See Section 6.4.1.
B.2 System Library
315
2. Position Syntax : void Position(int whichrobot, double x, double y); Return Value : NULL Description : This function moves a specified robot to a specified position. See Section 6.4.3. 3. Angle Syntax : void Angle(int whichrobot, int desired_angle); Return Value : NULL Description : This function rotates a specified robot to a specified heading angle. See Section 6.4.2. 4. Angle2 Syntax : void Angle2(int whichrobot, double x, double y); Return Value : Description : A variant of Angle, it rotates a specified robot to face directly towards a desired point instead. 5. AngleOfPosition Syntax : void AngleOfPosition(int whichrobot, double x, double y); • intWhichrobot : HOME1, HOME2, HGOALIE • doublex : X-coordinate of desired point • doubley : Y-coordinate of desired point Return Value : NULL Description : A variant of Angle, this function rotates the goalkeeping robot to a specified point (x,y). 6. bool DefencePosition Syntax : verb bool DefencePosition(int whichrobot, double x, double y); • int whichrobot : HOME1, HOME2, HGOALIE • double x : X-coordinate value of specified target position (x,y) • double y : Y-coordinate value of specified target position (x,y)
316
B. Reference Manual for an Experimental MiroSoT System
Return Value : • TRUE, if the robot arrived at the selected position • FALSE, if the robot did not arrive at the selected position Description : A variant of Position, this function performs positioning for a specified robot and in addition, returns a boolean value to indicate whether or not the robot arrived at the specified position. 7. bool GoaliePosition Syntax : bool GoaliePosition(int whichrobot, double x, double y); • int whichrobot : HOME1, HOME2, HGOALIE • double x : X-coordinate of a specified target position • double y : Y-coordinate of a specified target position Return Value : • TRUE, if the goalkeeping robot arrived at the specified position • FALSE, if the goalkeeping robot did not arrive at the specified position Description : Another variant of Position, this function performs positioning for the goalkeeping robot and in addition, returns a boolean value to indicate whether or not it arrived at the specified position. 8. bool Goalie Syntax : void Goalie(int whichrobot); Return Value : NULL Description : This function implements the strategy of goalkeeping presented in Section 6.5.2. 9. Attack Syntax : void Attack(int whichrobot1, int whichrobot2); Return Value : NULL Description : This function implements a two-robot attack strategy via zonedefence. See Section 6.6.1.
B.2 System Library
317
10. AutoPosition Syntax : void AutoPosition(); Return Value : NULL Description : This function makes each robot move to a pre-programmed position for the FreeBall and KickOff modes. It is invoked via the system user interface. 11. Game (also called My_Strategy in Section 6.3 ) Syntax : void Game(void); Return Value : NULL Description : This function ‘links’ all the strategies for the various game modes for selection via a flag, m_nGameMode. A game mode can be selected, and the corresponding strategy executed, via the system user interface. If the user selects a specific mode from the user interface, each robot will move based on a pre-programmed motion or role. A game mode can also be selected automatically by a user-programmed meta-strategy; this enables the strategies to be automatically selected according to the game situations considered by the meta-strategy. See Section 4.3 for some example strategies. Example : The example code given below is for the Penalty Kick mode. case PENALTY_KICK: dx1 = PositionOfBall[0]-PositionOfHomeRobot[HOME1][0]; dy1 = PositionOfBall[1]-PositionOfHomeRobot[HOME1][1]; if(dx1*dx1+dy1*dy1<8.0*8.0) { // if the numeral values of distance between the ball // and a robot are nearer than 16. AKick(HOME1); // select Robot no. 1 to attack Goalie(HGOALIE); } else if(dx1*dx1+dy1*dy1> 8.7*8.7 && dx1*dx1+dy1*dy1 < 10.0*10.0) { // if distance from the ball to a robot is between // 8.7 and 10. AKick(HOME1); // select Robot no. 1 to attack Goalie(HGOALIE); } else { MODE = DEFENSE; // if the ball goes to the opponent
318
B. Reference Manual for an Experimental MiroSoT System
// as the distance was far. InitRole(); // start attacking as usual Goalie(HGOALIE); 12. AvoidBound Syntax : void AvoidBound(int whichrobot); Return Value : NULL Description : This function gets a specified robot to come off the side-wall of the playground. See Section 6.5.3. 13. AKick Syntax : void AKick(int whichrobot); Return Value : NULL Description : A kick function used for attacking, it implements the univector field navigation method (see Section 4.6 and Section 6.6.2) that gets a specified robot to attempt to run through the dynamically changing ball position.
1. StopRobot Syntax : void StopRobot(void); Return Value : NULL Description : This function halts all robots. 2. WhichRobotStop Syntax : void WhichRobotStop(int whichrobot); Return Value : NULL Description : This function halts a specified robot.
B.2 System Library
3. Send_Command (void) Syntax : void Send_Command(void); Return Value : Description : This function transmits data saved as a command to COMport .
319
References
1. Y. Uny Cao, Alex S. Fukunaga, and Andrew Kahng, “Cooperative mobile robotics: Antecedents and directions,” Autonomous Robots, vol. 4, no. 1, pp. 7–27, March 1997. 2. Robin R. Murphy, An Introduction to AI Robotics, MIT Press, Cambridge, MA, USA, November 2000, A Bradford Book. Intelligent Robotics and Autonomous Agent series. 3. Ronald C. Arkin, Behaviour-Based Robotics, MIT Press, Cambridge, MA, USA, May 1998, Intelligent Robotics and Autonomous Agent series. 4. Sung-Wook Park, Jung-Han Kim, Eun-Hee Kim, and Jun-Ho Oh, “Development of a multi-agent system for robot soccer game,” in Proceedings of the IEEE International Conference on Robotics and Automation, April 1997, pp. 626–631 (vol. 1). 5. Arvin Agah and Kazuo Tanie, “Robots playing to win: Evolutionary soccer strategies,” in Proceedings of the IEEE International Conference on Robotics and Automation, April 1997, pp. 632–637 (vol. 1). 6. Jong-Hwan Kim, Hyun-Sik Shim, Myung-Jin Jung, Heung-Soo Kim, In-Hwan Choi, and J.O. Kim, “A cooperative multi-agent system and its real time application to robot soccer,” in Proceedings of the IEEE International Conference on Robotics and Automation, April 1997, pp. 638–643 (vol. 1). 7. Hyung-Suck Cho, Kyung-Hoon Kim, Kuk-Won Ko, Joo-Gon Kim, and Su-Ho Lee, “The development of a micro robot system for robot soccer game,” in Proceedings of the IEEE International Conference on Robotics and Automation, April 1997, pp. 644–649 (vol. 1). 8. Jong-Hwan Kim, Ed., Robotics and Autonomous Systems (Special Issue on First MIROSOT’96), Vol. 21, No. 2, Elsevier Science Publishers, Amsterdam, The Netherlands, September 1997. 9. Jong-Hwan Kim, Ed., Intelligent Automation and Soft Computing (Special Issue on Soccer Robotics : MIROSOT’97), Vol. 6, No. 1, AUTOSOFT Press, Albuquerque, New Mexico, 2000. 10. Jong-Hwan Kim and Prahlad Vadakkepat, “Multi-agent systems: A survey from the robot-soccer perspective,” Intelligent Automation and Soft Computing (Special Issue on Soccer Robotics : MIROSOT’97), vol. 6, no. 1, pp. 3–18, 2000. 11. Jong-Hwan Kim, Robot Soccer System, Taeyoung-sa Publisher, Taejon, Korea, 2000, In Korean. 12. Jong-Hwan Kim et al, Robot Soccer Engineering, KAIST Press, Taejon, Korea, 2002, In Korean. 13. Jeffrey Johnson, “Robot football : New frontiers in control and complexity theory,” in Proceedings of The 12th International Conference on Systems Engineering, Conventry, U.K., September 1997. 14. Myke Predko, Programming and Customizing PICmicro Microcontrollers, McGraw Hill, New York. Second Edition, 2000.
322
References
15. Myke Predko, PICmicro Microcontroller Pocket Reference, McGraw Hill, New York, 2002. 16. A. K. Jain, Fundamentals of Digital Image Processing, Prentice Hall, Inc., Englewood Cliffs, New Jersey, 1989. 17. Michael K. Sahota, Alan K. Mackworth, Rod A. Barman, and Stewart J. Kingdon, “Real-time control of soccer-playing robots using off-board vision : The Dynamite Testbed,” in Proceedings of IEEE International Conference on Systems, Man and Cybernetics, October 1995, pp. 3690–3693. 18. Randy Sargent, Billy Bailey, Carl Witty, and Anne Wright, “Dynamic object capture using fast vision tracking,” AI Magazine, vol. 18, no. 1, pp. 65–72, Spring 1997. 19. Karl E. Nelson, Jeffrey W. Collins, Michael A. Soderstrand, and Tien C. Hsia, “Vision systems and software design for soccer playing mobile robots,” Intelligent Automation and Soft Computing (Special Issue on Soccer Robotics : MIROSOT’97), vol. 6, no. 1, pp. 19–32, 2000. 20. Randy Sargent, Billy Bailey, Carl Witty, and Anne Wright, “The importance of fast vision in winning the first micro-robot world cup soccer tournament,” Robotics and Autonomous Systems (Special Issue on First Micro-Robot World Cup Soccer Tournament, MIROSOT’96), vol. 21, no. 2, pp. 139–147, September 1997. 21. Ramesh Jain, Rangachar Kasturi, and Brian G. Schunck, Machine Vision, McGraw-Hill Inc., 1995, McGraw-Hill Series in Computer Science. 22. W. K. Pratt, Digital Image Processing, Wiley, New York, Second Edition, 1991. 23. J. D. Foley, A. van Dam, S. K. Feiner, and J. F. Hughes, Computer Graphics: Principles and Practice, Addison Wesley, Second Edition, 1990. 24. D. F. Rogers and J. A. Adams, Mathematical Elements for Computer Graphics, McGraw Hill, New York, 1976. 25. R. O. Duda and P. Hart, Pattern Recognition and Scene Analysis, Wiley, New York, 1973. 26. E. Rich, Artificial Intelligence, McGraw Hill, New York, 1983. 27. P. Winston, Artificial Intelligence, Addison Wesley, Reading, Massachusetts, Third Edition, 1992. 28. Stuart J. Russell and Peter Norvig, Artificial Intelligence : A Modern Approach, Prentice Hall, Inc., Upper Saddle River, New Jersey, 1995. ◦
om and Tore H¨ agglund, PID Controllers : Theory, Design and 29. Karl Johan Astr¨ Tuning, Instrument Society of America. Second Edition, 1995. 30. Alf Isaksson and Tore H¨ agglund, Eds., IEE Proceedings on Control Theory and Applications (Special Issue on PID Control), Vol. 149, No. 1, The IEE Press, U.K, January 2002. 31. J. Gulder and V. I. Utkin, “Stabilization of non-holonomic mobile robots using lyapunov functions for navigation and sliding mode control,” in Proceedings of the IEEE International Conference on Decision and Control, 1994, pp. 2967– 2972. 32. J. Borenstein and Y. Koren, “Real-time obstacle avoidance for fast mobile robots,” IEEE Transactions on Systems, Man and Cybernetics, vol. 19, no. 5, pp. 1179 –1187, 1989. 33. J. Borenstein and Y. Koren, “Potential field methods and their inherent limitations for mobile robot navigation,” in Proceedings of the IEEE International Conference on Robotics and Automation, April 1991, pp. 1398–1404 (vol. 2). 34. J. Borenstein and Y. Koren, “The vector field histogram - fast obstacle avoidance for mobile robots,” IEEE Transactions on Robotics and Automation, vol. 7, no. 3, pp. 278 –288, 1991.
References
323
35. Hassan K. Khalil, Nonlinear Systems, chapter 7, pp. 289–312, Prentice Hall, Inc., Englewood Cliffs, New Jersey. 2nd Edition, 1996. 36. Gene F. Franklin, J. David Powell, and Michael L. Workman, Digital Control of Dynamic Systems, Addison Wesley Longman, Inc., Reading, Massachusetts, Third Edition, 1998. 37. Phillip John McKerrow, Introduction to Robotics, Addison-Wesley, New York, 1991. 38. Yong-Jae Kim, Jong-Hwan Kim, and Dong-Soo Kwon, “Evolutionary programming-based uni-vector field navigation method for fast mobile robots,” IEEE Transactions on Systems Man, and Cybernetics, Part B, vol. 31, no. 3, pp. 450–458, 2001. 39. Dong-Han Kim and Jong-Hwan Kim, “A real-time limit-cycle navigation method for fast mobile robots and its application to robot soccer,” Robotics and Autonomous Systems, 2002, In print. 40. Moon-Su Lee, Myung-Jin Jung, and Jong-Hwan Kim, “Evolutionary programming-based fuzzy logic path planner and follower for mobile robots,” in Proceedings of The Congress on Evolutionary Computation, San Diego, USA, July 2000, pp. 139–144 (Vol. 1). 41. Marc G. Slack, “Navigation templates: Mediating qualitative guidance and quantitative control in mobile robots,” IEEE Transactions on Systems Man, and Cybernetics, Part B, vol. 23, no. 2, pp. 452–466, 1993. 42. Erann Gat, “Navigation templates: Enhancements, extensions and experiments,” in Proceedings of the IEEE International Conference on Robotics and Automation, Atlanta, GA, USA, May 1993, pp. 541–547 (Vol. 1). 43. E. Rimon, “Exact robot navigation using artificial potential functions,” IEEE Transactions on Robotics and Automation, vol. 8, no. 5, pp. 501 – 518, 1992. 44. Dong-Han Kim, Yong-Jae Kim, Kwang-Choon Kim, Jong-Hwan Kim, and Prahlad Vadakkepat, “Vector field based path planning and petri net based role selection mechanism with Q-Learning for the soccer robot system,” Intelligent Automation and Soft Computing (Special Issue on Soccer Robotics : MIROSOT’97), vol. 6, no. 1, pp. 75–87, 2000. 45. Leslie Pack Kaelbling, Mitchael L. Littman, and Andrew W. Moore, “Reinforcement learning : A survey,” Journal of Artificial Intelligence Research, vol. 4, pp. 237–285, 1996. 46. Laurene V. Fausett, Fundamentals of Neural Networks : Architectures, Algorithms and Applications, Prentice Hall, Inc., Englewood Cliffs, New Jersey., 1994. 47. Myung-Jin Jung Heung-Soo Kim, Hyun-Sik Shim and Jong-Hwan Kim, “Action selection mechanism for soccer robot,” in Proceedings of the IEEE international Symposium on Computational Intelligence in Robotics and Automation, Monterey, USA, July 1997, pp. 390 – 395. 48. Ernst Mayr, Toward a New Philosophy of Biology: Observations of an Evolutionist, Belknap Press, Cambridge, MA, USA, 1988. 49. Thomas B¨ ack and Hans Paul Schwefel, “An overview of evolution algorithms for parameter optimization,” Evolutionary Computation, vol. 1, no. 1, pp. 1–23, 1993. 50. Jong-Hwan Kim and H. Myung, “Evolutionary programming techniques for constrained optimization problems,” IEEE Transactions on Evolutionary Computation, vol. 1, no. 2, pp. 129–140, July 1997. 51. James Lyle Peterson, Petri Net Theory and the Modelling of Systems, Prentice Hall, Inc., Englewood Cliffs, New Jersey, 1981. 52. Richard S. Sutton and Andrew G. Barto, Reinforcement Learning : An Introduction, MIT Press, Cambridge, Massachusetts, 1998.
324
References
53. Christopher J. C. H. Watkins and Peter Dayan, “Q-learning,” Machine Learning, vol. 8, no. 3/4, pp. 279–292, 1992. 54. Kui-Hong Park, Yong-Jae Kim, and Jong-Hwan Kim, “Modular Q-learning based multi-agent cooperation for robot soccer,” Robotics and Autonomous Systems, vol. 35, no. 2, pp. 109–122, 2001. 55. Lawrence J. Fogel, A. J. Owens, and M. J. Walsh, Artificial Intelligence Through Simulated Evolution, John Wiley and Sons, 1966. 56. Zbigniew Michalewicz, Genetic Algorithms + Data Structures = Evolution Programs, Springer Verlag, Berlin, Heidelberg. Second, Extended Edition, 1994. 57. Xin Yao, Ed., Evolutionary Computation: Theory and Applications, World Scientific, Singapore, November 1999. 58. Hung T. Nguyen and Elbert A. Walker, A First Course in Fuzzy Logic, Chapman & Hall, CRC Press, Boca Raton, Florida. 2nd Edition, 1999. 59. Kevin M. Passino and Stephen Yurkovich, Fuzzy Control, Addison Wesley Longman, Menlo Park, California, 1998. 60. N. Pfluger, J. Yen, and R. Langari, “A defuzzification strategy for a fuzzy logic controller employing prohibitive information in command formulation,” in Proceedings of the IEEE International Conference on Fuzzy Systems, March 1992, pp. 717–723. 61. T. A. Runkler and M. Glesner, “Defuzzification and ranking in the context of membership value semantics,” in European Congress on Fuzzy and Intelligent Technologies, September 1994. 62. R. R. Yager and D. P. Filev, “Slide: A simple adaptive defuzzification method,” IEEE Transactions on Fuzzy Systems, vol. 1, no. 1, pp. 69–78, February 1993. 63. Yong-Jae Kim, Univector Field Navigation, Ph.D. thesis, Department of Electrical Engineering and Computer Science, Korea Advanced Institute of Science and Technology, Korea, February 2003. 64. H. M. Deitel and P. J. Deitel, C: How to Program, Prentice Hall, Upper Saddle River, New Jersey, USA. Third Edition, 2001. 65. Bing Rong Hong, Quan Sheng Gao, and Hai Tao Chu, “Robot soccer simulation: A competition platform based on multi-agents,” International Journal of Harbin Institute of Technology (HIT), vol. 8, no. 3, pp. 203–206, September 2001. 66. H. M. Deitel and P. J. Deitel, C++ : How to Program, Prentice Hall, Upper Saddle River, New Jersey, USA. Fourth Edition, 2003.
Index
ACT, 2, 207 – actuation, 2 – – execution level, 104, 107 – control – – behaviour level, 104, 106 – – navigation, 120 – intelligent control, 2 agent, 1 – autonomous, 1, 7, 17 – communicative, 1, 7 – proactive, 1, 7 – reactive, 1, 7 – robotic, 1 communication, 57 – infra-red, 59 – – amplitude shift key, 59 – – base band, 60 – – communication protocol, 62 – protocol, 58 – radio-frequency, 64 – – communication protocol, 65 control architecture, 103 – behaviour-based, 103 – hierarchical, 103 – hybrid, 104 – – action level, 105, 153, 170 – – behaviour level, 106, 183, 194 – – execution level, 107 – – role level, 105, 149, 164 DC motor, 45 – dead zone, 57, 212 – encoder, 49 – gear ratio, 49 – gearbox, 49 – saturation, 57, 212 – torque, 47 – working principle, 45 DECIDE, 2, 207 – intelligence basics, 141
– – knowledge representation and inference, 143 – – learning and adaptation, 144 – – search and evolution, 142 domains, 6 – adversarial, 6 – cooperative, 6 – real-time, 6 MiroSoT Classification, 22 – action-based, 24 – command-based, 23 – intelligence-based, 25 MiroSoT programming – Find_Object(), 208 – My_Strategy(), 208 – – NormalGame(), 210, 247 – Send_Command(), 208, 213 – program functions, 209 – – Angle(), 213 – – AvoidBound(), 242 – – Kick(), 227 – – Position(), 215 – – Shoot(), 221 – – Velocity(), 211 – – Stop_AllRobots(), 249 multi-agent system, 2 – robotic, 17 – team, 2 – – common objective, 2 – – member, 2 navigation control, 121 – limit-cycle, 130 – – extended, 136 – unified method, 121, 122 – univector field, 126 – – obstacle avoidance, 128 – – target posture, 128 PID control, 117 – continuous, 118
326
Index
– discrete, 119 – tuning, 119 Pulse Width Modulation, 52 – duty cycle, 55 – H-bridge circuit, 54
– RoboSoT, 16 – SimuroSoT, 15 – – client-server architecture, 257 – – programming framework, see SimuroSoT programming
robot – hardware architecture, 36 – – communication, see communication – – DC motors, see DC motor – – mechanical design, 69 – – microcontroller, 37 – – motor drivers and circuits, see Pulse Width Modulation – – power system, 66 – – wheel assembly, 28 – kinematics, 31, 121 – nonholonomic constraint, 33 robot intelligence, 2 – ACT:Control, see navigation control – DECIDE, see DECIDE robot motion, 11 – posture, 11, 29 – – heading angle, 11, 29, 266 – – position, see playground, 29 robot soccer actions, 109 – attacker class, 111 – – Cannon Shoot action, 112 – – Position To Shoot action, 113 – – Shoot action, 111 – base class, 110 – – Position To SweepBall action, 111 – – Stop action, 111 – – SweepBall action, 111 – – Wandering action, 111 – defender class, 114 – – Position To PushBall action, 114 – – PushBall action, 114 – – Screen Out Ball action, 115 – goalkeeper class, 115 – – BlockBall action, 115 – – KDefendGoal action, 116 – – KNeedEscapeGoal action, 116 – – Keeper Attack action, 116 – – KDefaultPosition action, 116 robot soccer categories, 13 – HuroSoT, 18 – KheperaSoT, 16 – MiroSoT, 14 – – host software model, 206 – – programming framework, see MiroSoT programming – NaroSoT, 14
SENSE, 2, 207 – binary image processing, 77 – – size filtering, 89 – – computing geometric properties, 80 – – labelling algorithm 1, 87 – – labelling algorithm 2, 87 – – thresholding, 78 – flow chart, 91 – playground – – image coordinates, 92 – – physical coordinates, 92 – vision basics, 72 – – binary images, 74 – – colour images, 74 – – gray scale images, 74 – – pixel, 74 – – quantization, 74 – – sampling, 74 SimuroSoT programming – client program, 264 – image coordinate axes, 265 – kinematics model – – ball, 260 – – robot, 260 – system-defined – – class name CStrategySystem, 264 – – method name Action(), 264 – – method Velocity(), 267 – user-defined – – method Pass(), 271 – – method Goalie(), 271 – – method NormalGame(), 269, 270 – – method Position(), 267 – – method Think(), 264 Soft computing techniques, 141 – evolutionary programming, 178 – fuzzy logic and control, 188 – neural networks, 167 – Q-learning, 162 Soft computing techniquess – Petri nets, 146 strategy – action-level, 108 – game, 11, 247, 264 – – MiroSoT NormalGame(), 210, 247 – – SimuroSoT NormalGame(), 265, 269 – role-level, 109
Springer Tracts in Advanced Robotics Edited by B. Siciliano, O. Khatib, and F. Groen Published Titles: Vol. 1: Caccavale, F.; Villani, L. (Eds.) Fault Diagnosis and Fault Tolerance for Mechatronic Systems: Recent Advances 191 p. 2002 [3-540-44159-X] Vol. 2: Antonelli, G. Underwater Robots: Motion and Force Control of Vehicle-Manipulator Systems 209 p. 2003 [3-540-00054-2] Vol. 3: Natale, C. Interaction Control of Robot Manipulators: Six-degrees-of-freedom Tasks 120 p. 2003 [3-540-00159-X] Vol. 4: Bicchi, A.; Christensen, H.I.; Prattichizzo, D. (Eds.) Control Problems in Robotics 296 p. 2003 [3-540-00251-0] Vol. 5: Siciliano, B.; Dario, P. (Eds.) Experimental Robotics VIII 685 p. 2003 [3-540-00305-3]
Vol. 6: Jarvis, R.A.; Zelinsky, A. (Eds.) Robotics Research { The Tenth International Symposium 580 p. 2003 [3-540-00550-1] Vol. 7: Boissonnat, J.-D.; Burdick, J.; Goldberg, K.; Hutchinson, S. (Eds.) Algorithmic Foundations of Robotics V 577 p. 2004 [3-540-40476-7] Vol. 8: Baeten, J.; De Schutter, J. Integrated Visual Servoing and Force Control 198 p. 2004 [3-540-40475-9] Vol. 9: Yamane, K. Simulating and Generating Motions of Human Figures 176 p. 2004 [3-540-20317-6] Vol. 10: Siciliano, B.; De Luca, A.; Melchiorri, C.; Casalino, G. Advances in Control of Articulated and Mobile Robots 259 p. 2004 [3-540-20783-X]