Springer Tracts in Advanced Robotics Volume 41 Editors: Bruno Siciliano · Oussama Khatib · Frans Groen
Michael John Milford
Robot Navigation from Nature Simultaneous Localisation, Mapping, and Path Planning Based on Hippocampal Models
ABC
Professor Bruno Siciliano, Dipartimento di Informatica e Sistemistica, Università di Napoli Federico II, Via Claudio 21, 80125 Napoli, Italy, E-mail:
[email protected] Professor Oussama Khatib, Robotics Laboratory, Department of Computer Science, Stanford University, Stanford, CA 94305-9010, USA, E-mail:
[email protected] Professor Frans Groen, Department of Computer Science, Universiteit van Amsterdam, Kruislaan 403, 1098 SJ Amsterdam, The Netherlands, E-mail:
[email protected]
Author Dr. Michael John Milford The University of Queensland School of Information Technology and Electrical Engineering (ITEE) Axon Building Cpy Lucia, Queensland 4072 Australia Email:
[email protected]
ISBN 978-3-540-77519-5
e-ISBN 978-3-540-77520-1
Springer Tracts in Advanced Robotics
ISSN 1610-7438
Library of Congress Control Number: 2007942163 c 2008
Springer-Verlag Berlin Heidelberg
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilm or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable for prosecution under the German Copyright Law. The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Typesetting by the authors and Scientific Publishing Services Pvt. Ltd. Printed in acid-free paper 543210 springer.com
Editorial Advisory Board
EUR ON
Herman Bruyninckx, KU Leuven, Belgium Raja Chatila, LAAS, France Henrik Christensen, Georgia Institute of Technology, USA Peter Corke, CSIRO, Australia Paolo Dario, Scuola Superiore Sant’Anna Pisa, Italy Rüdiger Dillmann, Universität Karlsruhe, Germany Ken Goldberg, UC Berkeley, USA John Hollerbach, University of Utah, USA Makoto Kaneko, Osaka University, Japan Lydia Kavraki, Rice University, USA Sukhan Lee, Sungkyunkwan University, Korea Tim Salcudean, University of British Columbia, Canada Sebastian Thrun, Stanford University, USA Yangsheng Xu, Chinese University of Hong Kong, PRC Shin’ichi Yuta, Tsukuba University, Japan
European
***
***
Research Network
***
***
STAR (Springer Tracts in Advanced Robotics) has been promoted ROBOTICS under the auspices of EURON (European Robotics Research Network)
Foreword
At the dawn of the new millennium, robotics is undergoing a major transformation in scope and dimension. From a largely dominant industrial focus, robotics is rapidly expanding into the challenges of unstructured environments. Interacting with, assisting, serving, and exploring with humans, the emerging robots will increasingly touch people and their lives. The goal of the new series of Springer Tracts in Advanced Robotics (STAR) is to bring, in a timely fashion, the latest advances and developments in robotics on the basis of their significance and quality. It is our hope that the wider dissemination of research developments will stimulate more exchanges and collaborations among the research community and contribute to further advancement of this rapidly growing field. The monograph written by Michael Milford is yet another volume in the series devoted to one of the hottest research topics in the latest few years, namely Simultaneous Localization and Map Building (SLAM). The contents expand the author’s doctoral dissertation and describe the development of a robot mapping and navigation system inspired by models of the neural mechanisms underlying spatial navigation in the rodent hippocampus. One unique merit of the book lies in its truly interdisciplinary flavour, addressing a link between biology and artificial robotic systems that has been open for many years. To the best of my knowledge, this is the most thorough attempt to marry biological navigation in rodents and similar, with robotic navigation and SLAM. A very fine addition to our STAR series!
Naples, Italy September 2007
Bruno Siciliano STAR Editor
Preface
This book describes the development of a robot mapping and navigation system inspired by models of the neural mechanisms underlying spatial navigation in the rodent hippocampus. Computational models of animal navigation systems have traditionally had limited performance when implemented on robots. The aim of the work was to determine the extent to which hippocampal models can be used to provide a robot with functional mapping and navigation capabilities in real world environments. The focus of the research was on achieving practical robot performance, rather than maintaining biological plausibility. The mapping and navigation problem is a broad one and to completely solve it a robot must possess several key abilities. These include the ability to explore an unknown environment, perform Simultaneous Localisation And Mapping (SLAM), plan and execute routes to goals, and adapt to environment changes. The most successful conventional solutions are based on core probabilistic algorithms and use a variety of map representations ranging from metric to topological. These probabilistic methods have good SLAM performance, but few have been successfully integrated into a complete solution to the entire mapping and navigation problem. In contrast, biological systems are generally poorly understood and computational models emulating them have had very limited application to robot platforms in challenging environments. However, many animals solve the entire mapping and navigation problem without the precise sensors and high resolution maps that are typical of robotic methods. Models of the mapping and navigation process in relatively well understood animals such as rodents offer much potential for practical improvement. The book describes a series of studies which implemented computational models inspired by the mapping and navigation processes in the rodent hippocampus. The initial study was based on conventional state of the art models of place and head direction cells. However, as the research moved further towards the creation of a functional robot navigation system, these biological models were extended and adapted to provide better robot navigation performance. The aim was to create an integrated solution to not just the mapping problem, but also the related challenges of exploration, goal navigation, and adaptation. The end result of this process was a system now known as the RatSLAM system, and an algorithm known as experience mapping. At each stage of the research the system was tested experimentally, with the environments becoming larger and more challenging as its capability improved. The combined system displayed successful, although sometimes suboptimal, performance in a range of real robot tests lasting up to two hours. In the final experiments a RatSLAM
X
Preface
equipped robot, acting in a completely autonomous manner, explored an unknown environment while performing SLAM, then navigated to goal locations while adapting to simple lasting environment changes. Robot Navigation from Nature is intended to demonstrate the validity of an interdisciplinary approach combining biological navigation in rodents and other animals with robot navigation and SLAM. The majority of the work was performed during my doctorate studies, with this book being an adapted version of my PhD thesis. Through writing and publishing this book, I hope to spur further research in this direction. I believe it may offer a path to robots in the future that move around their environments in a truly intelligent manner. As with any research, I am indebted to the many researchers before me on whose work I have based much of the work presented in this book. The references throughout the book indicate the main groups and people from which I have drawn inspiration. I also wish to acknowledge the funding from the Australian Research Council and the Australian government, without which I would never have been able to perform this research, my PhD supervisor Dr. Gordon Wyeth, who was the driving force behind the establishment of this research project, and my colleague David Prasser, who worked with me on the project and created the vision systems. With regards to the development and production of this book, I thank Dr. Thomas Ditzinger, Engineering Editor at Springer Verlag, for helping me through the book production process, Dr Peter Corke from the CSIRO, Prof. Siciliano from the University of Naples, and the anonymous reviewer who provided much useful feedback.
Queensland Brain Institute & School of ITEE The University of Queensland August 2007
Michael John Milford
Table of Contents
List of Figures……………………………………………………………… XV List of Abbreviations…………………………………………………...…. XIX 1.
Introduction…………………………………………………………...…… 1.1 Mobile Robots………………………………………………….……… 1.2 Simultaneous Localisation and Mapping……………………………… 1.3 Exploration, Goal Navigation and Adapting to Change……………….. 1.4 Practical Performance from a Biological Model………………………. 1.5 Book Outline……………………………………………………………
1 1 3 5 6 6
2.
Mapping and Navigation………………………………………………….. 2.1 The Mapping and Navigation Problem………………………………… 2.1.1 Localisation and Mapping……………………………………... 2.1.2 SLAM: The Chicken and the Egg Problem…………………… 2.1.3 Dealing with Uncertainty……………………………………… 2.1.4 Exploring Unknown Environments……………………………. 2.1.5 Navigating to Goals……………………………………………. 2.1.6 Learning and Coping with Change…………………………….
9 10 10 11 11 12 12 13
3.
Robotic Mapping Methods………………………………………………… 3.1 Probabilistic Mapping Algorithms…………………………………….. 3.1.1 Kalman Filter Methods…………………………………………. 3.1.2 Expectation Maximisation Methods……………………………. 3.1.3 Particle Filter Methods…………………………………………. 3.2 Topological Mapping Methods………………………………………… 3.3 Exploration, Navigation, and Dealing with Change…………………… 3.3.1 Exploration……………………………………………………... 3.3.2 Navigating to Goals…………………………………………….. 3.3.3 Dealing with Dynamic Environments………………………….. 3.4 Discussion………………………………………………………………
15 15 15 17 18 21 22 23 25 26 28
4.
Biological Navigation Systems…………………………………………….. 4.1 Rodents and the Cognitive Map……………………………………….. 4.1.1 Head Direction and Place Cells………………………………... 4.1.2 Exploration, Navigation, and Dealing with Change……………
29 29 31 33
XII
Table of Contents
4.2 Other Animals and Insects……………………………………………... 4.2.1 Bees……………………………………………………………. 4.2.2 Ants…………………………………………………………….. 4.2.3 Primates………………………………………………………... 4.2.4 Humans………………………………………………………... 4.3 Discussion………………………………………………………………
34 34 36 36 38 39
5.
Emulating Nature: Models of Hippocampus…………………………….. 5.1 Head Direction and Place Cells – State of the Art…………………….. 5.1.1 Attractor Networks…………………………………………….. 5.1.2 Path Integration………………………………………………… 5.1.3 Head Direction Correction Using Allothetic Information……... 5.1.4 Place Cells – State of the Art…………………………………... 5.1.5 Place Cells Through Allothetic Cues………………………….. 5.1.6 Place Cells Through Ideothetic Information…………………… 5.1.7 Navigation……………………………………………………… 5.2 Discussion………………………………………………………………
41 41 41 42 44 45 45 47 51 53
6.
Robotic or Bio-inspired: A Comparison…………………………………. 6.1 Robustness Versus Accuracy…………...……………………………… 6.2 Map Friendliness Versus Map Usability……………………...……….. 6.3 Sensory Differences…………………………………………………… 6.4 Capability in Real World Environments………………………………. 6.5 One Solution……………………………………………………………
55 55 56 57 58 59
7.
Pilot Study of a Hippocampal Model……………………………………... 7.1 Robot and Environment………………………………………………... 7.2 Complete Model Structure……………………………………………... 7.3 A Model of Spatial Orientation………………………………………... 7.3.1 Representing Orientation……………………………………….. 7.3.2 Learning Allothetic Cues……………………………………….. 7.3.3 Re-localisation Using Allothetic Cues…………………………. 7.3.4 Internal Dynamics………………………………………………. 7.3.5 Path Integration Using Ideothetic Information…………………. 7.4 Model Performance……………………………………………………. 7.4.1 Experiment 1: Path Integration Calibration……………………. 7.4.2 Experiment 2: Localisation and Mapping in 1D……………….. 7.5 A Model of Spatial Location…………………………………………... 7.5.1 Representing Location………………………………………….. 7.5.2 Learning Allothetic Cues……………………………………….. 7.5.3 Re-localisation Using Allothetic Cues…………………………. 7.5.4 Internal Dynamics………………………………………………. 7.5.5 Path Integration Using Ideothetic Information…………………. 7.6 Model Performance……………………………………………………. 7.6.1 Experiment 3: Localisation and Mapping in 2D………………. 7.7 Discussion and Summary……………………………………………… 7.7.1 Comparison to Biological Systems…………………………….
61 61 63 64 64 66 67 68 69 70 70 71 74 74 75 75 76 78 79 79 81 83
Table of Contents
XIII
7.7.2 Comparison to Other Models…………………………………... 7.7.3 Conclusion……………………………………………………...
84 86
8.
RatSLAM: An Extended Hippocampal Model…………………………... 8.1 A Model of Spatial Pose……………………………………………….. 8.1.1 Complete Model Structure……………………………………… 8.1.2 Biological Evidence for Pose Cells…………………………….. 8.1.3 Representing Pose………………………………………………. 8.1.4 Internal Dynamics………………………………………………. 8.1.5 Learning Visual Scenes………………………………………… 8.1.6 Re-localising Using Familiar Visual Scenes…………………… 8.1.7 Intuitive Path Integration……………………………………….. 8.2 Generation of Local View……………………………………………... 8.2.1 Sum of Absolute Differences Module…………………………. 8.2.2 Image Histograms……………………………………………… 8.3 Visualising SLAM in a Hippocampal Model……………….…………. 8.4 SLAM in Indoor and Outdoor Environments………………………….. 8.4.1 Experiment 4: SLAM with Artificial Landmarks……………… 8.4.2 Experiment 5: SLAM in a Loop Environment…………………. 8.4.3 Experiment 6: SLAM in an Office Building…………………… 8.4.4 Experiment 7: SLAM in Outdoor Environments………………. 8.4.5 Path Integration only Performance……………….…………….. 8.4.6 SLAM Results………………………………………………….. 8.5 Summary and Discussion…………………………………………….... 8.5.1 RatSLAM Requirements………………………………………. 8.5.2 The Nature of RatSLAM Representations……………………..
87 87 87 87 89 90 91 92 93 94 94 96 99 101 101 104 107 112 113 113 115 115 115
9.
Goal Memory: A Pilot Study……………………………………………… 9.1 Enabling Goal Recall Using RatSLAM……………………………….. 9.2 Learning………………………………………………………………... 9.3 Recall…………………………………………………………………... 9.3.1 Experiment 8: Small Environment Goal Recall………………... 9.3.2 Goal Recall Results…………………………………………….. 9.3.3 Experiment 9: Large Environment Goal Recall………………... 9.3.4 Goal Recall Results…………………………………………….. 9.4 Summary and Discussion……………………………………………… 9.4.1 Creating Maps Suited to Goal Recall…………………………...
117 117 118 119 120 121 124 125 126 127
10. Extending RatSLAM: The Experience Mapping Algorithm……………. 10.1 A Map Made of Experiences…………………………………………. 10.2 Linking Experiences: Spatially, Temporally, Behaviourally………… 10.3 Map Correction……………………………………………………….. 10.4 Map Adaptation and Long Term Maintenance………………………. 10.5 Indoor Experience Mapping Results…………………………………. 10.5.1 Experiment 10: Large Pose Cell Representation……………. 10.5.2 Experiment 11: Small Pose Cell Representation…………….. 10.6 Experiment 12: Outdoor Experience Mapping……………………….. 10.7 Summary and Discussion……………………………………………..
129 129 131 133 134 136 137 139 140 142
XIV
Table of Contents
11. Exploration, Goal Recall, and Adapting to Change……………………... 11.1 Exploring Efficiently…………………………………………………. 11.1.1 Experimental Evaluation…………………………………….. 11.1.2 Discussion…………………………………………………… 11.2 Recalling Routes Using a Temporal Map……………………………. 11.2.1 Temporal Map Creation……………………………………… 11.2.2 Route Planning………………………………………………. 11.2.3 Behaviour Arbitration………………………………………... 11.2.4 Route Loss Recovery………………………………………… 11.3 SLAM and Navigation in a Static Environment……………………… 11.3.1 Experiment 13: Goal Recall with Minor Pose Collisions……. 11.3.2 Experiment 14: Goal Recall with Major Pose Collisions……. 11.3.3 Discussion……………………………………………………. 11.4 Adapting to Environment Change……………………………………. 11.4.1 Experiment 15: Indoor Map Adaptation……………………... 11.4.2 Results……………………………………………………….. 11.5 Discussion……………………………………………………………. 11.5.1 Conclusion……………………………………………………
145 145 147 147 149 149 150 151 153 153 154 156 156 158 158 159 160 161
12. Discussion…………………………………………………………………... 12.1 Book Summary……………………………………………………….. 12.1.1 Mapping and Navigation…………………………………….. 12.1.2 Pilot Study of a Hippocampal Model………………………... 12.1.3 RatSLAM: An Extended Hippocampal Model………………. 12.1.4 Goal Memory: A Pilot Study………………………………… 12.1.5 Extending RatSLAM: Experience Mapping…………………. 12.1.6 Exploring, Goal Recall, and Adapting to Change…………… 12.2 Contributions…………………………………………………………. 12.2.1 Comparative Review of Robotic and Biological Systems…… 12.2.2 Performance Evaluation of Hippocampal Models…………… 12.2.3 Implementation of an Extended Hippocampal Model………. 12.2.4 An Experience Mapping Algorithm…………………………. 12.2.5 An Integrated Approach to Mapping and Navigation……….. 12.3 Future Mapping and Navigation Research…………………………… 12.4 Grid Cells…………………………………………………………….. 12.5 Conclusion…………………………………………………………….
163 163 163 164 164 165 165 165 166 166 167 167 167 168 168 170 170
Appendix A - The Movement Behaviours………………………………... Search of Local Space………………………………………………………. Pathway Identification………………………………………………………. Velocity Commands…………………………………………………………
173 173 173 176
References…………………………………………………………………... 179 List of Reproduced Figures……………………………………………….. 187 Index………………………………………………………………………… 191
List of Figures
3.1 3.2 3.3 3.4
Maps produced with and without the EM algorithm………...…...….. The effect of the motion model on the robot’s pose belief………..…. Candidate selection using the NBV algorithm………………….....… Exploration method using the Spatial Semantic Hierarchy………......
18 20 24 25
4.1 4.2 4.3
The rodent hippocampus..……………………………………...……. Firing rate of cells as a function of head direction……………...…… Anatomy of a worker bee……………………………………..…...…
30 31 35
5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 5.10 5.11
Attractor network model of the rodent head direction system….…… A coupled attractor model of rodent head direction…………..…...… Bearing and place cell models…………………………………….…. Model relating cues, head direction cells and place cells……..…...… A model of view cells………………………………………..……..... Allothetic and ideothetic information influences place cells…..……. Place cell firing rates during a robot experiment……………..…….... Two-dimensional continuous attractor model of place cells…...…..... Minimalist LV-PI model as implemented by Browning……..…….... Generating a theta rhythm……………………………………..…….. Learning to navigate to a goal…………………………………..........
42 43 44 46 47 48 48 49 51 52 52
6.1 6.2
High resolution occupancy maps…………………………..……..…. Illustrative example of place cell firing and occupancy maps..…...…
56 57
7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8 7.9 7.10
Equipment used during indoor experiments……………...…..……… The Pioneer robot in the two by two metre experimental arena........... Overall structure of the hippocampal model…………………..…….. Head direction network………………………………………..…...... Activity profile in the head direction network…………………....…. Active view cells are associated with active head direction cells…… Connectivity of local view cells with head direction cells……........... Head direction cells excite themselves and other cells…..………….. Path integration cells in action for counter-clockwise rotation…........ Odometry calibration……………………………………………...…
62 63 64 65 65 66 67 68 70 71
XVI
List of Figures
7.11 7.12 7.13 7.14 7.15 7.16 7.17 7.18 7.19 7.20 7.21 7.22 7.23
Environment for 1D localisation and mapping experiments……....… Results of orientation calibration………………………………….…. Re-localisation under visual input………………………………....… Visual reinforcement of the current orientation hypothesis……...….. View cell connections to the place cells……………………….......… Sample activity profiles…………………………………………....… View cell-place cell weight map…………………………………...... Path integration in the place cells………………………………….… Localisation using path integration………………………………...... Positional tracking error…………………………………………...… Learning under visual ambiguity…………………………….……..... Re-localisation failure under visual ambiguity……………….……… Activity packets after further robot movement…………..…..………
72 72 73 74 76 77 77 78 79 80 82 82 83
8.1 8.2 8.3 8.4 8.5 8.6 8.7 8.8 8.9 8.10 8.11 8.12 8.13 8.14 8.15 8.16 8.17 8.18 8.19 8.20 8.21 8.22 8.23 8.24 8.25 8.26 8.27 8.28 8.29
The RatSLAM system structure…………………………..….....…… The eight-arm maze used in rodent experiments…………..……..….. The three-dimensional pose cell model…………………….......……. Snapshot of pose cell activity during an experiment…………...……. The pose cell and local view cell structures…………………...…….. High resolution and down sampled greyscale camera images…......... Local view cell activity during a 77 minute experiment………….…. Region of the panoramic image used to generate histograms….......... Sample 32 × 32 normalised red-green histograms…………….......… Hybrid topological-metric map………………………………...……. Floor plan of testing arena……………………………………...……. Testing arena and Pioneer robot with artificial landmark……...……. Results without visual re-localisation…………………………....…... SLAM results with unique landmarks……………………....…...…... Results with ambiguous landmarks……………………….......……... Floor plan of loop environment……………………………...………. Robot camera captures from loop environment……………...……… Robot localisation without visual re-localisation……..…….……….. RatSLAM trajectory for loop environment…………..……….……... Results of a robot kidnap……………………………..…..…...……... Complete floor environment…………………………….…...………. Robot camera captures from the complete floor environment…...….. Inconsistent pose cell trajectory without visual re-localisation……… RatSLAM trajectory for the office environment (η = 0.20)…..……... RatSLAM trajectory for the office environment (η = 0.25)…..……... CSIRO robot tractor…………………………………………...…….. Route for outdoor experiment………………………………...……... Robot localisation using only path integration…………..…….…….. Robot localisation with visual re-localisation…………..…….....…...
88 88 89 90 92 94 95 96 98 100 101 102 103 103 104 105 105 106 107 108 108 109 110 111 111 112 113 114 114
9.1 9.2 9.3 9.4
Active goal cells are linked forwards and backwards in time……...... Floor plan and robot trajectory for Experiment 8…………………..... RatSLAM trajectory just before goal navigation commenced…......... The temporal map cells after recall of the first goal……………....….
118 121 122 122
List of Figures
XVII
9.5 9.6 9.7 9.8 9.9 9.10
The path the robot followed to reach the first goal…………..…........ The temporal map cells after recall of the second goal………...……. The path the robot travelled to reach the second goal………...……... Floor plan of large indoor environment…………………….…...…… Dominant packet path for a 40 × 20 × 36 pose cell matrix….......…... Temporal map for the large indoor environment…………….......…..
123 123 124 124 125 126
10.1 10.2 10.3 10.4 10.5 10.6 10.7 10.8 10.9 10.10 10.11
Experience map co-ordinate space…………………………..…….… A transition between two experiences……………………..……….... An example of the behavioural and temporal information…..…......... Experience map growth over time……………………………....…… Floor plan for the experience mapping experiments………..….....…. Path of dominant activity packet (100 × 40 × 36 cell matrix)…...…... Snapshots of the experience map at various times…………….......… Experience map produced using a 100 × 40 × 36 cell matrix……...... Path of dominant activity packet (40 × 20 × 36 cell matrix)…...……. Experience map produced using a 40 × 20 × 36 cell matrix…..….…. Experience map for outdoor environment…………………........……
130 132 133 135 137 137 138 139 140 140 142
11.1 11.2 11.3 11.4 11.5 11.6 11.7 11.8 11.9 11.10
Robot movement at an intersection over 80 minutes………..….....… An example of a route calculated using a temporal map…..……..….. Behaviour arbitration during goal navigation………………....……... Floor plan showing the robot’s path and goal locations………...…… Temporal maps, planned and actual routes for experiment 13……..... Temporal maps, planned and actual routes for experiment 14…...….. Floor plan for adaptation experiment…………………………..….… Trial 1 goal navigation results………………………………….......... Trial 2 goal navigation results……………………………………...... Trial 3 goal navigation results……………………………..…...…….
148 150 152 154 155 157 158 159 159 160
A.1 A.2 A.3 A.4
Search tree node……………………………………...……..…..…… Search tree structure…………………………………...…..…..…….. Path extraction process…………………………………...….………. Velocity dependence on angle……………………………..…..……..
174 175 176 176
List of Abbreviations
APN ASR ATN CA CAL CAN CF CSIRO DARPA DG DP (M)EC EKF ELDEN EM GM GPS HD KF LCS LPM LTP LV LWF MCL MFIS nbhd NBV PC PET PI POMDP PoS, poSC PTZ RGB
Adaptive Place Network Absolute Space Representation Anterior Thalamic Nuclei Cornu Ammonis Calibration cell Competitive Attractor Network Centreline Following Commonwealth Scientific and Industrial Research Organisation Defence Advanced Research Projects Agency Dentate Gyrus Distinctive Place (Medial) Entorhinal Cortex Extended Kalman Filter Exploration and Learning in Dynamic ENvironments Expectation-Maximisation Goal Memory Global Positioning System Head Direction Kalman Filter Local Control Strategies Local Perceptual Map Long Term Potentiation Local View Left Wall Following Monte Carlo Localisation Memory For the Immediate Surroundings Neighbourhood Next-Best-View Place Cell(s), also Pose Cell(s) Positron Emission Tomography Path Integration Partially Observable Markov Decision Process Postsubiculum Pan-Tilt-Zoom (camera) Red-green-blue
XX RWF SAD SEIF SIFT SLAM sLEC SSD VIS
List of Abbreviations
Right Wall Following Sum of Absolute Differences Sparse Extended Information Filter Scale Invariant Feature Transforms Simultaneous Localisation And Mapping Superficial Lateral Entorhinal Cortex Sum of Squared Differences Visual bearing cell
1 Introduction
This book describes the design and implementation of a vision-based Simultaneous Localisation And Mapping (SLAM) system using an extended model of the rodent hippocampus, and a mapping algorithm that integrates with this system to provide goal recall and environment adaptation capabilities. Computational models of biological systems have traditionally had limited mapping and navigation capabilities when implemented on real robots. Conventional probabilistic techniques can solve specific components of the problem such as SLAM, but do not provide an integrated solution to the entire problem of exploring, mapping, navigating to goals, and adapting to change. The aim of this research was to demonstrate that by analysing and extending models of the rodent hippocampus within a pragmatic robotics context, it is possible to create a biologically inspired model that can solve many of the key mapping and navigation problems. This chapter describes the mapping and navigation challenge for mobile robots, the rationale for the approach taken in this body of work and the research contributions. Sections 1.2 and 1.3 discuss four major requirements of a mobile robot: the ability to perform SLAM, explore an environment, navigate to goals, and adapt to environment change. Section 1.4 describes the rationale behind developing a biologically inspired mapping and navigation solution, and the methodology employed. The chapter concludes in Section 1.5 with an outline of the book structure.
1.1 Mobile Robots Mobile robots are being employed in ever-increasing numbers in both domestic and industrial applications. The rapid uptake of mobile robots especially in consumer applications has ensured that the industrial robots used in manufacturing lines since the 1960s are no longer the most common form of robot. Over a short period of only a few years mobile robots have come to hold this distinction (UNECE 2005). The key to this explosion in mobile robot usage has been a transition in their mainstream image from novelty luxury items to practical, labour-saving machines. A good example of these new labour-saving robots is the vacuum cleaner robots produced by companies such as iRobot and Electrolux. By the end of 2004, there were an estimated 1.2 million personal domestic robots in service around the world according to M.J. Milford: Robot Navigation from Nature, STAR 41, pp. 1–7, 2008. © Springer-Verlag Berlin Heidelberg 2008 springerlink.com
2
Introduction
the United Nations Economic Commission for Europe survey (UNECE 2005), including more than one million vacuum cleaner robots. Of these robots about 550,000 were installed during 2004, indicating the rapid growth in this area. Although not as prolific as vacuum cleaner robots, lawn mowing robots are also in widespread use with an estimated 46,000 units worldwide as of the end of 2004 (UNECE 2005). The UNECE predicts that approximately seven million personal service robots will be sold during the period 2005 – 2008. This figure includes 4.5 million domestic robots with an estimated value of three billion US dollars, and 2.5 million entertainment and leisure robots with an estimated value of 4.4 billion US dollars (UNECE 2005). Longer term predictions include one of 39 million personal domestic robots by the year 2010, as given by the semiconductor analyst company Future Horizons (Horizons 2004). While mobile robots are becoming a normal part of a growing number of domestic homes, they are still quite limited in their capability. Many vacuum cleaner robots are, at their core, wheeled vehicles that combine a set of reactive movement behaviours with a range of pre-programmed path behaviours, using some simple back-off and turn behaviours to extract themselves from obstacles or dead ends. Although reactive techniques can be used to solve a large number of navigational problems, there are limits to what can be achieved without more advanced navigation methods. For example, there are many environment geometries such as box canyons that present a significant challenge to reactive techniques. The introduction of hysteresis mechanisms (Browning 2000) or local spatial memories (Balch and Arkin 1993) can increase the potential of reactive systems but only to a certain extent. Without state information and a map these robots have no higher level means of reasoning about their environment. Some of the characteristics of domestic robots are a product of needing to use cheap, mass produced sensors and computing power in order to achieve an affordable market price. Personal service robots need to be affordable because of the price sensitivity of consumers in that area (Schofield 1999). To produce a robot retailing for $100, it is quite likely that the factory will need to produce the system from components costing a small fraction of that amount. This financial restriction significantly limits the sensor equipment that can be used, ruling out popular robot sensors such as the SICK laser scanner (Schofield 1999). According to Colin Angle, “It is easy to build a robot that is prohibitively expensive”, but to build one that is affordable to mass consumers is much more challenging (Kanellos 2004). The more fundamental problem limiting the sophistication of mobile robots is the algorithms required to equip a mobile robot with advanced navigation abilities. A robot using only reactive behaviours has intrinsically limited navigation capabilities. To function intelligently in real world environments under anything but the most basic of assumptions about the environment, a mobile robot requires some way of mapping the environment and using that map to navigate and perform tasks (Thrun 1998). The development of mapping and navigation techniques that can be robustly applied on real robots will facilitate progress towards the next generation of mobile robots for industrial and domestic settings. This generation will include robots capable of exploring and mapping their surrounds, performing global tasks such as navigating to goals, and adapting to changes in the environment. The remainder of this chapter provides an introduction to the major mapping and navigation problems that must be solved before this next generation of mobile robots
Simultaneous Localisation and Mapping
3
becomes a reality, the current methodologies for solving these problems, and the motivation of the approach described in this book.
1.2 Simultaneous Localisation and Mapping One of the core problems facing autonomous mobile robots is the Simultaneous Localisation And Mapping (SLAM) problem. Although it is not the sole problem it is one of the most significant (Dissanayake, Durrant-Whyte et al. 2000; Newman and Leonard 2003) and has been the subject of much research over the past two decades. The core SLAM problem is the requirement that a robot, starting in an unknown environment, explore in order to learn the environment (map), while simultaneously using that map to keep track of the robot’s position (localise) within the environment. SLAM is, strictly speaking, a problem, although throughout the literature it is also used as a description of a process that a robot performs in solving the problem (Thrun 2002). The SLAM problem is difficult for many reasons. Mobile robots are essentially moving sensor platforms – sensors vary in type and capability but all have limitations and all are noisy to some degree. Consider the case of a laser range finder, widely regarded as one of the most accurate and reliable sensor types in robotics. A laser range finder measures the distance to obstacles with a high degree of accuracy, but is neither exact nor infallible. Measurements are only accurate within a margin of error. In addition, measurements may be incorrect in real world environments – the laser may ‘see’ through glass walls, and beams may reflect off multiple surfaces before returning to the sensor. In outdoor settings bright sunlight may affect the sensor and introduce faulty readings. The problem becomes worse when examining the performance of sonar range sensors. Unlike the relatively focused light beam from a laser, sound waves from a sonar sensor project in a cone-like manner from the point of origin. Returned ranges are obtained for the closest reflecting object within this cone. The wide beam results in significant uncertainty regarding the exact position of a detected obstacle (Moravec and Elfes 1985). Surfaces with certain textures or surfaces that do not transect the sonar cone at ninety degree angles may not reflect sufficient sound energy back to the sensor to register a reading. As with a laser range finder, readings may be caused by reflections from multiple surfaces and not represent the range to the closest object. The sensor problem is not limited to range finding sensors – vision-based sensors are subject to their own set of limitations. Cameras are notoriously poor at representing colours in images. If the image domain is reduced to greyscale images, there is still the problem of dealing with light intensity variation, especially in outdoor environments. There are a handful of techniques for dealing with lighting variation but none are completely satisfactory (Buluswar and Draper 2002; Tews, Robert et al. 2005). Camera images can also be distorted, especially the images obtained from panoramic cameras or using low cost wide angle lenses. Internal robot sensing is just as limited. Rotational encoders on robot wheels may accurately measure the movement of the wheel, but factors such as wheel slip ensure that this measurement does not correspond exactly to the movement of the robot along the ground. It is impossible to calibrate a system perfectly meaning that even with a
4
Introduction
minimum of wheel slippage, over any significant time period these slight measurement errors will accumulate to unmanageable proportions. This phenomenon is well known within the robotics community and is often referred to as ‘odometric drift’. The problem is made more severe in certain situations, such as when the robot’s environment contains a variety of ground surface types. The significance of many of these external and internal sensing problems can be reduced through engineering. Using more expensive cameras can reduce image distortion. Sensor fusion between multiple types of range sensors can help negate the effects of occasional incorrect measurements from a particular sensor type. Engineers can design robots and their control systems to minimise wheel slippage. The use of a Global Positioning System (GPS) can help avoid the need to integrate wheel encoder information in order to update a robot’s position over time. These approaches can solve or reduce the severity of some of these problems but are often not practical to implement or introduce new problems. For instance, standard GPS does not work indoors, and is not completely reliable outdoors, or available in all locations. The first problem can be solved by buying an indoor GPS, but this may only be suitable for certain types of indoor environments, making it a specific rather than general solution. In outdoor environments when the GPS ‘drops out’ the downtime can last long enough to severely test any alternative system. Cost concerns rule out many of the possible solutions, especially in areas such as domestic robots, where expensive sensors cannot be used. Underlying the entire sensory issue is the inescapable fact that humans and animals manage to navigate and learn environments with a set of inexact, imperfect sensors. Using expensive sensors to increase accuracy and reliability is a partial engineering solution to the problem, but it is obvious that nature has managed to solve the problem in an entirely different manner. From a commercial and practical perspective it is desirable that robot systems, especially those for personal use, be able to function with simple, inexpensive sensors in order to be commercially attractive. Research into ways of solving the SLAM problem can be divided into two broad categories. One category consists of a number of mathematical probabilistic techniques that have been developed over the last twenty years. These methods use Kalman Filters, particle filters, and Expectation Maximisation algorithms as the basis for robotic SLAM techniques. This approach has been extensively researched, resulting in a wide variety of world representation (map) types and sensor domains. The research in this field has primarily focused on producing functional navigation and mapping systems on mobile robots in real world environments, or in simulation with the goal of eventual deployment on a robot. The field has produced a number of SLAM methods that can function in large environments under varying assumptions. Typical assumptions are that the robot’s environment is static and that the robot has an accurate range sensor. Most of the methods with the best practical performance require costly sensing equipment and significant computational resources (Montemerlo, Thrun et al. 2003; Grisetti, Stachniss et al. 2005). In recent work range information has been extracted from stereo vision sensors and used successfully to perform SLAM, but the process has required fairly complex management of Scale Invariant Feature Transforms (SIFT) (Sim, Elinas et al. 2005). The other area of research has focused on emulating the biological systems thought to be responsible for mapping and navigation in animals. The most studied animal has
Exploration, Goal Navigation and Adapting to Change
5
been the rodent, with a focus on the rodent hippocampus as the probable region involved with spatial navigation tasks. Research in this area has developed a consensus on general characteristics of the hippocampus with respect to navigation in rats, although many aspects of the issue remain the subject of debate. The research trend in this area has been towards validating and improving models of the brain function, rather than producing practical, robot navigation systems (Franz and Mallot 2000). Few models have been implemented on actual robots, and those that have only in small artificial environments (Arleo 2000; Krichmar, Nitz et al. 2005). This research has however yielded some preliminary ideas on how a biologically inspired system might solve the SLAM problem, without the use of expensive range sensors or complex probabilistic algorithms. In summary, the SLAM problem is one of the major challenges that must be solved by higher level autonomous mobile robots. SLAM is a difficult problem because of the inherent uncertainty in the sensors a robot uses to gain knowledge about its environment. Robotics research has produced a number of solutions to the SLAM problem that perform quite well in large environments but which often rely on environment assumptions, expensive range sensors, and computationally intensive probabilistic algorithms. Existing computational models of biological systems are much less capable of solving the SLAM problem but offer the potential for an alternative solution incorporating more affordable sensory equipment.
1.3 Exploration, Goal Navigation and Adapting to Change Solving the SLAM problem is only one of many capabilities that a fully autonomous mobile robot must possess if it is to function in real world environments. A robot must also be able to explore an unknown environment, navigate to goals, and adapt to changes in the environment. However, these tasks cannot be performed efficiently without the core SLAM problem being solved. Mapping and navigation research on real robots has been biased towards solving the SLAM problem, since it is difficult to properly address these other challenges without a solution to the SLAM problem. Exploration and goal navigation techniques have been extensively researched in simulation, often with the assumption that the SLAM problem has been solved. There has been relatively little research on actually implementing these techniques on robots. There has also been only limited investigation into methods of adapting to changing environments. Because so much of the mapping and navigation research on actual robots over the past two decades has focused on the core SLAM problem, the subsequent solutions have been developed in isolation, without due consideration of other challenges such as goal navigation and adaptation to environment change. The focus on the SLAM problem has led to piecemeal attempts to implement goal recall and adaptation abilities after solving the SLAM problem, rather than the consideration of the entire mapping and navigation problem from the start of each investigation. Consequently, while there are individual techniques for solving each aspect of the problem, there are no systems that successfully solve the entire problem on robots in real world environments.
6
Introduction
1.4 Practical Performance from a Biological Model The work described in this book has been motivated by the desire to fill the gap between the relatively poor practical performance of biological navigation models and the better, but more specific, performance of current robotic navigation systems. Biological models are based on animals with competent all round mapping, navigation, and adaptation abilities, whereas robotic methods tend to solve each of the problems in isolation. While there has been a significant amount of research into creating models of biological systems, only a small fraction has focused on the practical usefulness of such models in robotics (Arleo 2000; Browning 2000). Most of the research has centred on testing models of biological mechanisms, rather than on finding an optimal solution to the robot mapping and navigation problem (Franz and Mallot 2000). Models of biological systems have been formed almost entirely from evidence obtained from rat experiments in small, highly controlled environments. Typical rodent experiments involve a rat moving around a small cylindrical, rectangular, or simple maze arena with a number of well-defined artificial visual cues (McNaughton, Barnes et al. 1983; Knierim, Kudrimoti et al. 1995; Skaggs and McNaughton 1998; Hampson, Simeral et al. 1999; Kobayashi, Tran et al. 2003). Testing of computational models developed using this experimental evidence has occurred in similar types of environments. For example, in Arleo’s doctoral work, the two arenas were a 600 × 600 mm arena with bar-coded walls and an 800 × 800 mm arena with an external visual cue but no walls (Arleo 2000). Many of the major mapping and navigation problems such as closing the loop and dealing with significant ambiguity are not apparent in such small, constrained environments, so the practical usefulness of biological models has remained untested. As well as being a relatively unexplored research topic, creating computational models of biological systems for use on robots is attractive for two reasons. The first reason is the mapping and navigation capabilities of the animals upon which the models are based. Animals such as rodents function in complicated environments without the use of high precision range finding equipment. Biological models offer the potential for autonomous mobile robots without the expensive laser range finder sensors used in so many SLAM systems. The second reason is the nature of how animals solve the entire mapping and navigation problem. Animals appear to have an integrated solution that allows them to explore, map, navigate to goals, and adapt to environment change. Modelling biological systems involves attempting to represent these combined mechanisms as one coherent whole, rather than as a number of isolated algorithms and techniques.
1.5 Book Outline The book chapters proceed as follows: To create a biologically inspired mapping and navigation system with practical performance requires an examination of the robotic and biological literature in this field. Chapter 2 defines the mapping and navigation problem and discusses the individual problems that it constitutes. Chapters 3 to 5 review the three major types of system that address this problem; conventional probabilistic systems, biological
Book Outline
7
systems, and computational models of biological systems. Chapter 6 provides a comparative analysis of these systems, resulting in the identification of key issues and a proposal for the research path taken in this work. The first stage of the research consisted of a pilot study implementation of a classical computational model of the rodent hippocampus, with the purpose of evaluating its mapping and navigation capabilities in a robotics context. Chapter 7 describes the model implementation and its mathematical foundations. The model was tested in a number of spatial localisation and mapping experiments. These experiments revealed the model had some fundamental mapping limitations. The chapter concludes with a review of the experimental research on rats to determine whether this limitation is a product of the model or the process being modelled. The outcomes of this research led to the extension of the hippocampal model which is described in Chapter 8. Developing a SLAM system based on models of the hippocampus is an iterative process. Chapter 8 describes the implementation of an extended model of hippocampal mapping and navigation called RatSLAM, which is based on the orthodox model implemented in Chapter 7. This is followed by a presentation of the RatSLAM model’s performance in a range of indoor and outdoor SLAM experiments. A map is only useful if it can be used by a robot to perform some sort of purposeful behaviour. Chapter 9 describes the implementation of a goal recall system that uses the RatSLAM maps directly to perform navigation to goals. The method uses a new cell structure known as the goal memory to create temporal maps for navigation. The system was tested in a sequence of progressively larger environments under increasingly challenging conditions. This increase in complexity introduced a number of phenomena into the RatSLAM maps including discontinuities and collisions that make them unsuitable for direct use. To create a more usable representation, an algorithm known as experience mapping was developed. Chapter 10 presents the experience mapping algorithm, which uses the RatSLAM maps to create spatio-temporal-behavioural maps that are suitable for exploration, goal navigation, and adaptation strategies. The algorithm is described in detail, including the core map correction process by which it produces experience maps that are free of the discontinuities and collisions of the original RatSLAM representations. Results are presented from experiments in both indoor and outdoor environments. With the establishment of a suitable mapping algorithm, methods for exploration, goal recall, and adaptation methods were implemented. Chapter 11 describes the development and implementation of these methods. Results are presented to evaluate each of these methods in a number of indoor robot experiments. The final experiments tested the ability of the combined system to autonomously explore, map, navigate to goals, and adapt to environment change. The final stage of the research consisted of a review of the work performed, an analysis of the research outcomes and their significance within the research field, and a discussion of the remaining research challenges. Chapter 12 starts with a summary of the research presented in each chapter and highlights the key findings that led to the next stage of the work. This is followed by an assessment of the major research contributions, which are related back to the literature on mapping and navigation. Section 12.3 discusses future challenges for the mapping and navigation research field and ways in which they may be addressed. The book then concludes with a summary of the work performed and its contribution to the field of robot mapping and navigation.
This page intentionally blank
2 Mapping and Navigation
Mobile robots and animals alike require the ability to move around in their environments. For a large number of animals such as insects, this ability is provided by reactive movement schemes. With an appropriate set of reactive behaviours an insect can function quite well in complex environments. However, there is a limit to what can be achieved through pure reactivity. A whole class of navigation problems become solvable if an animal or robot is able to form some sort of map of its environment. Research on the topic of mapping and navigation can be split into two areas. One area has focused on how animals move around and function in their habitats, whether the animal is an ant scurrying around on desert sand dunes (Muller and Wehner 1988), or an English taxi operator driving passengers around London (Maguire, Frackowiak et al. 1997). Popular study subjects in this area of research include rodents, primates, humans, and bees. By examining the capabilities of these biological agents both in nature and controlled environments, researchers have formed a number of models of the actual physical and neurological mechanisms driving their navigation systems. The extent of the research has varied, from purely theoretical analysis (Abbott 1994; Eichenbaum, Dudchenko et al. 1999; Hahnloser, Xie et al. 2001), to validation through testing of software models or testing in simulated environments (Borenstein 1994; Touretzky, Wan et al. 1994; Balakrishnan, Bhatt et al. 1998; Stringer, Rolls et al. 2002; Chavarriaga, Sauser et al. 2003; Koene, Gorchetchnikov et al. 2003), to deployment of models on actual robotic platforms (Arleo 2000; Krichmar, Nitz et al. 2005). Roboticists have generally had a different motivation driving their mapping and navigation research – the desire to produce functioning robots. The inherent uncertainty of the sensors used on robots has led to a convergence of the most successful robotic mapping and navigation systems – they are all probabilistic to some degree (Thrun 2000). Most methods have as their basis one or more of three different probabilistic algorithms: Kalman Filter, Expectation Maximisation, or particle filter algorithms. However, there is a large range of methods within this banner of probabilistic systems (Thrun 2002). The robot platform, sensor types, and environment all affect the way in which a mapping and navigation system is developed. M.J. Milford: Robot Navigation from Nature, STAR 41, pp. 9–13, 2008. © Springer-Verlag Berlin Heidelberg 2008 springerlink.com
10
Mapping and Navigation
This and the following 4 chapters examine the research that has been conducted in both the robotic and biological mapping and navigation fields. Their purpose is as follows: 1. Introduce the mapping and navigation problem and discuss the various components of the problem. 2. Review the core probabilistic algorithms and the robotic mapping methods that use them. 3. Investigate the capabilities of biological systems. 4. Review the mapping and navigation performance of state of the art models of the rodent hippocampus – the most thoroughly understood biological system in this field. 5. Identify key issues in the field of mapping and navigation and use them to compare and contrast robotic and biological systems. 6. Highlight the middle ground between the probabilistic and biologically-based research areas, and discuss the advantages of developing a method inspired by biological systems but with a primary focus on practical performance.
2.1 The Mapping and Navigation Problem Mobile robots by definition move around in their environments. Low level robots may function quite adequately in their environment using simple reactive behaviours and random exploration, but more advanced capabilities require some type of mapping and navigation system (Thrun 1998). The purpose of this section is to clarify exactly what is implied by the words ‘mapping’ and ‘navigation’ and to discuss the various capabilities a robot must possess to achieve these tasks. 2.1.1 Localisation and Mapping When a robot localises, it determines its location within some representation of the environment. The act of localisation can vary greatly in complexity. Take the example of a robot that has just detected it is within a certain range of an infrared beacon. Armed with the prior knowledge that there is a single infrared beacon somewhere in its environment (and hence equipped with a map), the robot now knows that it is within a certain distance of that beacon, effectively localising itself within its map. In another environment an aerial robot may have measured its distance from four artificial landmarks placed on the ground. Armed with the prior knowledge of the landmarks’ positions within the environment (a map), the robot can localise by calculating its position from the distance measurements. It is important to note that the localisation process is entirely dependent on the map the system possesses – without a map the process has no meaning. Mapping involves creating a representation of the environment. The representation or map can vary; one map might store a set of coordinate locations of trees in a park; another might keep a millimetre resolution grid showing which parts of an office floor have objects on them; yet another may represent the environment as a sequence of images and motor actions. Mapping is inherently linked with localisation – just about all mapping processes require that the robot is able to localise itself at least some of
The Mapping and Navigation Problem
11
the time. It is this inter-dependence of the two processes that leads to one of the most significant and challenging problems in mobile robotics. 2.1.2 SLAM: The Chicken and the Egg Problem To create a map of the environment based on its sensory information, a robot must know where it is. Conversely, a robot can only calculate where it is located if it has some sort of map or world representation. When a robot is placed into an unknown environment, it starts with neither a map nor any idea where it is. In this situation a robot must create a map while simultaneously localising itself within this map. This is known as the Simultaneous Localisation And Mapping problem, or SLAM for short (Dissanayake, Newman et al. 2001). The problem is also referred to in the literature as the Concurrent Mapping and Localisation (CML) problem (Leonard, Newman et al. 2001). It is generally considered to be one of the most challenging problems in mobile robotics and one of the most significant that must be solved before truly autonomous moving robots become a reality (Dissanayake, Durrant-Whyte et al. 2000; Leonard, Newman et al. 2001; Newman and Leonard 2003). Unlike mapping and localisation systems that require a priori knowledge of the environment or modification of the environment, SLAM systems can function in situations where map information or robot position is not initially known and environment modification is not possible. This ability makes these systems significantly more autonomous and ideal for a large range of applications including mobile mining robots, aerial vehicles, submersible robots, and planetary rovers (Dissanayake, Newman et al. 2001). Even in environments where it is possible to hand craft a map beforehand or place artificial markers, the effort involved is a significant motivator for using an autonomous map building system. 2.1.3 Dealing with Uncertainty From the perspective of a robot, the world is an uncertain place. This uncertainty comes from two main sources (Thrun 2000): • Sensors are noisy, producing measurements with a certain expected error range, and only work properly under a limited set of conditions. • The world is an inherently uncertain, ambiguous, and unpredictable place. To learn about an environment a robot must use some type of sensing device; because sensors are always noisy to some degree this introduces uncertainty into the robot’s perception of the world. For instance, a laser range finder may measure distances to a precision of 10 millimetres – it has high precision, but is not exact. Other less sophisticated range sensors may have less precise specifications. Vision sensors may exhibit undesirable characteristics in varying light conditions (Tews, Robert et al. 2005). Global sensors such as GPS can have ‘jitter’ – a robot’s reported position may rapidly move around in jumps of several metres. Environments are another source of uncertainty. Ambiguity is one main cause – separate locations in an environment can appear the same through a robot’s sensory devices. Environments can also be unpredictable – unforseen changes to the environment can occur, such as the removal or repositioning of furniture. A robot may have
12
Mapping and Navigation
to deal with the uncertain movement of dynamic objects such as people in an office building (Biswas, Limketkai et al. 2002). 2.1.4 Exploring Unknown Environments Robots placed in unknown environments have no initial knowledge of their surroundings. To acquire this knowledge they must move around their environment, which is known as the process of exploration. Exploration is a crucial capability for any robot that must function in new environments it has never experienced before. Some researchers consider it the fundamental problem for mobile robots, because it involves localisation, navigation, and planning (Oriolo, Vendittelli et al. 2004). A good exploration process allows a robot to efficiently gain knowledge about the environment. It also removes the time intensive task of building a priori maps by hand (Wullschleger, Arras et al. 1999), which is one of the keys to truly autonomous robots. Exploration is often a critical part of the localisation and mapping process. Some methods may require an exploration scheme that regularly returns the robot to familiar locations, limiting the continuous time spent in new parts of the environment (Arleo 2000). The exploration technique can also be linked closely to the nature of the world representations that the robot builds (Kuipers and Byun 1991). Even after the environment has been learned, the robot can discover changes in the environment by re-exploring. A robot may also use routes it has traversed during exploration to navigate to goals. As well as enabling a robot to efficiently learn the environment, a good exploration method may facilitate higher level functions such as goal navigation and adaptation to environment change. 2.1.5 Navigating to Goals “How do I get there” is one of the essential navigation functions (Maguire, Burgess et al. 1998). Mobile robots need to be able to get from place A to place B in an efficient manner. The problem is a complex one; to solve it in all but the simplest of situations involves much more than just following a reactive collision avoidance procedure. The robot must act despite the inherent uncertainty in its sensors and the robot’s perception of its location in the environment. At certain times the lack of information or certainty may require the robot to blend goal navigation and information acquisition activities in order to improve its state information and hence its planning ability (Latombe 1991). If a robot needs to plan and execute paths starting with only an incomplete map or even no map at all, it must explore and build an incremental solution to the goal navigation problem (Yahja, Singh et al. 1998). The type of robot and environment can dictate the desirable characteristics of the goal navigation process. The challenge is often to navigate to the goal as quickly as possible or via the shortest route. However certain robots may have different constraints; for instance an interplanetary rover robot may need to navigate to goals using the least power or via the safest route (Guo, Parker et al. 2003). Military or domestic guard robots may need to navigate via the stealthiest route in order to minimise the chances of detection (Marzouqi and Jarvis 2004). In some situations, practical limitations of robot actuators, safe movement restrictions and fragile payloads dictate the goal navigation requirements; routes with smooth movement trajectories and restricted acceleration (Lepetic, Klancar et al. 2003; Wei and Zefran 2005). Common to all these situations is the need for a method of navigating to locations with some notion of optimality.
The Mapping and Navigation Problem
13
2.1.6 Learning and Coping with Change Being able to autonomously explore, SLAM, and navigate to goals are challenging tasks for any robot, but are only part of the problem faced by robots moving around in real world environments. Robots must also be able to cope with change in their environment (Fox, Burgard et al. 1999; Biswas, Limketkai et al. 2002). This is a particularly challenging problem because of the vast range of timescales involved. Change in an environment can be short-term and transient, such as a person walking past a robot, or long term and lasting, such as a rearrangement of the furniture in an office (Yamauchi and Beer 1996). A localising and mapping process must be robust to transient change and be able to incorporate lasting change into its maps. Changes can also be defined as either perceptual or topological, or a combination of both (Yamauchi and Beer 1996). Perceptual change is a change in the sensory appearance of the environment to the robot. The type of sensors the robot is using determines what constitutes a perceptual change. For instance, a change in wall colour and texture may not be noticed by a mapping system that uses a laser range finder, but would be a significant change for a vision-based SLAM system. Conversely the movement of an object against a similarly coloured background might not be noticed by an appearance-based vision system, but would constitute a significant change to a range-based mapping system. In either situation, when the change is noticed, a previously familiar part of the environment suddenly appears different to the robot’s perception system. The robot must be able to consolidate these new changes with its previous experience in a rational manner. The problem of perceptual changes is compounded by physical changes to the structure of the environment. This is also referred to as topological change (Yamauchi and Beer 1996), and often includes perceptual change as well. In an indoor environment doors may be opened or closed, corridors may be blocked by temporary obstacles, and entire spaces can have their structural layout reorganised. In an outdoor setting vehicles may move around, roads or pathways may be blocked and the fundamental shape of the landscape may change, such as at a mine site. Paths that the robot is accustomed to taking may become blocked, or change in shape. New paths through previously inaccessible areas may be introduced. In order to function under these conditions a robot must be able to recognise and learn these structural environment changes and alter its movement through the environment accordingly. Each aspect of the problem is by itself a challenging task. They combine to provide a problem which is unsolved and will probably remain unsolved for years to come. The difficulty of the problem is not surprising given that humans often struggle in the same situations. For instance a human may struggle to adapt to topological changes in their workplace due to renovations or reorganisations. When a worker shifts offices, their co-workers may travel to their previous office many times before finally learning to go to the new office location. In outdoor environments visual changes can cause navigation problems, such as those caused by the transition from day-time to nighttime driving, or from a summer environment to a snow-covered winter environment. Given the difficulty of navigating in dynamic environments, it is not surprising that robotics research to date has only focused on solving limited subsets of the problem.
This page intentionally blank
3 Robotic Mapping Methods
Sensor and environment uncertainty has caused most robotic mapping and navigation methods to converge to probabilistic techniques. The key strength of probabilistic techniques is their ability to deal with uncertainty and ambiguity in a robot’s sensors and environment. Any technique that has some means of handling the uncertainties faced by a mobile robot has an immense advantage over techniques that do not. Probabilistic techniques allow a robot to appropriately use sensory measurements based on their modelled uncertainties. Ambiguous features or landmarks in the environment become useful (albeit less useful than unique features) rather than becoming failure points for the mapping algorithm.
3.1 Probabilistic Mapping Algorithms While most of the successful mapping and navigation techniques are probabilistic (Thrun 2000; Dissanayake, Newman et al. 2001; Montemerlo, Thrun et al. 2003), there is a wide range of ways in which a probabilistic approach can be implemented. There are several types of probabilistic algorithms that have been the focus of much research, with no clear ‘best’ one. Techniques vary in their complexity, assumptions about the environment, sensor usage, computational requirements, flexibility across robot platforms, scalability, and many other characteristics. Some are explicitly probabilistic down to the core mathematical foundations, while others are more implicit in their probabilistic characteristics, having been developed in a more heuristic manner. Some can be combined with others to form hybrid methods. This section will focus on the three most successful probabilistic algorithms, and their use in mapping and navigation methods on robots in the real world. The three types are: • Kalman Filter (KF) algorithms, • Expectation Maximisation (EM) algorithms, and • particle filter algorithms. 3.1.1 Kalman Filter Methods Kalman Filter-based methods track the locations of landmarks or features as the robot moves around the environment. Computationally they are one of the few techniques that M.J. Milford: Robot Navigation from Nature, STAR 41, pp. 15–28, 2008. © Springer-Verlag Berlin Heidelberg 2008 springerlink.com
16
Robotic Mapping Methods
can maintain the full notion of uncertainty in the map, but are limited by a Gaussian assumption of uncertainty. This limitation has been partially addressed by the introduction of Extended Kalman Filters (EKFs). The complexity of the solution is proportional to
( )
O N 2 , with N being the number of landmarks or features in the map. Dissanayake et al. (2001) introduced the contemporary Kalman Filter technique for solving the SLAM problem and proved that solutions would converge. To implement a vanilla Kalman Filter method, a few assumptions and a movement model are required. The usual assumption is that of a robot starting at an unknown location in an unknown environment. To model the motion of the robot through the environment, a kinematic model of the robot is required. Typically this model also incorporates some type of allowance for noise. Many methods use a linear discrete-time model to model the movement of the robot through the environment e.g. (Fabrizi, Oriolo et al. 1998). Although the actual robot movement is usually non-linear, when segmented into short enough time periods the movement can be considered linear with minimal inaccuracy. The updated state of the robot, xv ( k + 1) , can be calculated as follows:
xv (k + 1) = Fv (k )xv (k ) + uv (k + 1) + vv (k + 1)
(3.1)
v subscript denotes the variable is specific to the robot (v)ehicle, k is the step number, Fv (k ) is the state transition matrix, uv (k + 1) is a vector of control inputs, xv (k ) is the state of the robot, and vv (k + 1) is a vector describing the errors
where the
in the movement process. The movement noise is usually modelled as having a mean of zero. Robots require a sensor that can measure the relative locations of landmarks with respect to the robot. Observations, like movement, are characterised by uncertainty. The output of the sensor for the
i th landmark, zi (k ) , is given by:
zi (k ) = H i xa (k ) + wi (k )
= H pi pi − H v xv (k ) + wi (k )
where
(3.2)
xa (k ) is a vector containing the vehicle and landmark states, wi (k ) is a vec-
tor storing the observation errors, usually with a mean of zero as with the motion process model, H i is the observation model that maps the sensor output zi (k ) to the robot state vector, and
pi is the absolute location of the ith landmark.
Vanilla Kalman Filter techniques require linear process models with Gaussian noise. To extend their applicability to non-linear processes the Extended Kalman Filter algorithm was developed (Smith, Self et al. 1990). Extended Kalman Filters use the partial derivatives of the process and measurement functions to continually linearise around the current estimate. This technique allows the original Kalman Filter techniques to be extended to simple systems with non-linear characteristics. The method however is not practical when applied to more complex non-linear systems.
Probabilistic Mapping Algorithms
17
Standard Kalman Filter methods cannot handle the detection of two indistinguishable landmarks at separate locations in the environment. The detection of two such landmarks introduces a multimodal distribution representing that landmark’s location. This does not fit within the unimodal Gaussian noise assumption of a vanilla Kalman Filter technique. Most practical implementations sidestep this issue by ignoring nonunique environment features, with the obvious disadvantage of discarding a large amount of potentially useful information about the environment. Extensions such as the Lu/Milios algorithm described by Lu et al. (1997) use maximum likelihood data association to recover from incorrect correspondences. The method works well when errors in the robot pose remain small, but fails when the error in pose is large. The complexity of Kalman Filter methods scales in a quadratic manner with the number of landmarks or features in the environment. Computation can rapidly become quite significant as environment sizes or experiment durations increase. In response to this problem the Sparse Extended Information Filter (SEIF) method was developed (Thrun, Koller et al. 2002; Eustice, Walter et al. 2005). The SEIF method is based around pruning of the information or inverse covariance matrix used in the Kalman Filter update equations, and exploitation of the sparse nature of the resulting matrix. The complexity of the SEIF methods generally scales with O ( N ) making them more computationally tractable. There is, however, generally some degradation of the resulting map’s correspondence to the environment. For instance, Eustice, Walter et al. (2005) report that there is significant global map inconsistency using SEIF methods. They propose a modified procedure that solves the global inconsistency problem but that is no longer a constant-time algorithm. 3.1.2 Expectation Maximisation Methods The Expectation Maximisation algorithm has somewhat complementary attributes to Kalman Filter approaches (Dempster, Laird et al. 1977). Its most significant advantage is the ability to solve the correspondence problem – in ambiguous, cyclic environments robots using EM techniques are still able to generate correct maps. The EM algorithm involves two steps: • an expectation step that calculates the posterior over the robot poses for a single map, and • the maximisation step that calculates the most likely map given the expected robot poses. This two step process is iterated to provide increasingly accurate maps of the environment. Each point in the estimated path of the robot is calculated using all temporal data, including both past and future data relative to the time associated with the pose point being calculated. The maximisation step finds the map that maximises the log likelihood of both the sensor measurements and the estimated robot’s path. EM methods solve this problem for discrete (x, y) locations since there is no continuous solution. EM techniques solve the correspondence problem by generating multiple hypotheses about the robot’s path during the expectation step. Each path represents different correspondences between the sensory data and features in the environment. During the maximisation step these correspondences create the features in the resulting map.
18
Robotic Mapping Methods
In the next expectation step features that correctly represent the environment help maintain correct estimates of the robot’s path. Incorrect features generate unsupported pose estimates which gradually disappear through further iterations. This process gives EM methods the ability to map large cyclic environments as long as the motion model is representative of the potential cumulative errors in odometry obtained at the end of a long loop in the environment. Fig. 3.1 shows an experiment run by Thrun (2002), in which a robot moved around a large cyclic indoor environment containing indistinguishable landmarks. The results of applying the EM algorithm are contrasted with the raw data. The EM algorithm was able to solve the correspondence problem between ambiguous landmarks and produce the topologically correct map show in Fig. 3.1c.
Fig. 3.1. Maps produced with and without the EM algorithm. (a) Uncorrected trajectory of robot in a large cyclic environment with indistinguishable landmarks (shown by circles) (b) Occupancy grid map built using uncorrected odometry and sonar sensors. (c) Corrected trajectory and landmark locations using EM algorithm. (d) Resultant occupancy grid map (Thrun 2002). Reprinted with permission from Elsevier.
The main weakness of the EM algorithm is that, unlike Kalman Filter approaches, it does not maintain a full notion of uncertainty. This means that the algorithm can get ‘stuck’ in local maxima during its iterative two step map correction process. The algorithm is also computationally expensive, relegating it to offline use. The results shown in Fig. 3.1 took several hours to generate on a low-end PC. 3.1.3 Particle Filter Methods Particle filters represent the belief about the robot’s pose using a set of samples or particles distributed according to the posterior distribution over robot poses. In the
Probabilistic Mapping Algorithms
19
field of robot localisation these methods are often referred to as Monte Carlo Localisation (MCL) (Thrun, Fox et al. 2000). Rather than approximating the posterior in parametric form, MCL techniques use a distribution of weighted particles that approximate the desired distribution. This removes any restrictions regarding the shape or characteristics of the posterior distribution. MCL recursively uses a Bayes filter to estimate the posterior distribution of the robot poses. The filter estimates a probability density over the state space, using the sensory data. The belief set of robot poses, Bel ( xt ) , is given by:
Bel (xt ) = p (xt | d 0...t ) where
(3.3)
xt is the robot’s pose state at time t, and d 0...t is the sensor data from time 0
up to time t. In the context of robot localisation there are usually two types of data – internal and perceptual. Internal data is produced by sensors such as wheel encoders and inertial motion units and can be used to update information about the robot’s motion. Perceptual data is produced by external sensors such as laser range finders, sonar arrays, and cameras. By splitting the data into perceptual and internal types, exploiting some assumptions, and performing repeated integration one can arrive at the recursive update equation in Bayes Filter. One critical assumption is the Markov assumption – that the future state is independent of past states as long as the current state is known. By exploiting this assumption multiple times, the update equation for the robot’s possible pose states, Bel ( xt ) , is obtained as:
Bel (xt ) = η p(ot | xt ) where t, and
∫ p(x | x t
t −1 , at −1
) Bel (xt −1 )dxt −1
(3.4)
η is a normalisation constant, ot is the set of perceptual observations at time at −1 is the set of internal observations at time t-1 (Thrun, Fox et al. 2000). This
equation forms the basis for many of the Monte Carlo Localisation algorithms. If updates are based only on internal observations the belief in robot pose will gradually spread out due to uncertainties in the motion of the robot. Fig. 3.2 shows a progression of sample approximations to a robot’s pose as it moves around a rectangular path during an experiment by Thrun, Fox et al. (2000). The robot starts welllocalised but becomes less and less certain of its location with each move it makes. The distribution shapes reveal the characteristics of the sampling model – uncertainties in angular motion are more critical to the localisation process than uncertainties in translational motion. The p (ot | xt ) component of Equation 3.4 is the key to avoiding endless smearing of the sample approximation of the robot’s pose. A range of probability densities are incorporated into the calculation of p (ot | xt ) to allow for uncertainty. Thrun, Fox et al. (2000) describe three types of probability density for use with range sensors; a Gaussian distribution used to model uncertainty in a sensor range measurement, an
20
Robotic Mapping Methods
Fig. 3.2. The effect of the motion model on the robot’s pose belief. As the robot moves around a rectangular path, the samples representing the belief in robot pose spread out according to the robot’s motion model (Thrun, Fox et al. 2000). Reprinted with permission from Elsevier.
exponential density to model random readings from dynamic objects in the environment such as people, and a narrow uniform probability density located at the maximum range specification of the sensor, for maximum range scans when no obstacle is detected. Once the robot has a partial map of the environment, repeated observations using the robot sensors support the particles located in plausible locations, while unsupported particles quickly disappear. Robots can localise rapidly in large complex environments using this method, with reasonable computational requirements. Another advantage of this method is the ease with which the number of samples can be changed, to suit whatever computational resources are available. The main limitation of particle filters in their raw form is that they are only a localisation technique, not a simultaneous localisation and mapping technique. Extensions of vanilla MCL methods and integration with other probabilistic algorithms have however yielded efficient solutions to the full SLAM problem. One such algorithm is FastSLAM, developed by Montemerlo, Thrun et al. (2002). Like standard MCL methods, this algorithm uses a particle filter to estimate the posterior over robot paths. This is only one of a number of estimates however; to estimate the positions of landmarks in the environment, each particle possesses as many Kalman Filters as there are landmarks. Each Kalman Filter is used to estimate the location of a landmark conditioned on the path estimate. The resultant map stores the location of landmarks such as rocks (Montemerlo, Thrun et al. 2002) or trees (Montemerlo, Thrun et al. 2003). Computationally, FastSLAM can be a lot less demanding than Extended Kalman Filter techniques. The only Gaussian distributions used are two-dimensional ones representing the (x, y) locations of landmarks. EKF methods need to maintain a fivedimensional Gaussian because the distribution needs to represent the robot’s (x, y, θ)
Topological Mapping Methods
21
state in addition to the landmarks’ (x, y) states. This advantage alone allows the computation of the FastSLAM algorithm to scale with O (MK ) , with M the number of particles and K the number of landmarks (and hence Kalman Filters per particle). Montemerlo, Thrun et al. (2002) use a binary tree to represent the set of Gaussians associated with each particle to reduce the computation time to O (M log K ) . A second version of the FastSLAM system has also been developed, known as FastSLAM 2.0 (Montemerlo, Thrun et al. 2003). This method has been proven to converge with a single particle, and is more efficient than the first FastSLAM implementation. A recent approach by Grisetti, Stachniss et al. (2005) uses Rao-Blackwellized filters (Murphy and Russell 2001), with each particle representing a possible map and robot trajectory. In contrast to FastSLAM, the Rao-Blackwellized approach is used to build grid maps showing the occupancy state of the environment in a fine-grained grid. Although computationally similar to the FastSLAM algorithm, this method does not rely on predefined landmarks. The method was able to generate accurate high resolution grid maps in a number of large indoor and outdoor environments. The three probabilistic algorithms – Kalman Filters, Expectation Maximisation, and particle filters – form the core of most of the top robotic mapping systems. They are typically used to form occupancy grid (Grisetti, Stachniss et al. 2005) or landmark-based maps (Nieto, Guivant et al. 2003). However, one other major type of world representation has been successfully used by mobile robots. The following section reviews topological mapping methods, which represent places in an environment and their connectivity.
3.2 Topological Mapping Methods Topological approaches produce maps that represent places in the environment and how these places are connected. The concept of ‘place’ varies between different techniques. For some, a place is a local area of open space bounded by entrances (Kuipers, Modayil et al. 2004). Other techniques determine what constitutes a place by analysing the visual scene obtained from the robot’s camera. The representation of place connectivity also varies – ranging from the binary knowledge of whether two places are connected to the detailed knowledge of the motor commands or movement behaviours required to move between the two places. In one of the seminal pieces of work on topological mapping, Matarić (1990) describes a topological system that constructs a node based map of an environment using landmarks detected by a robot’s sonar sensors. The map is literally a graph of landmarks in the environment. Links are formed between nodes to represent topological adjacency. The lack of explicit Cartesian information in the graph map resulted in the optimality problem of choosing between the shortest topological and the shortest physical path when navigating to a goal. To address this problem a notion of time-asdistance was introduced, which allowed landmark nodes to be assigned an approximate physical size based on the robot’s transition time across the landmark. Vision-based topological approaches learn the visual scenes associated with places in the environment. Often only important places in the environment are learned – for instance in a maze environment the robot may only learn visual scenes at critical junctions
22
Robotic Mapping Methods
in the maze, but not during corridor traverses (Franz, Scholkopf et al. 1998). Other algorithms base the learning of places directly on the visual sensory input – unique or feature rich snapshots of the environment trigger the learning of a place. Often it is effective to simply learn places at regularly spaced intervals. Rybski, Zacharias et al. (2003) describe a mapping system that learns places every 1.07 meters the robot travels, with each place associated with a panoramic image. Other systems learn new places only when the previously stored image is insufficient for representing the environment, using difference methods such as the Sum of Squared Differences (SSD) (Vassallo, SantosVictor et al. 2002). Hybrid topological methods use range sensors to generate local Cartesian occupancy maps which are used as places in the map, even though the global representation is topological. Kuipers, Modayil et al. (2004) describe local perceptual maps (LPMs), which are bounded occupancy grids generated using a laser range finder. LPMs are inter-connected by gateways, which are areas of minimum obstacle clearance around each LPM, and the travel actions required to move between the LPMs. This approach has the advantage of producing accurate local metrical maps while taking advantage of the computational advantages of having a global topological map. However, the recognition and use of LPMs does require the use of a high accuracy range sensor and needs to be customised to suit the type of environment. A similar approach by Jefferies and Yeap (2000) combines local Absolute Space Representations (ASRs) with a global Memory For the Immediate Surroundings (MFIS). The computational requirements of hybrid topological methods can be significantly reduced by storing space histograms in x and y , rather than storing local occupancy grid maps (Duckett and Nehmzow 1999). Localisation is achieved by convolving the current histograms with the stored histograms to find the best matching location. This method requires a compass to remove the problem of self-orientation, but results in only minimal information loss when compared with storing the full local occupancy grids. To address the problems of perceptual aliasing and sensor noise, an iterative localisation algorithm is used to maintain multiple robot location hypotheses. Even after becoming completely lost in an ambiguous environment, a robot was able to use the algorithm to re-localise. Transition information between places can be learned in a number of ways. Vassallo, Santos-Victor et al. (2002) describe a topological mapping system where the robot learns a motor vocabulary representing the robot’s set of actions for moving around an environment. Links in the final topological map are associated with a specific motor word that describes the movement action required to go between the linked nodes. This approach integrates the motion abilities of the robot into the topological map and provides a more intuitive form of navigation, when compared to routes that are specified by ( x, y ,T ) co-ordinates. Other hybrid systems such as the MFIS represent transitions between places by exits, which are physical gaps at the boundary of an ASR that the robot can move through into another ASR.
3.3 Exploration, Navigation, and Dealing with Change Many methods can solve the SLAM problem using an appropriate set of assumptions, such as the assumption that the environment is static. However, SLAM in a static
Exploration, Navigation, and Dealing with Change
23
environment is only one part of the complete mapping and navigation problem. This section describes the additional problems of exploring an unknown environment, navigating to goals, and learning and coping with the dynamic aspects of an environment. 3.3.1 Exploration Exploration has been defined as “…the task of guiding a vehicle during mapping such that it covers the environment with its sensors” (Stachniss and Burgard 2003). Research into exploration methods has usually addressed the problem in isolation, rather than together with the SLAM problem. Frontier-based exploration is one of the simplest techniques and has been widely tested (Edlinger and von Puttkamer 1994; Yamauchi 1997). Using this exploration technique the robot moves towards boundaries or frontiers between known and unknown parts of the environment. The greedy implementation of this technique guides the robot towards the closest unknown frontier, which has been shown to result in reasonably constrained travel distances (Koenig, Tovey et al. 2001). Frontier-based exploration is relatively easy to integrate into most existing mobile robot architectures, although its performance in realistic long term robot experiments with uncertainty is untested. More sophisticated methods take into account other factors such as environment uncertainty and the knowledge gain potential of candidate locations for exploration. For instance, the Next-Best-View (NBV) algorithm evaluates new candidate locations based on the expected information gain, the motion cost required to move there, and the overlap between the current and candidate ‘safe region’ (Gonzalez-Banos and Latombe 2002). Unlike the greedy frontier-based technique, the robot may choose a more distant location if it considers that location to have more information gain potential than closer places. Fig. 3.3 shows the NBV candidate selection process. The overlap criterion is an example of incorporating the requirements of the localisation and mapping system into the exploration process. In this case, the robot will only move to new regions that overlap somewhat with the current observable region, so it can remain localised. In experiments, the NBV algorithm was able to perform at a comparable level to a human operator. Other metrics apart from knowledge gain can be used to evaluate potential exploration paths, such as sensing time and planning time (Moorehead, Simmons et al. 2001). Most of the methods described so far rely on some form of Cartesian occupancy grid map. The binary ‘occupied or not’ representation is unsuitable for representing the uncertainty the robot may have regarding a cell’s state. Coverage maps address this uncertainty by storing a posterior about the amount of a cell that is covered by an object (Stachniss and Burgard 2003). To explore the environment the robot must reduce the entropy of the posterior for a cell in the coverage map. This measure also allows the robot to determine how thoroughly a cell has been sensed, by monitoring the reduction in cell entropy on consecutive viewings. In metric, feature-based maps exploration can be performed by navigating towards the open end of the feature closest to the robot’s starting location, such as a wall segment (Wullschleger, Arras et al. 1999). Under this exploration method the starting location is the position in which the robot starts at the beginning of the experiment, not
24
Robotic Mapping Methods
Fig. 3.3. Candidate selection using the NBV algorithm. After five sensing operations the robot has identified an unexplored frontier (a), which is sampled for candidate exploration points (b). Each candidate point has a potential visibility gain (c), but is also evaluated based on other criteria such as motion cost to produce an optimal candidate (d) (Gonzalez-Banos and Latombe 2002). Reprinted with permission from Sage Publications.
the starting position before each exploration segment. This method ensures the robot regularly traverses a well known area that is suitable for re-localisation before exploring each new area. Large complex environments can cause many SLAM methods to fail due to their inability to close the loop under significant odometric error (Lu and Milios, 1997). Exploration techniques that rely purely on metric information can subsequently fail. One solution to this problem is to use exploration methods that use a hybrid map combining metric and topological information. One example is the exploration method shown in Fig. 3.4, which uses the Spatial Semantic Hierarchy (SSH) (Kuipers and Byun 1991). Distinct places in the environment are stored as nodes in the global topological map, with each node storing a local metric map. The metric map can be analysed to find the possible exploration routes from each distinct place (for instance, paths through doorways). The robot maintains an exploration agenda for each node, which stores information about where and in which direction it should explore. The method’s success relies on the ability of the mapping module to distinguish between similar places. The method can fail if the uniformity of the environment is too great.
Exploration, Navigation, and Dealing with Change
25
Fig. 3.4. Exploration method using the Spatial Semantic Hierarchy. DP – distinctive place, nbhd – neighbourhood, LCS – local control strategies (Kuipers and Byun 1991). Reprinted with permission from Elsevier.
3.3.2 Navigating to Goals Planning a route to a goal and following it in a simulated world is a relatively easy task, and has been the subject of much research. However, success in simulation is by no means an indicator of whether a method will work well on a robot in a complex environment. This section’s primary focus is on goal navigation techniques that have been tested on robots in real world environments. Approaches to equipping a robot with a goal navigation capability have varied, depending on the underlying localisation and mapping systems. One of the few methods used on a real robot in a complex environment is the motion planning algorithm used by the interactive museum tour-guide robot Minerva (Thrun 2000). Minerva operated in large, open area environments that were densely populated by moving people. Maintaining localisation was very important in such an environment, so the navigation system used a coastal planner to stay close to known objects (Roy, Burgard et al. 1999). The coastal planner method used Partially Observable Markov Decision Processes (POMDPs) to minimise the expected cumulative cost of the actions required to reach a goal. In this case, cost was the robot’s ability to localise along the path. The algorithm emphasised minimisation of localisation error at the expense of path length. The planning algorithm also relied on the Cartesian correctness of the map, which was generated a priori. Minerva was however able to robustly navigate around its environment during a two week period using this motion planning algorithm. The Exploration and Learning in Dynamic ENvironments (ELDEN) system uses an adaptive place network (APN) to learn the topology and geometry of an environment (Yamauchi and Beer 1996). Place units in the network represent regions in Cartesian space and links between place units represent the spatial relationship between
26
Robotic Mapping Methods
the units and the robot’s confidence that the link is traversable. Links are bidirectional and store both the transition from place unit A to place unit B and from B to A. When a goal is specified, the ELDEN system assigns links a cost that is inversely proportional to the robot’s confidence about the traversability of the link. Dijkstra’s algorithm is used to find the lowest cost path from each place unit to the goal location. Each place link is then assigned an orientation indicating the movement direction required to move closer to the goal. After this algorithm has been applied each place unit has only one link that it can traverse to move closer to the goal. The robot uses its reactive movement behaviours to follow these links to the goal. ELDEN was tested in a 14 × 8 metre laboratory area containing a range of static and dynamic objects, including people, boxes, chairs, desks, and carts. After twenty minutes of exploration, the system navigated successfully to several goal locations even in the presence of dynamic obstacles. However the re-localisation procedure underpinning this system is relatively crude. Localisation success relies on the robot being able to navigate back into the vicinity of its starting location, before performing a dead reckoning manoeuvre to the centre of this region and re-localising using a hillclimbing grid matching routine. The dependency on successful homing means the system cannot re-localise from large odometric errors, or close the loop. To address this limitation, Yamauchi (1996) proposed a system of storing local evidence grids with each unit, with re-localisation achieved by matching the current evidence grid to the stored grids. The proposed approach has similarities to the Spatial Semantic Hierarchy’s use of local perceptual maps and would also require that local regions be uniquely identifiable. In conducting a review of conventional goal navigation methods for robots, it quickly becomes apparent that there is little published literature on techniques that have been integrated into SLAM systems. There has been related path planning research in simulated environments (Hu, Khang et al. 1993; Marzouqi and Jarvis 2004; Niederberger and Gross 2004), or in structured physical environments (Browning 2000; Tews 2003). Very little research has addressed the problem in scenarios where the environment is initially unknown and complex enough to effectively test a system’s SLAM abilities, such as the ability to close large loops. This is somewhat surprising given the range of algorithms with good SLAM performance, such as the Spatial Semantic Hierarchy or FastSLAM. There are two possible reasons for this lack of research. It is possible that the conventional SLAM systems have fundamental characteristics that make them unsuitable for use in goal recall; this is discussed further in Chapter 6. The other reason may be that these areas have simply not yet been explored; only in the last few years has the robot mapping and navigation field reached the stage where several capable SLAM systems exist that are suitable for testing goal navigation methods. Either possibility is a good reason for the further investigation carried out in this book. 3.3.3 Dealing with Dynamic Environments Most real world environments are dynamic to some extent. To function fully in dynamic environments, a mapping and navigation system must address the extra issues arising from changes in the environment. The dynamic nature of an environment can
Exploration, Navigation, and Dealing with Change
27
vary significantly in timescale and characteristics, making the problem of mapping and navigating challenging. The difficulty of the problem has restricted the amount of research into mapping and navigation in dynamic environments – the vast majority of SLAM research assumes the environment is effectively static. This section focuses on mapping and navigation methods that specifically address at least some aspects of the dynamic problem. In environments where moving people are common, dynamic objects may be filtered out from the sensory data so as to avoid corruption of the localisation process. This is a relatively simple process in that, although the dynamic objects are being identified, they are not actively being learned or used as part of the mapping and navigation process. Range sensor readings can be used or discarded based on the calculated probabilities of the reading representing a known or unknown dynamic object. This approach has been successfully implemented using a robot equipped with a 2D laser range finder in a museum environment with dense crowds of people (Fox, Burgard et al. 1999). Other approaches have tracked the location of people over time using a conditional particle filter and a 2D laser range finder (Montemerlo, Thrun et al. 2002). Using a pre-learned map, the algorithm was able to localise, track moving people and also detect new permanent obstacles in the environment. In this case the robot started with a reasonably current map, and only performed localisation, rather than full simultaneous localisation and mapping. Some methods attempt to track the discrete movement, appearance, and disappearance of objects in the environment over time. These techniques focus on learning the different locations of dynamic objects rather than on tracking their actual movement. One such technique known as ROMA uses the Expectation Maximisation algorithm to learn models of non-stationary objects. It uses a map differencing technique that detects changes in an environment (Biswas, Limketkai et al. 2002). The technique is sufficient to identify and track the movement of unique objects using a laser range finder, and can handle variation in the total number of objects over time. This method is limited to obstacles detected in the scanning plane of the laser and requires the cross section of the object to remain constant after movement, allowing only rotations around the object’s vertical axis. Objects that are placed close together also cause recognition problems. These problems are not necessarily solvable using a laser range finder, suggesting that the approach may be overly precise and that a more robust and approximate process may be more suitable. The other main area of research into mapping and navigation in dynamic environments has focused on adapting topological representations of an environment to represent changes. The ELDEN system uses the concept of places and confidence links between them to represent the robot’s certainty the link can be traversed (Yamauchi and Beer 1996). The links also store the direction required to traverse the link. Three situations can result in the link confidence value being reduced when the robot attempts to get from A to B: the robot leaves A to get to B but through motor error or sensor noise ends up somewhere else; an unexpected obstacle causes the robot to end up in a different location than B; and the pathway between A and B is completely blocked and the robot cannot make any progress. This type of approach differs significantly from those described earlier in that it implicitly rather than explicitly encodes changes in the environment. The robot’s representation of the environment is updated based on the robot’s success in moving
28
Robotic Mapping Methods
around it, rather than through explicit identification of dynamic obstacles. The implicit learning of change has some advantages over explicit approaches, such as removing the dependence on accurate recognition of objects. The robot does not need to know the exact shape, size, and appearance of an obstacle placed in its path, but only that the obstacle has caused it to take a slight detour.
3.4 Discussion In summary, there has been a great deal of research into developing robotic mapping and navigation systems. The majority of the most successful current robotic mapping methods are to some degree probabilistic in order to deal with the inherent uncertainty of sensors and the environment. Under appropriate conditions some of these systems can solve the core SLAM problem in large, real world, static environments. There has been relatively little research into the concurrent problems of real world exploration, path planning, and adaptation to environment change, which are also integral parts of the mapping and navigation problem. Each of these challenges has been extensively addressed in isolation and in simulation, but rarely in parallel with the SLAM problem on an actual robot. It remains to be seen whether the mapping algorithms used by the ‘best’ SLAM systems are indeed optimal when the entire mapping and navigation problem is considered.
4 Biological Navigation Systems
Many animals possess formidable navigation abilities. Conducting research into the exact capabilities and limitations of the mapping and navigation systems used in nature provides two valuable outcomes; knowledge of what animals and insects can achieve using their own sensory and computing equipment, and theories and models of the mechanisms they use in the process of doing so. The first outcome is important because it demonstrates that many navigation tasks can be achieved with a surprisingly simple set of sensors. For instance, it is not necessary to have a high fidelity laser range finder to navigate effectively in a range of environments – most animals do not possess sensors that can tell them the range to obstacles varying in range from a few millimetres to ten or more metres, to a precision of ten millimetres. Experiments have demonstrated that rats can continue to navigate effectively even in pitch blackness, with the implication that some navigation systems can cope without a constant stream of feature rich information (Quirk, Muller et al. 1990; Mizumori and Williams 1993). Understanding the achievements of biological agents, particularly given their sensory input, provides much insight into the robot mapping and navigation problem faced by researchers. This chapter examines the mapping and navigation capabilities in a variety of animals. The following chapter then covers attempts by researchers to create computational models of these biological systems.
4.1 Rodents and the Cognitive Map In the field of biological mapping and navigation, rodents are one of the most studied animals. There is a well established base of research into their capabilities in a large range of experimental environments and situations. In addition, the rodent hippocampus is one of the most studied brain regions of any mammal (Fig. 4.1). This has led to a number of well established models of mapping and navigation in rodents, although there are still many areas of debate and controversy. At the core of this research is the concept of a cognitive map, first introduced by Tolman (1948). The cognitive map concept embodies the hypothesis that an animal or insect can learn information not specifically related or useful to the current task being M.J. Milford: Robot Navigation from Nature, STAR 41, pp. 29–39, 2008. © Springer-Verlag Berlin Heidelberg 2008 springerlink.com
30
Biological Navigation Systems
Fig. 4.1. The rodent hippocampus. CA1, CA2, CA3: cornu ammonis fields 1–3; DG: dentate gyrus; EC: entorhinal cortex; f: fornix; s: septal hippocampal pole; S: subiculum; t: temporal hippocampal pole (Amaral and Witter 1995; Cheung and Cardinal 2005). Reprinted with permission from Elsevier.
performed. Tolman discussed this latent learning with respect to rats. Rodents released in an environment with no reward food source were able to use the information learned at that time to later navigate to food sources placed in the environment. In the context of rodent navigation, the cognitive map is often theorised to be a spatial reference that some animals may create and use to navigate in their environments. Early work with rats identified place cells in the rodent hippocampus, which appeared to respond to the spatial location of the animal (O'Keefe and Dostrovsky 1971; O'Keefe and Conway 1978). Later work also identified another set of neurons that responded to the animal’s orientation, which were subsequently called head direction cells (Ranck 1984). Recent research has also uncovered evidence of grid cells, which
Rodents and the Cognitive Map
31
fire at regular grid-like intervals in the environment (Fyhn, Molden et al. 2004; Hafting, Fyhn et al. 2005). However, most of the models of the rodent hippocampus to date have focused on head direction and place cells. The following sections describe theories of the mechanisms by which rats map and navigate around their environments. 4.1.1 Head Direction and Place Cells Head direction cells are sensitive to the orientation of the rodent’s head, with each cell’s firing rate peaking when the rodent’s head is facing in a specific direction (Ranck 1984; Taube, Muller et al. 1990). The orientation of peak firing is often referred to as the cell’s preferred direction or orientation. A cell’s firing rate reduces as the rodent’s head rotates away from the cell’s preferred orientation (Fig. 4.2). The firing rates are dependent on the allocentric orientation of the head, and do not depend on the orientation of the body (Taube, Muller et al. 1990). Firing rates appear to be mostly independent of the rat’s location or current behaviour (motionless, walking, running, eating, and grooming), although some head direction cells appear to depend on factors such as the animal’s translational and angular velocity. For instance, recent work by Sargolini (2006) found a positive relationship between rodent speed and head direction cell firing rate for a majority of the observed head direction cells.
Fig. 4.2. Firing rate of cells as a function of head direction. Recording took place over an eight minute period using 6° bins (Taube, Muller et al. 1990). Copyright 1990 by the Society for Neuroscience.
The firing rates of head direction cells are controlled by both allothetic and ideothetic information. Vestibular and proprioceptive signals provide the rat with the ability to integrate rotation movement over time. This allows the firing profile in the head direction cells to be updated with changes in orientation of the head during movement in an environment. Experiments described by Mizumori and Williams (1993) observed the directional firing qualities of cells for rats in the dark. Directional information was retained even in the absence of any visual cues for short periods of time, and was updated by ideothetic input. Allothetic cues also influence the firing of head direction cells. Experiments have shown that rotating a dominant visual cue within an experimental arena can cause approximately the same shift in the preferred orientation of the head direction cells (Taube, Muller et al. 1990; Taube, Muller et al. 1990). Like robots, rats that are forced to rely only on self-motion information inevitably build up large position errors over
32
Biological Navigation Systems
time (Etienne, Maurer et al. 1996). So while it seems that ideothetic sensory information may be the primary means by which a rat updates its perceived position (Arleo 2000), it is likely that rats also rely on external cues to maintain the accuracy of their head direction system (Knierim, Kudrimoti et al. 1995; Etienne, Maurer et al. 1996). Place cells are the two dimensional cousins of head direction cells, corresponding to the rat’s location within the environment. Place cells fire maximally when the robot is located at a certain location in the environment, and fire to a lesser degree as the rodent moves away from this location. The area of the environment within which a particular place cell fires is known as the place field of that cell. However, firing of place cells is also affected by a number of other factors, such as visual cues, barriers, and food reward locations (Knierim, Kudrimoti et al. 1995; Gothard, Skaggs et al. 1996; Knierim 2002). Like head direction cells, place cell activity can change to reflect movement of the rodent even in the absence of visual cues (O'Keefe and Conway 1978; Muller and Kubie 1987; Quirk, Muller et al. 1990). In experiments with rats that had been blind from an early age, Save, Cressant et al. (1998) demonstrated that the place cells still exhibited normal functionality. Firing rates were, in general, lower in the blind rats than sighted rats, with one possible reason being that place cell firing rates depend on a combination of sensory inputs, with the deprivation of visual input resulting in a lower overall firing rate. Place cells also respond to visual cues. When distal cues are rotated, place fields have been observed to rotate a corresponding amount (O'Keefe and Conway 1978; Muller and Kubie 1987). Local visual cues appear to have less influence, except when they are proximal to a reward location of some kind (Redish 1999). Inconsistent path integration and external cue information can also compete to control place cell firing, as demonstrated in experiments by Gothard, Skaggs et al. (1996). Place fields have different shapes in geometrically different environments; for instance, Muller, Kubie et al. (1987) describe experiments in cylindrical and rectangular environments that produce differently shaped place fields. These experiments also introduced barriers of varying size into an arena. The barrier’s effect depended on its size; small barriers that did not change the rat’s movement significantly had little effect, but large barriers disrupted place fields that overlapped the barrier. This suggested that locomotion information could influence the formation and shape of place fields (Arleo 2000). One important characteristic of the hippocampal electroencephalogram is the theta rhythm, an observable sinusoidal rhythm of between 7 and 12 Hz (O'Keefe and Recce 1993). This rhythm manifests itself in the place cell firing through a phase correlation. As the rat enters the place field of one specific place cell, the place cell initially fires late in the theta period. As it moves through the place field, the place cell fires at progressively earlier times in the theta period. This phase shift is known as phase precession. While the firing of a place cell merely locates the rat somewhere within that cell’s place field, the phase precession gives the location of the rat within that place field. The theta rhythm is thought to possibly play a role in the goal navigation processes, which is discussed in the following section. Many of these theories of mapping and localisation in the rodent hippocampus have been tested using computational models. This work is explored in detail in Chapter 5.
Rodents and the Cognitive Map
33
4.1.2 Exploration, Navigation, and Dealing with Change Compared to the wealth of experimental evidence regarding spatial mapping in the hippocampus, there is relatively little information suggesting how a rat uses these internal representations to navigate between places. The nature of research regarding goal recall tends to be more speculative. It is relatively easy to observe the correspondence between firing rates of different cells and the location of a rat, but much harder to link these firing rates to a mechanism of purposeful navigation. The literature on this topic contains a number of interesting observations and proposed models of task orientated behaviour. In T-Maze experiments, rodent hippocampal cells were found to fire differently while the rat was traversing the central maze stem depending on whether it turned left or right at the T-intersection (Wood, Dudchenko et al. 2000). Further tests and statistical analysis found that the firing difference was most dependent on the turn taken rather than any other factors such as speed, heading, or position. These results suggest that the hippocampus may encode specific episodic memories which might be used in performing navigation tasks. Other studies have found cells that rely on a larger number of variables: the rodent’s location, significance of that location, and behaviour (Markus, Qin et al. 1995). In tests in alternating dark and light environments, some place cells appeared to fire based on the recent experience of the rat, indicating the hippocampus has access to short term memory (Quirk, Muller et al. 1990). When a rat repeatedly navigates between two reward locations, some hippocampal neurons gradually change their firing patterns and eventually fire maximally when near one of the reward locations (Kobayashi, Tran et al. 2003). When placed in a different environment without rewards, the firing patterns were maintained if the rat navigated purposefully towards an expected goal location, but were not observed in randomly moving rats. The experimental results supported the theory that during purposeful behaviour the rat develops hippocampal cells that encode a specific action of the rat at a specific location – in effect place-action cells. It is quite difficult to account for the range of experimental observations of rats using only one of the spatial or non-spatial models of hippocampal function; combined models are the most successful at explaining the wide range of observations. Some research has focused on developing a plausible framework relating to how this information is distributed in the hippocampus. One model dictates distinct clusters of neurons distributed longitudinally along the hippocampus (Hampson, Simeral et al. 1999). In the model, each cluster contains neurons that respond to a specific task involving spatial navigation. Spread throughout all these clusters there are also cells that encode the non-spatial aspects of the tasks. The stronger anatomical structuring of spatiallyrelated cells suggests their use as a base mode of spatial-only navigation. The authors suggest that repeated specific activity would form neuronal firing patterns within each of these clusters in response to behaviourally significant events, through behavioural or cognitive manipulations. Some simulated models have used the hippocampal theta rhythm as the basis for goal-directed spatial navigation (Gorchetchnikov and Hasselmo 2002). In simulation experiments long term potentiation (LTP) of excitatory recurrent connections between two bodies of cells occurred while the simulated rat performed food foraging behaviour in a T-maze (Koene, Gorchetchnikov et al. 2003). To recall known paths cell activity
34
Biological Navigation Systems
was introduced into these two bodies of standard cells in order to trigger predictive firing of a third body of cells that coded for the goal direction. The model produced several useful insights into learning and goal recall mechanisms. For instance it suggests that the theta rhythm can cause distinct learning and retrieval phases, and that it can help maintain a buffer of cell activity which is required for episodic memory.
4.2 Other Animals and Insects Since this book concerns a pragmatic rather than biologically faithful approach to solving the mapping and navigation problem, it is useful to examine the capabilities of other animals apart from the rats. This section describes the mapping and navigation abilities of bees, ants, primates, and humans. 4.2.1 Bees Insects are generally examined slightly differently to other animals due to their size and lack of a mammal brain – much of the research focuses on their navigation capabilities. Theories regarding the models or mechanisms by which they navigate are mostly inferred from their actions in various experiments. Honey bee workers fly many hundreds of kilometres during their lifetime foraging for nectar and pollen. They are equipped with a range of different sensors to help them move around in their environment. Their primary sensors are two compound eyes, with each eye made up of thousands of light sensitive cells. The compound eyes can detect both quantities of light and colour. In contrast, the three simple eyes arranged in the shape of a triangle on the top of the bee’s head are thought to primarily detect the amount of light present. The antennae on the bee’s head are thought to be used for both odour detection and flight speed measurement. Fig. 4.3 shows some of the basic anatomy of a honeybee worker. Foraging honeybees are known as central-place foragers because they have a common point to start and finish their foraging trips. Typically, the locations to which they forage are within two to three kilometres of this central location, although they have been observed to forage up to 13.5 km from their colony location (Frisch 1967; Ratnieks 2000). A large body of work has focused on what type of navigation methods bees might use to perform their impressive feats of navigation. Most of the models of bee navigation have been developed by subjecting bees to carefully devised sets of tests aimed at isolating various aspects of the navigation problem. This process of reverse engineering is not an exact science, and seemingly identical tests by different researchers have produced significantly different results. As a consequence, there is a wide range of theories of bee navigation. Researchers do agree on some components of the bee’s navigation process such as the method of spatial communication between bees (Frisch 1967; Michelsen 1993). Upon returning to the hive, scout bees perform either a round dance or waggle dance to inform other foragers about the direction and distance of the food from the hive. The waggle dance conveys this information in two different ways. The direction of the straight dance segment, relative to the vertical surface the bee is dancing on, tells other bees the direction of the food source relative to the sun. The duration of the
Other Animals and Insects
35
Fig. 4.3. Anatomy of a worker bee
waggle part of the dance indicates how far away the food source is from the hive. The simpler round dance is used for food sources close to the hive and indicates only direction. Models of a bee’s mapping and navigation processes vary from one extreme to another. Wehner, Bleuler et al. (1990) propose the local rule concept; that bees navigate using a number of independent memories. Under this model, bees navigate using a combination of dead-reckoning and route-specific landmark memories. Route-specific landmarks are used in the temporal order that they were learned. They are not stored in a general map context and hence cannot be used to generate novel behaviour. Instead the local cue memories affect what behaviour occurs in situations by prompting a change of behaviour. The cognitive map models as proposed by Gould (1986) are at the other end of the model spectrum. Gould presents experiments that suggest that it is not only vertebrates which are able to form and use some sort of cognitive map, but also some invertebrates such as bees. Because of the number of possible factors involved, hypothesising models of bee navigation is somewhat speculative. There has been extensive debate in the literature about many aspects of the field, such as whether bees use a cognitive map or instead rely on dead reckoning and cue-based navigation (Gould 1986; Gould 1988; Wehner, Bleuler et al. 1990). Often there is an inability to replicate the results of earlier experiments. In other cases new aspects of the problem have led to the previous experiments being rendered inappropriate or not conclusive. Only after rigorous and exhaustive testing can some aspects of models be validated. One example of this is extensive work by Esch and Burns (1996) and Srinivasan, Zhang et al. (2000) on the theory of bee distance estimation. Results from a large number of experiments
36
Biological Navigation Systems
strongly suggest that honeybees use image motion to determine their flight distances, rather than energy consumption as suggested by early work in the field (Heran 1956). 4.2.2 Ants In order to be able to return to their home, desert ants perform their own form of path integration when they leave the nest. Experiments by Muller and Wehner (1988) using the species Cataglyphis fortis suggest that an ant keeps track of the average of all directions in which the ant moves, weighted by the distance moved in each direction. The reference vector used for calculating the new movement angle is the previous homeward course direction. This model matches the experimental results obtained using real ants to a very high degree and also predicts the systematic errors that an ant’s navigation system accumulates. This navigation system leads to only small errors in most situations, except when an ant performs sudden backward turns, which the experiments show happens only rarely. Since the angular errors are systematic, an equally large clockwise and anticlockwise turn results in the two generated errors cancelling out. The experiments also reveal that the ants do not usually have a direction bias during their foraging activities. Wehner, Gallizzi et al. (2002) describe the navigation strategy that desert ants have adopted to deal with errors in their path integration process. When returning to the nest from a foraging location, the ants always use their home vector to return to the supposed location of the nest. The home vector is calculated during the outbound trip using weighted angular averaging as described previously. Usually for the longer trips there is some path integration error and the ant returns to a location near but not at the nest. It then embarks upon a systematic search to find the nest within that local area. It is interesting to note that repeated foraging trips to the same location result in only minimal correction of the home vector. However, through repeated trips the area searched during the systematic search process does become biased towards the true nest location. It seems that the desert ants rely heavily on the systematic search process even when given the opportunity to correct errors in their home vector. Desert ants also seem to adjust their navigation behaviour in a number of ways based on landmarks in their environment. Collett, Collett et al. (2001) describe experiments where an extended landmark (rather than a point landmark) was placed between the ant’s nest and a food source. By kidnapping ants at various stages during their journey from the food source back to the nest, the researchers were able to test the effect of the visual cues on the ant’s navigation. When the local movement vector as dictated by the remembered path back to the nest had approximately the same orientation as the vector suggested by the visual cue, the ants used a weighted average of the two to determine their movement. However, when there was a significant discrepancy between the two vectors, the ant appeared to ignore the local movement vector and use the vector suggested by the visual cue. 4.2.3 Primates Studies of insects have focused on analysing their actions during experiments and proposing plausible models of mapping and navigation that support the experimental
Other Animals and Insects
37
observations. Mammals have much larger and more accessible brains than insects, which allows researchers to examine the neuronal activity during experiments. This has resulted in a large amount of research focusing on the neuronal mechanisms by which mammals achieve mapping and navigation. Primates possess head direction cells, located in the presubiculum of the Macaca mulatta monkey. Primate head direction cells are quite similar to those found in the rodent hippocampus. The firing rates of cells are dependent on the head direction of the monkey, but not the eye direction of the monkey, as shown in experiments by Robertson, Rolls et al. (1999). Cells fire maximally for a certain head direction, with their firing rate dropping off in a Gaussian-like manner as the monkey’s head direction moves away from the cell’s preferred direction. Rolls (2005) describes the mean half-amplitude width of the cells’ tuning curves as being 76º. Head direction cells are not affected by the location of the monkey within the environment, and will give similar responses for the same monkey head direction even if the monkey has been moved and the visual scene is completely different. Primates also possess spatial view cells. These cells have responses based on where the monkey is looking. The information regarding where the monkey is looking is encoded in allocentric coordinates, storing the absolute location of the place rather than the location relative to the monkey’s current position and head orientation. This makes them quite different to the place cells observed in rats, which encode the rat’s current location within the environment. Spatial view cells exhibit useful characteristics in certain situations. Experiments by Robertson, Rolls et al. (1998) tested the response of these cells when the view details were partially or completely obscured by either curtains or darkness. Spatial view cells in the CA1 region of the hippocampus, parahippocampal gyrus, and presubiculum responded when the monkey was gazing towards a certain region even if none of the original view details could be seen. Other cells however, such as those in the CA3 region, appeared to rely on the familiar visual input, and hence did not respond when the view details were obscured. The characteristic of the first set of cells suggests the possibility of episodic memory, for instance remembering where an object was last seen. Because the memory is independent of the location of the monkey within the environment, it is potentially quite useful – the monkey can remember where something was regardless of where the monkey is located. Cells that respond to the monkey looking at a certain spatial location regardless of what the current view is suggest the possibility of recalling where an object is through memory, even though there is no current visual confirmation. O'Mara, Rolls et al. (1994) describe an experiment where a monkey was passively moved on a robot platform in a cue-controlled 2 × 2 × 2 metre test chamber. The experiments determined that there are cells in the primate hippocampus that respond to whole-body motion. Cells responded selectively to different types of motion. Some cells were sensitive only to linear motion, while others responded to axial motion. Different linear movement directions triggered responses in different cells. Some cells responses were modulated by visual input – when the visual input was removed some cells continued to respond to whole body motion, while the responses of others were much diminished or non-existent. One neuron only responded to external rotation of the environment and cues while the monkey was stationary. The researchers pointed out the relevance of these results to situations where part of the environment moves
38
Biological Navigation Systems
but with no associated movement of the observer, or extended periods of movement when there are no vestibular signals (such as on a train journey). 4.2.4 Humans Humans generally possess good vision sensors but relatively poor self-motion sensors. Their main advantage over animals and insects is the sophistication of their brain and higher level thinking processes. Despite the lack of absolute sensory equipment such as a magnetic field detector, and inaccurate path integration ability, most humans are able to navigate effectively in huge complex and dynamic environments. The mapping and navigation processes in humans are relatively unknown, especially when compared with well studied animals such as rodents. The hippocampus, although strongly associated with episodic memory, is also believed to possibly play a role in navigation, but the specifics remain unknown. One hindrance to further understanding of navigation processes in humans is the inability to conduct invasive examination in order to use the standard brain analysis techniques employed on rodents or monkeys. Instead, less invasive techniques such as Positron Emission Tomography (PET) are used. Ethical and safety considerations mean only a subset of the techniques used on animals can be applied to humans. Despite these limitations some initial findings have been published in the literature regarding proposed models of navigation in the human brain. In experiments involving taxi drivers in London (Maguire, Frackowiak et al. 1997), PET was used while drivers recalled and drove along complex routes around the city. Only experienced taxi drivers were used, whose knowledge of the streets had been reinforced over the years through many hours of taxi work. Drivers were also required to recall famous landmarks from locations they had never visited, and describe plot progression and individual movie scenes in movies. Each task involved either topographical (route recall) or non-topographical knowledge (landmark recall), and either could contain a sequential component (plot progression) or not (scene recall). The right hippocampus was active during route recall tasks that involved some use of a spatial map. Straight landmark recall in a non-spatial context did not activate the right hippocampus. However, the right hippocampus was also found to be activated during topographical unlearning. Maguire, Burgess et al. (1998) further describe a set of experiments involving humans navigating in a virtual reality town (a level from the computer game Duke Nukem 3D). After familiarisation periods of exploration in the environment, human subjects performed a number of navigation tasks; navigation directly to an unobscured goal, indirect navigation to a goal around obstacles placed in the virtual world, and path-following navigation along a trail of visual arrows. The right hippocampus was found to be more active during navigation than trailfollowing. Better navigation performance resulted in higher levels of activation in the right hippocampus. The researchers proposed that this part of the brain enabled navigation to unseen goals. The left hippocampus was generally more active during successful navigation than unsuccessful navigation, whether trail-following or goal finding. This activity was interpreted as corresponding to maintenance and recall of possible routes to the goal.
Discussion
39
The real world environment was significantly more complex than the simulation environments, both in size and complexity. For example, there were many more possible routes in the London environment than in the virtual reality town. Maguire, Burgess et al. (1998) proposed that the right hippocampus may be involved in higher level spatial tasks and decision making. The semantic memory tasks where a driver was required to recall a film plot activated mainly the left side of the brain, and had different brain activity to that displayed during route recall.
4.3 Discussion Within the body of knowledge on mapping and navigation in animals, it is clear that the rodent is both the most studied animal and the best understood in terms of the neural mechanisms by which it navigates and creates a map, and its real world capabilities. As a result the rodent is the ‘animal of choice’ in attempts to create artificial models of biological systems. Although navigation capabilities of insects are well known, little is known about the neuronal mechanisms they use to perform that navigation, which makes modelling these mechanisms a speculative process. Because of ethical considerations, relatively little is known about the neuronal mapping and navigation mechanisms in primates or humans either. Given the focus of this book on models of biological systems that can be applied in a practical robot context, it seems reasonable that further work focus on the rodent hippocampus. As such, the following chapter describes state of the art models of mapping and navigation in the rodent hippocampus.
This page intentionally blank
5 Emulating Nature: Models of Hippocampus
This chapter covers the state of the art models of the rodent hippocampus and their ability to solve each component of the mapping and navigation problem. Rodent hippocampal models can be separated into two groups. The first contains models that have been developed using theories of the neuronal mechanisms by which rats navigate and create a map. By testing these models in simulation or on a robot these theories can be validated or disproved. These models can also be used to make predictions that can be tested by further rodent experiments. Most of the models described in this chapter fall into this group. The second group of models is much smaller and contains models that have been developed for practical purposes, such as for controlling a mobile robot. These models vary in their trade-off between biological faithfulness and practical performance, with none matching the mapping performance of the top probabilistic techniques. This chapter also describes these models, with Chapter 6 discussing the implications of their limited practical performance.
5.1 Head Direction and Place Cells – State of the Art Models of the rodent hippocampus often consist of head direction and place cell structures. Because head direction cells relate to the rodent head’s one-dimensional orientation state, the computational algorithms required to simulate them are generally less complex than those required to simulate place cells, which relate to the rat’s twodimensional location state. 5.1.1 Attractor Networks Models of head direction and place cells often use some form of attractor network (Redish, Elga et al. 1996; Zhang 1996; Samsonovich and McNaughton 1997; Stringer, Rolls et al. 2002; Stringer, Trappenberg et al. 2002). Typically an array of cells is used, with nearby cells linked by strong excitatory connections, and distant cells linked by inhibitory connections. The stable state of such a network consists of a single cluster of active cells in a shaped peak distribution. The orientation represented by the current activity state can be determined in several ways, such as through population vector decoding or by simply picking the most highly activated cell. M.J. Milford: Robot Navigation from Nature, STAR 41, pp. 41–53, 2008. © Springer-Verlag Berlin Heidelberg 2008 springerlink.com
42
Emulating Nature: Models of Hippocampus
Fig. 5.1. Attractor network model of the rodent head direction system. Activity in the outer ring of head direction cells encodes orientation. The inner two rings are vestibular cells that respond to angular velocities. Links between visual cells and head direction cells are modified using a Hebbian learning rule. Head direction cells also have strong intrinsic connections that stabilise the system to a single localised cluster of active cells (Skaggs, Knierim et al. 1995). Reprinted with permission from The MIT Press.
5.1.2 Path Integration Activity injected into an attractor network near either side of the current activity peak will cause the peak to shift towards the injected activity through the network dynamics. This characteristic is commonly used as a means of path integration, although the technique by which new activity is injected can vary. One method by Skaggs, Knierim et al. (1995) uses two sets of ‘rotation’ cells, one set each for clockwise and counter clockwise directions of rotation, as shown in Fig. 5.1. Cells in these sets have both a preferred orientation and a preferred angular velocity direction. When the orientation and angular velocity matches the cell’s preferred state, the cell projects energy to nearby cells. Clockwise cells
Head Direction and Place Cells – State of the Art
43
project energy to head-direction cells in a clockwise direction, and vice-versa for counter clockwise. Although no results were presented for this model, a more complex implementation has been tested on a robot and found to be successful (Arleo 2000). The coupled attractor model of rodent head direction produces tuning curves that closely match those observed in two sections of the rodent brain (Redish, Elga et al. 1996). This model uses two populations of cells to represent the postsubiculum (PoS) and anterior thalamic nuclei (ATN) mammillary bodies in the rodent brain (Fig. 5.2). When there is no angular rotation, strong reciprocal matching connections between cells of the same preferred orientation in the PoS and ATN ensure stable synchronised Gaussian activity peaks in each cell body. Offset connections point from cells in the PoS to offset cells in the ATN, and have their strength modulated by the sign and magnitude of the angular velocity. During turning, activity in the ATN population leads that in the PoS population. This lead behaviour is supported by experimental findings that ATN cell activity is correlated with the head direction of a rodent approximately 20 – 40 ms in the future (Blair and Sharp 1995) .
Fig. 5.2. A coupled attractor model of rodent head direction. A turn to the right results in active right-offset connections but no left-offset connections from the PoS cells to the ATN cells. Strong matching connections are always active between cells in the two bodies with the same preferred orientation (Redish, Elga et al. 1996). Reprinted with permission from Informa Healthcare.
Path integration can be achieved in other ways, such as by adding a slight asymmetry to the intrinsic head direction excitatory links (Zhang 1996). This introduction of an ‘odd’ weight component causes activity to shift at a rate dependent on the magnitude of this odd component. For Zhang’s model particular care was taken to ensure that the shift in the activity profile of the head direction cells preserved its shape and that the angular speed corresponding to the shift was proportional to the magnitude of the odd component. Zhang supported his decision to focus on creating a symmetric activity profile even during movement by experimental findings on rats (Blair and Sharp 1995). Other experimental evidence however suggests that firing curves are distorted as a function of the angular velocity (Blair, Lipscomb et al. 1997), and that firing rates are at least somewhat dependent on the magnitude of the angular velocity (Taube 1995).
44
Emulating Nature: Models of Hippocampus
5.1.3 Head Direction Correction Using Allothetic Information Purely ideothetic updates are vulnerable to cumulative error and require the addition of some sort of allothetic correction mechanism in order to provide long term functionality. One such allocentric correction mechanism involves associating visual cues with head direction cells corresponding to the cue’s direction in an egocentric rather than allocentric reference frame (Skaggs, Knierim et al. 1995). This process only works for distant cues, when the relative orientation to the cue is the same in all environment locations. When visual cues are not sufficiently distant their relative bearing can change depending on a robot’s location within the environment. However, if visual cues are rotationally symmetric (appear the same from all directions) and the mapping and navigation system retains a Cartesian sense of the robot’s pose, then the egocentric bearing to a visual cue can be converted into an allocentric bearing. In Arleo’s (2000) hippocampal model visual bearing cells are used to encode the robot’s egocentric bearing; cells fire as a function of the angle between the robot’s heading and the bearing to a light cue. During learning, active visual bearing cells and place cells are associated with calibration (CAL) cells, which are in turn associated with head direction (poSC) cells (Fig. 5.3a). Later after learning is complete, during what is called the ‘non-learning’ stage, CAL cells fire only when the robot has a specific orientation relative to the visual cue and a specific spatial location (Fig. 5.3b). In effect the CAL cells associate the allocentric head directions of the robot with robot locations and relative bearings to a landmark. The calibration process consists of aligning the activity in the head direction cells with the activity in the CAL cells, which is achieved through the CAL-head direction connections. The learning of place-orientation contexts and the calibration of head direction are separate processes that do not occur simultaneously.
Fig. 5.3. Bearing and place cell models. (a) by Both visual bearing (VIS) and place cells (sLEC) are connected to calibration (CAL) cells, and each calibration cell projects to all head direction (poSC) cells (b) The calibration cells learn place-orientation contexts through VIS and sLEC inputs passing through separate neuroreceptors nr1 and nr2 (Arleo 2000). Reprinted with permission of the author and Verlag-Dissertation.
Head Direction and Place Cells – State of the Art
45
5.1.4 Place Cells – State of the Art Apart from their dimensional encoding differences, place cells and head direction cells have some similarities. By extending the one-dimensional head direction algorithms to two-dimensions, it is possible to create a simple place cell model that emulates some of the observed properties of place cells. However place cells also have many unique observed properties. Consequently many of the state of the art place cell models have been developed independently of the head direction models in an attempt to emulate some of these unique properties. 5.1.5 Place Cells Through Allothetic Cues Many of the early place cell models form place cell maps through sensory input from visual cues in the environment. For instance, one model uses the distance to cues in the environment to form the firing fields of simulated place cells during exploration (Burgess, Recce et al. 1994). A set of ‘sensory’ cells is assigned to each cue in the environment, with each cell responding maximally to the cue at a different distance. The cells use a broad tuning curve so that they still respond to a lesser degree for similar distances. An intermediate layer of entorhinal cells (EC) receives input from the sensory cells to produce a large firing field, as shown in Fig. 5.4. The entorhinal cells project to the place cells in the model, which are connected to subicular cells. Place cells and subicular cells have forward and backward inhibition; however, the place cell inhibition is stronger, resulting in smaller place fields than in the subicular cells. The model differs from many others in that cells have discrete firing responses measured in terms of firing spikes rather than a graded response. Cells are grouped around those with the largest inputs – the cell with input in each group is active but the others are silent. A simple set of goal cells was used to associate directions to goals with the current simulated rat’s location as given by the subicular cells. This model was tested in a simulated 1.5 × 1.5 m environment including a 150 mm border where up to 16 cues were placed. The simulated rat moved at a constant speed of 600 mm / s around the environment to simplify parts of the modelling process, rather than in a more biologically plausible stop-start manner. The simulated rat was allowed to explore the environment for 30 seconds before a goal was inserted, although the simulated rat would not always encounter the goal location during this time meaning exploration periods were often longer than 30 seconds. After encountering the goal the simulated rat was required to navigate to the goal from eight different starting locations around the edge of the environment. This procedure when repeated five times with different goal locations was known as a block of trials. On average only 30 to 60 seconds of exploration in the absence of a goal enabled the simulated rat to then navigate back to the goal after encountering it once. The average time taken to reach the goals in each block of trials, or the escape latency, reduced as successive goal recall trial blocks were performed with no further exploration. The average escape latency also reduced as the number of cues in the environment increased from 4 to 16.
46
Emulating Nature: Models of Hippocampus
Fig. 5.4. Model relating cues, head direction cells and place cells. Environmental cues determine the input to the network which feeds through to the paired place and subicular cell networks to produce place field firing patterns (Burgess, Recce et al. 1994). Reprinted with permission from Elsevier.
The main limitation of the model created by Burgess, Recce et al. (1994) was its dependence on visual cues and its lack of an update mechanism such as path integration in the absence of visual input. Unique point-like cues were used and were fairly evenly distributed, removing the problem of coping with ambiguity or real world features. However its rapid one-shot learning of goals, latent learning ability and short cut ability are all characteristics that are desirable on a mobile robot. Place cell firing activity can become somewhat direction dependent in narrow environments (Muller, Bostock et al. 1994; Markus, Qin et al. 1995). This has prompted the development of directional place cell or view cell models. In these models, such as the one shown in Fig. 5.5, the view cells merge the ‘where’ and ‘what’ information (Gaussier, Revel et al. 2002). Perirhinal (Pr) and parahippocampal (Ph) cells take the ‘what’ from an object recognition network as input and the ‘where’ from an object azimuth detection network. The Pr-Ph activity is calculated by multiplying the recognition level of the feature by the angular distance to the learned azimuth. Place cell activities are determined by normalising the sum of these activated Pr-Ph cells during place learning.
Head Direction and Place Cells – State of the Art
47
This model differs from experimental observations because it produces very large place fields of about two metres width, which are much larger than the observed fields in rats that are 100 to 200 mm wide (Gaussier, Revel et al. 2002). The authors point out that since place cells only fire when in close proximity to the rat’s location, the rat can only recognise its exact location and has no concept of distances to other locations in the environment, since no cells are firing that encode those other locations. In other models additional mechanisms or representations have been required in order to perform goal navigation (Touretzky and Redish 1995; Endo and Arkin 2003; Cuperlier, Quoy et al. 2005; Hafner 2005). However with the large place fields produced by this model, local goal navigation can be achieved by using a gradient climbing procedure through the place field of the goal place cell.
Fig. 5.5. A model of view cells. The ‘what’ – object recognition – and ‘where’ – object location – inputs to the perirhinal and parahippocampal cells. The normalised sum of the Pr-Ph activity produces the place cell activity (Gaussier, Revel et al. 2002). Reprinted with kind permission from Springer Science and Business Media.
5.1.6 Place Cells Through Ideothetic Information Place cell firing fields can also be established through the use of ideothetic information such as wheel encoder counts from a mobile robot. In this methodology the simulated rat or robot initially relies completely upon path integration to establish the firing properties of its place cells. Over time, the growing population of place cells can become associated with local views of the environment, with both associations shown by the model in Fig. 5.6 (Arleo 2000). In this model place cells are recruited incrementally as the robot explores the environment. A threshold limits the number of place cells that can be active at any one time – new cells are not recruited if the area is already sufficiently represented within the place cell network. Connections are learned between sets of cell representing both ideothetic and allothetic stimuli. During exploration the robot is given a monotonically
48
Emulating Nature: Models of Hippocampus
Fig. 5.6. Allothetic and ideothetic information influences place cells. Initially inertial stimuli drive the recruitment of place cells, but gradually visual stimuli start to become more dominant (Arleo 2000). Reprinted with permission of the author and Verlag-Dissertation.
Fig. 5.7. Place cell firing rates during a robot experiment. (a) Coverage of the arena by place fields, with each dot the centre of a place cell’s place field. The dot colour corresponds to the activation level of the place cells associated with each place field centre. (b) 600 × 600 mm testing arena, with bar-coded walls. The robot’s position corresponds to the firing activity shown in (a). (c) A sample activity profile for a robot located in the top right corner of the arena (Arleo 2000). Reprinted with permission of the author and Verlag-Dissertation.
Head Direction and Place Cells – State of the Art
49
growing sense of uncertainty, which eventually prompts it to return to its home location. The robot then recalibrates its path integrator using allothetic input learned at its home location. This approach avoids the problem of sensory aliasing, because ideothetic input can disambiguate situations of ambiguous allothetic stimuli. In addition, place cell activity can be updated in the absence of visual input. Well-conditioned place cells are produced through their dependence on two coherent allothetic and ideothetic spatial representations. An example of the place cell array firing rates is shown in Fig. 5.7.The influence of stimuli on the pose cells can also develop in the opposite order. In models where both allothetic and ideothetic stimuli drive place cell activity, cell activity can initially be driven solely by allothetic cues, with internal ideothetic connections evolving over time. One such model is shown in Fig. 5.8 (Stringer, Rolls et al. 2002). Like purely allothetic models, activity is initially driven by visual input. Each place cell is associated with a unique location in the environment when the cell is maximally activated as a result of visual input. The tuning curve of each place cell due to visual input is Gaussian shaped.
Fig. 5.8. Two-dimensional continuous attractor model of place cells. There are three types of input; visual, head direction cells that provide orientation information, and forward velocity cells that provide translational velocity information (Stringer, Rolls et al. 2002). Reprinted with permission from Informa Healthcare.
These visually driven firing profiles are used to form appropriate recurrent links within the place cells and links from other cells that encode ideothetic information. The model learns these links using two different processes. Hebbian learning is used to update links between co-activated place cells, with the rationale that cells active at the same time probably represent close physical locations, due to the hard-coded firing profiles caused by visual stimuli. In addition, a trace rule is used to update the
50
Emulating Nature: Models of Hippocampus
recurrent weights based not only on their current activity but also their past firing activity. The second rule is used to produce broader firing profiles for the place cells than can be obtained by using the Hebbian rule, even if the initial visual stimuli produce narrow activity profiles (Stringer, Trappenberg et al. 2002). To ensure stable activity packets in the network for periods when the simulated agent is not moving, current activity in the network is reinforced. This ensures that there is no activity packet drift due to the weight density irregularities in the rectangular place cell matrix. During initial exploration, synaptic weights are also updated, based on the activity in pairs of place cells as well as the activity in the head direction and forward velocity cells (Eq. 5.5). The learning rule associates place cells that were active slightly in the past with currently active place cells, the currently active head direction cells, and the currently active forward velocity cells. The change in the synaptic weight between two place cells,
FV δwijkl , is given by: ~
P
FV δwijkl = k ri P r j rkHD rl FV ~
where
(5.5) P
k is the learning rate, ri P is the firing rate of place cell i , r j is the trace fir-
j , rkHD is the firing rate of head direction cell k , and rl FV is the firing rate of forward velocity cell l . Learning this association ensures that place
ing rate of place cell
cell i fires when place cell j , head direction cell k, and forward velocity cell l are all active. In this way co-firing of all three sets of cells evolves links that stimulate place cell activity to appropriately represent the movement of the agent through the environment. Simulations showed that the model can self-organise during the initial learning phase based on visual input, and produce stable place cell firing patterns. The simulated agent was also able to update its state when placed in simulated darkness (all visual cues removed) using only ideothetic cues, hence demonstrating the suitability of the learned ideothetic connections. The system’s performance was found to rely on the shape of the hard-coded activity profiles caused by visual input, and the implicit Cartesian information stored in the relative locations of these firing fields. In work by Browning (2000), a minimalist place memory learning system was developed that could bind visual stimulus to place representations. The system achieved localisation through binding of visual information directly to spatial representations of position and heading, as well as the use of a path integration module. Fig. 5.9 shows the overall system, known as a minimalist LV-PI (Local View-Path Integration) model. During robot experiments in a number of small arenas, the system was able to localise with respect to heading but not location. The failure was primarily attributed to shortcomings in the view processing system; to address these Browning proposed a more advanced vision system inspired by fixation of interest, shifting of interest, and feature tracking.
Head Direction and Place Cells – State of the Art
51
Fig. 5.9. Minimalist LV-PI model as implemented by Browning (2000). Path integration (PI) is used to create the world representations used by the robot. Place is represented in the Place Code (PC) cells and heading is represented in the Head Direction (HD) cells. Output from the vision processing module is bound to both representations through associative learning. Reprinted with permission of B. Browning.
5.1.7 Navigation The spatial characteristics of place cells make them suitable for use in navigation. Arleo’s doctoral work developed very detailed models of head direction and place cells for use in navigation (Arleo 2000). The models were created with a strong emphasis on recreating experimental observations of rats. Towards this end, the system used biologically plausible path integration mechanisms and allothetic visual cue responses. A temporal difference learning technique known as Q-learning was applied to the spatial representations to create navigational maps that could be used for goal recall. Place cells were used to drive locomotor action cells within a reward-based model of navigation. The models were all tested on a Khepera mobile robot in a small 600 × 600 mm arena with bar-coded walls and an 800 × 800 mm arena with an external visual cue but no walls. The work contributed a number of insights into the neurophysiological processes involved in the spatial cognition of a rat, as well as a computational robot brain that could map and navigate within its limited environments. Other characteristics of the rodent hippocampus such as the theta rhythm have also been modelled. In the model by Burgess, Recce et al. (1994), firing fields are constructed during exploration by using cells that are tuned to respond to visual cues at various distances from the rat. Each field is seen as an approximate radial basis function, with the fields together representing the surface of the environment. To create the theta rhythm relationship, the firing rate of cells is related to the average angle between the rat’s heading and two visual cues (Fig. 5.10). This creates the desired effect of cells firing later when their place field is directly ahead of the rat, and earlier when the place field is mostly behind the rat. Intermediate firing times are obtained for place fields mainly to the side of the rat. The model also contains a body of cells corresponding to cells observed downstream of the hippocampus. These cells use the firing rate of the place cells to form a population vector representing the direction of the simulated rat with respect to a previously encountered reward source. This enables the simulated rat to navigate to goals within a small arena. The researchers noted that it is equally possible that the phase effect in rodents is created through some intrinsic property of place cells rather than the mechanism that they have proposed.
52
Emulating Nature: Models of Hippocampus
Fig. 5.10. Generating a theta rhythm. A theta rhythm can be created by relating the phase firing of a cell to the average angle, α, from the rat’s heading direction to two visual cues. An average angle of 0 degrees means the rat’s heading direction is in between the two cue locations, corresponding to a place field located ahead of the rat and hence a late firing phase (Burgess, Recce et al. 1994). Reprinted with permission from Elsevier.
Fig. 5.11. Learning to navigate to a goal. (a) Navigational map progress after 2, 20, and 21 trials. Navigational vectors start to point towards the goal as trials progress. During the final trial the platform was removed; the simulated rat rapidly made its way to the remembered platform location and stayed in that area. (b) Graph showing times taken to reach the platform over the 20 trials (Blum and Abbott 1996). Reprinted with permission from The MIT Press.
Discussion
53
Some models use simulated place cells to encode a different type of information. Blum and Abbott (1996) propose the concept of a map suggesting future movement directions based on the knowledge of the current location. Long term potentiation (LTP) is used to store temporal and spatial information about a rat’s motion through modification of synaptic strengths between place cells. When traversing a familiar route, a delay in LTP induction allows presynaptic cells to link with postsynaptic cells further along the path. When retracing that route, current place cell activity drives activity in cells corresponding to places slightly in front of the rat. To navigate to the end of the route, the simulated rat heads towards the position encoded by this anticipatory cell activity. Testing of this model in a simulated Morris water maze experiment produced decreasing platform acquisition times over the first 10 trials followed by consistent rapid platform finding behaviour (Fig. 5.11). Later work on this model extended it to multiple goal locations using the same map (Gerstner and Abbott 1997). This was achieved by modulating place cell activity based on goal location, in addition to each cell’s dependence on the location of the rat relative to the cell’s place field. Different goal locations triggered different navigational maps. This also allowed a form of navigational map interpolation – goal locations could be specified that were never learned in exploration. The system would produce a map that was a combination of maps to learned goal locations.
5.2 Discussion During the development of many of these computational models, the focus has been on recreating the neuronal firing patterns observed in the rat brain during experiments. Many of the observed neuronal firing phenomena such as peak deformation (Goodridge and Touretzky 2000) and theta precession (Burgess, Recce et al. 1994) have been successfully replicated in these models. However, while the biological faithfulness of these models has been thoroughly tested, far less work has investigated the practical mapping and navigation potential of these artificial models, especially on actual robots rather than in simulation. When these models have been implemented on robots, the experiments have occurred in small arenas with artificial visual cues. This is no doubt partly due to the main focus of this work on proving theories of biological navigation and mapping mechanisms – most of the experimental data regarding rodents has been performed in small arenas (Fyhn, Molden et al. 2004; Hafting, Fyhn et al. 2005). In contrast, many mobile robots are expected to operate in large, complex environments. Little research to date has investigated the capability of biologically inspired computational models in such environments.
This page intentionally blank
6 Robotic or Bio-inspired: A Comparison
The last four chapters have discussed the mapping and navigation problem and the range of system solutions that exist both in the animal world and on robot platforms. There are many forces driving the diversity that is apparent when examining these systems. In nature, creatures and their navigation systems have evolved to suit their environments and their own physical and sensory characteristics. In the natural world, the range of body types, sensory equipment, environments, and lifestyles has produced a myriad of solutions to the problems facing a creature that needs to move around effectively in its environment. Likewise in the research labs of robotics researchers and the domestic home or industrial workplace, mapping and navigation systems have developed in ways to suit the environments, the sensors available and the purpose of the robot. Given this diversity it seems a challenging task to identify one research direction that is most likely to yield a complete robot mapping and navigation system. The specific mechanisms and representations of each system differ so greatly that direct comparison is not possible. Fortunately, there are a number of fundamental mapping and navigation issues that are common to all these systems. Through consideration of these issues for all these systems it is possible to define goals for future research in this field, and in the process identify one means by which these goals may be met. The following sections compare and contrast biological systems and models of biological systems with robotic mapping algorithms in a range of areas. The comparison highlights the shortcomings of both types of system as well as their complementary characteristics. These shortcomings are particularly relevant when considering where future research into the mapping and navigation problem should focus. After a discussion of possible future research approaches, the final section describes the line of research that was pursued in the work described in this book.
6.1 Robustness Versus Accuracy One of the most significant differences between biological mapping systems and probabilistic methods is the world representations that they use. Many probabilistic systems incorporate high resolution occupancy grid maps, such as those shown in Fig. 6.1. In work by Grisetti, Stachniss et al. (2005) in the Freiburg campus environment, the M.J. Milford: Robot Navigation from Nature, STAR 41, pp. 55–60, 2008. © Springer-Verlag Berlin Heidelberg 2008 springerlink.com
56
Robotic or Bio-inspired: A Comparison
occupancy grid contained 6.25 million squares. By contrast the place cells found in the rodent hippocampus appear quite coarse, and likely do not encode information to such a precise degree. Furthermore many probabilistic based mapping methods attempt to produce a Cartesian map with direct spatial correspondence to the physical environment. A 2.5 metre wide, 16.7 metre long corridor should appear as an identically sized region of free space in an occupancy grid map, with occupied cells along its edges. Biological systems relax the requirement that there be direct Cartesian correspondence between a map and the environment. In rodents, place field size and shape can vary significantly depending on the type of environment. For instance, the addition of straight barriers to an environment resulted in the destruction of the place fields located where the barrier was added (Muller and Kubie 1987). Barriers can also cause an apparent remapping from the affected place cells into new place cells. Many animals appear to rely more on robust coping strategies than precise representations of their world. One such example is the desert ant; it navigates back to the general vicinity of its nest and then uses a systematic search procedure to find it (Wehner, Gallizzi et al. 2002).
Fig. 6.1. High resolution occupancy maps. (a) Grid map of the University of Freiburg campus, measuring 250 × 250 m, with a grid resolution of 0.1 m. (b) Grid map of the MIT Killian Court, measuring 250 × 215 m with a resolution of 0.1 m. Both maps were generated using the extended Rao-Blackwellized particle filter method (Grisetti, Stachniss et al. 2005). © 2005 IEEE.
6.2 Map Friendliness Versus Map Usability Maps that are highly accurate, precise, and Cartesian have a number of advantages and disadvantages. One immediate observation is that they are ‘human-friendly’. Looking at the map in Fig. 6.1b, a human can immediately make a number of high level observations about the environment – there are multiple rectangular loops in the environment, the environment is almost entirely made up of corridors running at discrete ninety degree orientations, and so on. It is easy for a human to relate parts of the map back to physical places in the environment, and also to compare the map and environment on a global scale. In contrast, it is hard to present an equivalent map for any animal or insect. Place fields observed from brain readings in the rodent hippocampus convey only the
Sensory Differences
57
observed locations in the environment to which a single place cell responds (Fig. 6.2). There is no definitive method of combining the environmental features that a rodent perceives with the place fields of each place cell to produce a single ‘conventional’ map. An occupancy grid map could be created by using place cell activity and externally determined ranges to obstacles. It is unlikely however such a map would be representative of the rodent’s actual mapping process. The extent to which rats rely on “range-to-obstacle” concepts is unknown, as is whether rats use a linear geometric scale for ordering the map.
Fig. 6.2. Illustrative example of place cell firing and occupancy maps. Darker squares indicate higher firing rates / increased occupancy periods. (a) Time spent by rat at each location in the circular environment. (b) Spike map showing firing rate of one cell. (c) The firing rate array is obtained by dividing the spike map by the time spent in each location, to reveal a single peak of activity.
The maps shown in Fig. 6.1 are human-friendly in that a human can easily understand them. A human could use these maps to navigate efficiently to various locations. However, in doing so a number of high level classification and planning processes would be used. The long narrow rectangular sections of free space in Fig. 6.1b might immediately be classified as corridors, and the large obstacle shapes in Fig. 6.1a as buildings. A human might consciously trace along the paths shown in the map, before choosing and following the shortest one. The overall process would be very different to the subconscious navigation that occurs as a person drives to and from work, walks around an office building, or moves around in their house and garden. This distinction is an important one – the form of most maps created by probabilistic methods is governed by the underlying mathematical algorithms, or human requirements that the map be in a form suitable for performance analysis. In contrast, biological mapping systems have evolved to produce representations that are suited to the navigation tasks animals must perform everyday.
6.3 Sensory Differences Animals and robot platforms have different sensory equipment and movement capabilities. Probabilistic methods exploit the particular characteristics of the sensors they use. Biological sensors have evolved for important activities such as finding food and mating, with biological navigation systems evolving to suit the sensors and activities.
58
Robotic or Bio-inspired: A Comparison
Models of biological systems emulate the biological mapping and navigation mechanisms while using robotic sensors. Some researchers have applied compensatory algorithms to the output of the robotic sensors in order to more closely emulate the biological systems. For instance, the use of certain colour spaces minimises the effect of illumination differences (Tews, Robert et al. 2005), a task that many biological vision sensors achieve effortlessly. Other research approaches have involved plausibly modifying the models so that they take advantage of the differences in sensing equipment. For instance, instead of using a camera, a set of sonar sensors can be used to detect bearings and ranges to environment cues (Recce and Harris 1998). In the search for a biologically inspired mapping and navigation system, there are two ways to approach this problem of sensory differences. One approach is based on the fact that biological systems manage to perform quite well despite the specific limitations of some of their sensors. Although biological vision sensors can be quite advanced, there is a lack of accurate range sensing equipment (with the exception of such animals as bats). It seems reasonable that given the rapidly increasing computational power of modern computers, it should eventually be possible to create an artificial system equipped only with biologically faithful sensors that can match the navigation performance of animals. This approach dictates the meticulous emulation of the sensory equipment and theorised navigation strategies of animals even if this means purposefully handicapping certain aspects of the systems. The other approach involves trying to exploit the superior characteristics of some robotic sensors by extending the mapping and navigation models. Given the current inability to match the capabilities of many biological sensors, it also seems reasonable that the superior characteristics of some robotic sensors be exploited to make up for other shortcomings in the models. In this approach the focus is on creating functional robotic systems, rather than faithful replication of proposed biological mechanisms. Biological mapping and navigation mechanisms are modified to accommodate the different characteristics of robot sensors. This approach has received relatively little attention, unlike the fields focusing on biologically inspired mechanical robot design. There has been only a limited amount of research into developing practical biologically inspired robot mapping and navigation systems (Gaussier and Zrehen 1994).
6.4 Capability in Real World Environments Biological navigation systems perform well enough to allow the millions of species of animals using them to function and survive every day. These systems combine advanced sensing, clever mapping and robust navigation mechanisms. There is a reasonable amount of knowledge about the capabilities of their sensors, and experiments have gathered a significant amount of knowledge about the navigation capabilities of these animals. Many theories have been devised to account for their capabilities, but in some areas research is only starting to scratch the surface. However, there is no question that animals can navigate well in a wide range of complex, dynamic environments. In most areas the state of the art in robotic mapping and navigation systems has not yet come close to matching the abilities of animals. In specific subsets of the problem and under certain conditions, these systems may outperform their biological counterparts, but it is with the sacrifice of robustness and flexibility, and is usually
One Solution
59
accompanied by a host of assumptions and the use of advanced sensing equipment. Nevertheless the top methods can produce impressive maps of large indoor and outdoor environments (Thrun 2000; Montemerlo, Thrun et al. 2003; Grisetti, Stachniss et al. 2005), and some robots can even navigate effectively using their maps (Yamauchi and Beer 1996; Thrun 2000). Most of the major mapping and navigation problems, such as closing the loop or coping with ambiguity, can be solved by one or another of these methods with appropriate assumptions. In contrast to the robotic systems, computational models of the rodent hippocampus have only been used in simulation or on robots in small structured environments, and are yet to solve many of the major mapping and navigation problems. The small size and limited complexity of these environments reduces the major mapping and navigation challenges such as closing the loop, dealing with extended sensory ambiguity, and navigating to goals to trivial or non-existent problems. None of these models have been tested in or successfully applied to the large unmodified environments in which one might reasonably expect an autonomous mobile robot to function, such as an office floor or an outdoor set of pathways. Biologically inspired models are partly limited because the goal for much of the research is to validate or otherwise navigational theories for a particular animal, rather than to produce a fully functional robot system (Franz and Mallot 2000). The uncertainty about biological systems and subsequent speculation has produced models that may only be partially faithful to the biology, with resulting navigation performance that is inferior to that of the animal. In pursuing the development of a biologically plausible model, it is unlikely that a researcher will stumble upon a better performing model by chance – current biological systems are the product of a long sequence of evolution.
6.5 One Solution Given the current state of robotic and biological mapping and navigation systems, several conclusions can be drawn. It is unlikely that research in the near future will create a perfect model of the higher level mapping and navigation systems, such as those of a rodent, primate, or human. Animal brains are only partially understood; researchers create theories from the readings of a few dozen electrodes, but the theories are far from being comprehensively proven. Even though models may recreate some aspects of animal navigation behaviour, there can be no real confidence that the underlying mechanisms driving the artificial system are the same as those in the real animal. Furthermore, even biological systems do not necessarily possess all the capabilities autonomous robots require to function in the challenging environments earmarked for their deployment. Conventional robot mapping and navigation research is also facing many challenges. The most impressive recent demonstrations of mobile robotics have been largely made possible by good engineering and incremental improvements in algorithms, sensors, and computational power. The Defence Advanced Research Projects Agency (DARPA) Grand Challenge of 2005 is one prime example of this; while the onboard navigation and mapping systems were state of the art, it was only with an impressive array of expensive, high precision sensing equipment that this specific task
60
Robotic or Bio-inspired: A Comparison
could be solved (Crane III, Armstrong Jr et al. 2004; Ozguner, Redmill et al. 2004). Some may argue that the continual improvement in sensors and computer power may eventually lead to navigation systems that surpass all the abilities of one of the most adept of navigators – humans. However, it is perhaps more likely that this milestone will be achieved through fundamental methodology changes, rather than steady computational and sensory improvements, although such changes will definitely facilitate the process. So, where to look for a solution to the mapping and navigation problem? This book proposes that an eventual solution may be found using a biologically inspired yet completely pragmatic approach. This is not a novel proposal, with previous research investigating this hypothesis (Browning 2000). However, research on pragmatic models of biological systems to date has still had a heavy emphasis on biological plausibility and has had limited practical success. No research has developed or tested a biologically inspired model under the same conditions and criteria as conventional robot mapping and navigation systems. For example, the biologically inspired model developed by Arleo (2000) can perform localisation, mapping and goal recall, but only in a very small artificial arena, with continuous visual input from artificial cues or a distal cue source. Other less biologically faithful models have displayed limited re-localisation ability in robot experiments in an artificial arena with visual cues, and have been able to navigate to goals in simple T and two arm mazes in simulation (Browning 2000). These approaches have also been fundamentally limited by the rodents on which they are based. The capabilities of rodent mapping and navigation systems do not necessarily fulfil all the desired capabilities of an autonomous mobile robot. If a biologically inspired system is ever to compete with conventional probabilistic methods, it must also contain solutions to the navigational shortcomings of animals. This work therefore sought to further investigate whether models of the rodent hippocampus can serve as a basis for a complete robot mapping and navigation system. The first major step towards this goal was to investigate the practical capabilities and limitations of existing state of the art models of the rodent hippocampus. Towards this end, Chapter 7 presents the results and findings of a pilot study that implemented a conventional model of the mapping and navigation mechanisms in the rodent hippocampus.
7 Pilot Study of a Hippocampal Model
Rodents are the most thoroughly understood animal with respect to both their mapping and navigation capabilities and the neuronal mechanisms by which they are achieved. However, of the few studies that have actually implemented a hippocampal model on a robot (Arleo 2000; Krichmar, Nitz et al. 2005), none have investigated the full extent to which the model can be used as a practical mapping and navigation system. This chapter presents the implementation and experimental testing of a classical model of the mapping and navigation processes in the rodent hippocampus. The model is presented in stages, starting with its representation of spatial orientation, followed by its representation of spatial location. Results from the model’s performance in robot experiments at each stage are presented. These results lead into a discussion of the limitations of the model, including one key finding. This is followed by an examination of the literature on rodent navigation to determine the plausibility of these findings. The chapter concludes by discussing the significance of these limitations with respect to creating a practical robot mapping and navigation system.
7.1 Robot and Environment The model was initially developed in simulation before being deployed on a robot. For reasons of brevity, this chapter only describes the experimental setup for the robot experiments, which was mimicked in the simulation experiments. Experiments were run on a Pioneer 2DXe mobile robot from Mobile Robots Inc (formerly ActivMedia Robotics), as shown in Fig. 7.1. At this stage of the project the robot was equipped with a SICK laser range finder, a ring of eight sonar sensors, a Sony Pan-Tilt-Zoom (PTZ) camera, and wheel encoders. The robot also had a full onboard computer with a 400 MHz Athlon K6 processor and 128 MB of RAM. An 802.11b wireless ad-hoc network provided communications with a 1.1 GHz Intel Pentium 3 laptop which ran the majority of the model computation. Table 1 gives a summary of the equipment used in these experiments. The laser range finder and sonar sensors were not used in this pilot study. M.J. Milford: Robot Navigation from Nature, STAR 41, pp. 61–86, 2008. © Springer-Verlag Berlin Heidelberg 2008 springerlink.com
62
Pilot Study of a Hippocampal Model Table 7.1. Summary of equipment used
Equipment Description Used in Pilot Study SICK laser range finder Range readings in a 180° arc No Ring of eight sonar transducers Array covers approximately No 210° of arc; each sensor detects obstacles within a 30° cone. Sony PTZ camera Field of view 48.8° horizontal Yes by 37.6° vertical; 768 × 576 resolution. Motor encoders 500-ticks per revolution motor Yes encoders Athlon K6 400 MHz On-board computer for local Yes processor processing of data Intel Pentium 3 1.1 GHz Off-board computer for run- Yes laptop ning main model computation
Fig. 7.1. Equipment used during indoor experiments. A Pioneer 2DXe robot was used with a wirelessly connected laptop.
The vision problem was simplified by using a system of artificial visual landmarks. Coloured sheets of paper were rolled into 210 mm tall cylinders with a diameter of 90 mm. Cylinder colour schemes were split between the upper and lower halves of the cylinder. Four different colours – red, green, blue, and magenta – provided a total of
Complete Model Structure
63
Fig. 7.2. The Pioneer robot in the two by two metre experimental arena. Coloured cylinders are arranged around the perimeter.
16 types of unique landmark. The vision system could report the bearing, range to and colour of cylinders within its field of vision, along with an associated uncertainty value for each reading (Prasser and Wyeth 2003). Cylinders were consistently visible at ranges between one and three metres, with a distance uncertainty of about 10%. The testing environment was a flat two by two metre area of a laboratory floor, with masking tape marking the boundaries (for the observer’s benefit only). The coloured cylinders were used as visual cues for the robot and were placed just outside the arena in various configurations, such as the one shown in Fig. 7.2. Because the vision system was trained to recognise rectangular areas of a solid uniform colour, it did occasionally incorrectly recognise people’s socks and posters in the background as coloured cylinders.
7.2 Complete Model Structure There are a variety of theories on how rodents map and navigate in their environments, and consequently a range of models of these processes. However there is some degree of consistency across most models regarding the major inputs, components, and functions. The model described in this chapter follows the general consensus that rodents use both external and internal sensory input, and that they have neural structures for representing their spatial orientation and location, as described in Section 4.1.1. Fig. 7.3 shows the overall model structure. The robot’s pose is encoded in two modules known as the head direction (HD) and place networks. The head direction network encodes the robot’s orientation, the place network the robot’s location. Ideothetic information in the form of wheel encoder input drives the path integration process, which
64
Pilot Study of a Hippocampal Model
updates the robot’s state information in both the head direction and place networks. Allothetic information in the form of camera images is processed to form a representation known as the local view (LV), which calibrates the robot’s state information stored in the head direction and place networks.
Fig. 7.3. Overall structure of the hippocampal model
7.3 A Model of Spatial Orientation This section describes the implementation of a model of spatial orientation inspired by head direction cells in the rodent hippocampus. The word ‘inspired’ is used because only the major characteristics of the biological head direction cells were modelled. The biology served as a seed for the development of a simple, intuitive model of spatial orientation with the aim of providing a functional representation of the orientation of a mobile robot. 7.3.1 Representing Orientation The core component of the spatial orientation model is an array of neural units or ‘cells’ roughly corresponding to biological head direction cells. Each cell is tuned to be maximally active when the robot’s orientation matches the cell’s preferred direction, as shown in Fig. 7.4. The cell’s activity level reduces as the robot orientation rotates away from this preferred direction. The cell arrangement reflects their associated robot orientations – nearby cells encode similar robot orientations. When the ensemble activity of the head direction cells is viewed as a bar graph, one can see a ‘packet
A Model of Spatial Orientation
65
Fig. 7.4. Head direction network. The cell activation levels are shown by the darkness of the shading. The ensemble cell activity encodes the robot’s orientation.
Fig. 7.5. Activity profile in the head direction network. The packet is slightly skewed because the robot is rotating.
of activity’ that resembles a Gaussian curve (Fig. 7.5). The centre of this ‘activity packet’ represents the current perceived orientation of the robot. To create the desired cell firing profile, the cells are connected together in a competitive attractor network (CAN). Attractor networks, which are introduced in Chapter 5, have often been used to model head direction cells due in part to their desirable network dynamics (Redish, Elga et al. 1996; Stringer, Trappenberg et al. 2002). In this model 81 head direction cells are used to encode the full range of robot orientations. This means each cell is maximally active for a spread of about 4.4 degrees in the
66
Pilot Study of a Hippocampal Model
robot’s range of possible orientations. One full iteration of the head direction CAN consists of the following steps: 1. 2. 3. 4. 5.
Association of allothetic cues with robot orientation Calibration of robot orientation using allothetic cues Internal competitive attractor network dynamics Update of orientation using ideothetic input Global normalisation of network activity
Each of these steps is described in the following sections. 7.3.2 Learning Allothetic Cues Without recalibration the robot’s perception of its orientation will gradually become incorrect. Recalibration is a two step process; the robot must first learn associations between its orientation and the input from its external sensors, and then use that information to correct its orientation state information when it later receives that same sensor input. The robot’s camera and vision processing module can see coloured cylinders and report on their colour, distance, and relative bearing from the robot, and associated uncertainties (Prasser and Wyeth 2003). A three-dimensional matrix of local view cells encodes the cylinder colour (type), distance and bearing. Activated local view cells are constantly being associated with active head direction cells through strengthening of weighted connections between them (Fig. 7.6). Although the vision system reports geometric information, the learning and re-localisation system does not explicitly use this information. It relies instead on the consistency in visual appearance of the environment at each robot pose. The link weight, β ijkl , is updated by: t +1 t β ijkl = β ijkl + ς (ElVijk − δ decay )
Fig. 7.6. Active view cells are associated with active head direction cells
(7.1)
A Model of Spatial Orientation
67
where the link weight is increased by an amount proportional to the product of activity in the head direction cell El and view cell Vijk less a decay constant δ decay . 7.3.3 Re-localisation Using Allothetic Cues Once the robot has learned the associations between sensory input and orientation, it can use them to correct its perceived orientation. Whenever a familiar visual scene is encountered, activity is injected into the head direction cells associated with the scene. If the robot’s perceived orientation is correct, this has the effect of reinforcing its belief. However, if the current perceived orientation is incorrect, the injected activity initiates an alternative hypothesis regarding the robot’s orientation, which then competes with the current hypothesis under the competitive attractor network dynamics. If the robot is rotating, the associated activity from the local view must be shifted somewhat according to the robot’s rotational velocity before being injected into the head direction cells. This lead stops the injected activity from affecting the path integration mechanism which is described in Section 7.3.5. The amount of shift required, e , is linearly related to the rotational velocity ω by a constant value k p :
e = k pw
(7.2)
The current view cell activity V is projected through the association weights β and a vision influence constant α into the head direction cells E.
El t +1 = El t + αLl t X
Y
Z
Ll = ∑∑∑ β ijklVijk t
(7.3)
t
(7.4)
i =0 j =0 k =0
Fig. 7.7. Connectivity of local view cells with head direction cells. Local view cell activity is projected through association links to a cell
Ll
in an array storing allothetic input. Activity in
L is injected directly into the head direction cells.
68
Pilot Study of a Hippocampal Model
Fig. 7.7 shows activity in the local view cells being projected into a copy of the head direction cell array, Ll , that stores activity caused by allothetic cues. This activity is injected directly into the actual head direction network. 7.3.4 Internal Dynamics To achieve stable activity profiles in the head direction network, a system of localised excitation and global inhibition is used. Each cell is connected to all the cells in the network including itself via excitatory links. The self-connected weight is the largest, with weight values dropping off in a discrete Gaussian manner for connections to more distant cells. The matrix storing the excitatory weight values between all cells, δ , is given by: ⎛ (i − j )2
1 ⎜⎜ δ ij = e ⎝ u n
u = ∑e
⎞ ⎟ s 2 ⎟⎠
(7.5)
⎛ i2 ⎞ ⎜ ⎟ ⎝ s2 ⎠
i =1
2
where s is the variance of the Gaussian distribution used to form the weights profile. During an experiment, activity in each of the head direction cells is projected through the appropriate excitatory links back into the head direction cells. The updated cell activity level after excitation,
Ei
t +1
Ei
t +1
N
, is given by:
= Ei + ∑ δ ij E j t
t
(7.6)
j =0
where
E j is the excitatory cell and δ ij is the excitatory link value from cell j to cell i.
Fig. 7.8. Head direction cells excite themselves and other cells. The total activity sum is used to inhibit all cells equally.
A Model of Spatial Orientation
69
Global inhibition based on the total activity in the network is performed to ensure that without input from allothetic stimuli, smaller activity packets will gradually ‘die out’. The updated activity level of a cell after inhibition,
Ei
t +1
Eit +1 , is given by:
= max(Ei + (1.0 − v ), 0) t
N
where v = ∑ Ei
(7.7)
t
i=0
The internal dynamics of the head direction cells are shown in Fig. 7.8. To ensure that the total activity sum in the head direction network stays constant, a global normalisation step is performed after the allothetic association, allothetic calibration, and ideothetic update steps. The cell activity levels after normalisation,
Eit +1 ,
are given by:
Ei t +1 =
1
Ei t
N
∑E
t
(7.8)
i
i =0
7.3.5 Path Integration Using Ideothetic Information Because external sensory input is not always available or useful, the robot must have some means of updating its state information based on internal sensory information. This is achieved by maintaining two sets of ideothetic input cells that project energy into the head direction network, as shown in Fig. 7.9. One set of cells is active for clockwise rotation, the other for counter-clockwise rotation, much like the model by Skaggs, Knierim et al. (1995). When the direction of rotation does not match the direction associated with the set of cells, all cells are forced to zero activity. Every cell in each network is initially activated according to the angular velocity of the robot. In addition, the activity profile of the head direction cells from the last iteration is also projected onto each set of ideothetic cells. The activity levels of the clockwise cells, Ec , and counter-clockwise cells, Ecc , are given by:
Ec = kωω c + k E E
Ecc = kωω cc + k E E
(7.9)
where ωc and ωcc are the detected clockwise and counter-clockwise angular velocities of the robot, kω is an activation constant, kE is a projection constant, and E is the current activity profile of the head direction cells. ωc and ωcc are limited to values of zero or larger. The activity from the ideothetic cells is injected into the head direction network through offset weighted connections. These weighted connections project the activity in the ideothetic cells into appropriately offset clockwise or counter-clockwise positions in the head direction network. This in effect allows the injected activity to lead
70
Pilot Study of a Hippocampal Model
the current perceived robot orientation by a certain amount. For the implementation described here the offset size was five cells, or about 22 degrees, which was found to provide desirable path integration dynamics in the networks. After injection, the activity level of the head direction cells,
Eit +1 , is given by:
Eit +1 = Eit + k [(Ec )i +δ + (Ecc )i −δ ]
(7.10)
where k is a constant determining the influence of ideothetic stimuli, and δ is the weight offset.
Fig. 7.9. Path integration cells in action for counter-clockwise rotation. Moving outwards from the centre, the first ring of cells are the head direction cells, followed by the counter-clockwise path integration cells and the clockwise path integration cells.
7.4 Model Performance This section presents results from two simple experiments using this hippocampal model – a path integration module calibration experiment, and a SLAM experiment in one-dimension. Experiments were conducted in both simulation and on the real robot, with comparable results, but due to the easy availability of ‘ground truth’ in simulation, results presented here are from the simulation experiments. 7.4.1 Experiment 1: Path Integration Calibration Calibration of the path integration system can be achieved through two means – modification of the ideothetic stimuli influence constant k, or modification of the weight offset size δ. In this context calibration refers not to correction of odometric drift but tuning of the path integration system to produce shifts in head direction activity that correctly correspond to the actual rotation of the robot. Like many of the biological models of path integration in the rodent hippocampus, the model used in this work
Model Performance
71
does not scale to all velocities. However, for the range of velocities encountered in operation of the robot (0 to 50 º/s), it is sufficiently well behaved. In this particular implementation the ideothetic stimuli influence constant k was modified to calibrate the system. To calibrate the constant, the robot was instructed to spin at five degrees per second in 125 degree alternating turns. The velocity of the robot was obtained by filtering the velocity of the activity peak in the local view network corresponding to coloured cylinders in the environment. The difference between this visual velocity and the velocity of the peak activity packet in the head direction network was used to drive correction of the k constant. Path integration in the head direction network was initially quite poor but corrected itself significantly by twenty five seconds into the trial and was accurate after fifty seconds, as shown in Fig. 7.10. The actual encoder velocities from the robot vary by about 300% when supposedly doing a constant velocity turn and as such were not recorded. The long calibration times were partly due to the scarcity of the cylinders – only two in this particular trial – and the camera’s small field of view, meaning cylinders were only in sight for about eight seconds at a time.
Fig. 7.10. Odometry calibration. The calibration was mostly complete after fifty seconds. Once corrected the calibration parameters remained constant as shown by the last two cycles in the graph.
7.4.2 Experiment 2: Localisation and Mapping in 1D In the second experiment the robot was set to rotate at five degrees per second in the centre of the arena, with each full rotation taking approximately 72 seconds. Two identical cylinders were placed just outside the arena as visual cues, as shown in Fig. 7.11. During the first four rotations, the robot was allowed to learn associations between these visual cues and its orientation. Using the pre-calibrated path integration parameters,
72
Pilot Study of a Hippocampal Model
Fig. 7.11. Environment for 1D localisation and mapping experiments
Fig. 7.12. Results of orientation calibration. Note the re-localisation at t = 324 seconds. The 200° jump corresponds to the 160° robot kidnap.
the activity peak in the head direction network closely tracked the actual robot orientation, as shown by Fig. 7.12. Because the learning and recalibration processes occur concurrently, the model also used the visual cue information to maintain localisation during this time. Halfway through the fifth lap the robot was kidnapped in orientation by 160 degrees, shown by the straight vertical segment of the robot orientation curve. This introduced a 160 degree error in the head direction network’s representation of robot orientation. Then at 318 seconds into the experiment, the robot saw the rightmost
Model Performance
73
coloured cylinder for the third time. Due to the ambiguity of having two identical cylinders in the environment, this vision information introduced two similarly weighted robot orientation hypotheses in the form of peaks of visually driven activity. Fig. 7.13 shows these hypotheses and how the activity in the head direction network changed over a period of about six seconds under this visual input. The size of these visually driven peaks also varied during this time. Although both visual peaks were dominant at different times, the network dynamics resulted in the leftmost hypothesis, which happened to be correct, becoming dominant.
Fig. 7.13. Re-localisation under visual input. Seeing the rightmost cylinder introduced two robot orientation hypotheses at t = 318 seconds. Although both hypotheses were dominant at different times, the network dynamics resulted in the robot re-localising to the leftmost hypothesis.
Six seconds after completing this first re-localisation, the robot saw the top cylinder for the third time, at t = 332 seconds. The vision system once again introduced two activity peaks regarding the robot’s orientation (Fig. 7.14). Between t = 332 seconds and t = 337 seconds, both visual peaks were alternately dominant. However, in contrast to the situation shown in Fig. 7.13, one visual peak overlapped and reinforced the current robot orientation hypothesis. This combination of a visual hypothesis with the current hypothesis reduced the effect of the other visual peak, which only introduced a minor peak in the head direction cell activity. The head direction network then correctly tracked the actual robot orientation for the remainder of the experiment (Fig. 7.12, t = 338 – 530 seconds).
74
Pilot Study of a Hippocampal Model
Fig. 7.14. Visual reinforcement of the current orientation hypothesis. Seeing the top cylinder introduced two visually driven activity peaks at t = 332 seconds. The right visual peak reinforced the current, correct orientation hypothesis, while the left peak introduced only minimal activity in the head direction network.
These results illustrate that: • path integration parameters can self-calibrate based on visual input during robot movement, • ideothetic wheel encoder input can sensibly update the robot’s perceived orientation, • allothetic visual input can correct localisation errors, and • one-dimensional SLAM can be performed in a simple environment with ambiguous visual cues.
7.5 A Model of Spatial Location This section describes the implementation of a model of spatial location inspired by place cells in the rodent hippocampus. Some of the mathematical basis for the model is a two-dimensional version of that used for the head direction model and hence is not discussed in detail, having already been covered in Section 7.3. 7.5.1 Representing Location The place cells are modelled as a two-dimensional matrix of cells, with each cell tuned to be maximally activated when the robot is at a specific location. A coarse
A Model of Spatial Location
75
representation is used, with the path integration system tuned so that each place cell represents a physical area of approximately 250 mm by 250 mm. This is quite different to most previous approaches involving place cells that have used a much finer resolution. The entire Pioneer robot covers an area represented by only four place cells. However, the place field size to robot size ratio compares favourably with that observed in nature – the stable place fields have areas of about 0.3 m2, compared to a robot area of about 0.16 m2, a ratio of about two to one. Place fields recorded in healthy rats have been recorded as having an area of about 0.056 m2 (Save, Cressant et al. 1998), compared to a rat ‘area’ of about 0.015 m2, a ratio of about three to one. To provide the appropriate system dynamics the model uses a competitive attractor network arrangement similar to that used for the head direction network. The cells are arranged in a two-dimensional matrix with full excitational interconnectivity between all cells. The excitational weights are created using a two-dimensional version of the discrete Gaussian distribution in Equation 7.5. This ensures the activation of each place cell reduces as the robot moves away from the cell’s preferred location. One full iteration of the place CAN consists of the same five steps used in a head direction CAN iteration (see Section 7.3.1), with orientation replaced by location: 1. 2. 3. 4. 5.
Association of allothetic cues with robot location Calibration of robot location using allothetic cues Internal competitive attractor network dynamics Update of robot location using ideothetic input Global normalisation
Each of these steps is described in the following sections. 7.5.2 Learning Allothetic Cues The need for correction of the robot’s state using visual input is especially important when considering its location, since slight errors in orientation quickly lead to large positional errors as translation occurs. Unlike the head direction system, there is no compensation for the current velocity of the robot when injecting allothetic input into the place cells. The place cells are relatively coarse when compared with the head direction cells and as such are not as sensitive to a slight lag in allothetic input. The learning of vision-place associations is essentially a two-dimensional version of the process described in Section 7.3.2. Active local view cells become associated with active place cells by a process similar to the one shown in Eq. 7.1. 7.5.3 Re-localisation Using Allothetic Cues Active local view cells inject activity into the place cells through association links. The place cell activity after allothetic injection, X
Y
t +1 Clm , is given by:
Z
t Clmt +1 = Clm + ρ ∑∑∑ β ijklmVijk i =0 j = 0 k = 0
(7.11)
76
Pilot Study of a Hippocampal Model
Fig. 7.15. View cell connections to the place cells. Activity in the local view cells is projected to each place cell through weighted connections.
where
Vijk is the local view cell activation level, β ijklm is the association link between
local view cell ijk and place cell lm, and ρ is the overall allothetic influence constant. Fig. 7.15 shows active local view cells projecting to the place cells. The nature of the competitive attractor dynamics ensures that activity in the place cell network tends to converge into peaks of activity or activity packets. Fig. 7.16 shows two examples of place cell activity. Fig. 7.16a shows the activity levels in a 51 by 51 grid of place cells. There is one major peak representing the most probable location of the robot and another much smaller peak. This small peak has not been created through ideothetic input, but rather by allothetic input which is shown in Fig. 7.16b. From a purely visual perspective, the robot is equally sure it is in two different locations, represented by the two peaks of the same size. This visual input is driving the activity in the place cells. The larger place cell peak represents the current place hypothesis and is being supported by one of the visual input peaks, while the smaller second peak has been started by the other visual input peak. Because of their close proximity, without continuous visual stimulus, the attractor dynamics would tend to merge the two peaks rather than have them compete against each other. In this particular case, the small peak in Fig. 7.16a will compete with the larger peak but will not win unless future allothetic input supports the smaller peak more strongly than the larger one. The place network deals with ambiguous visual input through this allothetic injection and competitive attractor network dynamics. View cells can project to more than one location in the place cell network, as shown by Fig. 7.17. The plot shows the strength of allothetic connections between a local view cell and a number of place cells. Although the view cell is most strongly associated with one cluster of place cells, it has weaker connections to several other areas. This type of firing profile for a visual cell can be caused by ambiguous visual information in the environment, where the robot thinks it is seeing the same visual scene at several different locations in the environment. 7.5.4 Internal Dynamics Stable activity profiles are maintained in the place cell network through competitive attractor network excitation and inhibition. The dynamics are a two-dimensional
A Model of Spatial Location
77
version of those used for the head direction cells. A two-dimensional Gaussian distribution is used to create the excitatory weights ε for each cell. The updated place cell activity levels after excitation,
Cij
t +1
Cij
t +1
, are given by: I
I
= Cij + ∑∑ ε ijkl Ckl t
t
(7.12)
k =0 l =0
Fig. 7.16. Sample activity profiles. (a) Activity in a 51 by 51 place cell grid. Total cell activity is normalised, so cell activity can be though of as a rudimentary probability of being located at the cell’s corresponding physical location. (b) The activity caused by visual input for the same place cell grid (before scaling). From a vision perspective there is equal probability of being located in two different locations.
Fig. 7.17. View cell-place cell weight map. Lighter shading indicates stronger connections from the view cell to the place cell. When activated this local view cell injects activity strongly into one area of the place cells and weakly into several other areas.
78
Pilot Study of a Hippocampal Model
The main difference between the place cell and head direction networks is that there is no wraparound for the place network – cells at one edge of the matrix do not project strongly to cells at the opposite edge of the matrix (they do project very weakly to each other because each cell excites every other to some degree). To minimise boundary effects a buffer layer of cells is used around the edges of the place cell matrix; these cells can be active and affect cells in the matrix proper, but do not themselves have any spatial correspondence to the physical world. 7.5.5 Path Integration Using Ideothetic Information The current activity packet in the place network is projected to its anticipated future position based on the velocity and orientation of the robot. Fig. 7.18 shows the overall path integration process. The robot orientation is not usually known exactly – rather it is represented by a number of activated head direction cells. The ideothetic input is processed on an individual basis for each head direction cell and its associated activity level. The place cell activity after path integration,
Cij
t +1
z=N
= Cij + ∑ E z t
z =0
v = f (Vabs , z )
t
x =1 y =1
∑∑ f (v )C x =0 y =0
Cijt +1 , is given by: t x + i + is , y + j + j s
(7.13)
The activity levels in each of four cells C are multiplied by a fractional function f (v ) that is dependent on velocity υ, which is itself a function of head direction cell index z and the absolute velocity of the robot Vabs. The four cells are selected based on
Fig. 7.18. Path integration in the place cells. Current place cell activity is updated under path integration by orientation information stored in the head direction network and translational velocity information.
Model Performance
79
the offsets is and js which are functions of υ. The sum of this activity is then multiplied by the activity level in each head direction cell E. This process is repeated for all head direction cells, with the sum of its output injected into the place cells as ideothetic input. The coarseness of the place cell representation reduces the value of explicitly representing the uncertainty in the robot’s translational velocity with a probability distribution, as is often done with particle filters (Thrun, Fox et al. 2000).
7.6 Model Performance To test the ability of the place cell model to track a robot’s position in space over time, experiments were performed in the same arena as described in Section 7.4. Experiments were performed in both simulation and on a real robot. This section presents the results obtained from the simulation experiments, in which the robot’s actual location could be continuously tracked, enabling the demonstration of the key finding of this part of the work. A brief discussion of the real robot tests is included at the end of the section. 7.6.1 Experiment 3: Localisation and Mapping in 2D The simulated robot was instructed to move around its two by two metre environment using random translations and rotations. Only one action was performed at a time, so
Fig. 7.19. Localisation using path integration. Over the short term the place cell model was able to track the robot’s position effectively.
80
Pilot Study of a Hippocampal Model
the robot never followed any curved trajectories. For these experiments the path integration parameters were all hand-tuned before the experiment, due to the increased complexity of the self-calibration process in two dimensions. The robot was able to track its position quite well for short periods of time as shown in Fig. 7.19. The place network kept the robot localised over the short-term by using path integration. The system was also sometimes able to re-localise when kidnapped; the chances of successfully re-localising reduced as the period between the kidnap and the detection of a familiar visual scene increased. In longer experiments the network’s tracking ability proved to be unstable. Over the period of an hour the robot became lost and its perceived location moved well outside its two by two metre arena. The positional error of the simulated robot during the experiment is shown in Fig. 7.20. The error appeared relatively stable during the first 20 minutes of the experiment, but quickly grew in later stages of the experiment to be larger than the size of the experimental arena. The system was also implemented on the real robot with similar results, although the trial could not be run for as long. The place cell model was able to keep the robot localised for short test durations only. Because the robot was moving based on its perceived position, not its actual position, a small error in pose usually compounded quite rapidly and the tests had to be terminated before the robot collided with objects outside the arena. The following section discusses the reasons for this tracking failure and highlights two problems with the current system.
Fig. 7.20. Positional tracking error. The error remained small for about 20 minutes before growing in an unbounded manner.
Discussion and Summary
81
7.7 Discussion and Summary When the system re-localised its position through viewing of a familiar scene, it did not always re-orientate to the associated robot orientation as well. The opposite situation could also occur; re-orientation but no re-localisation. In both cases, incomplete re-localisation resulted from the head direction networks and place cell networks not being rigidly coupled to each other. Incomplete re-localisation had some role in the failure of the system to track the robot’s location over time. Arleo (2000) addressed this problem by imposing an extra recalibration constraint to the place and head direction networks. The head direction network not only needed a familiar scene to recalibrate orientation but also that the robot perceives itself to be located at a position associated with that scene. This approach solves the localisation problem assuming the robot has sufficient time to sequentially re-localise position and then orientation. Arleo also proposed it would allow the robot to recover from kidnapping, although no results were presented. The kidnapping scenario can however be analysed in a thought experiment. If the robot was kidnapped it would wander until seeing a familiar scene. At that moment the robot would not immediately re-orientate itself since its perceived location would not be the one associated with that scene. However it would re-localise its position, after which it could then re-orientate itself since it now had the perceived position associated with the scene. The robot would need to slow or stop to allow the re-orientation to occur. This problem is a relatively minor one and could be solved in a similar way for the model presented here. However the model has a much more significant limitation. The major limitation of this hippocampal model is that it cannot maintain multiple pose hypotheses. Although it can maintain multiple location and orientation hypotheses, each place cell activity packet is associated with the same head direction activity profile – different place cell activity packets cannot have different orientations associated with them. If ambiguous visual input suggests two possible poses, these poses cannot be verified or discarded through further robot movement and visual input because path integration will move all place cell activity packets in the same net direction of all possible head directions. This renders the model useless in any environment where the robot regularly has long periods of ambiguous allothetic information. This was the major cause of the localisation failures in the experiments described in Section 7.6.1. This problem is illustrated in Fig. 7.21. It shows a schematic of the robot at two different moments in time when it has associated its current place and head direction network activity packets with an identical view of two cylinders in the environment. Fig. 7.22 shows the robot some time later when it encounters the same visual scene again, prompting two possible robot pose hypotheses. The robot needs to move to receive further visual input so it can determine which position and orientation is correct. However there is only one head direction network containing both orientation possibilities. Each activity packet in the place cells is associated with both these orientation possibilities. Ideally, to maintain multiple pose hypotheses under further robot movement, the pose hypothesis in the left of Fig. 7.22 would move downwards as the robot
82
Pilot Study of a Hippocampal Model
moves forward. However, because it now has a second direction associated with it (dotted upward arrow) that should only be associated with the other pose possibility, it will move in the net direction (shown by the short thick arrow), and will spread both upwards and downwards. The right pose hypothesis will also spread and move in the same manner. Fig. 7.23 shows the activity packets after further robot movement. Without appropriate binding, the separate representations of place and orientation cannot sensibly maintain multiple hypotheses of pose. The neurophysiological separation of the place and head-direction cells(O'Keefe and Conway 1978; Ranck 1984) renders this a spatial memory form of the binding problem (Thiele and Stoner 2003). While mechanisms such as synchronisation have been proposed as a way of binding the information of different neurons in other domains (Gray, Konig et al. 1989), we observed that there was a simple modification of the model that offered an elegant engineering solution to the spatial binding problem.
Fig. 7.21. Learning under visual ambiguity. An overhead view of the robot (large circle) at two moments in time where it has associated the place cell activity packet (grey shading) and orientation (long solid arrow) of the robot with an identical view of two cylinders. In Fig. 7.22 the robot re-encounters this scene.
Fig. 7.22. Re-localisation failure under visual ambiguity. The robot encounters the same scene as in Fig. 7.21 and injects activity supporting the two possible locations and two possible orientations. Further robot movement results in both activity packets smearing upwards and downwards and moving in the net direction shown by the short thick arrow.
Discussion and Summary
83
Fig. 7.23. Activity packets after further robot movement. Both activity packets have smeared upwards and downwards and have moved in the same erroneous direction.
7.7.1 Comparison to Biological Systems Are rats able to maintain multiple pose hypotheses? The research literature was examined to determine whether this problem might be solved using any theories developed from experimental observations. Early work by Bostock, Muller et al. (1991) investigated the effects of substituting a new stimulus for a familiar stimulus in the environment. Rats performed food foraging activities in a cylindrical arena with a white cue card on the wall. After stable firing fields were developed, the white cue card was replaced by a geometrically identical black cue card. Place cells responded in two different ways and were classified by the researchers as “rotational” or “complex”. The firing fields of the rotational cells could be rotated to superimpose on the original firing fields in the presence of the white cue card. However, the firing fields of the complex cells either disappeared or changed so that they could not be superimposed on the original fields even through rotation. The researchers concluded that in a new environment the hippocampal spatial representation initially generalised through a rotation of the previous map before discriminating with a complex remapping of the environment. Changes in environment shape can also lead to gradual remapping. In experiments where rats foraged in both square and circular walled arenas, place cells took several days or weeks to differentiate between the two environments (Lever, Wills et al. 2002). Faster remapping occurred when the environments were further differentiated in colour and texture, with the majority of observed cells differentiating between the two environments within a day (Wills, Lever et al. 2005).
84
Pilot Study of a Hippocampal Model
Experiments in which rats are pre-trained to search at the midpoint of two markers but are then given only one in the test show the rat searching in two possible locations relative to the single marker, as if treating that marker as first being one and then the other of the original two (Redish 1999). This behaviour suggest that rats may switch internal maps, although it could be explained by the rat maintaining two pose hypotheses when confronted by ambiguous visual input. When exposed to environments with sections that are identical in both appearance and physical orientation, rats appear to construct two distinct hippocampal maps that overlap yet are significantly different (Skaggs and McNaughton 1998). When rats were kidnapped from one identical region to another, their place fields initially fired as if the rat had not been kidnapped, but then in some of the rats the place cell firing changed to represent the rat’s true physical location – partial remapping occurred when faced with an ambiguous situation. When faced with a conflict of distal and local cues the entire spatial representation in the hippocampus can split into opposing representations, as shown by experiments performed by Knierim (2002). The results of these experiments show that several predictions that current competitive attractor models make do not occur, although Knierim points out that there is a wealth of specific evidence and theory that supports these predictions as well. The author has not yet come across any place cell activity recordings from a real rat that show multiple peaks being maintained for any length of time. Although it appears that rats can switch between multiple maps maintained by the same set of place cells, no evidence of them concurrently updating multiple maps has been found (Bostock, Muller et al. 1991). Rats seem to keep only one map in ‘active’ memory at any one time, switching maps when prompted by significantly inconsistent visual input. The experimental evidence to date suggests that the hippocampus may deal with ambiguity and perceptual inconsistency by learning and switching between multiple maps rather than by maintaining multiple pose hypotheses within the one map. 7.7.2 Comparison to Other Models This pilot study has revealed a fundamental limitation in the hippocampal model described in this chapter that makes it unsuitable for use as a robot mapping and navigation system. Before attempting to address the model’s shortcomings, it is beneficial to compare and contrast its structure and performance with other models of the rodent hippocampus. This section focuses on two issues – the balance between selfcalibration and visual cue reliance, and the ability to perform SLAM in large and ambiguous environments. Almost all robot-based research into hippocampal models has been performed in small, highly controlled environments. Artificial visual cues are often used, and in many cases every visual cue is unique, making the problem of localisation and mapping significantly less difficult. Some models also rely on distal cues that are effectively invariant in absolute bearing from all locations inside the small experimental arena (Arleo 2000). The nature of these environments is at least partially the reason why many models rely on continuous recognition of well-defined cues, and define
Discussion and Summary
85
their path integration characteristics through initial visual exploration of the environment. This approach removes the need for a hard-coded path integration module, but is not necessarily ideal for implementation on robots in real world environments. The compromise between the level of reliance on visual cues and the ability of a model to self-calibrate its path integration system exists throughout the computational models of the rodent hippocampus. The model described in this chapter has a reduced dependence on visual cues at the expense of requiring a hard coded path integration system. For robots, this approach is sensible because many robotic platforms have quite well-defined movement characteristics. It is generally much easier to create a motion model for a wheeled robot than for a quadruped animal. The movement of a wheeled robot can be tracked accurately in the short-term by using these motion models. Even on non-wheeled robots, path integration can be performed using other sensors such as inertial movement units and compasses. In addition, these motion models can model the uncertainty in movement due to wheel slippage and other causes, resulting in awareness of both the robot’s new position and the uncertainty in that position. What many robots do not have is the ability to continually track the position of features in the environment. Without accurate laser range finders, a robot must deduce ranges from visual stimuli. Range-finding from stereo vision is possible but requires that the robot reliably track visual features as it moves, which may require assumptions about the environment. Extracting reliable range information using only a single vision sensor is an even more difficult task (Davison, Reid et al. 2007). In addition, a robot may not have the time or necessary conditions to perform its path integration calibration in a real world environment. Vision may only be intermittently available due to the environment being obscured by people or dynamic objects. A robot may need to start functioning as soon as it is placed in a new environment, with no time for tuning. It seems reasonable then that any mapping and navigation system for a real robot exploit the motion models that exist for robot platforms of various types. The second major issue with computational models of the rodent hippocampus is that none have been successfully used in a large, unmodified, and visually ambiguous real world environment. For example, no model has been demonstrated to recover from the large localisation errors that result from long movement periods without definitive vision information. This ability to ‘close the loop’ is a fundamental requirement of any autonomous mobile robot operating in unmodified real world environments. Most importantly, few if any models have the ability to robustly handle significant sensory ambiguity in the environment. As discussed in Section 7.7, any model based on a separate head direction-place network structure and without appropriate binding between the two is unable to sensibly maintain multiple hypotheses of pose over time. This is where the biological models perform most poorly, when compared with state of the art probabilistic mapping and navigation methods. The next chapter describes a practical extension of the rodent hippocampal model known as the RatSLAM model. RatSLAM retains the desirable aspects of the model presented in this chapter, such as the competitive attractor dynamics. However, it also extends the concept of place and head direction cells to solve the problems of maintaining multiple hypotheses of pose and re-localising simultaneously in orientation and location.
86
Pilot Study of a Hippocampal Model
7.7.3 Conclusion This chapter has presented the implementation of a hippocampal model in a simple environment with artificial visual cues. The model was able to perform SLAM in one-dimension but could not maintain localisation when the problem was extended to the full robot state. The inability to perform full pose SLAM was due to a fundamental limitation of the model – it cannot correctly maintain multiple hypotheses of pose. This is a key requirement if a robot is to function in ambiguous environments. A review of the literature revealed that rats may use multiple overlapping maps in ambiguous environments or switch maps when prompted by visual evidence, but there was no strong evidence supporting the hypothesis that rats maintain multiple pose hypotheses over time. The analysis of this model and other hippocampal models suggested that in their current form, they are poorly equipped to deal with large, ambiguous environments.
8 RatSLAM: An Extended Hippocampal Model
Separate representations of robot orientation and spatial location are inherently unsuitable for mapping and navigation in large, ambiguous environments, as demonstrated by the work presented in Chapter 7. This chapter describes the implementation of an extended hippocampal model known as RatSLAM, which combines the concept of head direction and place cells to form a new type of cell known as a pose cell. The chapter starts with a description of the extended pose cell model. RatSLAM can use a number of vision processing systems, which are described in Section 8.2. Section 8.3 discusses a way of visualising the system’s non-metric performance. RatSLAM was tested in a range of indoor and outdoor experiments, which are described in Sections 8.4 and 8.4.4. The chapter then concludes in Section 8.5.
8.1 A Model of Spatial Pose This section describes the overall structure and individual components of the RatSLAM system. 8.1.1 Complete Model Structure The RatSLAM system contains several major components, as shown in Fig. 8.1. The robot’s pose is encoded in a single CAN module known as the pose cell network. Both ideothetic and allothetic stimuli influence the activity in the pose cells through their respective processing modules. 8.1.2 Biological Evidence for Pose Cells There is some limited biological evidence for cells firing in both a directionally and spatially selective manner in the rodent brain. Directionally selective place cell discharges have been reported in rodent experiments in an eight-arm maze, such as shown in Fig. 8.2 (McNaughton, Barnes et al. 1983). One theory put forward by this work is that place cells may become directional in narrow geometric environments such as corridor sections. In such an environment, the rat is unlikely to turn around except at either end, and hence only experiences stimuli in one of two directions as it M.J. Milford: Robot Navigation from Nature, STAR 41, pp. 87–116, 2008. © Springer-Verlag Berlin Heidelberg 2008 springerlink.com
88
RatSLAM: An Extended Hippocampal Model
Fig. 8.1. The RatSLAM system structure. The pose cell network replaces the separate head direction and place cell networks.
moves up and down the corridor. Conversely in open environments such as a cylindrical arena place cells are mostly non-directional (Muller, Kubie et al. 1987). Taube (1990) reported cells with a firing rate that was modulated approximately equally by both head direction and location, although the overall correspondence was weak. A computational model has been developed that predicts many of the experimental observations of directional place cells (Sharp 1991). In simulation experiments this model developed directional place cells in an eight-arm maze but non-directional cells in an open cylinder arena. Recently ‘grid cells’ have been found in entorhinal cortex (Fyhn, Molden et al. 2004; Hafting, Fyhn et al. 2005; Sargolini, Fyhn et al. 2006), which fire at regular intervals in the environment. Some grid cells are also conjunctive, responding only at specific orientations as well as spatial locations. However, their precise role in mapping is the subject of much current speculation, and further experimental work will surely yield more information about their function. Grid cells are discussed further in Section 12.4.
Fig. 8.2. The eight-arm maze used in rodent experiments (Carlson 1999)
A Model of Spatial Pose
89
Excepting grid cells, the experimental evidence does not support the concept of inherently directionally selective place cells, and the pose cell model presented here cannot be described as biologically plausible. Practical performance rather than biological plausibility was the aim of this research. While these experimental results helped retain some perspective on the concept of pose cells, the model described here is the result of a pragmatic approach to developing a model of cells that encode pose. 8.1.3 Representing Pose To form a population of pose cells, the competitive attractor network structures that were used to model head direction and place cells were combined to form a threedimensional structure. The pose cells are arranged in an ( x ' , y ' ,θ ' ) pattern, allowing
the robot to simultaneously represent multiple pose estimates in x ' , y ' , and θ ' (Fig. 8.3). Primed co-ordinate variables are used because although the correspondence between cells and physical co-ordinates is initially Cartesian, this relationship can become discontinuous and non-linear as the system learns the environment. For example, in indoor experiments, each pose cell initially fires maximally when the robot is in a 0.25 × 0.25 m area and within a 10 degree band of orientation. However as an experiment progresses, the pose volume each pose cell corresponds to can grow, shrink, warp, or even disappear under the influence of visual information. The network’s ( x ' , y ' ) plane can be arranged in any shape that tessellates, such as a square (Fig. 8.3), rectangle (Fig. 8.4), or hexagon. Due to the relative complexity of the wraparound connections, a hexagonal arrangement may be optimal because it requires the least number of wrapping connections for a given network size.
Fig. 8.3. The three-dimensional pose cell model. Each dimension corresponds to one of the three state variables of a ground-based robot.
Cells in the pose cell matrix have a continuous range of activity between 0 and 1. The activity in each cell encodes the probability of the robot having the specific pose associated with that cell. On a cell-by-cell basis this does not have much meaning, but when viewed as a whole, the ensemble activity can be decoded to provide an estimate of the robot’s most likely pose. Fig. 8.4 shows a snapshot of cell activity in the pose cell matrix during an experiment. The largest most strongly activated cluster of cells
90
RatSLAM: An Extended Hippocampal Model
Fig. 8.4. Snapshot of pose cell activity during an experiment. Several activity packets of varying size are evolving under the influence of the competitive attractor dynamics.
encodes the most likely pose of the robot. Other less dominant clusters or activity packets encode other possible hypotheses regarding the robot’s pose. 8.1.4 Internal Dynamics As with the head direction and place networks, competitive attractor network dynamics control the activity in the pose network. There are three stages to the internal dynamics: • an excitatory update, • global inhibition of all cells, and • normalisation of pose cell activity. Due to the three-dimensional nature of the pose network structure, the algorithms in each stage are more complex than those used in the pilot study, but the principles are generally the same. Excitatory Update A three-dimensional discrete Gaussian distribution is used to create the excitatory weight matrix, ε, which each pose cell uses to project activity to all other cells in the pose cell matrix. The distribution, ε abc , is the product of a two-dimensional Gaussian distribution corresponding to excitation in the
(x' , y ' )
pose cell plane, and a one-
dimensional distribution corresponding to excitation in the culated by:
ε abc = e where
(
)
− a 2 +b 2 k x 'y '
e −c
2
kθ '
θ'
dimension, and is cal-
(8.1)
k x ' y ' and kθ ' are the variance constants for the distributions in the ( x' , y ' )
plane and the
θ'
dimension, and
a , b , and c are the x' , y ' , and θ ' co-ordinates
A Model of Spatial Pose
within the distribution. The change in pose cell activity caused by excitation,
91
ΔPx ' y 'θ ' ,
is given by: N X ' N Y ' Nθ '
ΔPx ' y 'θ ' = ∑∑∑ ε (a − x ' )(b − y ' )(c −θ ' ) Pabc
(8.2)
a =0 b =0 c =0
N x ' , N y ' , and Nθ ' are the three dimensions of the pose cell matrix in ( x' , y ' , θ ' ) space. The pose cells ‘wrap’ in all three directions; for instance, the ‘top’
where
pose cell layer in Fig. 8.3 excites both the layers directly below it and the layers at the ‘bottom’ of the diagram. Global Inhibition Because multiple pose hypotheses (represented by multiple activity packets) require time to compete and be reinforced by further visual input, inhibition is relatively gentle, meaning rival packets can co-exist for significant periods of time. The activity levels in the pose cells after inhibition,
[
Pxt'+y1'θ ' , are given by:
) ]
(
Pxt'+y1'θ ' = max Pxt' y 'θ ' + ϕ Pxt' y 'θ ' − max (P ) , 0
(8.3)
where the inhibition constant φ controls the level of global inhibition. Activation levels are limited to non-negative values. Normalisation Normalisation maintains the sum of activation in the pose cells at one after visual and path integration input. The activity level in the pose cells after normalisation,
Pxt'+y1'θ ' ,
is given by: t +1 x ' y 'θ '
P
=
Pxt' y 'θ ' N X ' N Y ' Nθ '
∑∑∑ P a = 0 b = 0 c =0
(8.4)
t abc
8.1.5 Learning Visual Scenes The local view module is a collection of neural units that represent the input to the robot’s external sensors in a form that is usable by the mapping and localisation system. Two simultaneous interactions occur between the local view cells and pose cells: mapping via associative learning, and localisation by the injection of activity into the pose cells. Fig. 8.5 shows the local view and pose cell structures and the links that form between them.
92
RatSLAM: An Extended Hippocampal Model
Fig. 8.5. The pose cell and local view cell structures. Associative links are learnt between an
Vi
and an activated pose cell Px ' y 'θ ' . Active local view cells inject energy into the pose cells through these links. activated local view cell
Visual scenes are associated with the robot’s believed position by a learning function which increases the connection strengths between co-activated local view and pose cells. The updated connection strength,
β ixt +' 1y 'θ ' , is given by:
β ixt +' 1y 'θ ' = max (β ixt ' y 'θ ' , λVi Px ' y 'θ ' ) where
(8.5)
Vi is the activity level of the local view cell and Px ' y 'θ ' is the activity level of
the pose cell. These links form the pose-view map shown in Fig. 8.1. 8.1.6 Re-localising Using Familiar Visual Scenes The pose-view map can be used to inject activity into the pose cells, which is the mechanism that RatSLAM uses to maintain or correct its believed pose. Active local view cells project their activity into the pose cells to which they are associated, by an amount proportional to the association strength. The change in pose cell activity, ΔPx ' y 'θ ' , is given by:
ΔPx ' y 'θ ' = ∑ β ix ' y 'θ 'Vi
(8.6)
i
The learning method that RatSLAM uses to build the pose-view map cannot usefully associate raw camera data with pose cells. The data must be processed to reduce the dimensionality of the camera image while preserving distinctive information. The single layer learning mechanism between the local view and pose cells works best when the local view is a sparse feature vector, as this avoids problems with linearly inseparable inputs. To constrain the local view structure to a practical number of cells, the vision processing system should perform spatial generalisation; activity in the
A Model of Spatial Pose
93
local view cells should not change significantly for small changes in robot pose, as discussed in Section 8.2.1. 8.1.7 Intuitive Path Integration The path integration process updates the pose cell activity profile by shifting activity based on wheel encoder velocities and the ( x ' , y ' , θ ' ) co-ordinates of each pose cell. The advantage of a unified pose representation is that each cell encodes its own preferred orientation. This allows multiple hypotheses of the robot’s pose to propagate in different directions, unlike in a more convention head direction – place cell model. Many computational models of the rodent hippocampus achieve path integration by projecting existing cell activity to an anticipated future location, and allowing the competitive attractor network dynamics to shift the current activity towards this new activity. In a three-dimensional competitive attractor network it is difficult to achieve well-behaved path integration characteristics while still retaining suitable internal network dynamics. Consequently, a more pragmatic approach was taken with this extended hippocampal model. The new path integration method shifts existing pose cell activity rather than projecting a copy of the current activity forwards in time. This process makes its performance independent of variable sensory update rates and robot velocity, and produces more accurate robot paths, as well as removing the need for parameter tuning. The method has no real biological basis, unlike other methods in more biologically faithful models (Arleo, Smeraldi et al. 2001; Stringer, Rolls et al. 2002). After path integration the updated pose cell activity,
Pxt'+y1'θ ' =
δxo′ +1 δyo′ +1 δθ o′ +1
∑δ ∑δ ∑δθ α
a = xo′ b = yo′ c =
o′
abc
Pxt'+y1'θ ' , is given by:
P(tx '+ a )( y '+ b )(θ ' + c )
(8.7)
where δxo′ , δy o′ , δθ o′ are the rounded down integer offsets in the x ' , y ' and θ ' directions. The amount of activity injected is based on the product of the activity of the sending pose cell unit, P, and a residue component, α. The residue component is spread over a 2 × 2 × 2 cube to account the quantisation effects of the grid representation. It is based on the fractional components of the offsets, δxf, δyf and δθf which themselves are based on the translational velocity v, rotational velocity ω and preferred cell orientation θ ' . The integer and fractional offsets are given by Eqs. 8.8 and 8.9 respectively:
⎡δxo′ ⎤ ⎡ ⎣k x 'v cosθ '⎦⎤ ⎢δy ′ ⎥ = ⎢ k v sin θ ' ⎥ ⎦⎥ ⎢ o ⎥ ⎢ ⎣ y' ⎢⎣δθ o′ ⎥⎦ ⎢⎣ ⎣kθ 'ω ⎦ ⎥⎦
(8.8)
⎡δx′f ⎤ ⎡k x 'v cos θ '−δx'o ⎤ ⎥ ⎢ ⎥ ⎢ ⎢δy ′f ⎥ = ⎢ k y 'v sin θ '−δy 'o ⎥ ⎢δθ ′f ⎥ ⎢ k θ ' ω − δθ ' ⎥ ⎣ ⎦ ⎣ ⎦
(8.9)
94
where
RatSLAM: An Extended Hippocampal Model
k x ' , k y ' , and kθ ' are path integration constants. The residue matrix α is cal-
culated in Eqs. 8.10 and 8.11:
α abc = g (δx′f , a − δxo′ )g (δy ′f , b − δyo′ )g (δθ ′f , c − δθ o′ ) ⎧1 − u, if v = 0; g (u, v ) = ⎨ if v = 1 ⎩ u,
(8.10)
(8.11)
8.2 Generation of Local View Three visual processing methods were used in the course of this work: a cylinder recognition system; a sum of absolute differences matcher; and a histogram matcher. The cylinder recognition system was used in the pilot study and has already been described in Section 7.1. This section gives an overview of the other two local view generation processes, which were developed by David Prasser, another researcher on the project (Prasser, Wyeth et al. 2004; Prasser, Milford et al. 2005). 8.2.1 Sum of Absolute Differences Module In indoor environments, images from the Pioneer robot’s camera are converted to 8bit greyscale images and down sampled to a resolution of 12 × 8 pixels, as shown in Fig. 8.6. Down sampling serves two main purposes: the coarseness of the image renders it slightly invariant to the camera location and orientation, producing limited spatial generalisation, and the small size of the image reduces the computational resources required to process it. The 96-pixel images are processed by a Sum of Absolute Differences (SAD) module that produces the local view (Prasser, Wyeth et al. 2004). The SAD module compares each image with its repository of stored template images. New images that are
Fig. 8.6. High resolution and down sampled greyscale camera images. The low resolution 12 × 8 pixel images are used by the RatSLAM system.
Generation of Local View
95
sufficiently similar to template images are re-recognised as such, while significantly different images are learned as new image templates and added to the repository. Using this vision module, the local view consists of a one-dimensional array of cells, with each cell corresponding to a particular image template. The activity level of a cell, a , varies between 0 and 1, and is given by:
0, di > d max ⎧ ai = ⎨ ⎩1 (di + ε ), di <= d max
(8.12)
where ε is introduced to avoid infinite activation levels when the distance is zero, and d max is a distance threshold. The distance, di, is the sum of absolute differences between pixel intensity values in the template and the current image and is given by:
di = where
∑ p j − p*j
(8.13)
p j is the intensity value of the jth pixel in the visual template associated with
cell i, and
p*j is the value for the jth pixel in the current camera image. The total ac-
tivity level in the set of local view cells is normalised to unity. The number of local view cells grows during an experiment, as shown in Fig. 8.7. The graph plots the active local view cells by index against time during a real world
Fig. 8.7. Local view cell activity during a 77 minute experiment. The vertical axis gives the index of the most active local view cell throughout the experiment.
96
RatSLAM: An Extended Hippocampal Model
robot experiment. At the beginning of the experiment the graph rises steeply representing a rapid accumulation of new image templates. About 500 seconds into the experiment the robot repeats a familiar path for the first time, indicated by rerecognition of the first 50 or so image templates. As the experiment progresses, the robot’s knowledge of the environment becomes increasingly complete; there are more frequent periods of re-recognition, and the rate of new template acquisition drops significantly, from 2 templates per second to around 0.4 templates per second. The SAD vision processing method suited the indoor environments and robot movement scheme. The corridor and wall following movement behaviours and small field of view forward facing camera ensured that the robot only learned forward and backward representations of large portions of the environment, rather than complete panoramic visual representations. The advantage of the scheme is its simplicity – it is a low resolution, appearance based method that requires only consistency in the visual appearance of the environment. There is no geometric processing of scenes and no feature extraction or tracking. The disadvantages of the scheme are its sensitivity to illumination variation and inability to recognise new images as being rotated versions of stored template images. These issues were addressed in outdoor environments by the second visual processing method described in Section 8.2.2. 8.2.2 Image Histograms Outdoor experiments were performed on a robot platform with a panoramic camera. The larger field of view and greater variations in illumination required a more sophisticated vision processing system. A histogram matching system was developed for generating the local view. Histograms are invariant to rotations about the camera-axis, allowing recognition of familiar places with novel robot orientations. The red-green-blue colour space
Fig. 8.8. Region of the panoramic image used to generate histograms
Generation of Local View
97
was used because it is more robust to changes in illumination in the testing environments than the camera’s native YUV colour space (Tews, Robert et al. 2005). Only part of the camera’s panoramic image is used to produce an image histogram, as shown in Fig. 8.8. Other areas of the image contain no useful information. After conversion from YUV to a normalised red-green-blue colour space, the image was used to calculate a 32 × 32 two-dimensional histogram of the red-green components. Fig. 8.9 shows some sample histograms created from images in different environment locations. Most of the histogram bins are typically empty, with a small peak area that usually corresponds to uniformly coloured ground areas. From a local view perspective, these histograms serve the same purpose as the 12 × 8 image templates described in the previous section. As the robot moves around the environment, the current image histogram is either matched to an already learned histogram or added to the reference set. The distance between two histograms,
χ i2 , is calculated using a modified χ 2
statistic:
⎧ (hij − c j )2 , hij + c j ≠ 0 ⎪ χ i2 = ∑ ⎨ hij + c j j ⎪ 0, hij + c j = 0 ⎩
(8.14)
where hi is the reference histogram and c is the current histogram. The j subscript refers to the bin number within each histogram. Each reference histogram has an associated local view cell that is activated by similar new histograms. New local view cells are created for each new reference histogram. The activity level of a local view cell,
Vi , is calculated using the χ i2 dis-
tance, and normalised:
⎧1 − χ i2 d m , χ i2 < d m Vi = ∑ ⎨ 0, χ i2 ≥ d m j ⎩ Vi = Vi ' ∑ V j' '
(8.15)
j
The constant dm is a distance threshold beyond which a cell is inactive. Extracting Orientation Information Image histograms depend on the robot’s location rather than its full pose. Pose cells are however directional – they represent a specific location and orientation of the robot. Consequently two learning processes are possible. The first involves associating histograms with entire columns of pose cells representing all possible orientations of the robot at a certain environment location. Subsequently seeing that template would cause injection of activity in several columns of the pose cells. This activity would then propagate outwards in a spiral pattern under further robot movement. This approach is likely better suited to a particle filter style method. Under competitive attractor network dynamics the long term propagation of a spiralling column of
98
RatSLAM: An Extended Hippocampal Model
Fig. 8.9. Sample 32 × 32 normalised red-green histograms. Each histogram was produced from an image obtained in a different environment location. The peak in each histogram corresponds to the uniformly coloured ground.
activity is challenging – the system tends to resolve the 360 degrees of pose hypotheses relatively quickly. The approach that was used incorporates a compass to provide robot orientation information. The compass is assumed only to be locally consistent rather than globally accurate. Other methods of determining relative orientation such as feature tracking in the panoramic images would also be suitable under this approach. Upon creation each local view cell stores not only the image template but also the compass orientation. The compass orientation when each image template i is first stored becomes the reference orientation αi. When the robot passes through the same environment location again, the difference in orientation, γ i , is calculated and converted into the discrete heading space of the pose cells:
γ i = ⎣(α i − α ) φ ⎦
(8.16)
Visualising SLAM in a Hippocampal Model
where γi is the relative orientation, α is the current compass orientation, and
φ
99
is the
angular width of a pose cell. φ has a value of 10 since each pose cell initially represents 10 degrees of orientation. Eqs. 8.5 and 8.6 were adjusted to incorporate the extra relative orientation information stored with each local view cell. Equation 8.5 became:
(
β ixt +' 1y '(θ ' −γ i ) = max β ixt ' y '(θ '−γ i ) , λVi Px ' y 'θ '
)
(8.17)
This equation reinforces the association between the reference orientation version of the histogram template and the pose cells that were active the first time it was seen. Multiple viewings of an image always reinforce the original template association, regardless of the orientation of the robot each time. To re-localise the robot, active local view cells inject energy into the pose cells. Which pose cells are activated depends on the relative orientation γi of the current image, which determines the target of the LV-PC links. The change in pose cell activation due to visual input, ΔPx ' y '(θ ' + γ i ) , is given by:
ΔPx ' y '(θ ' +γ i ) = ∑ β ix ' y 'θ 'Vi
(8.18)
i
The
(θ '+γ i ) subscript shifts the activity injection point within the pose cells to ad-
just for the shift in orientation between the original vision template and the current image.
8.3 Visualising SLAM in a Hippocampal Model There are many different ways of visualising and evaluating the mapping process of a SLAM system. Because there is significant variability between methods, a visualisation technique that suits one method may not be suitable or even be nonsensical when applied to another method. There are more abstract ways of evaluating a system, such as testing whether the system can use its maps to navigate to goals. However at this stage of the research this was not yet possible using the RatSLAM model. Some way of visualising the mapping and localisation performance of the system was still required. Many researchers choose to generate a human-friendly map of the environment using their SLAM method. Human-friendly signifies that a human observer can relate the map to the environment. A Cartesian occupancy grid map for instance has direct Cartesian correspondence to the environment, and an observer can easily relate parts of the map to their personal experience in the environment (see Fig. 6.1). It is also easy to qualitatively analyse the accuracy of the map, although coming up with a quantitative measure is more challenging. One way of quantitatively assessing the accuracy of a map is to compare the robot’s perceived trajectory with the actual robot trajectory. More abstract maps may involve topological representations of significant places in the environment. A map may consist of nodes representing open spaces or rooms in
100
RatSLAM: An Extended Hippocampal Model
an indoor office environment, with lines representing pathways or corridors. This sort of map is usually only qualitatively assessed by checking that the relative arrangement and connectivity between places in the environment is correct. For hybrid topological-metric maps both the global topological correctness and local metrical correctness can be examined, such as for the map shown in Fig. 8.10.
Fig. 8.10. Hybrid topological-metric map. The map blends local occupancy grid maps with a global topological arrangement (Kuipers, Modayil et al. 2004). © 2004 IEEE.
In the case of the RatSLAM model, the pose cell structure initially has a direct relationship to the spatial arrangement of the environment due to the path integration mechanism. However as the robot moves around the environment and is subject to both odometric drift and visual re-localisations, the arrangement of the pose cells becomes less and less Cartesian, especially on a global scale. Therefore comparing the true location of the robot to its mapped location is meaningless. It is possible to generate a coarse occupancy grid map using the robot’s sonar or laser sensors. Because of the eventual non-Cartesian nature of the pose cell representation of the environment, the map’s appearance is quite different to conventional occupancy grid maps. For example, discontinuities in the pose cell map can result in misaligned corridor sections in the occupancy grid map. Any conventional metric comparing the shape of obstacles and spaces in the environment to the mapped shapes is therefore not suitable. Because most conventional visualisation methods are unsuitable, experimental results obtained using the RatSLAM system are evaluated using an alternative set of performance indicators. The RatSLAM world representations are analysed for their consistency and stability over time. The consistency of the robot’s perceived trajectory during an experiment is used as the primary indicator of performance, as represented by the path of the maximally active pose cell through the pose network.
SLAM in Indoor and Outdoor Environments
101
Other visualisations are also used to reveal certain characteristics of the system. For the experiments involving artificial landmarks, the consistency in the perceived location of each landmark is used as an additional consistency indicator. Together the visualisations give a detailed account of the mapping and localisation performance of the extended hippocampal model.
8.4 SLAM in Indoor and Outdoor Environments This section describes SLAM experiments run in a number of indoor environments, both with and without artificial visual landmarks. The experiments used the Pioneer robot described in Section 7.1. 8.4.1 Experiment 4: SLAM with Artificial Landmarks The first set of experiments used the coloured cylinders as artificial landmarks. Experiments were performed with unique and ambiguous landmark configurations. Experimental Setup with Artificial Landmarks The test environment was a carpeted corridor area in a campus building, with dimensions of approximately 17 m × 8 m (Fig. 8.11). The only modifications made to the environment were the addition of cardboard boxes at exits to prevent the robot leaving the test arena and the placement of coloured cylinders (Fig. 8.12). Cylinders positioned with bright light sources behind them (such as windows) were almost invisible to the robot from some orientations but seen easily from other orientations. Rectangular patches of sunlight on the floor were sometimes picked up as being cylinders and learned as landmarks.
Fig. 8.11. Floor plan of testing arena. Dotted lines indicate where cardboard boxes have been used to block corridors. Cylinder sizes are exaggerated. The colours of the cylinders were red (R), green (G), magenta (M) and blue (B).
102
RatSLAM: An Extended Hippocampal Model
Fig. 8.12. Testing arena and Pioneer robot with artificial landmark
The robot performed wall following to circumnavigate the test environment. Additional parallel behaviours included obstacle avoidance and homing on the cylinders. The tests were performed during the day and as such there was a reasonable volume of human traffic through the area. The robot sometimes had to go around people who tried to obstruct it, or deal with obscured landmarks and false landmarks generated by people in brightly coloured pants. At this stage in the project the robot did not have a laser, and sonar returns off the smooth walls were poor. Consequently, the movement of the robot lacked smoothness. The robot was not given any information about the layout of the environment before the start of the test. Learning, recall and map maintenance all occurred concurrently. There was no user intervention during the tests, and no ‘phases’ of learning in the trial. All tests were of 40 minute duration. Two types of tests were performed, one with unique landmarks and one with ambiguous landmarks. An odometry-only test was performed to benchmark the accuracy of the path integration system. The dimensions of the pose cell structure for these experiments were 100 × 50 × 36 cells, initially corresponding to a physical space of 25 m × 12.5 m × 360° (a significantly larger area than the test environment so that the map could be learned without wrapping). The path integration module was calibrated so that cells initially encoded approximately a 0.25 × 0.25 m area. The initial angular resolution of each cell was 10 degrees. SLAM Results with Unique Landmarks The first set of tests used unique landmarks; each of the four cylinders in the environment was a different colour. There was still a level of ambiguity in the landmarks during this test however. The cylinders look identical from any orientation, meaning that a unique local view representation corresponds to a ring of possible robot locations, and a helix (in pose cell space) of possible pose representations. As expected, there was an unbounded drift in localisation when only path integration was used, as shown in Fig. 8.13a. When the local view module was activated, the RatSLAM system bounded the trajectories, as shown in Fig. 8.14a. The trajectory shown is the path of the most activated pose cell during the experiment. The trajectory
SLAM in Indoor and Outdoor Environments
103
has been projected onto the ( x ' , y ' ) plane of the pose cell structure and superimposed on the floor plan for reference. At points in the experiment the robot lost localisation, perhaps because the cylinders were temporarily obscured by passers-by. However once visual input was received the robot re-localised, as shown at point A. The mapped cylinder positions in Fig. 8.14b show tight clustering, which indicates that the robot maintained consistent localisation throughout the test.
Fig. 8.13. Results without visual re-localisation. (a) Uncorrected odometry trajectory. (b) Mapped cylinder positions based on uncorrected odometry.
Fig. 8.14. SLAM results with unique landmarks. (a) RatSLAM trajectory. (b) Mapped cylinder positions using RatSLAM.
SLAM Results with Ambiguous Landmarks In the second set of tests the landmark ambiguity was increased by using two identical landmarks. The magenta cylinder in Fig. 8.11 was replaced with a red cylinder. The pose cells were required to maintain multiple pose hypotheses over a period of time until one hypothesis could be strengthened by evidence from a unique landmark. The trajectory of the robot remained bounded in a fashion similar to the tests with unique landmarks, as shown in Fig. 8.15. However there were two occasions where path integration errors caused the robot to lose localisation for a period of time. The
104
RatSLAM: An Extended Hippocampal Model
plot in Fig. 8.15a shows the two jumps where the robot recovered from these path integration errors. The system was also continuously re-localising in a less dramatic fashion. The path integration errors leading to the re-localisation jumps at points B and C were caused by wheel slip when the robot clipped an obstacle. In both cases the robot was able to effectively re-localise to the correct pose using visual cues. The tight clustering of the cylinder locations shown in Fig. 8.15b indicates consistent localisation. The additional ‘cylinder’ that appears in the top left corner came from a patch of sunlight that had similar appearance to a blue cylinder. Since this sunlight was relatively consistent throughout the experiment, the system was able to use this further ambiguous feature to its advantage, rather than the feature causing any degeneration in system function.
Fig. 8.15. Results with ambiguous landmarks. (a) RatSLAM trajectory with non-unique landmarks. Note the two large re-localisation jumps at B and C after the robot built up a significant path integration error due to wheel slip. (b) Mapped cylinder positions using RatSLAM. Note the learnt ‘cylinder’ at the top left of the figure corresponding to a consistent patch of ‘blue’ sunlight.
8.4.2 Experiment 5: SLAM in a Loop Environment The use of artificial cylinders, while useful for initial development of a SLAM system, also has its disadvantages, one example being the identical appearance of a landmark from all directions. The remainder of the indoor experiments used the image template matching procedure described in Section 8.2.1. This section describes an experiment in which the robot was required to explicitly ‘close’ the loop. Experimental Setup The test environment was a 70 metre long corridor loop at the Queensland Centre for Advanced Technologies (Fig. 8.16). The robot’s control scheme used the sonar array to perform autonomous wall following. SLAM was performed live during the actual test. As with earlier experiments, the first test was run using only the RatSLAM path integration module, with no visual association or calibration, to determine the accuracy of the path integration process.
SLAM in Indoor and Outdoor Environments
105
Fig. 8.16. Floor plan of loop environment. The shaded area and arrows indicate the path of the robot.
Results are presented for tests performed during work hours. The environment was not completely static – people were moving about during tests (Fig. 8.17b), doors were being opened or closed, and the lighting conditions varied. Separate corridor sections appeared similar to the robot’s camera, such as those shown in Fig. 8.17c and Fig. 8.17d.
Fig. 8.17. Robot camera captures from loop environment. (a) Dark uniform wall. (b) Bright window and person. (c) Corridor 2. (d) Corridor 1.
106
RatSLAM: An Extended Hippocampal Model
The dimensions of the pose cell structure for these experiments were 140 × 100 × 36 cells, initially corresponding to a physical space of 35 m × 25 m × 360°. The path integration module was calibrated so that cells initially encoded approximately a 0.25 × 0.25 m area. The initial angular resolution of each cell was 10 degrees. The only parameter change from the artificial landmark experiments was the use of a larger pose cell structure to accommodate the increased environment size without wrapping.
Fig. 8.18. Robot localisation without visual re-localisation. The robot rapidly became lost over three laps when relying only on path integration.
Localisation Using only Path Integration Without visual re-localisation, small errors in the reported velocities obtained from the wheel encoders accumulated over time, as shown in the trajectory map in Fig. 8.18. Closing the Loop When the full RatSLAM system was used the pose cell trajectory was consistent over repeated laps, as shown in Fig. 8.19. At the start of each new lap the robot was able to repeatedly close the loop. At other stages of the experiment the system was able to recover from path integration errors. RatSLAM did not close the loop at exactly the same point and occasionally lost localisation because of the jerkiness of the wall following behaviour. The poor sonar performance in these indoor environments meant that the robot used a slightly zigzagging path when travelling down a corridor. With an effective field of view of only 50 degrees, the robot experienced different visual sequences on different laps. A more sophisticated local movement scheme using a laser range finder was developed for later experiments.
SLAM in Indoor and Outdoor Environments
107
Fig. 8.19. RatSLAM trajectory for loop environment. RatSLAM closed the loop upon starting the second and subsequent laps, and also recovered from path integration errors at other points. Dashed lines indicate the start and finish of re-localising snaps. Each grid square represents 4 × 4 pose cells in the ( x ' , y ' ) plane.
Kidnap Recovery To further test RatSLAM’s localisation ability, a number of kidnap experiments were performed. The robot was allowed to complete one or two laps of the environment before it was kidnapped, without any communication to the robot that it was being kidnapped. The initial kidnaps consisted of lifting the robot and rotating it 90 degrees before placing it again. After kidnaps, the RatSLAM system re-localised the robot quickly. Visual information caused activity to be injected into the pose cell matrix that was close enough to the now incorrect activity packet to ‘nudge’ it into the correct position. To increase the severity of the problem the robot was kidnapped in both position and orientation, once again without informing the robot that it had been kidnapped. Fig. 8.20 shows how the robot was kidnapped from point A to point B and also rotated counter-clockwise by 90°. Immediately after the kidnap multiple pose hypotheses were introduced into the pose cells. After travelling for half a lap the robot had obtained sufficient visual information to re-localise. There is no backwards in time correction of the RatSLAM map after re-localising snaps. This does not prevent RatSLAM from creating maps that are topologically (although not spatially) usable even when subjected to kidnapping. The author knows of no real-time SLAM methods that have produced spatially consistent maps in experiments where the robot was kidnapped. The multiple representations and discontinuous nature of the pose cell representations are discussed further in Section 8.5.2. 8.4.3 Experiment 6: SLAM in an Office Building This section describes an experiment in which the robot was required to perform SLAM in a large, complex, and unmodified indoor environment.
108
RatSLAM: An Extended Hippocampal Model
Fig. 8.20. Results of a robot kidnap. The robot was kidnapped in both position and orientation from point A to point B. After travelling ‘lost’ for a while, it was able to eventually re-localise to the correct pose. Each grid square represents 4 × 4 pose cells in the ( x ' , y ' ) plane.
Experimental Setup The test environment for this experiment was the complete floor of a building at the University of Queensland campus measuring 45 metres by 13 metres, shown in Fig. 8.21. The floor layout of this building is complex and there are multiple possible pathways between many points. To simulate movement of the robot under an exploration and goal
Fig. 8.21. Complete floor environment. Only large obstacles are shown, although there were a number of small irregular objects such as people, chairs and boxes present. The route traversed by the robot is shown by the thick continuous line.
SLAM in Indoor and Outdoor Environments
109
Fig. 8.22. Robot camera captures from the complete floor environment. (a) Reflective glass (b) Laboratory (c) Corridor (d) Open plan space (e) Person and doorway (f) Featureless wall.
memory system in development at the time, the robot was tele-operated. At intersections the robot was driven down a ‘random’ path. Excursions were made into offices, meeting rooms, laboratories and open plan spaces. Routes were traversed both forwards and backwards. The experiment was performed during work hours. Conditions were similar to those described in Section 8.4.1. Fig. 8.22e shows an example of a transient object in the environment. Fig. 8.22a and Fig. 8.22b show examples of the lighting variation and reflections the robot encountered. The uniform white walls ensured a large fraction of the environment appeared highly ambiguous to the robot’s vision sensors. For the purposes of comparison, the RatSLAM system was run with and without visual re-localisation enabled. Having already established the typical rate of odometric drift in previous experiments, the RatSLAM path integration trajectory is shown for this experiment, rather than the robot trajectory calculated directly from integration of encoder information. This allows the direct comparison of the RatSLAM trajectories with and without visual re-localisation. The dimensions of the pose cell structure for these experiments were 200 × 100 × 36 cells, initially corresponding to a physical space of 50 m × 25 m × 360°. The path integration module was calibrated so that cells initially encoded approximately a 0.25 × 0.25 m area. The initial angular resolution of each cell was 10 degrees.
110
RatSLAM: An Extended Hippocampal Model
Path Integration only Performance When relying only on path integration there was no consistency in the RatSLAM trajectories corresponding to repeated paths in the environment, as shown in Fig. 8.23. However because the robot was driven by hand, its movement was smoother with less wheel slip. The RatSLAM trajectories did not rotate as quickly as in previous experiments because of the slower rate of odometric drift.
Fig. 8.23. Inconsistent pose cell trajectory without visual re-localisation. Each grid square represents 4 × 4 pose cells in the ( x ' , y ' ) plane.
SLAM Results With visual re-localisation enabled, the RatSLAM system was able to keep the robot localised. Fig. 8.24 and Fig. 8.25 show the RatSLAM trajectories for two experiments using different learning rates. RatSLAM stayed well localised for most of the experiment shown in Fig. 8.24, except for one significant deviation while the robot was in the main corridor section of the environment, which was finally corrected by a relocalisation jump. Two other much smaller re-localisation jumps represent recovery from momentary path integration errors. The experiment was repeated with a higher learning rate, with the resulting RatSLAM trajectory shown in Fig. 8.25. The increased learning rate and absence of any major wheel slippage incidents allowed the RatSLAM system to stay well localised at all times except during one short excursion into a meeting room, after which it relocalised immediately. The pose cell representation has a high degree of topological correspondence to the actual environment, although it is not consistent in a global Cartesian sense.
SLAM in Indoor and Outdoor Environments
111
Fig. 8.24. RatSLAM trajectory for the office environment (η = 0.20). Path integration errors were corrected by re-localising snaps. The corridor at A is not aligned with most of the map, showing the global non-Cartesian characteristics of the pose cell representation. Each grid square represents 4 × 4 pose cells in the ( x ' , y ' ) plane.
Fig. 8.25. RatSLAM trajectory for the office environment (η = 0.25). The trajectory was highly consistent with only small re-localisation jumps. Each grid square represents 4 × 4 pose cells in the ( x ' , y ' ) plane.
Localisation Consistency The non-Cartesian nature of the RatSLAM representations makes it hard to evaluate the absolute accuracy of the system. However it is possible to make some quantitative localisation observations with respect to consistency of the representation. The localisation accuracy can be described in terms of consistency on subsequent passes through an area. In the map shown in Fig. 8.25 the robot’s perceived position is consistent to within two pose cells (approximately 0.5 metres) with previous passes through the environment for the entire experiment except for one 33 second period. During this period the error (relative to the previous pass) in pose gradually increased to about 3 metres in position and 25 degrees in orientation. The error was then corrected by a re-localising snap.
112
RatSLAM: An Extended Hippocampal Model
8.4.4 Experiment 7: SLAM in Outdoor Environments To test the flexibility of the RatSLAM system it was implemented on a different robot platform in an outdoor setting. This section describes a SLAM experiment run in a large outdoor environment using a modified ride-on tractor developed by the Commonwealth Scientific and Industrial Research Organisation (CSIRO) (Usher, Ridley et al. 2003). Experimental Setup Although the tractor was capable of autonomous movement, in this experiment it was manually controlled due to the safety issues of operating a large robot in a populated environment. An omni-directional camera configuration provided 1024 × 768 YUV colour-space panoramic images of the environment. Wheel encoder information was used to perform path integration. The vision module used the electronic compass to provide relative orientation information. The tractor platform is shown in Fig. 8.26.
Fig. 8.26. CSIRO robot tractor
Tests were performed in a range of outdoor environments; however for the purposes of brevity, this section only presents a typical set of results from an experiment run in the University of Queensland campus. The robot was driven approximately 2.8 km around the campus over 30 minutes. During this time it experienced a wide variety of environment types, including footpaths, rainforest areas, roads, a bridge over a lake, and the area under a building. Although the experiment occurred on a sunny day the illumination levels experienced by the robot varied from bright sunshine to darkness under buildings and in areas of heavy vegetation. The robot visited every point shown in Fig. 8.27 at least twice and in both directions.
SLAM in Indoor and Outdoor Environments
113
Fig. 8.27. Route for outdoor experiment. The robot started at S and finished at F. An initial loop of the inner loop SABFS was followed by a clockwise traverse of the outer loop SACABDBFS and then a counter-clockwise traverse of the same outer loop.
The pose cell model parameters were modified to increase the size of the physical environment it could represent. The path integration module was adjusted so that cells initially encoded approximately a 2 × 2 m area rather than the 0.25 × 0.25 m area used for the indoor experiments. The initial angular resolution of each cell stayed the same at 10 degrees. The pose cell matrix structure measured 200 × 100 × 36 cells, which without wrapping initially corresponded to an area measuring 800 m × 200 m, or a total of 160,000 square metres. 8.4.5 Path Integration only Performance A 0.4° calibration error caused the path integration system to calculate an erroneous robot trajectory, as shown in Fig. 8.28. The tractor uses front wheel steering rather than a differential system as used on a wheelchair style robot such as the Pioneer. This makes the system especially sensitive to calibration errors, as revealed by the trajectory in this figure. 8.4.6 SLAM Results The RatSLAM system generated consistent trajectories for each path section. Fig. 8.29 shows the trajectory of the most strongly activated pose cell during the experiment in (x' , y' ) space. Key points are labelled and correspond to the labelled locations in Fig. 8.27. Although the path is not a globally Cartesian representation of the physical path of the robot, there is a high degree of correspondence between path sections in the real world and in the pose cell representation. Many path segments in the real world were consistently represented by the same paths in the pose cell structure, such
114
RatSLAM: An Extended Hippocampal Model
Fig. 8.28. Robot localisation using only path integration. A 0.4° offset error in the tractor front wheel’s reported steering angle led to a very high degree of odometric drift. The point labels are the same as for Fig. 8.27.
Fig. 8.29. Robot localisation with visual re-localisation. The trajectory shows the path of the most strongly activated pose cell in ( x ' , y ' ) space, with wrapping in both directions. The thin lines indicate re-localisation jumps where the RatSLAM system closed the loop. Each grid square represents 4 × 4 pose cells in the ( x ' , y ' ) plane.
as segments AB and BF. The system was able to consistently close the loop, although the exact point at which it did so varied. The large central loop in Fig. 8.27 was closed repeatedly as indicated by the jumps from S to S ' .
Summary and Discussion
115
Other path segments were learned as separate forward and reverse paths in the pose cells. Near point D there are separate pose cell trajectories for the forward and reverse traversals of the small loop, as well as for the path from B to D. On subsequent visits to this area the dominant activity packet switched between the forward and reverse representations. This also occurred in the loop at point C, although a portion of the forward and reverse paths was represented by the same pose cells, with the representation diverging for the rest of the loop.
8.5 Summary and Discussion This chapter has described the implementation of an extended model of the rodent hippocampus known as RatSLAM, and presented results from SLAM experiments in a range of indoor environments and one outdoor environment. The model is able to repeatedly ‘close the loop’ and produce consistent representations of the environment. By testing in different environments and on different robot platforms the model was demonstrated to be flexible in its application. 8.5.1 RatSLAM Requirements RatSLAM requires two main conditions to be fulfilled in order to successfully perform SLAM. Firstly, the robot platform must possess an external sensor and processing system that are capable of generating reasonably consistent output for the same environment location and robot orientation. In indoor environments this was achieved by using a forward facing camera to generate greyscale images that were down sampled and processed by a template matching system. In the outdoor environment, where illumination changes were more significant, a normalised red-green-blue colour space was used. A histogram matching system allowed RatSLAM to make use of the extra orientation information given by the panoramic camera used in these outdoor experiments. Secondly, RatSLAM requires appropriate sensory input in order to perform path integration. For both the Pioneer robot and the outdoor tractor platform this was achieved by using the wheel encoder velocities to update the robot’s pose state. The sensory input need not be highly accurate for the system to function adequately, as was the case in the outdoor experiments which had very large path integration errors. Beyond a certain size, the magnitude of the localisation error has no influence on the likelihood of RatSLAM re-localising the robot. This differentiates the RatSLAM system from other SLAM methods such as the Expectation Maximisation algorithm, which requires the path integration error to be within the bounds dictated by the motion model. However the accuracy of the path integration process does affect how long the robot can maintain localisation in the absence of external sensory input or during long terms of highly ambiguous input. 8.5.2 The Nature of RatSLAM Representations During these experiments RatSLAM generated stable, consistent maps of an environment that were topologically correct. The maps were stable in that after sufficient
116
RatSLAM: An Extended Hippocampal Model
experience in the environment the robot did not learn any new internal pose cell representations of the environment. The maps were consistent because after sufficient experience in the environment the robot consistently re-used its representations of a particular area of the environment as the robot moved through it. The topological connectivity of the pose cells correctly represented the connectivity of the environment. However, the final representations of space in the pose cell matrix were not continuous. Initially the layout of the pose cells had a high degree of correspondence to the Cartesian layout of the environment; nearby cells encoded adjacent areas, and cells that were distant from each other represented areas that were distant from each other. As the system learned the environment visually as well as spatially, odometric drift and visual re-localisation created discontinuities in the map. The relative location of cells in the pose cell matrix no longer necessarily corresponded to the relative spatial arrangement of the areas they represented. RatSLAM representations are also subject to two complementary phenomena; collisions and multiple representations. For small environments with a larger than one ratio between the nominal area encoded by the pose cell matrix and the actual environment area, the RatSLAM representations have a relatively high degree of correspondence to the Cartesian layout of the environment (see Fig. 8.25). However as the environments become larger and more complex, or as path integration performance degrades, a number of phenomena become common. Because re-localisation is not an instantaneous process, the system learns multiple representations – multiple clusters of pose cells become associated with the same physical location, such as the path segment AS in Fig. 8.27, which is represented by multiple segments in Fig. 8.29. The complementary attribute of collisions in the pose cell matrix also becomes increasingly common – clusters of pose cells in the same ( x ' , y ' ) area become associated with more than one location in the environment, such as the crossover points in the loop at D in Fig. 8.29, which each represent multiple locations in Fig. 8.27. There are many systems that have the capability to perform some form of SLAM. RatSLAM produces maps that become consistent and stable over time, but which contain discontinuities and multiple representations of the same physical place. In the context of creating autonomous mobile robots, a SLAM system is only useful if a robot can use the maps it produces to perform useful tasks. The next chapter describes a pilot study investigation into using the RatSLAM system to perform goal directed navigation on a mobile robot.
9 Goal Memory: A Pilot Study
SLAM methods vary significantly, which can make direct comparison challenging. However there is one performance indicator which can be universally applied to all these methods – can they be used to perform useful tasks such as goal orientated navigation? Navigating to goals is one of the fundamental purposes of any mapping and navigation system, whether it is one used by a rat foraging for food, or by a delivery robot in an office environment. The type and nature of the goal may vary between contexts, but the core problem usually remains the same; how can the animal or robot efficiently get from place A to place B. This chapter describes the implementation of a method for goal navigation using the RatSLAM system and a cell structure known as goal memory. Section 9.1 describes the process that generates maps suitable for use in route planning and execution. During exploration the robot uses a learning process to store temporal information, which is discussed in Section 9.2. Section 9.3 describes the methods for planning and executing routes to goals. The goal memory system was tested in progressively larger environments, which are described in Sections 9.3.1 and 9.3.3. The chapter concludes with a discussion of the experimental results and the implications for goal recall using RatSLAM in Section 9.4.
9.1 Enabling Goal Recall Using RatSLAM The task of adding a goal recall capability to the RatSLAM system was undertaken in a pragmatic way as with the development of the core mapping method. Some biological navigation concepts were considered, such as the idea of activity encoding the desired future location of the robot. However there was little attempt to model the actual firing properties of biological place cells during task behaviour, with more emphasis given to the computability of the algorithms. The system is intended to be implemented on an actual robot in a typical working environment. The goal memory system learns the temporal gradient between places in the environment. The temporal gradient refers to the time taken to navigate between places, and represents the speed at which the robot travels as well as the distance the robot is required to cover. The robot learns the gradient during exploration, at the same time as it builds the pose cell representation of the environment. To find a path to a goal, M.J. Milford: Robot Navigation from Nature, STAR 41, pp. 117–128, 2008. © Springer-Verlag Berlin Heidelberg 2008 springerlink.com
118
Goal Memory: A Pilot Study
the gradient is integrated from the goal location. The integral builds quickly along long or difficult paths, and slowly on more direct paths that the robot can navigate effectively. The robot chooses the direction with the least integral, where the integral corresponds to the temporal distance to the goal.
9.2 Learning The goal memory structure consists of a two-dimensional array of cells that are a projection of the three-dimensional pose cell structure along the T ' axis; they represent the summation of pose cell activity along the T ' axis (see Fig. 9.1). Cells in the goal memory system are associated with places in the environment through the pose cells and consequently inherit the non-Cartesian spatial characteristics of the pose cell structure. The activity level of each goal memory cell, G x ' y ' , is obtained by summing the activity in all the pose cells representing a certain
(x' , y' ) location:
nθ
Gx ' y ' = ∑ Px ' y 'θ'
(9.1)
θ =0
where P is the pose cell activity level. Links form between goal memory cells that are currently active and the goal memory cells that were active during the last system iteration. The links convey information about the time to travel from place to place in the environment. Large values for connections correspond to long travel times, while small values indicate short travel times. The connection strengths between cells are updated based on three factors: • the time duration between iterations, τ, • the activity level product of the linked goal memory cells, χ, and • the cell distance between the goal memory cells, δ. Link updates are dominated by τ, the temporal difference between the last and current iteration, which is given by:
τ = η τ (t k − t k −1 )
Fig. 9.1. Active goal cells are linked forwards and backwards in time
(9.2)
119
Recall
where tk is the time that the kth iteration occurred and ητ is a significance factor (as are the following ηχ and ηδ) that determines this parameter’s contribution to the link update. This temporal difference contribution forms the basis of the goal memory cell connection strengths. The activity level product, χ , is given by:
(
)
⎡ Gk 2 ⎤ χ = ηχ ⎢ k −1max k − 1.0⎥ ⎢⎣ Gi × G j ⎥⎦ where
(9.3)
k k −1 Gmax is the current maximally activated goal memory cell, Gi is the activ-
G kj is the activity level of cell j during the
ity level of cell i during the last iteration,
current iteration and ηχ is the significance parameter. Pose cell activity packets are most strongly activated at their centres with weaker activation levels towards their boundaries. The χ parameter leads to links between two weakly activated goal memory cells being given larger values (and hence a slower transition association) than between strongly activated cells. This ensures the goal memory system is less likely to use these connections when determining the fastest route to the goal. The cell distance parameter δ makes link strengths somewhat dependent on the spacing between the two cells. The distance parameter is assigned a low significance factor, ηδ. Its primary role is to differentiate between the near side and far side of the activity packet, rather than to provide a spatial measure. δ is given by:
δ = ηδ where
(x
k −1 i
(x
k −1 i
− x kj
) + (y 2
k −1 i
− y kj
)
2
(9.4)
(
)
)
, yik −1 are the cell co-ordinates from the last iteration and x kj , y kj are
the co-ordinates of the cell from the current iteration. The updated link value, μ k , is calculated using a moving average:
μk =
μ k −1 + τ + χ + δ 2
(9.5)
Goal memory cells are linked both forwards and backwards in time, that is μij = μji where μij is the forwards in time link and μji is the backwards in time link.
9.3 Recall To find the best path to a goal location, the goal memory system forms a temporal map which is based on the integral of the temporal gradient connections in the goal memory. The temporal map is an array of cells, T, that represent the time to the designated goal. The array has the same dimensionality as the goal memory array. To form the integral, the value of each cell in the temporal map is set to a large value, except for the unit which corresponds to the goal location which is set to zero. The temporal
120
Goal Memory: A Pilot Study
map is then updated until it reaches a stable state. The updated activity level of a temporal cell, T, is given by:
⎡ Nl ⎤ Ti = min ⎢Ti , ∑ (T j + μ ji )⎥ ⎣ j =1 ⎦ where Nl is the number of links from other cells to cell i and
(9.6)
μ ji is the connection
strength from cell j to cell i. This update is performed for every cell during each iteration of the temporal map creation process. The temporal map is considered stable when the activity level is not decreasing in any cells. In this stable state, cell activity levels are lowest at the goal and increase outwards representing locations further in time from the goal cell. To calculate the direction that the robot should move, the system searches near the temporal map cells corresponding to the current pose activity packet for the direction of travel that will yield the fastest decrease in cell activity. This direction specifies the quickest route to the goal, and is used continuously to perform goal directed navigation. As with the pose cell model, learning and recall happen continuously and concurrently. By updating the goal memory as the robot attempts to achieve the current goal, the system can potentially find better paths as new information is learnt. Experiments were performed in progressively larger and more complex environments with increasing numbers of goals and longer experiment durations. The purpose of this escalation was to test the applicability of the goal memory system to large scale environments. 9.3.1 Experiment 8: Small Environment Goal Recall The testing environment for the initial goal recall experiments was part of a building floor at our university campus. Fig. 9.2 shows the floor plan and robot trajectory through the environment. The experiments were conducted in normal office hours with occupants moving around and about the robot. Experiments consisted of an exploration and a goal navigation stage. At various times during exploration the experimenter designated the robot's position as a future goal by pressing a key on a laptop wirelessly connected to the robot. Another button press switched the robot from exploration to goal navigation mode, with the robot seeking out each goal in order. The only user input was to choose goal locations and tell the robot to start the goal navigation process. The task of exploring, performing SLAM and navigating to each goal was left entirely to the RatSLAM system. The visual and goal memory learning processes were active during both the exploration and the goal navigation phases. The dimensions of the pose cell structure for these experiments were 140 × 80 × 36 cells, initially corresponding to a physical space of 35 m × 20 m × 360° (a significantly larger area than the test environment so that the map could be learned without wrapping). The goal memory connections were formed based on the three parameters, ητ, ηχ, and ηδ as described in Section 9.2. Parameter values of 1.0, 0.25 and 0.1 respectively were used because they were empirically found to produce sensible temporal maps. The goal recall performance was insensitive to small changes in these values.
Recall
121
Fig. 9.2. Floor plan and robot trajectory for Experiment 8. The numerical labels indicate the two goal locations. A, B, and C correspond to places of interest that are discussed in Section 9.3.2.
In order to perform these experiments autonomously, the robot was given a behaviour-based exploration strategy driven from the sonar sensors. The behaviour alternated between left hand wall following and a turn-back-and-look behaviour. Both behaviours were subsumed by an obstacle avoidance behaviour. The turn-back-andlook behaviour would attempt to backtrack the recent path and then return to the forward position. It was triggered by either travelling a distance greater than three meters, or turning through an angle greater than 30 degrees. The purpose of this behaviour was to view each region from different directions. 9.3.2 Goal Recall Results The robot spent 13 minutes exploring the environment. During this time it simultaneously learned both the vision–pose associations and the goal memory relationships. Fig. 9.3 shows the RatSLAM trajectory just before the robot was instructed to start navigating to the goals. The small size of the environment and the absence of any significant wheel slippage ensured that there were never any large path integration errors. Consequently when the system did re-localise, it was through nudging of the dominant pose cell activity packet, rather than large jumps from one packet to another. When the robot was located at point A it was instructed to seek out the goals in the order they were designated. To navigate to the first goal, the goal memory system initiated the recall process, integrating the goal memory weights to form the temporal map. The temporal map’s stable state is shown in Fig. 9.4. The darkly shaded regions are cells with low activity levels, indicating they are close to the goal location in time and distance. By moving in directions which resulted in the maximum decrease in the temporal map cell activity levels, the robot was able to follow the fastest route to the first goal, as shown in Fig. 9.5. The system stopped navigation when the most probable robot position (the peak of the largest activity packet in the pose cells) was
122
Goal Memory: A Pilot Study
Fig. 9.3. RatSLAM trajectory just before goal navigation commenced. The goal locations are labelled with numbers. ‘A’ marks the location of the dominant activity packet at this time. Each grid square represents 4 × 4 pose cells in the ( x ' , y ' ) plane.
Fig. 9.4. The temporal map cells after recall of the first goal. Darker areas correspond to lower cell activity levels and hence locations close in time to the goal.
within one cell of the goal location. Navigation of the 16.5 metre path took a little over 96 seconds. Upon reaching the first goal the recall process recalculated the temporal map for navigation to the second goal. The stable state distribution of activity is shown in Fig. 9.6. The robot navigated to the second goal by again moving in directions that resulted in the maximum decrease in temporal map cell activity, following the trajectory shown in Fig. 9.7. Navigation of the 21 metre path took about 140 seconds.
Recall
123
Fig. 9.5. The path the robot followed to reach the first goal. Each grid square represents 4 × 4 pose cells in the ( x ' , y ' ) plane.
Fig. 9.6. The temporal map cells after recall of the second goal. At B the only movement direction that results in a decrease in goal cell activity is directly to the left.
At point B the robot appeared to have a choice of two routes to the goal. Closer examination of the temporal map reveals that the robot in fact only had one choice; in order to move to the goal via the lower route the robot would need to first move temporally away from the goal, as revealed by the higher temporal values below point B (shown by lighter shading in Fig. 9.6). There is a plateau at point C in the temporal map where it is possible for the goal memory movement vectors to be split evenly between taking the upper and lower paths. The integral operation in Eq. 9.6 generally precludes the robot ever being directed towards a plateau region in the temporal map, such as in this situation where
124
Goal Memory: A Pilot Study
Fig. 9.7. The path the robot travelled to reach the second goal. Each grid square represents 4 × 4 pose cells in the ( x ' , y ' ) plane.
the robot would not be directed towards point C. However there are pathological situations where it could occur. The choice of the robot as to which path to take in this condition is arbitrary and assumed equal from the data gathered. A decision could be forced using a random function or a hysteresis mechanism. 9.3.3 Experiment 9: Large Environment Goal Recall The second set of experiments was conducted using most of the building floor area shown in Fig. 9.8. This larger, more complex environment led to longer experiment durations, since the robot had a lot more ground to cover and more route possibilities. Wrapping in the pose cell structure became a consideration, since the nominal area encoded by the pose cells was smaller than the physical area of the environment. To
Fig. 9.8. Floor plan of large indoor environment. The robot’s path is shown by the thick line.
Recall
125
test the overall system functionality with pose cell wrapping, a small 40 × 20 × 36 pose cell matrix was used with a nominal no-wrap area representation of 10 × 5 m. This is much smaller than the actual 28 × 13 m environment area. 9.3.4 Goal Recall Results The pose cell trajectory for the hour-long experiment superficially bears little resemblance to the actual trajectory the robot followed through the environment, as shown in Fig. 9.9. There are many loop closures (shown by straight dashed lines), collisions, and multiple representations. However, many sections of the pose cells were consistently reused to represent repeated visits to certain areas of the environment. Over time the pose cell map became stable and did not learn any new pose cell representations.
Fig. 9.9. Dominant packet path for a 40 × 20 × 36 pose cell matrix. The path is projected onto the ( x ' , y ' ) plane. Each grid square represents 4 × 4 pose cells in the ( x ' , y ' ) plane. ‘Start’ and ‘End’ mark the initial and final location of the dominant activity packet.
Fig. 9.10 shows the temporal map created for this environment by the goal memory system, for a goal located at the ‘Start’ location in Fig. 9.9. Because the goal memory system produces a temporal map by using a copy of the pose cell matrix, it inherits the discontinuities and multiple representations of the pose cell matrix. The multiple representations and collisions ensured that just about every cell was connected to many other cells, which resulted in very short predicted times to get between any two places in the map. The maximum predicted travel time was five seconds, compared with 120 and 150 seconds for Experiment 8. The robot attempted to use this map when it was instructed to navigate from ‘A’ to the ‘Start’ location shown in Fig. 9.8. This resulted in the robot driving a few metres in the wrong direction to the entrance of the main corridor, before getting stuck in an endless conflict loop between its goal memory module and local obstacle avoidance routine. The actual robot repeatedly tried to drive through the upper corridor wall, with the local obstacle routine overriding it whenever it got too close.
126
Goal Memory: A Pilot Study
Fig. 9.10. Temporal map for the large indoor environment
9.4 Summary and Discussion This chapter described the implementation and testing of a goal navigation algorithm that directly uses the pose cell representations. In small environments with small path integration errors the robot was able to navigate successfully to goals. However in a larger environment, the goal memory system failed to calculate appropriate routes to the goals. The goal memory system failed because it implicitly interprets the structural arrangement of the pose cells as corresponding to the physical arrangement of the places associated with them. These experiments showed that in large environments the pose cell structure loses spatial relevance, through a number of phenomena related to wrapping of the pose cells and visual re-localisation. Discontinuities, collisions, and multiple representations remove any useful spatial information encoded by the structural arrangement of the pose cells. Instead the pose cells develop a characteristic shared by biological place cells in the rodent hippocampus – the arrangement of the cells bears no correspondence to the spatial structure of the environment. This characteristic means it is not possible to directly extract spatial information from the pose cell structure for tasks such as navigating to goals. In addition to the lack of spatial information, collisions in the pose cells mean that, in certain pose cell activity states, the robot has multiple hypotheses about its location. This is an undesirable characteristic for goal navigation – in such a situation the robot would have to navigate based on one hypothesis and then backtrack if it was the wrong hypothesis. A hysteresis mechanism could be used to allow the robot to ‘shoot through’ such collisions, but goal navigation would still fail for large enough collision areas in the pose cells. Multiple representations also cause goal specification problems – for a human operator to specify a desired goal location all the pose cell representations of the goal location must be marked. The following sections discuss the concepts introduced in Section 8.5.2 and their implications for the usability of the maps in goal recall.
Summary and Discussion
127
9.4.1 Creating Maps Suited to Goal Recall Two approaches were investigated to address the phenomena of collisions and multiple representations in the pose cell maps. The first approach involved creating a competitive attractor network that dynamically allocated pose cells and inter-pose cell links during an experiment. Excitatory links were created between cells while the robot explored, with the link strength encoding the physical distances between the places represented by the cells. Global inhibition and normalisation were also applied as in the original model. The network was able to form interconnected cell structures when tested in simulation. However the dynamic cell creation process resulted in the cell representation density varying for different parts of the environment. While this occurs naturally as a result of visual input in the RatSLAM pose cell network, the hard-coded excitatory and inhibitory links maintain reasonable firing profiles. In the new network the dynamically allocated links, despite their strengths being modulated by distance, did not generate stable firing profiles. When the robot was stationary activity packets could drift quite rapidly. The path integration process was also unstable. A number of algorithms and compensatory thresholds were implemented in an attempt to solve these problems but with no success. The approach was finally abandoned as being impractical. This work did however provide some insight into the challenge of modelling spatial encoding in a cell network. It is clearly a difficult task to create a system that: • creates its cell structure and inter-cell links through a path integration process, rather than using a hard-coded structure and link system, • encodes pose rather than place and orientation, and hence can sensibly maintain multiple pose hypotheses, • has attractor network dynamics that allow appropriate long term propagation of activity packets under ideothetic input but without allothetic input, as well as providing stable firing profiles in the absence of external input, • re-localises through appropriate competition between current cell activity and activity injected through allothetic input, and • can perform all these tasks in a large complex environment with ambiguous, noncontinuous allothetic input and significant odometric drift. A subset of these capabilities can be achieved in a system. For instance, the RatSLAM system performs all but the first task. Some self-organising models (Stringer, Rolls et al. 2002; Stringer, Trappenberg et al. 2002) create their own cell structures and inter-cell links, but require a distance sensitive vision processing module to set up the cell firing fields and provide no means of updating the cell structure after a large re-localisation. In addition these models self-organise a two-dimensional rather than three-dimensional attractor network. A solution was finally found not by changing the core RatSLAM system, but by extending it. In order to maintain the last four capabilities listed above, which are all desirable for a robot mapping and navigation system, but also to address the three issues of discontinuities, collisions, and multiple representations, a new algorithm
128
Goal Memory: A Pilot Study
known as experience mapping was developed. This algorithm maps directly from the core representations of robot pose and environment appearance stored in the pose and local view cell networks to create an experience map that stores spatial, visual, temporal, behavioural, and transitional information about the environment. These maps inherit the desirable characteristics of the core RatSLAM representations but remove the phenomena of discontinuities, collisions, and multiple representations through the algorithm’s map correction mechanism. Chapters 10 and 11 present the experience mapping algorithm, and show how its combination with the RatSLAM model results in a complete system for autonomous robot exploration, SLAM, goal navigation, and adaptation to environment change.
10 Extending RatSLAM: The Experience Mapping Algorithm
In environments that are larger than the nominal area encoded by the pose cells, the RatSLAM model reuses the pose cell matrix multiple times in creating a map of the environment. This leads to several phenomena in the pose cell matrix: discontinuities where adjacent physical areas are represented by distally separate cells; multiple representations where several clusters of cells represent the same physical area; and collisions where several physical areas are represented by the same set of cells. Although the representations are topologically correct, consistent and stable, these phenomena remove the spatial relevance of the pose structure layout and hence any possibility of using it directly for tasks such as goal navigation. One approach to overcoming this problem is to create an algorithm that can maintain the inter-cell spatial information while retaining the topological correctness of the pose cell maps. The experience mapping algorithm was developed to pursue this approach. This chapter describes the development and implementation of the algorithm and its integration with the RatSLAM system. Section 10.1 starts by describing the concept of an experience, the experience map, and the relationship of experiences to the RatSLAM model. This is followed by a description in Section 10.2 of how transitions are formed between experiences. The spatial information stored in transitions is used to perform map correction: the process is described in Section 10.3. Section 10.4 describes methods that were developed to allow the maps to adapt to changing environments and to self-prune during long-term continuous experiments. Sections 10.5 and 10.6 describe experiments testing the experience mapping algorithm in indoor and outdoor environments. The chapter concludes by discussing the algorithm’s place within the research literature in Section 10.7.
10.1 A Map Made of Experiences Activity in the pose cells and local view cells drives the creation of experiences. Each experience represents a snapshot of the activity within the pose cells and local view cells at a certain time. When the set of existing experiences is insufficient for describing the pose and local view cells' activity state, a new experience is created. Fig. 10.1 M.J. Milford: Robot Navigation from Nature, STAR 41, pp. 129–143, 2008. © Springer-Verlag Berlin Heidelberg 2008 springerlink.com
130
Extending RatSLAM: The Experience Mapping Algorithm
shows an experience and how it is associated with certain pose and local view cells. x' , y ' , and θ ' describe the location of the cells within the pose cell matrix associated with the experience, and V describes the local view cell associated with the experience.
Fig. 10.1. Experience map co-ordinate space. An experience is associated with certain pose and local view cells, but exists within the experience map's own ( x, y ,θ ) co-ordinate space.
Each experience also has its own ( x, y ,T ) state which describes its location within the co-ordinate space of the experience map. This co-ordinate space is completely separated from the pose and local view cell co-ordinate spaces. The first experience learned is initialised with an arbitrary (0,0,0) position within the experience map. Subsequence experiences are assigned a position based on the last experience's position and the robot movement that has occurred since. Experiences have an activation level that depends on how close the energy peaks in the pose and local view networks are to the cells associated with each experience. Each experience has a zone of association within the pose cells and local view cells. When the energy peaks in each network are within these zones, the experience is activated. Within the pose cells the zones are continuous, whereas within the local view cells the zone is discrete. The component of an experience's energy level determined by the current pose cell activity, E x ' y 'θ ' , is calculated by:
E x ' y 'T '
0 ° 0 ® °2 r' T ' r r ¯
if r' r ! 1; if T' r ! 1; otherwise
(10.1)
131
Linking Experiences: Spatially, Temporally, Behaviourally
r' r =
(x' pc − x'i )2 + (y' pc − y'i )2 ra
θ 'r = where
(10.2)
θ ' pc −θ 'i θa
(10.3)
x' pc , y' pc , and θ ' pc are the co-ordinates of the maximally active pose cell,
x'i , y'i , and θ 'i are the co-ordinates of the pose cells associated with the experience, ra is the zone constant for the ( x ' , y ' ) plane, and θ a is the zone constant for the θ ' dimension. The visual scene V acts like a switch for the experience, turning it on or off. The th total energy level of the i experience, Ei , is given by:
⎧ 0 Ei = ⎨ ⎩ E x ' y 'θ '
if Vcurr ≠ Vi ; if Vcurr = Vi
(10.4)
Vcurr is the current visual scene, and Vi is the visual scene associated with experience i .
where
RatSLAM continuously monitors the energy levels of the experiences. If no experiences have a positive energy level, this indicates that no existing experiences sufficiently represent the current pose and local view cell activity. In such a situation a new experience is generated to represent the current activity state of the pose and local view cells.
10.2 Linking Experiences: Spatially, Temporally, Behaviourally As the robot moves around the environment, the experience mapping algorithm also learns experience transitions. Transitions store information about the physical movement of the robot between one experience and another, as well as the movement behaviour used during the transition and the time duration of the transition. This information is used to perform experience map correction and is also used by the goal recall process. Fig. 10.2 shows a transition from experience i to experience j. The robot’s change in pose state between experiences, dpij, is calculated by comparing the odometric pose of the robot before and after the transition:
⎛ θ j ⎞ ⎛ θ i ⎞ ⎛ dθ ij ⎞ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ dp ij = p j − p i = ⎜ x j ⎟ − ⎜ xi ⎟ = ⎜ dxij ⎟ ⎜ y ⎟ ⎜ y ⎟ ⎜ dy ⎟ ⎝ j ⎠ ⎝ i ⎠ ⎝ ij ⎠
(10.5)
132
Extending RatSLAM: The Experience Mapping Algorithm
Fig. 10.2. A transition between two experiences. Shaded circles and bold arrows show the actual pose of the experiences.
d ij
is the odometric distance between the two experiences.
p i and p j are vectors describing the pose of the robot according to odometry before and after the transition. This equation is only applicable when experience j is first created, with its (x j , y j , θ j ) pose determined by the robot pose associated with experience i and the robot's movement during the transition. where
As the robot navigates familiar parts of the environment, old experience transitions may be repeated, but with different odometric information about the transition. The new information about the transition,
dp ijnew , is incorporated through an averaging
process:
dp ijnew = A.dp ijold + B.dp ijcurr
(10.6)
where
0 ⎛1 2 ⎜ A = ⎜ 0 d s . cos dθ ⎜ 0 d . sin dθ s ⎝
⎞ ⎟ − d s . sin dθ ⎟ d s . cos dθ ⎟⎠ , 0
⎛1 2 0 0 ⎞ ⎟ ⎜ B = ⎜ 0 0 0⎟ ⎜ 0 0 0⎟ ⎝ ⎠, curr ⎛ old 1 ⎡ −1 ⎛⎜ dyij ⎞⎟ −1 ⎜ dyij dθ = ⎢ tan − tan ⎜ dxijcurr ⎟ ⎜ dxijold 2⎢ ⎝ ⎝ ⎠ ⎣
and
(
d s = dijcurr + dijold
) (2d ). old ij
⎞⎤ ⎟⎥ ⎟⎥ ⎠⎦ ,
Map Correction
133
Two other variables are used to store temporal and behavioural information. tij stores the time taken for the robot to transition between the two experiences. This link variable is used to form the temporal map described in Section 11.2.1. β ij stores the local movement behaviour that was used during the transition. The behavioural link information is used to update the experience map when the environment changes using the method described in Section 10.4, and is also used to aid exploration of the environment using the method described in Section 11.1. Fig. 10.3 shows an example of the behavioural and temporal information learned through the experience mapping algorithm. By using the algorithm, the robot is, in effect, learning routes through the environment in a robot friendly format. From the robot’s perspective, a route is a sequence of experiences that can be achieved in order by using the appropriate movement behaviours between experiences. For example, at the fork, the route that the robot follows depends on whether it uses the left wall following (LWF) or right wall following (RWF) local movement behaviour. The robot also has a concept of how long a route typically takes to traverse, as encoded by the temporal transition information between experiences. Sections of a route where the robot is forced to turn may take longer to traverse than straight sections.
Fig. 10.3. An example of the behavioural and temporal information. The numbered circles represent experiences, and CF, LWF, and RWF are movement behaviours: CF – Centreline Following; LWF – Left Wall Following; RWF – Right Wall Following.
10.3 Map Correction When the robot sees familiar visual scenes after spending time in a novel part of the environment, visual activity is injected into the pose cell matrix, causing the robot to re-localise its perceived pose. This causes the robot’s associated location within the
134
Extending RatSLAM: The Experience Mapping Algorithm
experience map (given by the maximally active experience) to jump from the new experience it has most recently learned to a previously learnt experience. The experience mapping algorithm learns this new transition, which contains a large discrepancy between the transition's relative spatial information and the difference between the two experiences ( x, y ,θ ) co-ordinates in the experience map. For example, two experiences may be positioned several metres apart in the experience map but be linked by transitions encoding much smaller distances. By minimising the discrepancies between relative experience poses and interexperience spatial transition information, the experience map can become locally representative of the environment's spatial arrangement. This can be achieved by consolidating the expected pose of each experience based on relative odometric information and their current pose. Under this correction process, the change in experience pose, Δp i , is given by: Nt ⎡ Nf ⎤ Δp i = α ⎢∑ (p j − p i − dp ij ) + ∑ (p k − p i − dp ki )⎥ k =1 ⎣ j =1 ⎦
where
α
is a learning rate constant,
other experiences, and
(10.7)
N f is the number of links from experience i to
N t is the number of links from other experiences to experi-
ence i . The application of Eq. 10.7 to all the experiences in the experience map is dubbed the experience map correction process. The map correction process is subject to the same constraints of any network style learning system – appropriate learning rates must be used to balance rapid convergence with instability. Experimentation determined that a learning rate of α = 0.5 results in the map rapidly converging to a stable state. When the orientation of an experience is changed through the map correction process, the ( dx , dy ) component of its transitional information must also be updated to account for the rotation. The updated transitional information,
dp
where
new ij
0 ⎛1 ⎜ = ⎜ 0 cos Δθ i ⎜ 0 sin Δθ i ⎝
dp ijnew , is given by:
⎞ ⎟ − sin Δθ i ⎟dp curr ij ⎟ cos Δθ i ⎠
0
(10.8)
Δθ i is the change in the orientation of the experience.
10.4 Map Adaptation and Long Term Maintenance The experience maps represent changes in the environment through modification of the inter-experience transition information. As well as learning experience transitions, the system also monitors transition ‘failures’. Transition failures occur when the robot's current experience switches to an experience other than the one expected given
Map Adaptation and Long Term Maintenance
135
the robot's current movement behaviour. If enough failures occur for a particular transition, indicated by the confidence level dropping below a certain threshold, then that link is deleted from the experience map. The confidence level, cij , is given by:
cij =
nij ni βij
(10.9)
where nij is the number of times the transition between experience i and j has occurred, and ni β is the number of times a transition from experience i to any experiij
ence using the behaviour βij has occurred. This process has some similarity to the recency weighted averaging scheme described by Yamauchi and Beer (1996) and is discussed further in Section 10.7. In a fixed size environment it is desirable that the experience mapping algorithm limit the number of experiences in the map. The number of experiences needs to remain limited to ensure that the map correction process is computable in real-time. To give this issue a numerical aspect, during an 80 minute experiment in a typical indoor office environment the algorithm may learn around 6000 experiences. On a 1.1 GHz Pentium 3 computer running RatSLAM and the experience mapping algorithm, this
Fig. 10.4. Experience map growth over time. As the number of experiences in the map increases, the computational load per map correction iteration increases. This reduces the number of map correction iterations that can be performed for each iteration of the RatSLAM system.
136
Extending RatSLAM: The Experience Mapping Algorithm
number of experiences slows the map correction process to an average of two or three map correction iterations per system cycle (with RatSLAM system cycles happening seven times a second). Fig. 10.4 displays a graph showing how the map correction process slows down as the experience map grows larger. The rightmost part of the graph represents the approximate experience map size limit on this computing hardware – if the map were any larger the correction process would not perform sufficient iterations to close large cyclic loops. This could be solved by stopping the robot for a moment and dedicating the computational resources to experience map correction, but the system would no longer be considered an online system. In order to limit the size of the experience map, it is necessary to prune redundant experiences. The RatSLAM system in combination with the experience mapping algorithm has no global Cartesian sense of space, so it is difficult to identify redundant map representations. For instance, differentiating between redundancy and dynamic change in the environment is a very challenging task. The development of an experience map pruning technique involves a number of heuristic decisions: • Should the system unlearn parts of the environment it can no longer access and how can it identify such situations? • What is the difference between a valid representation of a currently inaccessible location and a redundant representation of an existing location? • In general how should the system determine experience importance in order to prioritise the retention of experiences? For the RatSLAM system and experience mapping algorithm there are two possible approaches – prune through spatial analysis of the experience map structure, or prune through weeding of once-off experiences. The second approach was taken in the initial implementation of a map maintenance module. When the experience map reaches the allocated maximum size, the pruning algorithm starts removing experiences. The algorithm searches for experiences that have only been experienced once by the robot during the experiment, and selects the experience from this group with the oldest activation time stamp. By pruning these experiences, the robot forgets parts of the map which have never been verified by a second pass, but keeps representations that have been reinforced by multiple passes. The temporal selectivity ensures that older once-off experiences will be unlearned rather than those the robot has just learned exploring a newly discovered room. Due to battery life constraints, this pruning algorithm was never fully tested on a robot, although it was found to work well in tests of up to 2 hours duration.
10.5 Indoor Experience Mapping Results A number of indoor experiments were performed to test the experience mapping algorithm. The experiments were conducted in the environment shown in Fig. 10.5, the same environment where the original goal memory system was shown to fail. The RatSLAM system and experience mapping algorithm were both run in an online manner.
Indoor Experience Mapping Results
137
This section presents the results of two experiments which were performed under different pose cell conditions. The first experiment used a 100 × 40 × 36 pose cell matrix with an initial correspondence to a 25 × 10 metre rectangular environment. To stress the pose cell representation, the second experiment used a 40 × 20 × 36 pose cell matrix, corresponding to a 10 × 5 metre environment. Experiments were run for approximately 80 minutes duration, during which time the robot travelled approximately 1.2 km. 10.5.1 Experiment 10: Large Pose Cell Representation Although the pose cell matrix was almost large enough to represent the environment without any wrapping, odometric error led to significant wrapping and collisions in the pose cell representation. Fig. 10.6 shows the trajectory of the dominant activity
Fig. 10.5. Floor plan for the experience mapping experiments
Fig. 10.6. Path of dominant activity packet (100 × 40 × 36 cell matrix). The path is projected onto the ( x ' , y ' ) plane. Each grid square represents 8 × 8 pose cells in the ( x ' , y ' ) plane.
138
Extending RatSLAM: The Experience Mapping Algorithm
Fig. 10.7. Snapshots of the experience map at various times. The robot’s location is shown by the small circle.
packet through the ( x ' , y ' ) plane of the pose cell matrix. Visual input caused 54 loop closures, shown by the long straight dashed lines in the figure. It is possible to discern the correspondence between some sections of the map and the actual environment, but the map lacks global correspondence. Fig. 10.7 shows a series of six snapshots of the experience map at different times during the course of the experiment. The times have been chosen to show critical events such as when the robot closed the loop. For instance, Fig. 10.7a and Fig. 10.7b show the map just before and after the robot closed a loop of the entire environment; the loop was 122 metres long and took eight minutes to complete. After only a few
Indoor Experience Mapping Results
139
seconds the map correction process had already drawn together the two corridor representations somewhat, as shown by Fig. 10.7b. The map correction continued until the corridor representations closely overlapped as shown in Fig. 10.7c. Fig. 10.7c and Fig. 10.7d show the experience map just before and after the robot completed a reverse lap of the left room. The last two subfigures show the state of the experience map at two later stages in the experiment when it had become mostly stable, although some minor corrections were still occurring. By the end of the experiment the experience map contained none of the discontinuities visible in the pose cell representation and all the multiple representations were grouped into overlapping areas, as shown by Fig. 10.8. Approximately 5900 experiences were created during the experiment. The entire map rotated slightly in a clockwise direction – this occurred because none of the experiences were anchored to any absolute spatial co-ordinate.
Fig. 10.8. Experience map produced using a 100 × 40 × 36 cell matrix
10.5.2 Experiment 11: Small Pose Cell Representation When the experiment was repeated using the smaller pose cell matrix size, wrapping in the pose cell matrix occurred much more frequently, resulting in an increased number of collisions and multiple representations. Significant wrapping occurred in both the x ' and y ' directions as shown in Fig. 10.9, compared with Fig. 10.6 in which most wrapping occurs in the y ' direction only. Visual input drove 43 significant loop closures, a slight decrease from the 54 that occurred when using the larger pose cell matrix size. Just about every part of the pose cell matrix was used with some areas representing many parts of the environment, such as in the right section of Fig. 10.9. The experience mapping algorithm was able to produce a coherent experience map as the robot learned this complex pose cell representation. Fig. 10.10 shows the experience map: the discontinuities visible in the pose cell matrix plots are gone, and multiple representations have been grouped into overlapping areas of the map. In one section of the map at the bottom right the map correction algorithm failed to overlap
140
Extending RatSLAM: The Experience Mapping Algorithm
two multiple representations. The left representation was only experienced once during the entire experiment, whereas the right representation was reinforced through repeated traverses of the route. This once-off representation most probably represents a different visual configuration of that part of the environment, which changed on subsequent visits. Although battery life did not permit it, if the experiment had continued it is likely the pruning mechanism described in Section 10.4 would have removed the redundant left representation path.
Fig. 10.9. Path of dominant activity packet (40 × 20 × 36 cell matrix). The path is projected onto the ( x ' , y ' ) plane. Each grid square represents 4 × 4 pose cells in the ( x ' , y ' ) plane.
Fig. 10.10. Experience map produced using a 40 × 20 × 36 cell matrix
10.6 Experiment 12: Outdoor Experience Mapping To test the performance of the experience mapping algorithm in a different environment and on a different robot platform, it was run with the RatSLAM system on the
Experiment 12: Outdoor Experience Mapping
141
data sets obtained using the CSIRO robot tractor. The algorithm was modified slightly to make use of the advantages offered by the omni-directional camera on the outdoor robot. This modification was implemented in collaboration with a fellow researcher. Experiences representing the same location but different orientations should overlap exactly. This map characteristic was achieved by using non-directional spatial links to connect experiences representing the same location on forward and reverse traverses of a path. As the robot experienced a location a second time at a different orientation, it learned a link between the two experiences with zero-length and a 180 degree relative orientation. Through the map correction process, this extra link pulled the two experiences together to produce a spatially consistent map, with the 180 degree relative orientation specification ensuring the appropriate relative directional relationship. By actively linking forward and backward traverses of a path, this modification of the algorithm made it superior to the indoor version, which links forward and backward paths through indirect odometric information. This modified experience mapping algorithm was applied to the University of Queensland campus data set. The data set was obtained by driving the robot tractor approximately 2.8 km through a variety of environment types including footpaths, rainforest areas, roads, a bridge over a lake, and the area under a building. The pose cell matrix size was 200 × 100 × 36 cells with an approximate initial correspondence without wrapping to an area measuring 800 × 200 metres. Further details about the data set can be found in Section 8.4.4. Running the modified experience mapping algorithm on the campus data set produced the experience map shown in Fig. 10.11. During the half hour experiment the algorithm learned 1218 experiences. Most of the multiple pose cell representations of the same area in the environment were grouped into overlapping representations in the experience map. The discontinuities normally present in the pose cell representations, such as those shown in Fig. 8.29, were also removed by the experience mapping algorithm. However the experience map representations of forward and reverse paths did not always overlap perfectly. For example, the representations of the two loops at the upper left and upper right of the environment were slightly misaligned. This misalignment was caused by a failure of the primary mechanism by which forward and reverse paths are linked together. The modified experience mapping algorithm has two mechanisms for drawing together multiple representations of the same place. The primary mechanism draws together multiple representations through the zero-length, 180 degree relative orientation links that are formed between forwards and backwards paths. This is the stronger mechanism and results in well-grouped multiple representations. The weaker mechanism by which multiple representations are grouped together is through the general map correction process based on relative spatial information between experiences. This is a more indirect process because it relies on the local accuracy of odometric information. Due to insufficient visual information, no direct spatial links were learned between the two sets of experiences representing the forward and backward traverses of the upper left loop in Fig. 10.11. However, the forward and backward representations of the path leading up to the loop were linked directly together and hence overlapped perfectly in the experience map. The map correction process attempted to use the common origin point of both loop representations and the inter-experience spatial
142
Extending RatSLAM: The Experience Mapping Algorithm
information from that point onwards to move the two loop representations towards each other. However, since odometry is only accurate over the short-term, this resulted in the representations being slightly misaligned. Despite the slight misalignments between some of the forward and reverse path representations, the map has a high degree of correspondence to the actual environment. A different visual processing method might have directly linked all the forward and backward representations resulting in a perfectly aligned map. However the misalignments are minor and would probably not impede the performance of the goal navigation process described in Chapter 11.
Fig. 10.11. Experience map for outdoor environment. The map was produced using a 200 × 100 × 36 pose cell matrix and contained 1218 experiences.
10.7 Summary and Discussion The experience mapping algorithm uses the representations stored in the core RatSLAM pose and local view cell networks to produce maps that are spatially continuous. Multiple representations of the environment are grouped together into overlapping sections of the map. The four-dimensional state of an experience – three pose states and a vision state – means that experience maps are not subject to the collision phenomenon that occurs in the pose cell matrix. The experience mapping algorithm shares some similarities with the ELDEN system by Brian Yamauchi & Beer (1996). Like ELDEN, the experience mapping algorithm uses confidence links to represent the robot’s certainty that a link can be traversed. However the experience mapping algorithm also encodes behavioural, spatial and temporal information in its links. The spatial information is relative rather than absolute and more comprehensive than in the ELDEN system; each link contains
Summary and Discussion
143
relative angular, rotational, and displacement information, rather than just absolute orientation information. Both approaches are similar in that they do not actively map transient objects, instead relying on their local movement scheme to deal with them. However for lasting objects their link confidence update schemes differ. ELDEN’s update scheme depends only on the current link state and a learning rate. The time it takes to adapt to a new obstacle on a path depends mostly on the recent success the robot has had traversing that path. The experience map’s link update scheme uses the link’s entire transition history. This has the effect of slowing the link adaptation speed in proportion to the time that the robot has spent in the environment. The two adaptation mechanisms are discussed further in Section 12.2.4. Recently there has been much work on graphical SLAM techniques, which use some form of graph representation of the environment as the main ‘map’ (Lu and Milios 1997; Folkesson and Christensen 2004; Olson, Leonard et al. 2006; Thrun and Montemerlo 2006). Many graphical techniques, unlike filtering techniques such as EKFs, retain all the data obtained during navigation and postpone the mapping process until afterwards. The aim with graphical techniques such as ‘Graphical SLAM’ is to retain all the information about the environment and use this information exactly for as long as is practical (Folkesson and Christensen 2004). Graphs generally consist of nodes and edges; nodes represent robot poses or features, and edges represent constraints between nodes. Implementations often have several types of edges or nodes, such as GraphSLAM (Thrun and Montemerlo 2006), which uses motion edges which link consecutive robot poses, and measurement edges which link robot poses to features that were measured at that location. In GraphSLAM, the most likely map is calculated by minimizing the functional sum of the various constraints in the graph. These techniques have been used to successfully map large outdoor (Thrun and Montemerlo 2006) and indoor environments (Folkesson and Christensen 2004). Depending on the complexity of the environment and the assumptions made, they vary from offline techniques to very fast online techniques. The primary difference between the experience mapping algorithm and most graphical SLAM techniques is their intended functional role; GraphSLAM and similar algorithms form the entire mapping system, while the experience mapping algorithm sits on top of the RatSLAM model. The experience mapping algorithm converts the already existing spatial representation in the pose cells into a hybrid representation that facilitates its use in active navigation – part of this does involve ‘correcting’ the map, but has other significant purposes, such as the incorporation of behavioural information. However, the map correction process itself is similar in concept to aspects of map management used in graphical SLAM techniques, and would most likely benefit from some of the ideas presented in the literature. Chapter 11 describes the development and implementation of exploration and goal navigation methods that use the experience maps. The experience mapping algorithm is also tested to see how the map adaptation methods described in Section 10.4 perform in real world tests – the robot’s ability to adapt to environment changes can be more explicitly demonstrated while the robot is attempting to perform purposeful tasks. In the final experiments, the RatSLAM system and experience mapping algorithm combine to create a complete robot mapping and navigation system.
This page intentionally blank
11 Exploration, Goal Recall, and Adapting to Change
The experience mapping algorithm uses a process of map correction to minimise the discrepancies between the locations of experiences in the map and the spatial information stored in the inter-experience links. Links between experiences also contain temporal and behavioural information. When combined with the spatial information, this temporal and behavioural information can be used to explore an environment, navigate to goals, and adapt to changes in the environment. This chapter describes methods for performing exploration and navigation tasks and their evaluation in experiments. Section 11.1 describes the implementation and testing of an exploration method using the spatial and behavioural experience map information. The process by which experience maps are used to plan and execute routes to goals is described in Section 11.2. Section 11.3 describes experiments in a large indoor environment that tested the goal navigation algorithms. Section 11.4 presents the results of an experience map adaptation experiment in a changing environment. The chapter concludes in Section 11.5 with a discussion of the exploration, goal navigation, and adaptation methods.
11.1 Exploring Efficiently When the environment is unknown, the robot explores using a random behaviour from its set of local movement behaviours. As the environment becomes more familiar, it is desirable for the robot to choose new paths through the environment to maximise its knowledge. This can be achieved by choosing movement behaviours that have not yet been used at behaviour intersections in the environment, where more than one movement is possible. For instance, when at a binary fork, it is desirable that the robot choose Behaviour 2 if only Behaviour 1 has been used previously. This is similar to the approach adopted by Kuipers and Byun (1991). Past behaviour usage can be detected using the experience map. A recursive tree search function is used to find routes from the currently active experience through the experience map using only transitions of a specific behaviour type. If a sufficiently long continuous path is found through the experience map for a certain behaviour, M.J. Milford: Robot Navigation from Nature, STAR 41, pp. 145–161, 2008. © Springer-Verlag Berlin Heidelberg 2008 springerlink.com
146
Exploration, Goal Recall, and Adapting to Change
that behaviour is tagged as having already been followed from the current environment location. During exploration the robot attempts to use previously unused movement behaviours to try to experience new routes through the environment. The complexities of real environments and robot movement means that the search must have some robustness to single transition interruptions of a route that is otherwise traversable using a constant movement behaviour. When a robot passes through an intersection such as the fork shown in Fig. 10.3, the movement behaviour may rapidly alternate during the change from centreline following to left or right wall following. The search algorithm must be able to propagate through such periods of instability. This can be achieved by making the algorithm search for routes that have been mostly traversed using one movement behaviour. Each time the algorithm searches along the transition between two experiences, it updates a behaviour use variable η si which represents the fraction of the total current search route
i that can be traversed using the search behaviour β s . η si is calculated by:
⎧kη sir + (1 − k ), β ab = β s
η sir +1 = ⎨ ⎩
where r is the search iteration,
kη sir ,
β ab
β ab ≠ β s
(11.1)
is the behaviour used in the transition between
experience a and b at the current search location, and k is a weighting variable determined by the experience rank:
k=
rank rank + 1
(11.2)
The rank refers to how far the experience for which the algorithm is searching is from the search origin point. The rank starts from zero and gets progressively larger as the search moves away from this origin point. Through the weighting variable, k , the ranking system assigns more importance to behaviours used near the current robot position than those used further away from the robot. Searches progress only a certain temporal distance t s from the current experience. Once this distance is reached the algorithm evaluates the behavioural consistency of the route found by the search process. The robot is deemed to have previously followed a route using behaviour β s if the final
η si
lowed routes,
for that route exceeds a threshold value. The number of previously fol-
N s , is given by: N s = ∑ λsi
(11.3)
i
⎧1, η si ≥ η min ⎩0, η si < η min
λsi = ⎨
(11.4)
Exploring Efficiently
147
11.1.1 Experimental Evaluation The exploration algorithm was tested during the indoor experience mapping experiments described in Section 10.5. The value of η min was set to 0.05. The experiment allowed the algorithm to be evaluated in a complex indoor environment with a variety of intersection configurations. Two aspects of the algorithm’s performance were of particular interest. On a global scale, the algorithm needed to provide the robot with route decisions that would lead to the robot thoroughly exploring the entire environment. On a local level the algorithm needed to provide the robot with a sequence of route choices that would ensure the robot had experienced the entire set of paths through each intersection. The robot successfully used the exploration algorithm to explore the entire environment shown in Fig. 10.5. The robot repeatedly traversed all possible pathways, as shown by the experience map trajectories in Fig. 10.8 and Fig. 10.10, which correspond to the path of the robot during each experiment. Different areas of the environment, such as corridors, rooms, and open plan space, were all thoroughly explored. The robot did not spend excessive time in well known areas of the environment when there were still unknown areas to explore. The performance of the local route choice mechanism was analysed by examining the movement behaviour of the robot at intersections. The intersection at the entrance to the room containing goal number two was of particular interest (see Fig. 11.4), as the robot entered or exited it a total of 27 times, as shown in Fig. 11.1. The first three times the robot entered or left the intersection it used a left wall following behaviour (Fig. 11.1a); the next three times it recognised the previous use of that behaviour and used a new right wall following behaviour (Fig. 11.1b). After that point the movement behaviour became progressively more complex. Rather than maintaining a single behaviour throughout, the robot started switching behaviours to try and create new routes. For instance, the first half of route number seven used a right wall following behaviour before switching to a left wall following behaviour for the second half. More complex behavioural sequences resulted in loops such as for route number 17, 20, and 26. If the search algorithm detects that the robot has already trialled all the possible behaviours from a point, the robot maintains its current behaviour, which leads to it repeating previous movement behaviours later in an experiment. When the robot entered the intersection from the left, there were only three movements the robot could execute using its behaviour scheme – go straight to the right (routes 1, 15, 19, 23, 25), enter the room and move downwards (routes 4, 12), and enter the room and move to the right (routes 7, 21). After initially trying all three (routes 1, 4, 7), the robot repeated them throughout the rest of the experiment. 11.1.2 Discussion By attempting novel movement behaviours at critical points in the environment such as forks and intersections, the robot experiences as much of the environment as is possible using its set of local movement behaviours. This introduces a dependence on suitable movement behaviours; without suitable behaviours the robot may not learn
148
Exploration, Goal Recall, and Adapting to Change
Fig. 11.1. Robot movement at an intersection over 80 minutes. The labels indicate the order in which the paths were traversed.
Recalling Routes Using a Temporal Map
149
all the possible routes through the environment. However, with an appropriate set of local movement behaviours, the robot is able to learn all the possible routes through the environment. This knowledge of routes and how to follow them provides a solid foundation on which to build a goal directed navigation system.
11.2 Recalling Routes Using a Temporal Map The experience maps provide the spatial, temporal, and behavioural information used to perform goal directed navigation. However, in order to use this information to plan and execute routes to a goal, a number of processing steps must be performed. Ideally a goal recall system should plan the ‘best’ route to the goal – in this case the best route is defined as being the fastest route. As such, some means of determining the temporal length of a route is required – this is achieved by forming a temporal map. Gradient climbing techniques are then used to find the optimal route. 11.2.1 Temporal Map Creation To create the temporal map, the current active experience (representing the robot’s current location) is seeded with a zero time stamp value, and all other experiences are assigned a ‘very large’ value. Time stamp values are then assigned to linked experiences based on the current experience's time stamp and the temporal link information. This process is iterated, with any experience A only updating experience B if they are linked and if experience B's time stamp is larger. The updated time stamp value, is given by:
(
t kj +1 = min τ , t kj
τ j = ti + tij
)
t kj +1 ,
(11.5) (11.6)
where ti is the time stamp value of experience i, tij is the temporal link from experience i to experience j, τj is the resultant proposed time stamp value of experience j, τ is the set of proposed time stamp values for experience j, and k is the iteration number. The map creation process stops when the time stamp values of all the experiences reach a stable state. By propagating time stamps from the robot’s current location and only updating experience time stamps to lower values, the temporal map creation process always produces one increasing path of experience time stamp values from the robot location to the goal. The path is guaranteed to represent the shortest time duration route to the goal, by virtue of the final map stability condition, which indicates no experience can be reached any more quickly in time than as indicated by its current time stamp value. If there are two routes to the goal that are exactly equal in temporal length, the temporal map will have two increasing time stamp paths; however, in practice, this is unlikely to occur. In this situation the robot can use either route to navigate successfully to the goal.
150
Exploration, Goal Recall, and Adapting to Change
The experience map is not necessarily completely interconnected, so a temporal map may not include all the experiences in it. Once the robot starts navigating to the goal it may come across and activate experiences not involved in the original temporal map. In such a situation the robot forms a new temporal map that may not include the original route, but which may also have a different shortest route to the original. The route following mechanism is structured so that if the new route is longer the robot remembers and continues to follow the old shorter route for a fixed period of time, before switching to the new route. If the new route is shorter the robot switches to it immediately and hence uses the new information to reach the goal more rapidly. 11.2.2 Route Planning A gradient climbing procedure is used to find the shortest route to a goal, starting at the goal location. This technique finds the temporally shortest path from the goal to the current robot location, by searching for experiences with smaller time stamp values. The route is stored as a sequence of nodes, with each node corresponding to an experience. The nodes contain information about the ( x, y ) location of each node within the experience map co-ordinate space as well as the time stamp value of each node. The time stamp values represent the expected absolute times at which the robot will reach each node along the route to the goal. Fig. 11.2 shows an example of a route calculated using a temporal map. This particular route to the goal contains about 50 nodes corresponding directly to experiences, numbered sequentially from the robot’s current ‘Start’ location. The route has been calculated by starting at the goal location and following the decreasing temporal gradient to the robot’s current location. There is only one possible search direction
Fig. 11.2. An example of a route calculated using a temporal map. The dots are experiences stored in the route and the numbers indicate the index of the experience, starting at one at the robot’s location. The temporal distance is in seconds.
Recalling Routes Using a Temporal Map
151
from the goal location, as all other directions (for instance to the left or downwards) result in an increase in temporal value. In fact at all stages along the route there is only one direction of decreasing temporal gradient, due to the integral nature of the temporal map creation process. To increase the robustness of the route planning process, a spread of experiences centred on the goal location are used as the goals. The route planning process is used to find the shortest route back to the robot’s location from each of these experiences. By using a distribution of goal experiences rather than a single goal experience, the path planning process is able to choose the best route to the goal from a number of possible routes. The set of experiences used as goal candidates, G , is determined by each experience’s distance from the goal location, and is given by:
⎧⎪ ⎧ p − pgoal < Δpmax ⎫⎪ G = ⎨E : ⎨ j ⎬ n < nmax ⎪⎩ ⎩ ⎪⎭ where
(11.7)
E is the complete set of experiences, p j is the ( x, y ) location of experience
j within the experience map, p goal is the goal location within the experience map, Δpmax is the maximum allowable experience – goal separation, and n and nmax are the number and maximum number of candidate experiences. 11.2.3 Behaviour Arbitration Spatial, temporal, and behavioural information can be used to follow the planned route to the goal. The behavioural information tells the robot which movement behaviour is required to move between experiences along the route. The spatial information tells the robot what physical movement is required to move between the two experiences. Temporal information is used to update stored routes when the robot is unable to actively update the route to the goal. One method of following a route involves using the movement behaviour stored for the transition from the robot’s current experience to the next experience in the route. This approach was used in the initial route following module and was found to function well except in certain situations. The robot struggled to correctly navigate through complex intersections where behaviours could rapidly alternate. At these intersections it is critical that the correct movement behaviour be used at the right time. Navigation performance was found to be very sensitive to slight variations in the timing of the experience activations. If an experience was activated slightly earlier or later than normal, the robot could switch behaviours too early or late and miss a critical turn. An example of sensitivity to behaviour switching is shown in Fig. 11.1, where route number seven involves a switch from right wall following behaviour to left wall following behaviour. The timing of the switch is crucial – switch too early and the robot misses the left turn and continues down the corridor. Switch too late and it might fail to take the left turn after entering the room. During experiments the sensitivity hampered the system’s ability to follow the correct path at critical junctions and increased the average time the robot took to attain each goal. As a result further
152
Exploration, Goal Recall, and Adapting to Change
development of the route following process focused on a combined spatialbehavioural approach to the problem. The spatial transition information stores the relative movement of the robot from one experience to the next in terms of the translation distance, change in robot orientation, and the relative angle between the robot’s initial orientation and the angle to its final location. Although the experience map correction process does not necessarily produce a globally Cartesian map, it does make local sections of the experience map have a high degree of spatial correspondence to the environment. This information can be used to spatially compare the current section of the planned route to the goal with the local paths suggested by the robot movement behaviours. In Fig. 11.3 the robot is located at a fork and has two reactive movement behaviour options; left wall following or right wall following. The robot can predict the local path it would take using each behaviour, which is shown by the dashed lines in the figure. The local section of the stored goal route can be overlayed on top of these paths, as shown by the solid line. To navigate to the goal, the robot uses the behaviour which results in the closest path to the stored goal path. The overall closeness of the goal route to a local movement path, Ci , can be calculated as follows:
Ci =
t max
∑ min p − p i
* t
(11.8)
t =0
pi is the vector of points along the local behaviour path, pt* is the point at time t along the stored goal route, and t max is how far ahead in time the path comparison is
where
performed. The robot chooses the behaviour which has the closest local movement path to the goal route, such as the left wall following behaviour in Fig. 11.3.
Fig. 11.3. Behaviour arbitration during goal navigation. In this situation the robot would choose to go left, as that local movement path more closely matches the proposed route.
SLAM and Navigation in a Static Environment
153
11.2.4 Route Loss Recovery As the robot navigates toward the goal, the temporal map is constantly updated. After each map update the path planning process finds the shortest route to the goal. However, because the environment is not static and the robot’s path is not completely repeatable, new visual scenes can trigger the learning of new experiences even when the robot is traversing a known path. Learning new experiences allows the robot to more completely represent a route but also results in the temporary loss of a functional temporal map, since time stamps cannot be propagated from a brand new experience that is not yet linked to any others. The robot may also move into a separately connected section of the experience map and generate a different temporal map with a longer planned route to the goal, as discussed at the end of Section 11.2.1. Both of these situations result in the robot losing its ‘best’ route to the goal. When the planned route to the goal degrades or is lost, the robot uses the last ‘best’ route to the goal to navigate for a certain period of time. However, the robot has no active feedback on how far along this route it has progressed since it is no longer activating any of the experiences associated with the route. In order to maintain its perception of where it is along the remembered route, the robot discards sections of the route with time stamp values older than the current time, as these sections should already have been used. This update mechanism becomes less and less valid as time passes, but keeps the robot moving sensibly until one of three possible outcomes occurs: • The robot passes through the novel aspects of the environment back into familiar territory, regenerates the temporal map and continues along the updated original route, • the robot enters a novel part of the environment and reverts to exploration to try to find a familiar part of the map again and re-establish a route to the goal, or • the robot remains in the alternative section of the experience map and switches to the longer route suggested by the new temporal map. In the case of the first outcome, being able to follow a stored route for a short period of time ensures that the robot can maintain goal navigation while learning some new visual aspects of the environment.
11.3 SLAM and Navigation in a Static Environment This book has now presented all the necessary components for full SLAM and goal navigation experiments; a core SLAM system, an experience mapping algorithm that creates directly usable maps, and exploration and goal navigation methods that make use of the experience maps. This section describes the experimental evaluation of the goal navigation algorithm in a large indoor environment. Experiments were run in the environment shown in Fig. 11.4. This is the environment in which the initial goal memory system failed, as discussed in Section 9.4. Each experiment consisted of an hour of exploration, followed by sequential navigation to six goal locations (goal navigation commenced with the robot navigating from the top left corner to goal number one). Goals were specified by using the mouse to click on
154
Exploration, Goal Recall, and Adapting to Change
the desired goal locations in the experience map. As with the experience mapping experiments discussed in Section 10.5, two different pose cell matrix sizes were used. For the first experiment, a 100 × 40 × 36 ( x ' , y ' ,θ ' ) matrix was used, corresponding to approximately a 25 × 10 metre area. The second experiment used a much smaller 40 × 20 × 36 matrix, corresponding to a 10 × 5 metre area. This second experiment was intended to test the goal navigation performance of the system with significant collisions, discontinuities and multiple representations in the pose cells.
Fig. 11.4. Floor plan showing the robot’s path and goal locations
11.3.1 Experiment 13: Goal Recall with Minor Pose Collisions The presentation and analysis of the experimental results for each experiment are split into two sections. The first section covers the routes the robot calculated to each goal. The second section covers the routes the robot actually followed to reach each goal. Calculating Routes to the Goal Three of the six routes the robot planned to the goals were close to optimal – the routes to the second, fourth, and sixth goals, which are shown in Fig. 11.5. The first and fifth goal routes were significantly suboptimal, and the third goal route was extremely indirect. One possible cause of the long roundabout route to goal three is that the robot may not have learned it could turn right into the corridor upon exiting from the left hand side of the room containing goal two. Examination of the robot’s movement at this intersection during the experiment reveals that the robot in fact never did experience this transition. The closest it came was learning that it could turn right from the room after exiting from the right hand side of the room, which it did during transition five and seventeen (see Fig. 11.1). Having never experienced this particular transition the robot did not have the knowledge necessary to plan a route that involved exiting from the left hand side of the room and turning right into the corridor.
SLAM and Navigation in a Static Environment
155
Fig. 11.5. Temporal maps, planned and actual routes for experiment 13. Temporal maps and planned routes, with the temporal distance measured in seconds (a, c, e, g, i, k). Actual routes executed by the robot (b, d, f, h, j, l). Four of the executed routes were close to optimal or somewhat suboptimal (b, d, h, j). Two routes involved major movement errors (f, l). Overlapping experiences were rendered in order of their temporal value, with lower value layers rendered above higher value layers.
156
Exploration, Goal Recall, and Adapting to Change
Following Routes to the Goal The robot followed the planned route to the goal for goals one and two, as shown in Fig. 11.5. The robot followed routes that were quite similar to the planned routes for goals four and five, but with some minor variations. For the routes to goal three and six the robot deviated very significantly from the initial planned routes to the goal. From observing the experiments it was clear that most of the mistakes made by the route following algorithm resulted from using the incorrect local movement behaviour at critical intersections. Once the robot took a wrong turn it was also clear that the route planning algorithm immediately started to plan new ‘shortest’ routes to the goal. After mistakenly turning into the room containing goal two while navigating to goal six, the robot attempted to exit back into the corridor, as shown by the loop at the entrance to the room. It was not able to time its switching of local movement behaviours correctly and ended up doing a complete loop of the room. 11.3.2 Experiment 14: Goal Recall with Major Pose Collisions The second experiment tested the system’s goal navigation performance with a much smaller pose cell representation. Fig. 11.6 shows the temporal maps, planned routes, and actual routes taken by the robot to each goal during the experiment. Calculating Routes to the Goal The path planning process calculated nearly optimal routes to the goal for four of the goals – goals one, two, four and six. The planned routes to goals three and five were significantly suboptimal. Unlike the previous experiment however, the robot was able to plan routes involving a right turn into the corridor after leaving from the left side of the room containing goal two, resulting in improved path planning performance. Following Routes to the Goal The robot followed the planned routes to goals one, two and six exactly. In following the planned route to goal three the robot made one minor deviation from the planned path. The robot’s performance in following the planned route to goal four was very poor – the robot made significant deviations from the planned route at two points, resulting in long detours into two side rooms. Halfway along the original planned route to goal five the robot calculated a new route that was slightly shorter than the planned route. The robot switched to this new route and followed it to the goal. It is likely the robot encountered an alternative route not directly available from the robot’s starting location in the experience map. 11.3.3 Discussion Although analysis of the system’s behaviour is useful and allows further refinement of the methods, from a practical perspective it is the system’s actual performance that is most important. In practical terms, the goal recall system performed reasonably but has significant room for improvement. In all twelve trials the robot planned a route to the goal, navigated to the goal and registered that it had been attained. In half of these trials the final route that the robot took to the goal was close to optimal. Three routes
SLAM and Navigation in a Static Environment
157
Fig. 11.6. Temporal maps, planned and actual routes for experiment 14. Temporal maps and planned routes (a, c, e, g, i, k). The actual routes executed by the robot (b, d, f, h, j, l). Five of the executed routes were optimal or somewhat suboptimal (b, d, f, j, l). One route involved major errors (h).
158
Exploration, Goal Recall, and Adapting to Change
were significantly suboptimal but still reasonable paths to the goal, while the remaining three routes were very indirect and involved significant navigational mistakes. In the context of evaluating the usefulness of the experience maps for performing goal recall, these results are encouraging. The goal recall process had a 75% success rate at guiding the robot to each goal along either an optimal or a suboptimal but reasonable route. Most of the navigation failures were due to one of two reasons. The first reason relates to the route planning process only being able to calculate routes that consist of path segments the robot has actually experienced. If the local movement behaviours and exploration algorithm did not take the robot along all the possible paths through the environment during exploration, the route finding algorithm was significantly handicapped. The second cause of navigation failure was the robot’s poor determination of the required movement behaviour at intersections during goal recall. This resulted in the robot occasionally making incorrect movement behaviour decisions at critical locations when attempting to follow a route. The problem was compounded by the robot control system only using purely reactive movement behaviours. After making a wrong turn the robot had no way of turning around to correct its mistake unless it could be achieved in a purely reactive manner. For example the robot had no way of turning around when following the wall in the room containing goal two. It could only get back to the corridor by doing a complete lap of the room.
11.4 Adapting to Environment Change To investigate the adaptation capabilities of the experience map, an experiment was performed in which a single obstacle was added and then removed from the environment while the robot explored and navigated to goals. This experiment extended on the experiments described in the previous section by introducing the requirement that the robot adapt to changes in the environment. 11.4.1 Experiment 15: Indoor Map Adaptation Fig. 11.7 shows the testing environment and the location where an obstacle (a cardboard box) was added and then removed. During the experiment the robot was
Fig. 11.7. Floor plan for adaptation experiment. The obstacle was a cardboard box that was added and removed from the location indicated in the figure.
Adapting to Environment Change
159
instructed to navigate from the start location to the goal location three times. The first instruction came just after the obstacle was placed in the environment, after the robot had spent about twenty minutes exploring the unmodified (no obstacle) environment. The second command occurred six minutes later, with the obstacle still in the environment. The obstacle was then removed, after which the robot was allowed to explore for six minutes, before the third and final goal navigation trial was started. 11.4.2 Results Navigating to the goal for the first time, the robot was unaware of the obstacle, as revealed by the planned path to the goal shown in Fig. 11.8a. After multiple attempts to get to the goal via this path, the robot unlearned the inter-experience links near the obstacle’s location. The robot then planned a new route and followed it to reach the goal. The second time the robot was told to navigate to the goal, it immediately planned the longer route to the goal, having learned that the top route was blocked during the first trial (Fig. 11.9). The robot successfully followed this route to the goal. During the period of exploration after reaching the goal a second time (and after the removal
Fig. 11.8. Trial 1 goal navigation results. (a) The temporal map and planned route to the goal (b) The actual route the robot followed. The temporal distance is measured in seconds.
Fig. 11.9. Trial 2 goal navigation results. (a) The temporal map and planned route to the goal (b) The actual route the robot followed.
160
Exploration, Goal Recall, and Adapting to Change
Fig. 11.10. Trial 3 goal navigation results. (a) The temporal map and planned route to the goal (b) The actual route the robot followed.
of the obstacle), the robot learned new links connecting the experiences along the top corridor. When told to navigate to the goal a third time, the system planned a route along the top unblocked path. The robot successfully navigated to the goal along this route (Fig. 11.10). Although the confidence level threshold can be tuned to reduce the time taken to learn an obstacle, at its current level the robot avoids learning shorter term obstacles such as people. This results in the robot requiring multiple failed navigation attempts to unlearn a route, as shown by Fig. 11.8b. The lack of a turn around behaviour once again affected the goal navigation performance, as shown by the start of the route in Fig. 11.9b.
11.5 Discussion The robot’s exploration method requires a set of movement behaviours that produce repeatable trajectories through an environment, making it particularly suitable for environments with defined pathways such as building interiors or outdoor pathways and roads. Using left wall, right wall, and centreline following behaviours, it performed well in the indoor experiments, allowing the robot to rapidly explore many routes through the environment. Results such as those shown in Fig. 11.1 demonstrated that the robot does consistently pick novel movement behaviours at intersections in the environment. The exploration method is less appropriate for robots operating in environments where unconstrained movement is a possibility, such as submersible or aerial robots. In these environments it is difficult to create repeatable trajectories using only reactive movement behaviours. In the indoor experiments described in this chapter, the goal recall process was able to calculate and follow routes between places that were either close to optimal or a reasonable solution in the majority of cases. However the process often made incorrect decisions at both the planning stage and the execution stage. Suboptimal routes at the planning stage were usually the result of the robot not fully exploring all the possible routes through the environment. Incorrect decisions at the execution stage were usually caused by slight localisation errors. At many intersections the robot had a window of less than a second in which to
Discussion
161
make crucial movement decisions. The route following process only compared the route to the goal from the currently active experience with the projected local movement paths and hence was very sensitive to even small localisation errors. Given the wireless network lag and the relatively low RatSLAM system processing rate of 7 Hz, it was not surprising that that the robot did miss some critical turns while navigating to the goals. The route following process could be made to be more robust by also considering the routes to the goal from experiences that were recently active or soon to be active. The robot could then choose the movement behaviour that would produce the appropriate local movement path for the largest number of these experiences. The problem of changing environments was only addressed near the end of the work described in this book. Consequently the development and testing of the map adaptation mechanism was of a preliminary nature. As discussed in Section 3.3.3, the problem of functioning in a fully dynamic environment is very challenging; it is the result of compounding several difficult problems together. This work focused on simple lasting changes but ignored the effects of transient change. By using a small plain obstacle the change was also mostly topological, although it did change the environment’s appearance enough to also be considered a minor perceptual change. Under these assumptions the system demonstrated its ability to adapt to both the addition and removal of an obstacle. When the shortest route to the goal was blocked the robot was able to learn, after several attempts to get to the goal, that the route was blocked. The adaptation process is intentionally not an instantaneous one – the environment ambiguity, appearance-based vision system, and lack of a global Cartesian map means the robot can only learn an obstacle gradually. The strength of the adaptation mechanism is that it does not need to explicitly recognise or track an obstacle. There is no need for any context interpretation; the robot only needs to know that it has failed repeatedly to traverse a pathway it has previously used. When an obstacle was removed from a pathway, the robot relearned the pathway by moving along it during exploration. This process was much quicker; the pathway was relearned on one pass, since relearning only required the formation of new links between the previously disconnected experiences. This result demonstrated that the robot was able to relearn and use a shorter pathway when an obstacle was removed from the environment. 11.5.1 Conclusion This chapter has shown how the experience maps can be used to equip a robot with exploration, goal recall, and simple adaptation capabilities. By using the behavioural movement information stored in the experience map links to search for novel routes through the environment, the robot can quickly explore all the paths through the environment that are possible using its set of movement behaviours. The temporal information stored in the experience map links can be used to create a temporal map, which in turn can be used to find the fastest route between two points. The robot can use the spatial and behavioural information stored in the experience map links to actually follow this route. Inter-experience link confidence values allow the robot to learn that a route has become blocked, and also that a previously blocked route has become passable.
This page intentionally blank
12 Discussion
This book has described the implementation of a vision-based simultaneous localisation and mapping system using extended models of the rodent hippocampus, and an experience mapping algorithm that integrates with this system to provide exploration, goal recall and adaptation capabilities. The purpose of this chapter is to summarise the work presented, discuss the research contributions, and suggest potential future work in this field. The chapter starts by summarising the work presented in each chapter and highlighting the key findings that led to the next stage of the work. Section 12.2 discusses the major research contributions made by this work and their place within the mapping and navigation field. Section 12.3 outlines important issues that future work should address. The recent exciting discovery of grid cells in the rodent entorhinal cortex and its relevance for future biologically inspired robot navigation research is discussed in Section 12.4. The book ends in Section 12.5 with some final conclusions.
12.1 Book Summary This section provides a summary of the work presented in each chapter and the key findings that led to the work in subsequent chapters. 12.1.1 Mapping and Navigation The first task performed in Chapter 2 was to define the mapping and navigation problem. In order to perform more complex tasks than can be achieved using only reactive movement behaviours, a mobile robot must be able create a map of its environment while using that map to locate where it is. This is known as the simultaneous localisation and mapping problem and is arguably the single most significant problem faced by mobile robots. However, it is not the only mapping and navigation problem. Section 2.1 also discussed the problems of exploration, navigating to goals, and adapting to environment change. Chapters 3 to 5 covered three different approaches to solving the mapping and navigation problem; contemporary robotic solutions, biological solutions, and solutions using computational models of the biological systems. Contemporary solutions M.J. Milford: Robot Navigation from Nature, STAR 41, pp. 163–171, 2008. © Springer-Verlag Berlin Heidelberg 2008 springerlink.com
164
Discussion
involve probabilistic algorithms and metric or topological mapping methods. Biological solutions are those employed by animals in their everyday lives, such as rats, bees, and primates. Computational model solutions involve attempts to emulate the biological systems using theories of the neural processes animals use to perform mapping and navigation. Chapter 6 discussed the strengths and weaknesses of these approaches and proposed that computational models of the rodent hippocampus offered the potential for a robust complete mapping and navigation system. 12.1.2 Pilot Study of a Hippocampal Model Chapter 7 described the implementation of a head direction – place model of the rodent hippocampus. This study was performed to test the mapping abilities of such a system in an ambiguous real world environment. Although there were already existing artificial models of the hippocampus, they had only been tested in very small artificial environments, with a focus on reproducing results from rat experiments rather than on testing practical performance and scalability to larger environments. Onedimensional and two-dimensional competitive attractor networks were used to represent the head direction and place cells. A three-dimensional cell matrix was used to encode the colour, distance and bearing of artificial landmarks detected using the robot’s camera. The model was tested on a Pioneer robot in a small environment scattered with artificial landmarks. The key finding of the pilot study was that a separate head direction – place representation system could not (without a binding mechanism) maintain multiple hypotheses of the robot’s pose over time. This finding led to the development of the combined pose representation described in Chapter 8. 12.1.3 RatSLAM: An Extended Hippocampal Model Chapter 8 described the merging of the head direction and place cell networks to form the pose cell network, a single three-dimensional competitive attractor network encoding the complete robot pose. This combination solved the multiple hypothesis problem and other related issues. Now multiple hypotheses could be appropriately maintained over time based on sensory input until they were resolved by the competitive attractor network dynamics. Path integration and visual association processes were integrated into the pose cell model to provide the robot with the ability to update its perceived state. A number of vision processing methods were introduced to provide the pose model with distinct local views in different parts of the environment. Section 8.4 presented the RatSLAM system in action showing it could form stable consistent representations of indoor environments. To demonstrate the flexibility of the system it was also tested on a robotic tractor in a large and varied outdoor environment. Section 8.4.4 presented results showing the model could build a consistent representation during an experiment in which the robot travelled 2.8 km in a 400 × 200 metre outdoor environment. These SLAM results led to the development of an initial goal recall system which was the subject of the next chapter.
Book Summary
165
12.1.4 Goal Memory: A Pilot Study Chapter 9 presented an attempt to provide the robot with a goal navigation capability using an extension of the RatSLAM model known as the goal memory system. Goal memory involved a pragmatic approach to exploiting the existing RatSLAM model to perform goal directed navigation. The system learned temporal links between goal memory cells that were formed by a two-dimensional summation of activity in the pose cell matrix along the T ' dimension. During goal recall these links were used to create a temporal map, from which the quickest route to a goal was extracted using a gradient climbing procedure. This system was found to function effectively in small environments with minimal odometric drift. However, as the environments became larger, the discontinuities, collisions, and multiple representations in the pose cell representation rendered the goal memory system ineffective. Although the core RatSLAM representations of the world were consistent and topologically correct they were not suitable for direct use in goal navigation. This prompted further work to create an algorithm that could use the RatSLAM representations to create a map that was useable for goal navigation. 12.1.5 Extending RatSLAM: Experience Mapping Chapter 10 presented the experience mapping algorithm, which creates experience maps from the world representations stored in the pose cell and local view networks. Experiences encode the robot’s memory of a certain place and its appearance, and are connected by links storing spatial, temporal, and behavioural information. The later part of this chapter presented the results of applying the experience mapping algorithm in a complex indoor environment, using progressively smaller pose cell maps. The algorithm produced a coherent experience map of the environment with a high degree of correspondence to the layout of the environment. The algorithm was also successfully applied to data obtained from an outdoor experiment. 12.1.6 Exploring, Goal Recall, and Adapting to Change Chapter 11 described how the RatSLAM model and experience mapping algorithm could combine to produce a functional robot mapping and navigation system. The first part of the chapter described the implementation of a simple greedy exploration algorithm that attempted to follow new routes through the environment using the behavioural information in the experience map. However the major focus of the chapter was on how the temporal, spatial, and behavioural aspects of the experience maps could combine to produce efficient goal directed navigation. Section 11.2 started with a description of how temporal link information is used to create a temporal map. A gradient climbing technique is used to find the quickest route to the goal by searching in this temporal map from the goal location to the robot’s current position. This route is followed using a combination of the spatial and behavioural information. The loss of a viable temporal map results in the system using and updating stored routes until a temporal map is regained or exploration is resumed.
166
Discussion
Results were presented for a complex indoor environment showing the robot’s ability to rapidly explore, SLAM, and perform efficient goal navigation. Goals were often achieved in a nearly optimal or reasonable manner, but the goal recall process did make several mistakes at both the planning and execution stages. Two main causes for the mistakes and suboptimal performance were highlighted; the failure of the movement scheme to generate appropriate pathways at critical junctions during exploration, and the sensitivity of the route following process to slight localisation errors. The last section of Chapter 11 presented some preliminary results demonstrating the experience maps adapting to simple changes in the environment. The experience link confidence values were able to update based on the presence or absence of an obstacle in a critical pathway. When the obstacle was present the confidence values lowered to the point of link removal, signifying the pathway was no longer traversable. When the obstacle was removed and the robot traversed the pathway the links were relearned. Goal recall trials performed at all stages of the experiment showed the system could adapt to new environment changes encountered by the robot.
12.2 Contributions Although much research has been carried out in the field of mapping and navigation, the work described in this book has addressed some aspects of the field which have received little attention. This section discusses these contributions and their place in the overall field. 12.2.1 Comparative Review of Robotic and Biological Systems The first contribution made by this work to the field of mapping and navigation research was the comparative review of the robotic, biological, and biologically inspired mapping and navigation systems. There is a large amount of literature reviewing research into each of these system types in isolation (Trullier, Wiener et al. 1997; Franz and Mallot 2000; Thrun 2002; Durrant-Whyte and Bailey 2006), but relatively little research addressing all three at the same time and in a comparative manner. This is partly due to the difficulty of comparing systems that vary so much in their mechanisms and functionality. Limited knowledge or understanding of the systems also hinders comparison – biological systems are only partially understood and attempts to model them have all involved some degree of speculation. To overcome this problem the different systems were discussed and compared using a number of performance-based concepts; map robustness and accuracy, map friendliness versus usability, sensory usage, and mapping and navigation capability in real world environments. This was a different approach to that taken by most biological research, which focuses on agreement between model behaviour and the behaviour of the actual biological agents (Franz and Mallot 2000). The focus on map usability also differed from the literature on robotic mapping methods which generally focuses on the core mapping algorithms and performance (Thrun 2002). By comparing biological and robotic approaches to the overall mapping and navigation problem using these concepts, this work established an overall view of the mapping and navigation field as it stands today.
Contributions
167
12.2.2 Performance Evaluation of Hippocampal Models Researchers have modelled many different biological mapping and navigation systems. The focus of these studies has been almost entirely on testing the validity of the biological theories rather than on producing the best practical mapping and navigation performance on a robot (Franz and Mallot 2000). Of the research that has targeted practical performance, the majority has concerned lower level biological navigation mechanisms, such as corridor following based on optical flow, path integration, and local guidance strategies (Chahl and Srinivasan 1996; Franz, Scholkopf et al. 1998; Low and Wyeth 2005). Research using models with higher level mapping and navigation capabilities has at best focused equally on their biological faithfulness and practical performance (Arleo 2000). In comparison the pilot study implementation of a hippocampal model focused almost entirely on its mapping and navigation ability in an increasingly challenging sequence of real world experiments. The study was able to identify a key theoretical limitation in the standard head direction – place representation used by many hippocampal models, as well as other limitations. By focusing on practical performance and allowing deviations from the conventional biological theories, this study was able to propose one way these models might be extended to become usable robotic SLAM systems. 12.2.3 Implementation of an Extended Hippocampal Model One of the major contributions of this work is the development of the RatSLAM system, which is an extended navigation model inspired by the mapping and navigation processes in the rodent hippocampus, and its evaluation in a mobile robot mapping context. By testing RatSLAM in a range of indoor and outdoor experiments this research demonstrated that it is possible to use the theories of biological mapping and navigation as the basis for a practical robot SLAM system. This was an important outcome that brought a biologically inspired system into the realm of state of the art robot mapping and navigation systems. 12.2.4 An Experience Mapping Algorithm One of the major concepts highlighted in the mapping and navigation systems review was that of map usability. Usability in this context referred to the ability of a robot, not a human, to use the map to perform important tasks, such as exploring new areas, navigating to goals, and adapting to changes in the environment. The experience mapping algorithm developed in this research produces maps that are highly useable by a behaviour-based robot. The combination of spatial, temporal, behavioural, and traversability information stored in these maps facilitates exploration, goal navigation, and adaptation processes. This work developed such methods and demonstrated their performance in a range of robot experiments. In isolation, the experience mapping algorithm is similar in some respects to other algorithms such as ELDEN (Yamauchi and Beer 1996). These two methods are most
168
Discussion
alike in the way they deal with environment change, with both using the notion of link traversability between nodes in their topological maps. In this particular aspect ELDEN has a more thoroughly developed adaptation method that appears more robust than the one implemented in this work. However, other aspects of the maps differ; ELDEN maps are globally spatial and topological and contain traversability information, while experience maps are topological and contain behavioural, temporal, traversability, and relative spatial information. The methods have different backgrounds; experience mapping was developed as an extension of a biologically inspired SLAM system with mechanisms for dealing with ambiguity and significant localisation error, while ELDEN is a stand alone system with limited localisation and mapping performance. Consequently ELDEN is a specific method for adaptation in dynamic environments, whereas RatSLAM and the experience mapping algorithm provide a broader integrated solution to the mapping and navigation problem. 12.2.5 An Integrated Approach to Mapping and Navigation There have been very few attempts to cohesively solve all the challenges that make up the robot mapping and navigation problem. This is no doubt partly due to the relative newness of SLAM systems with robust practical performance. It is only in the last few years that robotic mapping systems have been able to provide good SLAM performance in complex real world environments. Consequently solutions to the higher level navigation problems have mostly been restricted to simulation (Niederberger and Gross 2004), or remained noticeably absent from systems that would be prime candidates for solving this problem (Montemerlo, Thrun et al. 2002; Kuipers, Modayil et al. 2004). Together the RatSLAM model and experience mapping algorithm provide an integrated solution to the mapping and navigation problem. The core RatSLAM model networks provide the robot with the ability to perform simultaneous localisation and mapping. The experience mapping algorithm uses the maps produced by RatSLAM to provide the robot with exploration, goal navigation, and adaptation abilities. The system functions autonomously, with processes running in real-time on modest computational resources. The system’s practical performance is sufficient to demonstrate the validity of taking a biologically inspired approach to producing a robotic mapping and navigation system.
12.3 Future Mapping and Navigation Research This body of work has shown that models of biological mapping and navigation processes are a valid basis for the development of practical robot mapping and navigation systems. This has implications for future research in the robot mapping and navigation field, as well as for future biological research into the neuronal mechanisms by which animals map and navigate. Work in both these areas will need to address several major issues that will shape how these research fields evolve in the future. There are also several specific areas of potential future research pertaining to RatSLAM, including a more sophisticated vision system and its development into a more biologically plausible
Future Mapping and Navigation Research
169
system. However, those specific areas aside, issues of general significance that may be studied in future work include: • Development of theoretical analysis techniques for biological models: The RatSLAM model and experience mapping algorithm have been evaluated through practical experimentation. Other computational models such as those of Arleo (2000) or Stringer, Rolls et al. (2002) were evaluated in terms of their practical performance and their replication of biological observations. A set of theoretical analysis tools is required to complement the practical analysis of these systems. Being able to theoretically prove (or disprove) properties of these biologically inspired models such as convergence and long-term stability, such as in the work by Cheung et al. (Cheung, Zhang et al. 2007), would facilitate their further development as practical robot mapping and navigation systems. • Formal analysis of the interaction between mapping and map exploitation: This book has touched on the issue of map usability and has argued that representations such as occupancy grid maps are not optimal for integrated mapping and navigation systems. Animals navigate quite well without apparently forming any high resolution occupancy grid map. Future work on both probabilistic and biologically inspired mapping and navigation systems will benefit from a detailed study of the properties of a map that make it usable. • Long term or perpetual real robot experiments in changing environments: Very few mapping and navigation systems have been tested in continuous experiments lasting more than a few hours, with a few exceptions such as Minerva (Thrun 2000). If these systems are ever to be applied on a new generation of intelligent mobile robots, they must be capable of perpetual robust performance. Apart from testing the stability of the representations that a system builds over time, this will reveal more about the nature of the dynamic environment problem. • Formalisation of the relationship between sensor characteristics and mapping methods: For instance, why is it that a large number of probabilistic methods rely on accurate laser or sonar range sensors, whereas many animals appear to rely primarily on other sensor types such as eyes, in some cases extracting approximate range information from what they see? There may be fundamental reasons why animals have not developed high precision range sensing abilities. Conversely perhaps certain aspects of the animal solution to the mapping and navigation problem are lacking. Future work should address the implications of different sensor characteristics within the entire mapping and navigation problem context. • Formalisation of the dynamic environment problem: Researchers have heuristically classified types of change based on timescale and impact on the robot’s perception and movement. These concepts have been sufficient for the relatively simple approaches to solving aspects of the overall problem. However, if the problem is to be comprehensively solved in the future it is likely a more thorough analysis will be required. This analysis should address major issues such as whether change should be examined on a continuous timescale or separated into temporal types.
170
Discussion
12.4 Grid Cells During the course of this work, a new type of spatially responsive cell, known as a grid cell, was discovered in the rodent entorhinal cortex (Fyhn, Molden et al. 2004; Hafting, Fyhn et al. 2005). This discovery has radically altered the view of spatial representation in the rodent brain, previously believed to be primarily in the head direction and place cells. Unlike place cells, which fire only at one location within an arena, grid cells fire at regular locations, which when plotted over the environment form the vertices of a tessellating hexagonal grid. These grids have three characteristics; the spacing between firing fields, the orientation of the grid relative to a reference axis, and the spatial phase of the grid. Neighbouring cells have similar spacing and orientation, but have different spatial phase. Consequently, the firing of only a few co-localized cells is sufficient to cover the entire environment with firing fields. Furthermore, the spacing of grids increases monotonically as the cell recording location moves from dorsal to lateral locations in the medial entorhinal cortex (MEC). Grid cells also vary in characteristics between different layers of MEC. Cells in layer II respond only at regular locations in the environment, whereas a significant proportion of cells in layers III, V, and VI (Sargolini, Fyhn et al. 2006), known as conjunctive grid cells, respond to both location and orientation. These cells are of particular relevance to the RatSLAM navigation model due to their conjunctive properties. When the original pilot study found that the separation of state into location and orientation in place and head direction cells could cause navigation problems, conjunctive cells were devised as an engineering solution to the problem. While place cells were known to become directional in narrow environments (McNaughton, Barnes et al. 1983) there were no apparent cells with inherent conjunctive properties. Furthermore, the reuse of cells through the wrapping connectivity of the pose cell matrix was not supported by observations of single firing fields from place cells. However, the observation that grid cells fire at multiple locations in the environment suggests that reusing cells to encode multiple environment locations is a reality, although there are multiple theories on the neuronal mechanisms required to create their grid-like firing fields (Guanella and Verschure 2006; Rolls, Stringer et al. 2006; Burgess, Barry et al. 2007). Grid cells offer an exciting new source of biological inspiration for creating robot navigation systems. At the time of publication, despite a large number of computational modeling approaches (Guanella and Verschure 2006; Rolls, Stringer et al. 2006; Solstad, Moser et al. 2006; Burgess, Barry et al. 2007), there has been no comprehensive investigation into their functional usefulness as part of a robot navigation system. Pursuing a pragmatic, robotics-based investigation into grid cells and their utility in navigation may lead to improved biologically inspired robot navigation performance, and further understanding of the role they play in rodent navigation.
12.5 Conclusion This book has presented a number of navigation models and algorithms for use by a mobile robot in real world environments. The work has followed a theme of practical adaptation and extension of current models of hippocampal mapping and navigation, with the aim of producing the best possible robot navigation performance, rather
Conclusion
171
than maintaining biological plausibility. The research has focused on producing an integrated online solution to the entire robot mapping and navigation problem, by providing a robot with the capability for exploration, SLAM, navigation to goals, and adaptation to simple environment changes. The pilot study implemented an attractor network hippocampal model on a mobile robot for vision-based SLAM. The model was able to perform SLAM in onedimension but failed in two-dimensions. The subsequent analysis concluded that conventional place cell-head direction computational models of the rodent hippocampus are, without some form of memory binding, fundamentally unable to maintain and propagate multiple hypotheses about the robot’s pose. An extended model of the rodent hippocampus was developed that combined the place cell and head direction representations into a single model of pose to overcome this limitation. Known as the RatSLAM system, this model enables a robot to produce consistent and stable representations of complex environments. RatSLAM performed SLAM in a range of indoor and outdoor environments on two different robot platforms. These experiments demonstrated that a navigation model inspired by the rodent hippocampus is capable of performing SLAM in real-time on robots in large complex environments. A goal memory model was developed to provide the robot with the ability to navigate to goals. The model was found to fail in large complex environments because of a number of non-spatial phenomena in the pose cell network. Through an analysis of these phenomena it was concluded that the RatSLAM maps were not directly usable for higher level tasks such as goal navigation due to a lack of spatial information. The second major part of this work involved the development and implementation of an experience mapping algorithm, which uses the core RatSLAM representations to produce maps containing visual, spatial, temporal, behavioural, and traversability information. An exploration technique was developed that uses the behavioural information stored in the experience maps to explore environments. A goal recall method was also developed, which uses the spatial and temporal information stored in the experience map to plan and execute paths to goals. These methods enabled a robot to explore and navigate to goals in a large indoor environment, which demonstrated the usability of the experience maps. An adaptation mechanism was added to the experience mapping algorithm, which uses experience map link confidence values to modify the map when the environment changes. The final experiments tested the complete range of capabilities of the integrated system, comprising the RatSLAM model and experience mapping algorithm. A robot used the combined system to autonomously • • • •
explore an indoor environment, perform vision-based SLAM, navigate to goals, and learn and adapt to simple environment changes.
In conclusion, the models and algorithms that were developed and the results they have produced have demonstrated that it is possible to create a biologically inspired system that can solve many of the key mapping and navigation problems faced by a mobile robot.
Appendix A – The Movement Behaviours
The robot uses a set of three reactive movement behaviours to move autonomously in the indoor environments. The behaviours generate consistent robot trajectories over multiple passes through the environment, which facilitates re-localisation by the visionbased RatSLAM system. The exploration algorithm attempts to use novel behaviours whenever possible in order to increase the robot’s knowledge of the environment. During goal recall the route-following process uses the movement behaviours to follow the planned route to the goal. The local movement behaviours are generated through a three step process, which is described in the following sections. Search of Local Space A tree search algorithm is used to identify valid movement trajectories through the local environment in front of the robot. The search is performed within a Cartesian obstacle map created using the laser range finder mounted on the Pioneer. Range to obstacle readings are measured every degree for a 180 degree arc in front of the robot, and converted into points in the obstacle map. Searches are performed to a depth of N l , with each fork containing nb branches of length
lseg varying in orientation by θ sep . Search branches are only propagated
when the minimum clearance at the end of the branch exceeds rmin (see Fig. A.1). For the indoor environments described in this thesis, the parameters displayed in Table A.1 were found to provide good coverage of the local space. Fig. A.2 shows an obstacle map and the branches resulting from a tree search through the local space. Some search paths have been cut short due to the minimum clearance constraint before reaching the full search depth. In the second stage of behaviour generation, only search paths that have reached the maximum depth are considered. Pathway Identification The second stage of the behaviour generation process involves grouping the paths found by the tree search. The search paths are first ordered by the relative bearing of their endpoints from the robot. The ordered list is then classified into groups containing endpoint M.J. Milford: Robot Navigation from Nature, STAR 41, pp. 173–177, 2008. © Springer-Verlag Berlin Heidelberg 2008 springerlink.com
174
Appendix A – The Movement Behaviours
Fig. A.1. Search tree node. Each node in the search tree is used to create further search branches. Segments are must be more than
rmin
lseq
long and vary in orientation by
θ sep . The ends of each branch
from the nearest obstacle. Table A.1. Space search parameters
Parameter
Description
Value
Nl
Depth of search in terms of number of branch levels
4
nb
Maximum number of branches per node in the tree search
5
lseg
Length of each branch segment
0.3 m
θ sep
Spread in orientation between adjacent branches from a search node
17.2°
rmin
Minimum clearance of branch ends from obstacles
0.35 m
locations that are within a certain angular distance (relative to the robot) to at least one other member of the group. Grouping is achieved by searching through the ordered endpoint locations. When the gap between two adjacent endpoint locations exceeds an angle β max , the existing group is deemed to have ended. Fig. A.3 shows the grouping process for two different search trees.
Appendix A – The Movement Behaviours
175
Fig. A.2. Search tree structure. The robot’s laser range finder is used to create a local obstacle map, with obstacles shown by small dots. A tree search algorithm searches for possible local paths through the obstacle map with a minimum clearance from obstacles.
nb = 3
in this example.
For groups with a small range in bearing relative to the robot, the midpoint of the group becomes a candidate path point. Groups that exceed an angular width of α max produce two candidate path points a distance of
α max 2
in from each edge of the
group. The candidate path points are shown in Fig. A.3 by short thick arrows. The number of candidate path points determines what movement behaviours are available to the robot. Having only one candidate path point means the robot can only use a centreline following movement behaviour. With two or more candidate path points, the leftmost and rightmost candidate path points can be used for left wall following and right wall following. For example, in Fig. A.3 (a) and Fig. A.3 (b), the robot has access to both left wall following and right wall following movement behaviours, but not centreline following. During exploration and goal navigation, the robot performs movement behaviours by picking the appropriate candidate path points. For example, during early exploration in novel areas of the environment, the robot can perform continual left wall following by always choosing the leftmost candidate path point. When there is only one candidate path point, the robot switches to centreline following behaviour.
176
Appendix A – The Movement Behaviours
Fig. A.3. Path extraction process. The grouping process gathers the endpoints of completed search branches into groups based on their proximity in bearing from the robot. (a) Large groups produce pairs of candidate path points less than
α max
α max 2
in from their ends. (b) Groups that are
wide produce single candidate path points at their centre (shown by short thick
arrows).
Fig. A.4. Velocity dependence on angle. Command velocities depend on φ , the difference between the robot’s orientation and the relative bearing of the candidate path point from the robot.
Velocity Commands The translational and rotational velocities required to reach a candidate path point are calculated based on the difference between the robot’s orientation and the relative
Appendix A – The Movement Behaviours
177
bearing of the candidate path point from the robot. The desired rotation velocity, Z , is given by:
ω= where
ωmax
φ min( φ , ωmax ) φ
(A.1)
is the maximum allowable rotation velocity, and
φ
is the angular differ-
ence between the robot’s orientation and the relative bearing of the candidate path point from the robot, shown in Fig. A.4. The desired translational velocity, v , is calculated by:
⎡ ⎛ φ v = max ⎢0, vmax ⎜⎜1 − ⎝ ωmax ⎣⎢
⎞⎤ ⎟⎥ ⎟ ⎠⎦⎥
(A.2)
vmax is the maximum translational velocity. After extensive experimentation in the indoor environments, parameter values of ωmax = 40 °/s and vmax = 0.3 m/s where
were found to produce smooth robot movement.
References
Abbott, L.F.: Decoding Neuronal Firing and Modeling Neural Networks. Quarterly Review Biophysics 27, 291–331 (1994) Amaral, D., Witter, M.: Hippocampal Formation. In: Paxinos, G. (ed.) The Rat Nervous System, pp. 443–493. Academic Press, London (1995) Arleo, A.: Spatial Learning and Navigation in Neuro-mimetic Systems: Modeling the Rat Hippocampus. Verlag-dissertation, Germany (2000) Arleo, A., Smeraldi, F., et al.: Place Cells and Spatial Navigation based on Vision, Path Integration, and Reinforcement Learning. Advances in Neural Information Processing Systems 13 (2001) Balakrishnan, K., Bhatt, R., et al.: A Computational Model of Rodent Spatial Learning and Some Behavioral Experiments. Cognitive Science Society, Mahwah, New Jersey (1998) Balch, T.R., Arkin, R.C.: Avoiding the Past: A Simple but Effective Strategy for Reactive Navigation. In: International Conference on Robotics and Automation, Atlanta, USA (1993) Biswas, R., Limketkai, B., et al.: Towards Object Mapping in Dynamic Environments With Mobile Robots. In: International Conference on Intelligent Robots and Systems, Lausanne, Switzerland (2002) Blair, H.T., Lipscomb, B.W., et al.: Anticipatory Time Intervals of Head-Direction Cells in the Anterior Thalamus of the Rat: Implications for Path Integration in the Head-Direction Circuit. The Journal of Neurophysiology 78(1), 145–159 (1997) Blair, H.T., Sharp, P.E.: Anticipatory head direction signals in anterior thalamus: evidence for a thalamocortical circuit that integrates angular head motion to compute head direction. The Journal of Neuroscience 15(9), 6260–6270 (1995) Blum, K.I., Abbott, L.F.: A Model of Spatial Map Formation in the Hippocampus of the Rat. Neural Computation 8, 85–93 (1996) Borenstein, J.: Internal Correction of Dead-reckoning Errors With the Smart Encoder Trailer. In: International Conference on Intelligent Robots and Systems, Munich, Germany (1994) Bostock, E., Muller, R., et al.: Experience-dependent modifications of hippocampal place cell firing. Hippocampus 1(2), 193–205 (1991) Browning, B.: Biologically Plausible Spatial Navigation for a Mobile Robot. Ph.D thesis, University of Queensland (2000) Buluswar, S.D., Draper, B.A.: Color models for outdoor machine vision. Computer Vision and Image Understanding 85(2), 71–99 (2002) Burgess, N., Barry, C., et al.: An oscillatory interference model of grid cell firing. Hippocampus (in press, 2007)
180
References
Burgess, N., Recce, M., et al.: A Model of Hippocampal Function. Neural Networks 7, 1065– 1081 (1994) Carlson, N.: Foundations of physiological psychology. Allyn and Bacon, Boston (1999) Chahl, J.S., Srinivasan, M.V.: Visual computation of egomotion using an image interpolation technique. Biological Cybernetics 74(5), 405–411 (1996) Chavarriaga, R.A., Sauser, E., et al.: Modelling directional firing properties of place cells. Computational Neuroscience Meeting, Alicante, Spain (2003) Cheung, A., Zhang, S., et al.: Animal navigation: the difficulty of moving in a straight line. Biological Cybernetics 97(1), 47–61 (2007) Cheung, T.H., Cardinal, R.N.: Hippocampal lesions facilitate instrumental learning with delayed reinforcement but induce impulsive choice in rats. BMC Neuroscience 6(36) (2005) Collett, T.S., Collett, M., et al.: The Guidance of Desert Ants by Extended Landmarks. The Journal of Experimental Biology 204(9), 1635–1639 (2001) Crane III, C., Armstrong Jr., D., et al.: Autonomous Ground Vehicle Technologies Applied to the DARPA Grand Challenge. In: International Conference on Control, Automation, and Systems, Bangkok, Thailand (2004) Cuperlier, N., Quoy, M., et al.: Navigation and planning in an unknown environment using vision and a cognitive map. In: International Joint Conference on Artificial Intelligence, Edinburgh, Scotland (2005) Davison, A.J., Reid, I.D., et al.: MonoSLAM: Real-Time Single Camera SLAM. IEEE Transactions on Pattern Analysis and Machine Intelligence 29(6), 1052–1067 (2007) Dempster, A.P., Laird, A.N., et al.: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society 39(1), 1–38 (1977) Dissanayake, G., Durrant-Whyte, H., et al.: A computationally efficient solution to the simultaneous localisation and map building (SLAM) problem. In: International Conference on Robotics and Automation, San Franciso, USA (2000) Dissanayake, G., Newman, P.M., et al.: A solution to the simultaneous localisation and map building (SLAM) problem. IEEE Transactions on Robotics and Automation 17(3), 229–241 (2001) Duckett, T., Nehmzow, U.: Self-Localisation and Autonomous Navigation by a Mobile Robot. In: Towards Intelligent Mobile Robots Conference, Bristol, UK (1999) Durrant-Whyte, H., Bailey, T.: Simultaneous localization and mapping: part I. Robotics & Automation Magazine 13(2), 99–108 (2006) Edlinger, T., von Puttkamer, E.: Exploration of an indoor-environment by an autonomous mobile robot. In: International Conference on Intelligent Robots and Systems, Munich, Germany (1994) Eichenbaum, H., Dudchenko, P., et al.: The hippocampus, memory, and place cells: is it spatial memory or a memory space? Neuron 23, 209–226 (1999) Endo, Y., Arkin, R.C.: Anticipatory Robot Navigation by Simultaneously Localizing and Building a Cognitive Map. In: International Conference on Intelligent Robots and Systems, Las Vegas (2003) Esch, H.E., Burns, J.E.: Distance estimation by foraging honeybees. The Journal of Experimental Biology 199, 155–162 (1996) Etienne, A.S., Maurer, R., et al.: Path integration in mammals and its interaction with visual landmarks. Journal of Experimental Biology 199(1), 201–209 (1996) Eustice, R., Walter, M., et al.: Sparse Extended Information Filters: Insights into Sparsification. In: International Conference on Intelligent Robots and Systems, Edmonton, Canada (2005) Fabrizi, E., Oriolo, G., et al.: A KF-based localization algorithm for nonholonomic mobile robots. In: Mediterranean Conference on Control and Automation, Alghero, Italy (1998)
References
181
Folkesson, J., Christensen, H.: Graphical SLAM - a self-correcting map. In: International Conference on Robotics and Automation, New Orleans, United States, IEEE, Los Alamitos (2004) Fox, D., Burgard, W., et al.: Markov Localization for Mobile Robots in Dynamic Environments. Journal of Artificial Intelligence Research 11, 391–427 (1999) Franz, M.O., Mallot, H.A.: Biomimetic robot navigation. Robotics and Autonomous Systems (30) , 133–153 (2000) Franz, M.O., Scholkopf, B., et al.: Where did I take that snapshot? Scene-based homing by image matching. Biological Cybernetics 79(3), 191–202 (1998) Franz, M.O., Scholkopf, P.G., et al.: Learning View Graphs for Robot Navigation. Autonomous Robots 5, 111–125 (1998) Frisch, K.v.: The Dance Language and Orientation of Bees. Harvard University Press, Cambridge (1967) Fyhn, M., Molden, S., et al.: Spatial Representation in the Entorhinal Cortex. Science 27(305), 1258–1264 (2004) Gaussier, P., Revel, A., et al.: From view cells and place cells to cognitive map learning: processing stages of the hippocampal system. Biological Cybernetics 86(1), 15–28 (2002) Gaussier, P., Zrehen, S.: Navigating With an Animal Brain: A Neural Network for Landmark Identification and Navigation. Intelligent Vehicles, 399–404 (1994) Gerstner, W., Abbott, L.F.: Learning Navigational Maps Through Potentiation And Modulation Of Hippocampal Place Cells. Journal of Computational Neuroscience 4, 79–94 (1997) Gonzalez-Banos, H.H., Latombe, J.C.: Navigation strategies for exploring indoor environments. International Journal of Robotics Research 21(10-11), 829–848 (2002) Goodridge, J., Touretzky, D.: Modeling Attractor Deformation in the Rodent Head-Direction System. Journal of Neurophysiology 83(6), 3402–3410 (2000) Gorchetchnikov, A., Hasselmo, M.E.: A model of hippocampal circuitry mediating goal-driven navigation in a familiar environment. Neurocomputing, 44–46, 423–427 (2002) Gothard, K.M., Skaggs, W., et al.: Dynamics of Mismatch Correction in the Hippocampal Ensemble Code for Space: Interaction between Path Integration and Environmental Cues. Journal of Neuroscience 16(24), 8027–8040 (1996) Gould, J.L.: The locale map of honeybees: Do insects have cognitive maps. Science 232(4752), 861–863 (1986) Gould, J.L.: Timing of Landmark Learning by Honey Bees. Journal of Insect Behavior 1(4) (1988) Gray, C.M., Konig, P., et al.: Oscillatory responses in cat visual cortex exhibit inter-columnar synchronization which reflects global stimulus properties. Nature 338, 334–337 (1989) Grisetti, G., Stachniss, C., et al.: Improving Grid Based SLAM with Rao Blackwellized Particle Filters by Adaptive Proposals and Selective Resampling. In: International Conference on Robotics and Automation, Barcelona, Spain (2005) Guanella, A., Verschure, P.F.M.J.: A Model of Grid Cells Based on a Path Integration Mechanism. In: Kollias, S., Stafylopatis, A., Duch, W., Oja, E. (eds.) ICANN 2006. LNCS, vol. 4131, pp. 740–749. Springer, Heidelberg (2006) Guo, Y., Parker, L.E., et al.: Performance-Based Rough Terrain Navigation for Nonholonomic Mobile Robots. In: Industrial Electronics Conference, Roanoke, USA (2003) Hafner, V.V.: Cognitive Maps in Rats and Robots. Journal of Adaptive Behavior 13(2), 87–96 (2005) Hafting, T., Fyhn, M., et al.: Microstructure of a spatial map in the entorhinal cortex. Nature Neuroscience 11(436), 801–806 (2005)
182
References
Hahnloser, R.H.R., Xie, X., et al.: A Theory of Integration in the Head-Direction System. In: Neural Information Processing Systems Conference, Vancouver, Canada (2001) Hampson, R.E., Simeral, J.D., et al.: Distribution of spatial and nonspatial information in dorsal hippocampus. Nature 402, 610–614 (1999) Heran, H.: Ein Beitrag zur Frage nach der Wahrnehmungsgrundlage der Entfernungsweisung der Bienen. vergl. Physiology 38, 168–218 (1956) Horizons, F.: Key Market Drivers Report. Kent, Future Horizons (2004) Hu, T.C., Khang, A.B., et al.: Optimal Robust Path Planning in General Environments. Transactions on Robotics and Automation 9(6), 775–784 (1993) Jefferies, M.E., Yeap, W.K.: The Utility of Global Representations in a Cognitive Map. In: Conference on Spatial Information Theory, Morro Bay, USA. (2000) Kanellos, M.: From medicine to military, machines finally arrive. CNET News.com (2004) Knierim, J.: Dynamic Interactions between Local Surface Cues, Distal Landmarks, and Intrinsic Circuitry in Hippocampal Place Cells. Journal of Neuroscience 22(14), 6254–6264 (2002) Knierim, J., Kudrimoti, H., et al.: Place cells, head direction cells, and the learning of landmark stability. Journal of Neuroscience 15(3), 1648–1659 (1995) Kobayashi, T., Tran, A.H., et al.: Contribution of hippocampal place cell activity to learning and formation of goal-directed navigation in rats. Neuroscience 117(4), 1025–1035 (2003) Koene, R.A., Gorchetchnikov, A., et al.: Modeling goal-directed spatial navigation in the rat based on physiological data from the hippocampal formation. Neural Networks 16, 577–584 (2003) Koenig, S., Tovey, C., et al.: Greedy mapping of terrain. In: International Conference on Robotics and Automation, Seoul, Korea (2001) Krichmar, J.L., Nitz, D.A., et al.: Characterizing functional hippocampal pathways in a brainbased device as it solves a spatial memory task. Proceedings of the National Academy of Sciences USA 102, 2111–2116 (2005) Kuipers, B., Byun, Y.T.: A Robot Exploration and Mapping Strategy Based on a Semantic Hierarchy of Spatial Representations. Robotics and Autonomous Systems 8(1), 47–63 (1991) Kuipers, B., Modayil, J., et al.: Local Metrical and Global Topological Maps in the Hybrid Spatial Semantic Hierarchy. In: International Conference on Robotics and Automation, New Orleans, United States of America (2004) Latombe, J.-C.: Robot Motion Planning. Kluwer Academic Publishers, Boston (1991) Leonard, J.J., Newman, P.M., et al.: Towards Robust Data Association and Feature Modeling for Concurrent Mapping and Localization. In: International Symposium on Robotics Research, Lorne, Australia (2001) Lepetic, M., Klancar, G., et al.: Time optimal path planning considering acceleration limits. Robotics and Autonomous Systems 45, 199–210 (2003) Lever, C., Wills, T.J., et al.: Long-term plasticity in hippocampal place-cell representation of environmental geometry. Nature 416, 90–94 (2002) Low, T., Wyeth, G.: Obstacle Detection using Optical Flow. In: Australasian Conference on Robotics and Automation, Sydney, Australia (2005) Lu, F., Milios, E.: Globally consistent range scan alignment for environment mapping. Autonomous Robots (4), 333–349 (1997) Maguire, E.A., Burgess, N., et al.: Knowing Where and Getting There: A Human Navigation Network. Science 280(5365), 921–924 (1998) Maguire, E.A., Frackowiak, R.S.J., et al.: Recalling Routes around London: Activation of the Right Hippocampus in Taxi Drivers. The Journal of Neuroscience 17(18), 7103–7110 (1997)
References
183
Markus, E.J., Qin, Y.-L., et al.: Interactions between Location and Task Affect the Spatial and Directional Firing of Hippocampal Neurons. The Journal of Neuroscience 15(11), 7079– 7094 (1995) Marzouqi, M., Jarvis, R.A.: Covert Robotics: Covert Path Planning in Unknown Environments. In: Australasian Conference on Robotics and Automation, Canberra, Australia (2004) Matarić, M.J.: A Distributed Model for Mobile Robot Environment-Learning and Navigation. Ph.D thesis, Massachussets Institute of Technology (1990) McNaughton, B., Barnes, C.A., et al.: The contributions of position, direction, and velocity to single unit activity in the hippocampus of freely-moving rats. Experimental Brain Research 52, 41–49 (1983) Michelsen, A.: The transfer of information in the dance language of honeybees: progress and problems. Journal of Comparative Physiology A: Neuroethology, Sensory, Neural, and Behavioral Physiology 173(2), 135–141 (1993) Mizumori, S.J., Williams, J.D.: Directionally selective mnemonic properties of neurons in the lateral dorsal nucleus of the thalamus of rats. Journal of Neuroscience 13, 4015–4028 (1993) Montemerlo, M., Thrun, S., et al.: FastSLAM: A Factored Solution to the Simultaneous Localization and Mapping Problem. In: AAAI National Conference on Artificial Intelligence, Edmonton, Canada (2002) Montemerlo, M., Thrun, S., et al.: FastSLAM 2.0: An improved particle filtering algorithm for simultaneous localization and mapping that provably converges. In: International Joint Conference on Artificial Intelligence, Acapulco, Mexico (2003) Montemerlo, M., Thrun, S., et al.: Conditional Particle Filters for Simultaneous Mobile Robot Localization and People-Tracking. In: International Conference on Robotics and Automation, Washington DC, USA (2002) Moorehead, S., Simmons, R., et al.: Autonomous Exploration Using Multiple Sources of Information. In: International Conference on Robotics and Automation, Seoul, Korea (2001) Moravec, H.P., Elfes, A.E.: High Resolution Maps from Wide Angle Sonar. In: International Conference on Robotics and Automation, St Louis, USA (1985) Muller, M., Wehner, R.: Path integration in desert ants, Cataglyphis fortis. Proceedings of the National Academy of Science 85, 5287–5290 (1988) Muller, R.U., Bostock, E., et al.: On the directional firing properties of hippocampal place cells. The Journal of Neuroscience 14(12), 7235–7251 (1994) Muller, R.U., Kubie, J.L.: The effects of changes in the environment on the spatial firing of hippocampal complex-spike cells. The Journal of Neuroscience 7(7), 1951–1968 (1987) Muller, R.U., Kubie, J.L., et al.: Spatial firing patterns of hippocampal complex-spike cells in a fixed environment. The Journal of Neuroscience 7(7), 1935–1950 (1987) Murphy, K., Russell, S.: Rao-Blackwellised Particle Filtering for Dynamic Bayesian Networks. Sequential Monte Carlo Methods in Practice, 176–183 (2001) Newman, P.M., Leonard, J.J.: Consistent, Convergent, and Constant-time SLAM. In: International Joint Conference on Artificial Intelligence, Acapulco, Mexico (2003) Niederberger, C., Gross, M.: Generic Path Planning for Real-Time Applications. In: Computer Graphics International Conference, Heraklion, Greece (2004) Nieto, J., Guivant, J., et al.: Real Time Data Association for FastSLAM. In: International Conference on Robotics & Automation, Taipei, Taiwan (2003) O’Keefe, J., Conway, D.H.: Hippocampal place units in the freely moving rat: Why they fire where they fire. Experimental Brain Research 31(4), 573–590 (1978) O’Keefe, J., Dostrovsky, J.: The hippocampus as a spatial map: preliminary evidence from unit activity in the freely moving rat. Brain Research 34, 171–175 (1971)
184
References
O’Keefe, J., Recce, M.L.: Phase relationship between hippocampal place units and the EEG theta rhythm. Hippocampus 3(3), 317–330 (1993) O’Mara, S.M., Rolls, E.T., et al.: Neurons responding to whole-body motion in the primate hippocampus. The Journal of Neuroscience 14(11), 6511–6523 (1994) Olson, E., Leonard, J., et al.: Fast iterative alignment of pose graphs with poor initial estimates. In: International Conference on Robotics and Automation, Orlando, United States, IEEE, Los Alamitos (2006) Oriolo, G., Vendittelli, M., et al.: The SRT Method: Randomized Strategies for Exploration. In: International Conference on Robotics and Automation, New Orleans, Louisiana (2004) Ozguner, U., Redmill, K.A., et al.: Team TerraMax and the DARPA grand challenge: a general overview. In: Intelligent Vehicles Symposium, Parma, Italy (2004) Prasser, D., Milford, M., et al.: Outdoor Simultaneous Localisation and Mapping using RatSLAM. In: International Conference on Field and Service Robotics, Port Douglas, Australia, Springer, Heidelberg (2005) Prasser, D., Wyeth, G.: Probabilistic Visual Recognition of Artificial Landmarks for Simultaneous Localization and Mapping. In: International Conference on Robotics and Automation, Taipei (2003) Prasser, D., Wyeth, G., et al.: Biologically Inspired Visual Landmark Processing for Simultaneous Localization and Mapping. In: International Conference on Intelligent Robots and Systems, Sendai, Japan (2004) Quirk, G.J., Muller, R.U., et al.: The Firing of Hippocampal Place Cells in the Dark Depends on the Rat’s Recent Experience. The Journal of Neuroscience 10(6), 2008–2017 (1990) Ranck, J., J.B.: Head direction cells in the deep cell layer of dorsal presubiculum in freely moving rats. Society Neuroscience Abstracts 10, 599 (1984) Ratnieks, F.L.W.: How far do honey bees forage? Bee Improvement 6, 10–11 (2000) Recce, M., Harris, K.D.: Memory for places: A navigational model in support of Marr’s theory of hippocampal function. Hippocampus 6(6), 735–748 (1998) Redish, D.: Beyond the Cognitive Map. Massachusetts Institute of Technology, Massachusetts (1999) Redish, D., Elga, A., et al.: A coupled attractor model of the rodent head direction system. Network: Computation in Neural Systems 7, 671–685 (1996) Robertson, R.G., Rolls, E.T., et al.: Head Direction Cells in the Primate Pre-Subiculum. Hippocampus 9(3), 206–219 (1999) Robertson, R.G., Rolls, E.T., et al.: Spatial view cells in the primate hippocampus: effects of removal of view details. Journal of Neurophysiology 79(3), 1145–1156 (1998) Rolls, E.T.: Head direction and spatial view cells in primates, and brain mechanisms for path integration. In: Wiener, S.I., Taube, J. (eds.) Head direction cells and the neural mechanisms underlying directional orientation, pp. 299–318. MIT Press, Cambridge (2005) Rolls, E.T., Stringer, S.M., et al.: Entorhinal cortex grid cells can map to hippocampal place cells by competitive learning. Network: Computation in Neural Systems 17, 447–465 (2006) Roy, N., Burgard, W., et al.: Coastal Navigation - Mobile Robot Navigation with Uncertainty in Dynamic Environments. In: International Conference on Robotics and Automation, Detroit, USA (1999) Rybski, P., Zacharias, F., et al.: Using visual features to build topological maps of indoor environments. In: International Conference on Robotics and Automation, Taipei, Taiwan (2003) Samsonovich, A., McNaughton, B.: Path Integration and Cognitive Mapping in a Continuous Attractor Neural Network Model. Neuroscience 17(15), 5900–5920 (1997) Sargolini, F., Fyhn, M., et al.: Conjunctive Representation of Position, Direction, and Velocity in Entorhinal Cortex. Science 312(5774), 758–762 (2006)
References
185
Save, E., Cressant, A., et al.: Spatial Firing of Hippocampal Place Cells in Blind Rats. Journal of Neuroscience 18(5), 1818–1826 (1998) Schofield, M.: Service robots - the end of the beginning? Industrial Robot 26(6), 456–459 (1999) Sharp, P.E.: Computer simulation of hippocampal place cells. Psychobiology 19(2), 103–115 (1991) Sim, R., Elinas, P., et al.: Vision-based SLAM using the Rao-Blackwellised Particle Filter. In: International Joint Conference on Artificial Intelligence, Edinburgh, Scotland (2005) Skaggs, W., Knierim, J., et al.: A model of the neural basis of the rat’s sense of direction. Advanced Neural Information Processing Systems 7, 173–180 (1995) Skaggs, W., McNaughton, B.: Spatial Firing Properties of Hippocampal CA1 Populations in an Environment Containing Two Visually Identical Regions. Journal of Neuroscience 18, 8455–8466 (1998) Smith, R., Self, M., et al.: Estimating Uncertain Spatial Relationships in Robotics. Autonomous Robot Vehicles 8, 167–193 (1990) Solstad, T., Moser, E., et al.: From grid cells to place cells: a mathematical model. Hippocampus 16, 1026–1031 (2006) Srinivasan, M.V., Zhang, S., et al.: Honeybee Navigation: Nature and Calibration of the ”Odometer”. Science 287(5454), 851–853 (2000) Stachniss, C., Burgard, W.: Mapping and Exploration with Mobile Robots using Coverage Maps. In: International Conference on Intelligent Robots and Systems, Las Vegas, USA (2003) Stringer, S.M., Rolls, E.T., et al.: Self-organizing continuous attractor networks and path integration: two-dimensional models of place cells. Network: Computation in Neural Systems 13, 429–446 (2002) Stringer, S.M., Trappenberg, T.P., et al.: Self-organizing continuous attractor networks and path integration: one-dimensional models of head direction cells. Network: Computation in Neural Systems 13, 217–242 (2002) Taube, J.S.: Head direction cells recorded in the anterior thalamic nuclei of freely moving rats. Journal of Neuroscience 15(1), 70–86 (1995) Taube, J.S., Muller, R.U., et al.: Head direction cells recorded from the postsubiculum in freely moving rats. I. Description and quantitative analysis. Journal of Neuroscience 10(2), 420– 435 (1990) Taube, J.S., Muller, R.U., et al.: Head-direction cells recorded from the postsubiculum in freely moving rats. II. Effects of environmental manipulations. Journal of Neuroscience 10(2), 436–447 (1990) Tews, A., Robert, J., et al.: Is the Sun Too Bright in Queensland? An Approach to Robust Outdoor Colour Beacon Detection. In: Australasian Conference on Robotics and Automation, Sydney, Australia (2005) Tews, A.D.: Achieving Multi-robot Cooperation in Highly Dynamic Environments. Ph.D thesis, University of Queensland (2003) Thiele, A., Stoner, G.: Neuronal synchrony does not correlate with motion coherence in cortical area MT. Nature 421, 366–370 (2003) Thrun, S.: Learning Maps for Indoor Mobile Robot Navigation. Artificial Intelligence 99(1), 21–71 (1998) Thrun, S.: Probabilistic Algorithms and the Interactive Museum Tour-Guide Robot Minerva. International Journal of Robotics Research 19(11), 972–999 (2000) Thrun, S.: Probabilistic Algorithms in Robotics. AI Magazine 21(4), 93–109 (2000)
186
References
Thrun, S.: Robotic Mapping: A Survey. Exploring Artificial Intelligence in the New Millennium. Morgan Kaufmann, San Francisco (2002) Thrun, S., Fox, D., et al.: Robust Monte Carlo Localization for Mobile Robots. Artificial Intelligence 128(1-2), 99–141 (2000) Thrun, S., Koller, D., et al.: Simultaneous Mapping and Localization With Sparse Extended Information Filters: Theory and Initial Results. In: Fifth International Workshop on Algorithmic Foundations of Robotics, Nice, France (2002) Thrun, S., Montemerlo, M.: The GraphSLAM Algorithm with Applications to Large-Scale Mapping of Urban Structures. The International Journal of Robotics Research 25(5-6), 403– 429 (2006) Tolman, E.C.: Cognitive maps in rats and men. The Psychological Review 55(4), 189–209 (1948) Touretzky, D., Redish, D.: Landmark Arrays and the Hippocampal Cognitive Map. In: Swedish Conference on Connectionism, Skövde, Sweden (1995) Touretzky, D., Wan, H.S., et al.: Neural Representation of Space in Rats and Robots. World Congress on Computational Intelligence (1994) Trullier, O., Wiener, S.I., et al.: Biologically based artificial Navigation systems: review and prospects. Progress in Neurobiology 51, 483–544 (1997) UNECE. 2005 World Robotics survey. Geneva, UNECE (2005), http://www.unece.org/press/pr2005/05stat_p03e.pdf Usher, K., Ridley, P., et al.: Visual servoing of a car-like vehicle - an application of omnidirectional vision. In: International Conference on Robotics and Automation (2003) Vassallo, R.F., Santos-Victor, J., et al.: Using Motor Representations for Topological Mapping and Navigation. In: International Conference on Intelligent Robots and Systems, EPFL, Lausanne, Switzerland (2002) Wehner, R., Bleuler, S., et al.: Bees Navigate by Using Vectors and Routes Rather than Maps. Naturwissenschaften 77, 479–482 (1990) Wehner, R., Gallizzi, K., et al.: Calibration processes in desert ant navigation: vector courses and systematic search. Journal of Comparative Physiology A: Sensory, Neural, and Behavioral Physiology 188(9), 683–693 (2002) Wei, S., Zefran, M.: Smooth Path Planning and Control for Mobile Robots. In: International Conference on Networking, Sensing and Control, Tucson, USA (2005) Wills, T.J., Lever, C., et al.: Attractor Dynamics in the Hippocampal Representation of the Local Environment. Science 308(5723), 873–876 (2005) Wood, E.R., Dudchenko, P.A., et al.: Hippocampal Neurons Encode Information about Different Types of Memory Episodes Occuring in the Same Location. Neuron 27, 623–633 (2000) Wullschleger, F.H., Arras, K.O., et al.: A Flexible Exploration Framework for Map Building. In: Third European Workshop on Advanced Mobile Robots, Zurich, Switzerland (1999) Yahja, A., Singh, S., et al.: Recent Results in Path Planning for Mobile Robots Operating in Vast Outdoor Environments. In: Symposium on Image, Speech, Signal Processing and Robotics, Hong Kong, China (1998) Yamauchi, B.: A Frontier Based Approach for Autonomous Exploration. In: International Symposium on Computational Intelligence in Robotics and Automation, Monterey, USA (1997) Yamauchi, B., Beer, R.: Spatial Learning for Navigation in Dynamic Environments. Systems, Man and Cybernetics 26(3), 496–505 (1996) Zhang, K.: Representation of spatial orientation by the intrinsic dynamics of the head-direction cell ensemble: A theory. Journal of Neuroscience 16, 2112–2126 (1996)
List of Reproduced Figures
A number of figures in this book have been reprinted with the permission of their copyright owners. The permission is acknowledged in the figure captions, and the full details regarding the figures are given in this section. Chapter 3 Figure 3.1 – Figure reprinted with permission from Elsevier. This figure was published in Exploring Artificial Intelligence in the New Millenium, Thrun, S., Robotic Mapping: A Survey, Copyright Elsevier (2002). Figure 3.2 – Figure reprinted with permission from Elsevier. This figure was published in Artificial Intelligence, Vol 128, Thrun et al, Robust Monte Carlo Localization for Mobile Robots, pp 99-141, Copyright Elsevier (2001). Figure 3.3 – Figure reprinted with permission from Sage Publications. This figure was published in The International Journal of Robotics Research, Vol 21 (10-11), Gonzalez-Banos, H.H. and Latombe, J., Navigation Strategies for Exploring Indoor Environments, pp 829-848, Copyright Sage Publications (2002). Figure 3.4 – Figure reprinted with permission from Elsevier. This figure was published in Robotics and Autonomous Systems, Vol 8, Kuipers et al, A Robot Exploration and Mapping Strategy Based on a Semantic Hierarchy of Spatial Representations, pp 47-63, Copyright Elsevier (1991). Chapter 4 Figure 4.1 – Figure reprinted with permission from Elsevier. This figure was published in The Rat Nervous System, Hippocampal Formation, ed. G. Paxinos, pp 443493, Copyright Elsevier (1995). Figure 4.2 – Figure reprinted with permission from the Journal of Neuroscience. This figure was published in The Journal of Neuroscience, Vol 10 (2), Taube, J.S. et al, Head-direction cells recorded from the postsubiculum in freely moving rats, pp 42035, Copyright the Society for Neuroscience (1990).
188
List of Reproduced Figures
Chapter 5 Figure 5.1 – Figure reprinted with permission from The MIT Press. This figure was published in Advanced in Neural Information Processing Systems 7, ed. Tesauro, G., Touretzky, D. S., Leen, T. K., A Model of the Neural Basis of the Rat's Sense of Direction, Skaggs, W.E. et al, pp 173-180, Copyright The MIT Press (1994). Figure 5.2 – Figure reprinted with permission from Informa Healthcare. This figure was published in Network: Computation in Neural Systems, Vol 7, Redish, D. et al, A coupled attractor model of the rodent head direction system, pp 671-685, Copyright Informa Healthcare (1996). Figure 5.3 – Figure reprinted with permission from Verlag Dissertation and Arleo, A. This figure was published in Spatial learning and navigation in neuro-mimetic systems, Arleo, A., Copyright Verlag Dissertation (2001). Figure 5.4 – Figure reprinted with permission from Elsevier. This figure was published in Neural Networks, Vol 7 (6-7), Burgess, N. et al, A model of hippocampal function, pp 1065-1081, Copyright Elsevier (1994). Figure 5.5 – Figure reprinted with kind permission from Springer Science and Business Media. This figure was published in Biological Cybernetics, Vol 86 (1), Gaussier, P. et al, From view cells and place cells to cognitive map learning: processing stages of the hippocampal system, pp 15-28, Copyright Springer Science and Business Media (2002 / 2004). Figure 5.6 – Figure reprinted with permission from Verlag Dissertation and Arleo, A. This figure was published in Spatial learning and navigation in neuro-mimetic systems, Arleo, A., Copyright Verlag Dissertation (2001). Figure 5.7 – Figure reprinted with permission from Verlag Dissertation and Arleo, A. This figure was published in Spatial learning and navigation in neuro-mimetic systems, Arleo, A., Copyright Verlag Dissertation (2001). Figure 5.8 – Figure reprinted with permission from Informa Healthcare. This figure was published in Network: Computation in Neural Systems, Vol 13, Stringer, S. et al, Self-organizing continuous attractor networks and path integration: two-dimensional models of place cells, pp 429-446, Copyright Informa Healthcare (2002). Figure 5.9 – Figure reprinted with permission from Browning, B. This figure was published in Biologically plausible spatial navigation for a mobile robot, Browning, B., Ph.D thesis, The University of Queensland (2000). Figure 5.10 – Figure reprinted with permission from Elsevier. This figure was published in Neural Networks, Vol 7 (6-7), Burgess, N. et al, A model of hippocampal function, pp 1065-1081, Copyright Elsevier (1994).
List of Reproduced Figures
189
Figure 5.11 – Figure reprinted with permission from The MIT Press. This figure was published in Neural Computation, Vol 8 (1), Blum, K. et al, A model of spatial map formation in the hippocampus of the rat, pp 85-93, Copyright The MIT Press (1996). Chapter 6 Figure 6.1 – Figures reprinted with permission from IEEE. These figures were published in the proceedings of the 2005 International Conference on Robotics and Automation, Grisetti, G. et al, Improving grid-based SLAM with rao-blackwellized particle filters by adaptive proposals and selective resampling, pp 2432-2437, Copyright IEEE (2005). Chapter 8 Figure 8.10 – Figure reprinted with permission from IEEE. This figure was published in the proceedings of the 2004 International Conference on Robotics and Automation, Kuipers, B. et al, Local metrical and global topological maps in the hybrid spatial semantic hierarchy, pp 4845-4851, Copyright IEEE (2004).
Index
ant, 36 cue navigation, 36 home vector, 36 systematic search, 36, 56 bee cognitive map, 35 local rule concept, 35 odometer, 35 round dance, 34 waggle dance, 34 computational model, 41-53 allothetic correction, 49 attractor network, 41-42 head direction cell, 42-46, 68-70 Hebbian learning, 49 minimalist LV-PI, 50 Morris water maze, 53 multiple goal, 53 path integration, 42-43 peak deformation, 53 PI calibration, 70-71 place cell, 46-51, 74-79 Q-learning, 51 theta rhythm, 51-52 concurrent mapping and localisation. See SLAM DARPA Grand Challenge, 59 dynamic environments, 13, 26-27 implicit mapping, 28 types of change, 13
Expectation Maximisation, 4, 15, 21 algorithm, 17 closing the loop, 17 computability, 18 correspondence problem, 17 experience mapping, 129-131 behaviour arbitration, 151-153 changing environment, 156-160 experiences, 129-131 exploration, 145-147 indoor tests, 136-140, 153-157 linking experiences, 131-133 maintenance, 136 map correction, 133-134 outdoor test, 140-142 route loss recovery, 153-154 route planning, 150-151 temporal map creation, 149-150 exploration, 11-12, 22-24 coverage map, 23 frontier-based, 22 greedy, 22 Next-Best-View, 22-23 relation to mapping, 12 Spatial Semantic Hierarchy, 24 goal memory improvements, 127-128 learning, 118-119 limitations, 126-127 recall, 119-120 testing, 120-125 grid cell, 31, 170
192
Index
head direction cell, 31-32 computational models, 41-44 human, 38-39 taxi driver, 38 virtual world, 38 Kalman Filter, 4, 15, 16-17 Lu/Milios algorithm, 17 sparse extended information filter, 17 standard, 17 local view, 64 learning allothetic cues, 66 relocalisation, 68-69 localisation, 10 map type biological vs robotic, 55-56 graph, 21 human-friendly, 57 occupancy grid, 20 mapping, 10 biological vs robotic, 58-59 cognitive map, 29 hybrid topological, 22 in 1D, 71-74 in 2D, 79-81 multiple pose hypotheses, 81-82 probabilistic, 15-21 rodent, 28-33 topological, 21-22 transitions, 22-27 uncertainty, 11 vision-based topological, 21 visualising, 99-101 mobile robot Minerva, 25 Pioneer, 62 tractor, 112 Monte Carlo Localisation. See particle filter navigation, 5, 12 ant, 36 bee, 33, 35 biological models, 6 Dijkstra's algorithm, 25
Exploration and Learning in Dynamic ENvironments, 25-26 human, 38-39 Partially Observable Markov Decision Processes, 25 purpose, 12 to goals, 25-26 particle filter, 4, 15, 21 FastSLAM, 20 FastSLAM 2.0, 20 probability density, 19 Rao-Blackwellized filters, 20 scaling, 19 update equation, 18 place cell, 31, 32, 33 during purposeful behaviour, 34 multiple pose hypotheses, 83-84 theta rhythm, 33-34 T-Maze, 33 primate, 36-38 episodic memory, 37 head direction cell, 37 passive motion, 38 view cell, 37 RatSLAM, 87-94 artificial landmarks, 101-109 biological basis, 88-89 functional requirements, 115 image histograms, 96-99 indoor vision templates, 94-96 kidnapping, 107 localisation consistency, 101 loop closure, 106 map characteristics, 115-116 model structure, 88 office building, 107-111 outdoors, 4, 115 path integration, 93-94 pose representation, 89-90 relocalisation, 92-93 system dynamics, 91-92 sensing, 58-59 artificial landmarks, 62 biological vs robotic, 57-58
Index
SLAM, 1 biologically inspired solutions, 5, 9 engineering solutions, 3, 4 math solutions, 4 problem, 3, 11 robotic solutions, 9
sensing issues, 3, 4 visual, 4 visual processing sum of squared differences, 21
193
Springer Tracts in Advanced Robotics Edited by B. Siciliano, O. Khatib and F. Groen Vol. 41: Milford, M.J. Robot Navigation from Nature 194 p. 2008 [978-3-540-77519-5] Vol. 40: Birglen, L.; Laliberté, T.; Gosselin, C. Underactuated Robotic Hands 241 p. 2008 [978-3-540-77458-7] Vol. 39: Khatib, O.; Kumar, V.; Rus, D. (Eds.) Experimental Robotics 563 p. 2008 [978-3-540-77456-3] Vol. 38: Jefferies, M.E.; Yeap, W.-K. (Eds.) Robotics and Cognitive Approaches to Spatial Mapping 328 p. 2008 [978-3-540-75386-5] Vol. 37: Ollero, A.; Maza, I. (Eds.) Multiple Heterogeneous Unmanned Aerial Vehicles 233 p. 2007 [978-3-540-73957-9] Vol. 36: Buehler, M.; Iagnemma, K.; Singh, S. (Eds.) The 2005 DARPA Grand Challenge – The Great Robot Race 520 p. 2007 [978-3-540-73428-4] Vol. 35: Laugier, C.; Chatila, R. (Eds.) Autonomous Navigation in Dynamic Environments 169 p. 2007 [978-3-540-73421-5] Vol. 34: Wisse, M.; van der Linde, R.Q. Delft Pneumatic Bipeds 136 p. 2007 [978-3-540-72807-8]
Vol. 29: Secchi, C.; Stramigioli, S.; Fantuzzi, C. Control of Interactive Robotic Interfaces – A Port-Hamiltonian Approach 225 p. 2007 [978-3-540-49712-7] Vol. 28: Thrun, S.; Brooks, R.; Durrant-Whyte, H. (Eds.) Robotics Research – Results of the 12th International Symposium ISRR 602 p. 2007 [978-3-540-48110-2] Vol. 27: Montemerlo, M.; Thrun, S. FastSLAM – A Scalable Method for the Simultaneous Localization and Mapping Problem in Robotics 120 p. 2007 [978-3-540-46399-3] Vol. 26: Taylor, G.; Kleeman, L. Visual Perception and Robotic Manipulation – 3D Object Recognition, Tracking and Hand-Eye Coordination 218 p. 2007 [978-3-540-33454-5] Vol. 25: Corke, P.; Sukkarieh, S. (Eds.) Field and Service Robotics – Results of the 5th International Conference 580 p. 2006 [978-3-540-33452-1] Vol. 24: Yuta, S.; Asama, H.; Thrun, S.; Prassler, E.; Tsubouchi, T. (Eds.) Field and Service Robotics – Recent Advances in Research and Applications 550 p. 2006 [978-3-540-32801-8]
Vol. 33: Kong, X.; Gosselin, C. Type Synthesis of Parallel Mechanisms 272 p. 2007 [978-3-540-71989-2]
Vol. 23: Andrade-Cetto, J,; Sanfeliu, A. Environment Learning for Indoor Mobile Robots – A Stochastic State Estimation Approach to Simultaneous Localization and Map Building 130 p. 2006 [978-3-540-32795-0]
Vol. 32: Milutinovi´c, D.; Lima, P. Cells and Robots – Modeling and Control of Large-Size Agent Populations 130 p. 2007 [978-3-540-71981-6]
Vol. 22: Christensen, H.I. (Ed.) European Robotics Symposium 2006 209 p. 2006 [978-3-540-32688-5]
Vol. 31: Ferre, M.; Buss, M.; Aracil, R.; Melchiorri, C.; Balaguer C. (Eds.) Advances in Telerobotics 500 p. 2007 [978-3-540-71363-0]
Vol. 21: Ang Jr., H.; Khatib, O. (Eds.) Experimental Robotics IX – The 9th International Symposium on Experimental Robotics 618 p. 2006 [978-3-540-28816-9]
Vol. 30: Brugali, D. (Ed.) Software Engineering for Experimental Robotics 490 p. 2007 [978-3-540-68949-2]
Vol. 20: Xu, Y.; Ou, Y. Control of Single Wheel Robots 188 p. 2005 [978-3-540-28184-9]
Vol. 19: Lefebvre, T.; Bruyninckx, H.; De Schutter, J. Nonlinear Kalman Filtering for Force-Controlled Robot Tasks 280 p. 2005 [978-3-540-28023-1]
Vol. 9: Yamane, K. Simulating and Generating Motions of Human Figures 176 p. 2004 [978-3-540-20317-9]
Vol. 18: Barbagli, F.; Prattichizzo, D.; Salisbury, K. (Eds.) Multi-point Interaction with Real and Virtual Objects 281 p. 2005 [978-3-540-26036-3]
Vol. 8: Baeten, J.; De Schutter, J. Integrated Visual Servoing and Force Control – The Task Frame Approach 198 p. 2004 [978-3-540-40475-0]
Vol. 17: Erdmann, M.; Hsu, D.; Overmars, M.; van der Stappen, F.A (Eds.) Algorithmic Foundations of Robotics VI 472 p. 2005 [978-3-540-25728-8]
Vol. 7: Boissonnat, J.-D.; Burdick, J.; Goldberg, K.; Hutchinson, S. (Eds.) Algorithmic Foundations of Robotics V 577 p. 2004 [978-3-540-40476-7]
Vol. 16: Cuesta, F.; Ollero, A. Intelligent Mobile Robot Navigation 224 p. 2005 [978-3-540-23956-7]
Vol. 6: Jarvis, R.A.; Zelinsky, A. (Eds.) Robotics Research – The Tenth International Symposium 580 p. 2003 [978-3-540-00550-6]
Vol. 15: Dario, P.; Chatila R. (Eds.) Robotics Research – The Eleventh International Symposium 595 p. 2005 [978-3-540-23214-8]
Vol. 5: Siciliano, B.; Dario, P. (Eds.) Experimental Robotics VIII – Proceedings of the 8th International Symposium ISER02 685 p. 2003 [978-3-540-00305-2]
Vol. 14: Prassler, E.; Lawitzky, G.; Stopp, A.; Grunwald, G.; Hägele, M.; Dillmann, R.; Iossifidis. I. (Eds.) Advances in Human-Robot Interaction 414 p. 2005 [978-3-540-23211-7]
Vol. 4: Bicchi, A.; Christensen, H.I.; Prattichizzo, D. (Eds.) Control Problems in Robotics 296 p. 2003 [978-3-540-00251-2]
Vol. 13: Chung, W. Nonholonomic Manipulators 115 p. 2004 [978-3-540-22108-1] Vol. 12: Iagnemma K.; Dubowsky, S. Mobile Robots in Rough Terrain – Estimation, Motion Planning, and Control with Application to Planetary Rovers 123 p. 2004 [978-3-540-21968-2] Vol. 11: Kim, J.-H.; Kim, D.-H.; Kim, Y.-J.; Seow, K.-T. Soccer Robotics 353 p. 2004 [978-3-540-21859-3] Vol. 10: Siciliano, B.; De Luca, A.; Melchiorri, C.; Casalino, G. (Eds.) Advances in Control of Articulated and Mobile Robots 259 p. 2004 [978-3-540-20783-2]
Vol. 3: Natale, C. Interaction Control of Robot Manipulators – Six-degrees-of-freedom Tasks 120 p. 2003 [978-3-540-00159-1] Vol. 2: Antonelli, G. Underwater Robots – Motion and Force Control of Vehicle-Manipulator Systems 268 p. 2006 [978-3-540-31752-4] Vol. 1: Caccavale, F.; Villani, L. (Eds.) Fault Diagnosis and Fault Tolerance for Mechatronic Systems – Recent Advances 191 p. 2003 [978-3-540-44159-5]