Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen
Editorial Board David Hutchison Lancaster University, UK Takeo Kanade Carnegie Mellon University, Pittsburgh, PA, USA Josef Kittler University of Surrey, Guildford, UK Jon M. Kleinberg Cornell University, Ithaca, NY, USA Alfred Kobsa University of California, Irvine, CA, USA Friedemann Mattern ETH Zurich, Switzerland John C. Mitchell Stanford University, CA, USA Moni Naor Weizmann Institute of Science, Rehovot, Israel Oscar Nierstrasz University of Bern, Switzerland C. Pandu Rangan Indian Institute of Technology, Madras, India Bernhard Steffen TU Dortmund University, Germany Madhu Sudan Microsoft Research, Cambridge, MA, USA Demetri Terzopoulos University of California, Los Angeles, CA, USA Doug Tygar University of California, Berkeley, CA, USA Gerhard Weikum Max-Planck Institute of Computer Science, Saarbruecken, Germany
6079
Cristian S. Calude Masami Hagiya Kenichi Morita Grzegorz Rozenberg Jon Timmis (Eds.)
Unconventional Computation 9th International Conference, UC 2010 Tokyo, Japan, June 21-25, 2010 Proceedings
13
Volume Editors Cristian S. Calude The University of Auckland, Department of Computer Science, Science Centre 38 Princes Street, Auckland 1142, New Zealand E-mail:
[email protected] Masami Hagiya University of Tokyo, Graduate School of Information Science and Technology Department of Computer Science 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan E-mail:
[email protected] Kenichi Morita Hiroshima University, Graduate School of Engineering Department of Information Engineering Higashi-Hiroshima 739-8527, Japan E-mail:
[email protected] Grzegorz Rozenberg Leiden University, Leiden Institute of Advanced Computer Science (LIACS) Niels Bohrweg 1, 2333 CA Leiden, The Netherlands E-mail:
[email protected] Jon Timmis University of York, Department of Computer Science and Department of Electronics Heslington, York, YO10 5DD, UK E-mail:
[email protected]
Library of Congress Control Number: 2010927630 CR Subject Classification (1998): F.1, F.2, I.1, C.1.3, C.1, J.2 LNCS Sublibrary: SL 1 – Theoretical Computer Science and General Issues ISSN ISBN-10 ISBN-13
0302-9743 3-642-13522-6 Springer Berlin Heidelberg New York 978-3-642-13522-4 Springer Berlin Heidelberg New York
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. springer.com © Springer-Verlag Berlin Heidelberg 2010 Printed in Germany Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India Printed on acid-free paper 06/3180
Preface
The 9th International Conference on Unconventional Computation, UC 2010, was organized under the auspices of EATCS and Academia Europaea, by the University of Tokyo (Tokyo, Japan), and the Center for Discrete Mathematics and Theoretical Computer Science (Auckland, New Zealand). It was held in Tokyo during June 21–25, 2010 (see http://arn.local.frs.riken.jp/UC10/). The venue was the Sanjo (Hilltop) Conference Hall at Hongo Campus of the University of Tokyo. Hongo Campus was formerly the residence of the Maeda family, one of the richest feudal lords in the Edo period of Japan. The Japanese garden in the residence is partially preserved, including the pond and the hill on which the conference hall is located. Within walking distance from Hongo Campus are Ueno park with many museums, the Akihabara area, which is now the center of Japanese pop culture, and the Korakuen amusement park/baseball stadium. The International Conference on Unconventional Computation (UC) series (see http://www.cs.auckland.ac.nz/CDMTCS/conferences/uc/) is devoted to all aspects of unconventional computation — theory as well as experiments and applications. Typical, but not exclusive, topics are: natural computing including quantum, cellular, molecular, membrane, neural, and evolutionary computing, as well as chaos and dynamical system-based computing, and various proposals for computational mechanisms that go beyond the Turing model. The first venue of the Unconventional Computation Conference (formerly called Unconventional Models of Computation) was Auckland, New Zealand in 1998. Subsequent sites of the conference were Brussels, Belgium in 2000, Kobe, Japan in 2002, Seville, Spain in 2005, York, UK in 2006, Kingston, Canada in 2007, Vienna, Austria in 2008, and Ponta Delgada, Portugal in 2009. The proceedings of the previous UC conferences appeared as follows: 1. Calude, C. S., Casti, J., Dinneen, M. J. (eds.): Unconventional Models of Computation, Springer, Singapore (1998) 2. Antoniou, I., Calude, C. S., Dinneen, M. J. (eds.): Unconventional Models of Computation, UMC 2K: Proceedings of the Second International Conference, Springer, London (2001) 3. Calude, C. S., Dinneen, M. J., Peper, F. (eds.): UMC 2002. LNCS 2509, Springer, Heidelberg (2002) 4. Calude, C. S., Dinneen, M. J., P˘ aun, G., P´erez-Jim´enez, M. J., Rozenberg, G. (eds.): UC 2005. LNCS 3699, Springer, Heidelberg (2005) 5. Calude, C. S., Dinneen, M. J., P˘ aun, G., Rozenberg, G., Stepney, S. (eds.): UC 2006. LNCS 4135, Springer, Heidelberg (2006) 6. Akl, S. G., Calude, C. S., Dinneen, M. J., Rozenberg, G., Wareham, H. T. (eds.): UC 2007. LNCS 4618, Springer, Heidelberg (2007)
VI
Preface
7. Calude, C. S., Costa, J. F., Freund, R., Oswald, M., Rozenberg, G. (eds.): UC 2008. LNCS 5204, Springer, Heidelberg (2008) 8. Calude, C. S., Costa, J. F., Dershowitz, N., Freire, E., Rozenberg, G. (eds.): UC 2009. LNCS 5715, Springer, Heidelberg (2009) The four keynote speakers at the 2010 conference were: – Shun-ichi Amari (Brain Science Institute, RIKEN, Japan): “Computations Inspired from the Brain” – Luca Cardelli (Microsoft, UK): “Algebras and Languages for Molecular Programming” – Fran¸coise Chatelin (Universit´e Toulouse and Cerfacs, France): “A Computational Journey into Nonlinearity” – Jos´e F´elix Costa (IST, Technical University of Lisbon, Portugal): “Computable Scientists, Uncomputable World” We would like to thank the keynote speakers for taking their time to come to the conference and for a stimulating series of lectures. In addition to the main UC2010 conference, three workshops were also hosted: one on “Hypercomputation” organized by Mike Stannett (University of Sheffield, UK), one on “Computing with Spatio-Temporal Dynamics 2010” organized by So Tsuda (University of the West of England, UK) and Masashi Aono (RIKEN Advanced Science Institute, Japan), and one on “DNA Nanotechnology Toward Molecular Robotics” organized by Satoshi Murata (Research Group on Molecular Robotics, SICE/Tohoku University). The Program Committee selected 15 papers (out of 26), 4 posters (out of 4), and 6 papers converted to posters to be presented as full-length talks. In this volume, 4 (extended) abstracts of invited talks, 15 regular papers, and 8 abstracts of posters are included. The Program Committee is very grateful to the extra reviewers for the help they provided in improving the papers for this volume. These experts are: A. Alhazov, T. Haruna, K. Imai, I. Kawamata, N. Nagy, M. Olah, X. Piao, P. Rothemund, L. Staiger, and F. Tanaka. We extend our thanks to all members of the Local Organizing Committee, particularly to M. Aono, M. Hagiya, S. Murata, F. Peper, and F. Tanaka for their invaluable organizational work. The conference was supported by the University of Tokyo, the Mitsubishi Foundation, and Support Center for Advanced Telecommunications Technology Research, Foundation (SCAT). We extend to all of them our deep gratitude. We would like to acknowledge the developers of the EasyChair system, and the excellent cooperation from the Lecture Notes in Computer Science team of Springer for their help in making possible the production of this volume in time for the conference.
Preface
VII
Finally, we would like to thank all the authors for their high-quality contributions and hope that they found the conference enjoyable and stimulating. April 2010
Cristian S. Calude Masami Hagiya Kenichi Morita Grzegorz Rozenberg Jonathan Timmis
Conference Organization
Steering Chairs Cristian S. Calude Grzegorz Rozenberg
Auckland, New Zealand Leiden, The Netherlands, and Boulder, Colorado, USA
Steering Committee Thomas B¨ack Lov K. Grover Jarkko Kari Lila Kari Jan van Leeuwen Seth Lloyd Gheorghe P˘ aun Tommaso Toffoli Carme Torras Arto Salomaa
Leiden, The Netherlands Murray Hill, NJ, USA Turku, Finland London, Ontario, Canada Utrecht, The Netherlands Cambridge, MA, USA Seville, Spain, and Bucharest, Romania Boston, MA, USA Barcelona, Spain Turku, Finland
Program Chairs Kenichi Morita Jonathan Timmis
Hiroshima, Japan York, UK
Program Committee Andrew Adamatzky Selim Akl Masashi Aono Olivier Bournez Cristian S. Calude Luca Cardelli David Corne Nachum Dershowitz Michael Dinneen Marco Dorigo Masami Hagiya Emma Hart Gregg Jaeger Natasha Jonoska Jarkko Kari
Bristol, UK Kingston, Canada Wako, Japan Paris, France Auckland, New Zealand Cambridge, UK Edinburgh, UK Tel Aviv, Israel Auckland, New Zealand Brussels, Belgium Tokyo, Japan Edinburgh, UK Boston, USA Tampa, USA Turku, Finland
X
Organization
Viv Kendon Vincenzo Manca Jonathan Mills Ferdinand Peper Kai Salomaa Hava Siegelmann Mike Stannett Darko Stefanovic Susan Stepney Christof Teuscher Hiroshi Umeo Damien Woods Xin Yao
Leeds, UK Verona, Italy Bloomington, USA Kobe, Japan Kingston, Canada Amherst, USA Sheffield, UK Albuquerque, USA York, UK Portland, USA Osaka, Japan Seville, Spain Birmingham, UK
Local Organization Chair Masami Hagiya
Tokyo, Japan
Local Organization Masashi Aono Satoshi Murata Ferdinand Peper Fumiaki Tanaka
Wako, Japan Tokyo, Japan Kobe, Japan Tokyo, Japan
Referees Andrew Adamatzky Selim Akl Artiom Alhazov Masashi Aono Olivier Bournez Cristian Calude Luca Cardelli David Corne Nachum Dershowitz Michael Dinneen Marco Dorigo Masami Hagiya Taichi Haruna
Katsunobu Imai Gregg Jaeger Natasha Jonoska Jarkko Kari Ibuki Kawamata Viv Kendon Vincenzo Manca Kenichi Morita Naya Nagy Mark Olah Ferdinand Peper Xiaoxue Piao Paul Rothemund
Kai Salomaa Ludwig Staiger Mike Stannett Darko Stefanovic Susan Stepney Fumiaki Tanaka Christof Teuscher Jon Timmis Hiroshi Umeo Damien Woods Xin Yao
Table of Contents
Invited Talks Computations Inspired from the Brain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shun-ichi Amari
1
Algebras and Languages for Molecular Programming . . . . . . . . . . . . . . . . . Luca Cardelli
2
A Computational Journey into Nonlinearity . . . . . . . . . . . . . . . . . . . . . . . . . Fran¸coise Chatelin
3
Computable Scientists, Uncomputable World (Abstract) . . . . . . . . . . . . . . Jos´e F´elix Costa
6
Regular Contributions Finite State Transducers with Intuition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ruben Agadzanyan and R¯ usi¸ nˇs Freivalds
11
Reversibility and Determinism in Sequential Multiset Rewriting . . . . . . . . Artiom Alhazov, Rudolf Freund, and Kenichi Morita
21
Synchronization in P Modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Michael J. Dinneen, Yun-Bum Kim, and Radu Nicolescu
32
On Universality of Radius 1/2 Number-Conserving Cellular Automata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Katsunobu Imai and Artiom Alhazov
45
DNA Origami as Self-assembling Circuit Boards . . . . . . . . . . . . . . . . . . . . . Kyoung Nan Kim, Koshala Sarveswaran, Lesli Mark, and Marya Lieberman
56
Tug-of-War Model for Multi-armed Bandit Problem . . . . . . . . . . . . . . . . . . Song-Ju Kim, Masashi Aono, and Masahiko Hara
69
Characterising Enzymes for Information Processing: Towards an Artificial Experimenter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chris Lovell, Gareth Jones, Steve R. Gunn, and Klaus-Peter Zauner
81
XII
Table of Contents
Majority Adder Implementation by Competing Patterns in Life-Like Rule B2/S2345 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Genaro J. Mart´ınez, Kenichi Morita, Andrew Adamatzky, and Maurice Margenstern
93
Solving Partial Differential Equation via Stochastic Process . . . . . . . . . . . Jun Ohkubo
105
Postselection Finite Quantum Automata . . . . . . . . . . . . . . . . . . . . . . . . . . . . Oksana Scegulnaja-Dubrovska, Lelde L¯ ace, and R¯ usi¸ nˇs Freivalds
115
A New Representation of Chaitin Ω Number Based on Compressible Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kohtaro Tadaki Quantum Query Algorithms for Conjunctions . . . . . . . . . . . . . . . . . . . . . . . . Alina Vasilieva and Taisia Mischenko-Slatenkova Universal Continuous Variable Quantum Computation in the Micromaser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rob C. Wagner, Mark S. Everitt, Viv M. Kendon, and Martin L. Jones Quantum Computation with Devices Whose Contents Are Never Read . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abuzer Yakaryılmaz, R¯ usi¸ nˇs Freivalds, A.C. Cem Say, and Ruben Agadzanyan The Extended Glider-Eater Machine in the Spiral Rule . . . . . . . . . . . . . . . Liang Zhang
127 140
152
164
175
Posters Formalizing the Behavior of Biological Processes with Mobility . . . . . . . . Bogdan Aman and Gabriel Ciobanu
187
Quantum Finite State Automata over Infinite Words . . . . . . . . . . . . . . . . . Ilze Dzelme-B¯erzi¸ na
188
A Geometrical Allosteric DNA Switch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Anthony J. Genot, Jon Bath, and Andrew J. Turberfield
189
Properties of “Planar Binary (Butchi Number)” . . . . . . . . . . . . . . . . . . . . . Yuuki Iwabuchi and Junichi Akita
190
Characterising Enzymes for Information Processing: Microfluidics for Autonomous Experimentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gareth Jones, Chris Lovell, Hywel Morgan, and Klaus-Peter Zauner
191
Table of Contents
XIII
Inference with DNA Molecules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alfonso Rodr´ıguez-Pat´ on, Jos´e Mar´ıa Larrea, and I˜ naki Sainz de Murieta
192
A Network-Based Computational Model with Learning . . . . . . . . . . . . . . . Hideaki Suzuki, Hiroyuki Ohsaki, and Hidefumi Sawai
193
Image Processing with Neuron-Like Branching Elements (POSTER) . . . . Hisako Takigawa-Imamura and Ikuko N. Motoike
194
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
195
Computations Inspired from the Brain Shun-ichi Amari RIKEN Brain Science Institute Hirosawa 2-1, Wako, Saitama 351-0198, Japan
Abstract. The brain is a highly complex system composed of a vast number of neurons. It uses various levels of hierarchy, consisting of the molecular level, cellular level, network level, system level, and more abstract level of mind. The principles of computation are largely different from those of the conventional computation where symbols are processed by logical calculations. Information is represented spatio-temporal patterns in a distributed manner, and calculations are emergent, stochastic and cooperative, rather than exact logics. It also includes learning and memory. We show miscellaneous topics of computation related to neural information processing: First one is statistical neurodynamics which is computation by randomly connected neurons. We give the robustness of computation, and its stability by using a simple stochastic law. This also reveals that neural computation is shallow, converging quickly to stable states, where the small-world phenomenon is observed. The cortices are regarded as two-dimensional layered neural fields, where computation may be realized by the dynamics of spatio-temporal pattern formation and synchronization. We show simple examples of pattern formation and collisions of patterns in neural fields. Learning and memory is another peculiar computation of a neural system. We also show some mechanisms of self-organization. An associative memory model shows fundamentally different aspects of memory from the conventional one, where memory patters are not recalled from the stock but generated newly each time. The basin of attraction has a fractal structure. We finally touch upon social computation by using the prisoner’s dilemma and ultimatum games.
C.S. Calude et al. (Eds.): UC 2010, LNCS 6079, p. 1, 2010. c Springer-Verlag Berlin Heidelberg 2010
Algebras and Languages for Molecular Programming Luca Cardelli Microsoft Research
Nucleic acids (DNA/RNA) encode information digitally, and are currently the only truly ‘user-programmable’ entities at the molecular scale. They can be used to manufacture nano-scale structures, to produce physical forces, to act as sensors and actuators, and to do computation in between. Eventually we will be able to interface them with biological machinery to detect and cure diseases at the cellular level under program control. The basic technology to create and manipulate these devices has existed for many years, but the imagination necessary to exploit them has been evolving slowly. Recently, some very simple computational schemes have been developed that are autonomous (run on their own once started) and involve only short (easily synthesizable) DNA strands with no other complex molecules. We now need programming abstractions and tools that are suitable for molecular programming, and this requires a whole hierarchy of concepts to come together. Lowlevel molecular design is required to produce molecules that interact in the desired controllable ways. On that basis, we can then design various kinds of ‘logic gates’ and ‘computational architectures’, where much imagination is currently needed. We also need programming languages to organize complex designs both at the level of gate design, and at the level of circuit design. Since DNA computation is massively concurrent, some tricky and yet familiar programming issues arise: the need to formally verify circuit designs to avoid subtle deadlocks and race conditions, and the need to design high-level languages that exploit concurrency and stochasticity.
C.S. Calude et al. (Eds.): UC 2010, LNCS 6079, p. 2, 2010. © Springer-Verlag Berlin Heidelberg 2010
A Computational Journey into Nonlinearity Fran¸coise Chatelin Universit´e Toulouse and Cerfacs, France
The talk to be presented is about the domain of mathematical computation which extends beyond modern calculus and classical analysis when numbers are not restricted to belong to a commutative field. It describes the dynamics of complexification, resulting in an endless remorphing of the computational landscape. Nonlinear computation weaves a colourful tapestry always in a state of becoming. In the process, some meta-principles emerge which guide the autonomous evolution of mathematical computation. These organic principles are essential keys to analyze very large numerical simulations of unstable phenomena: they lie at the heart of the new theory of Qualitative Computing. What is Qualitative Computing? It is the newly developed branch of mathematical analysis which looks specifically at how the laws of classical analysis (Euler-Cauchy-Riemann) are modified when mathematical computation does not take place in a commutative field. Most analysis text books do not consider numbers beyond R or C, with respective dimension(s) 1 or 2. However, there are important practical domains where such an approach is too limited. For example, the quaternions which form a noncommutative field H of numbers with 4 real dimensions are the language of Maxwell’s electromagnetism, and of special relativity. In the booming field of numerical linear algebra, the basic “numbers” are often taken to be square matrices which belong to a noncommutative assocaitive algebra (over R or C). This is an essential key to the successes of modern numerical software packages like Lapack and Scalapack used worldwide for intensive computer simulations in high tech industries. The general consensus among mathematicians and physicists at the end of the 19th century was that complex numbers – C is the algebraic closure of R – were good enough for every day science. Scientists feared that one could only loose computing power by dropping such properties for multiplication as commutativity or associativity, which were viewed then as essential. F. Klein and Lord Kelvin fiercely attacked Hamilton’s quaternions. But theoretical physics has clearly vindicated Hamilton’s non commutative field in the 20th century by adopting Clifford algebras Ck , k ≥ 3, C2 = H. However, such algebras – heavily used in physics and algebraic geometry – cannot exploit the power of multiplication to its fullest for intrinsic reasons related to their being associative. Therefore one wonders: does a family of multiplicative algebras Ak exist, which does not hinder the computing capabilities of multiplication? Amazingly enough, the answer is yes. It consists of the little-known Dickson algebras Ak of dimension 2k , k ≥ 0 (with Ak = Ck for k ≤ 2), where the multiplication is defined recursively, being nonassociative for k ≥ 3. At the dawn of the 20th century, vectors of dimension 2k in Ak , k ≥ 2, have been called C.S. Calude et al. (Eds.): UC 2010, LNCS 6079, pp. 3–5, 2010. c Springer-Verlag Berlin Heidelberg 2010
4
F. Chatelin
hypercomplex numbers (Hurwitz, Dickson). And accordingly, computation in Ak , k ≥ 2, was called hypercomputation. The talk will show the extent to which hypercomputation in Ak , k ≥ 3, is unconventional, plagued/blessed as it is by computing paradoxes signaling a clash between local (linear) and global (non linear) computation. An important source of paradoxes is found in the act of measurement. Let us consider the multiplication map defined by a = 0, that is La : x → a × x, which is a linear map in Ak . For k ≤ 3, La has for unique singular value the euclidean norm a > 0; but for k ≥ 4, there can exist 2k−3 distinct singular values ≥ 0 which differ from a. Moreover the results of the Singular Value Decomposition (if computed inductively) may depend on the computational route, and may even be hypercomplex and uncountable! This is one of the surprises that the Fundamental Theorem of Algebra keeps in store when set in noncommutative algebras. The internal clockwork of hypercomputation is guided in part by such measures which modify the local 3D-geometry defined at a. This results in an expanded logic which provides an arithmetic basis for the emergence of simplexity in life’s complex processes, and in highly unstable phenomena. The computational journey into nonlinearity in the framework of Dickson algebras is endless. At every level k ≥ 4, one gets new vistas, each richer than before. We offer glimpses of the ever changing territory. New computational principles emerge at each level k ≥ 2 which may supersede some others valid at a lower level k < k. For example, if we drop commutativity in H (k = 2) then the discrete can emerge from the continuous by exponentiation (a generalization of eniπ/2 = in ). Without associativity (k ≥ 3), there are several different ways to compute the multiplicative measures of vectors which may agree only partially with each other. This creates paradoxes and new options as well. As a rule, the emergence of paradoxes goes hand in hand with an increase in the freedom of choice. This freedom of choice provides a rational basis for the many fuzzy phenomena encountered in experimental sciences at a small scale: they are currently attributed to randomness, as in statistical physics, quantum mechanics, or genetic mutation. However, the proverbial God (i.e. the computing spirit) does not play dice in mathematical computation, but rather offers an ever richer variety of computational options to choose from. Hypercomputation supports the old adage: “Variety is the spice of life.” Caveat. The words “hypercomputation”, “computability” and “complexity, complexification” are used in their classical mathematical sense. They should not be confused with the same words used in Computer Science. In this specific context, the words applied to programs for Turing machines acquire a meaning which differs greatly from the mathematical one. The emergence of new mathematical concepts under the evolution pressure of mathematical computing is a recurring phenomenon since Antiquity. For example, irrational numbers, zero and its inverse ∞, negative numbers and complex numbers were finally accepted by our ancestors only after much anguish, inner turmoil and heated debate. Qualitative Computing has been the driving force
A Computational Journey into Nonlinearity
5
behind the evolution of mathematical logic from the beginnings, when Pythagoras, and Euclid √ presented the first known incompleteness result, the proof of the irrationality of 2. It is a fact of experience that the classical logic of Aristotle is too limited to capture the dynamics of nonlinear computation. Mathematics provides us with the missing tool, an organic logic (based on {R, C, ∞}) which is tailored on the dynamics of nonlinearity. This organic logic can tame the computing paradoxes stemming from measurements in the absence of associativity; it represents the internal clockwork of computation. It makes full use of the computing potential of rings of numbers with 1,2,4 and 8 dimensions. One salient feature is that the cooperation of results by Fermat, Euler, Riemann and Sierpi´ nski explains the autonomous complex dynamics of the Picard iteration to solve x = rf (x), where f : R → R is continous and r is a real parameter. The necessity to limit the frame of interpretation to 3 dimensions at most brings to light some mechanisms by which computation turns the complex into the simple without reduction. A detailed technical presentation is provided in the speaker’s book “Qualitative Computing: a computational journey into nonlinearity”, currently in press at World Scientific, Singapore.
Computable Scientists, Uncomputable World (Abstract) Jos´e F´elix Costa1,2,3 1
2
Departamento de Matem´ atica, Instituto Superior T´ecnico Universidade T´ecnica de Lisboa
[email protected] Centro de Matem´ atica e Aplica¸co ˜es Fundamentais do Complexo Interdisciplinar Universidade de Lisboa 3 Centro de Filosofia das Ciˆencias da Universidade de Lisboa
(This talk is a joint work of Edwin Beggs, Jos´e F´elix Costa and John V. Tucker.) Consider the classical model of a Turing machine with an oracle. The classical oracle is a one step external consultation device. The oracle may contain either non-computable information, or computable information provided just to speed up the computations of the Turing machine. In this talk we will consider the abstract experimenter (e.g. the experimental physicist) as a Turing machine and the abstract experiment of measuring a physical quantity (using a specified physical apparatus) as an oracle to the Turing machine. The algorithm running in the machine abstracts the experimental method of measurement (encoding the recursive structure of experimental actions) chosen by the experimenter. It is standard to consider that to measure a real number1 μ, e.g. the value of a physical quantity, the experimenter (now the Turing machine) should proceed by approximations. Thus, besides the value of μ, we will consider dyadic rational approximations (denoted by finite binary strings), and a procedure to measure μ proved to be universal. What do we intend to measure? It can be a distance between two points, or an electric charge in a field, or the mass of a particle, etc. Measurable numbers were first considered a scientific enterprise by Geroch and Hartle in their famous paper [11], where they introduce the concept: We propose, in parallel with the notion of a computable number in mathematics, that of a measurable number in a physical theory. The question of whether there exists an algorithm for implementing a theory may then be formulated more precisely as the question of whether the measurable numbers of the theory are computable. Measurement is a scientific activity supported by a full theory developed since the beginning of the last century as a chapter of mathematical logic (see [9,10,12,14,4]), which is unexpectedly similar to oracle consultation but exhibiting new features in complexity theory. On the other side, scientific activity seen as algorithm running in a Turing machine is also not new in learning theory (see [13]). A concrete example now follows where the measurement of inertial mass is considered (see [6] for first attempts): if we project a particle of known mass 1
Real numbers make part of the general setting in measurement theory.
C.S. Calude et al. (Eds.): UC 2010, LNCS 6079, pp. 6–10, 2010. c Springer-Verlag Berlin Heidelberg 2010
Computable Scientists, Uncomputable World
7
towards a particle of unknown mass, then the first will be reflected if its mass is less than the unknown mass, and it is projected forward together with the particle of unknown mass if its mass is greater than the unknown mass. Using binary search we are allowed, in principle, to read bit by bit the value of the unknown mass2 But we find a novelty: if we want to read the bits of μ using such a method, then the time needed for a single experiment is 1 , Δt ∼ m − μ where m is the mass of the proof particle in that single experiment. This means that the time needed for a single experiment to read the bit i of the mass μ, using the proof particle of mass m of size i (number of its bits) is in the best case exponential in i. This ideal experiment tells us that, if the abstract oracle to a Turing machine is to be replaced by an abstract physical measurement, then the time needed to consult the oracle is not any more a single step of computation but a number of time steps that will depend on the size of the query. Provided with such mathematical constructions, the main complexity classes involved in such computations, e.g. for the polynomial time case, change and deserve to be studied (see [6,3]). New interesting classes emerge, namely those involved in the study of complexity of hybrid systems and analogue-digital systems such as mirror systems and neural nets (see [8,15]). In the physical world, it is not conceivable that a proof particle of mass m can be set with infinite precision. If we consider that precision is not infinite but unbounded, i.e., as big as we need, than we can continue reading the bits of μ. The same complexity classes are defined. But suppose that we reject unbounded precision to favor the most common and realistic a priori fixed precision criterion. Then we prove that, using stochastic methods, we are still able to read the bits of μ. To make our claim rigorous, we say that the lack of precision in measurement will not constitute an obstacle to the reading of the bits of μ. Oracles should be regarded as information with possible error that take time to consult (see [1,2,5]). However, the Turing machine imposes limitations to what is effectively accessible to physical observations. E.g., not all masses are measurable. Not due to the limitations in measurements where experimental errors occur, not because quantum phenomena put a limit to measurements, but because of a more essential internal limitation of physicists conceived as Turing machines. The mathematics of computation theory does not allow the reading of bits of physical quantities beyond a certain limit: even if they could be measured with infinite precision by physics, they could not be measured by physical-mathematical reasons. 2
Note that this activity of measuring inertial mass is not that much different from the activity of measuring mass using a balance scale and a toolbox of standard weights. It is only more simple to describe from a point of view of the dynamics involved in finding the unknown value.
8
J.F. Costa
The reaction of the reader towards the gedankenexperiment considered above of measuring mass might well be of discomfort: such devices can not be built. But the reader should notice that this reaction is a consequence of a diffuse philosophy that considers the Turing machine an object of a different kind: both the abstract physical machine and the Turing machine are non-realizable objects. For the implementation of the Turing machine the engineer would need either unbounded space and a physical support structure, or unbounded precision in some finite interval to code for the contents of the tape; each time the size of the written word in the working tape increase by one symbol, the precision needed will increase. The experiment can be set up to some precision in the same way that the Turing machine can be implemented up to some accuracy. Note that the apparent counterintuitive theoretical gedankenexperiment of increasing accuracy towards infinite precision is part of measurement theory in physics in the context of a physical theory. In this respect, Geroch and Hartle (who are eminent physicists of the very large and the very small) write in [11]: The notion “measurable” involves a mix of natural phenomena and the theory by which we describe those phenomena. Imagine that one had access to experiments in the physical world, but lacked any physical theory whatsoever. Then no number w could be shown to be measurable, for, to demonstrate experimentally that a given instruction set shows w measurable would require repeating the experiment an infinite number of times, for a succession of εs approaching zero. One could not even demonstrate that a given instruction set shows measurability of any number at all, for it could turn out that, as ε is made smaller, the resulting sequence of experimentally determined rationals simply fails to converge. It is only a theory that can guarantee otherwise. Knowing that both objects, the Turing machine and the measurement device, are of the same ideal nature, the reader may wonder what is the purpose of such an experiment from the computational point of view. The physical experiment exhibits the character of an oracle, an external device to the Turing machine. It gives to the concept of an oracle a new epistemology: the oracle is not any more an abstract entity, but an abstract physical entity; the oracle is not any more a one step transition of the Turing machine, but a device that needs time to be consulted; the oracle is not any more a relativization mechanism, but it has physical content: it can only be consulted up to some accuracy; moreover the degrees of accuracy in the consultation of the oracle can be studied. For some, this setting can be seen as that of a computer connected to an analogue device3 . As emergent result, we are led to the conclusion that infinite precision and unbounded precision are of the same ontological nature, as the computational process has taught us for decades. Different experiments imply a conclusion similar to that of the work on neural nets in the nineties ([15]): to compute up to time t4 , only O(f (t)) bits of the unknown are needed, where f is a function depending on the undergoing experiment. As we will see, this result is more about the nature of numbers and arithmetic than about physics or neurodynamics. 3 4
A kind of hybrid system. E.g., to simulate the physical system up to time t.
Computable Scientists, Uncomputable World
9
Of relevance is the objective of such construction of a Turing machine connected with an abstract physical device (that can not be built) from the (physical) sciences point of view. The idea is also the same as a Turing machine is for computation science: to be able to describe limiting results and negative results. The limiting results are obtained in the perfect Platonic world. In the same way, limiting results of a computer are not about our computers, but about the limit of a computer. The same happens with physical oracles. The experiments allow us to study the limiting results on measurement. We can state, as B¨ohm wrote in [7] for the quantum mechanics, even if the physically significant variables actually existed with sharply defined values (as it is demanded by classical mechanics) we couldn’t access them. Notice that now we state differently that we couldn’t access them either with infinite precision, or with unbounded precision, or with arbitrary but fixed precision, as consequence of limiting character of computable functions. Finally, if reality is discrete, physical quantities exist only in quantized form, then our research program provides evidence that even they were not, they would be perceived and experienced as if they were.
References 1. Beggs, E., Costa, J.F., Loff, B., Tucker, J.V.: Computational complexity with experiments as oracles. Proceedings of the Royal Society, Series A (Mathematical, Physical and Engineering Sciences) 464(2098), 2777–2801 (2008) 2. Beggs, E., Costa, J.F., Loff, B., Tucker, J.V.: Computational complexity with experiments as oracles II. Upper bounds. Proceedings of the Royal Society, Series A (Mathematical, Physical and Engineering Sciences) 465(2105), 1453–1465 (2009) 3. Beggs, E., Costa, J.F., Tucker, J.: Physical oracles: the Turing machine and the Wheatstone bridge. In: Studia Logica, Trends in Logic VI: The Contributions of Logic to the Foundations of Physics, 22 p. (to appear 2010) 4. Beggs, E., Costa, J.F., Tucker, J.V.: Computational Models of Measurement and Hempel’s Axiomatization. In: Carsetti, A. (ed.) Causality, Meaningful Complexity and Knowledge Construction. Theory and Decision Library A, vol. 46, pp. 155–184. Springer, Heidelberg (2009) 5. Beggs, E., Costa, J.F., Tucker, J.V.: Physical experiments as oracles. Bulletin of the European Association for Theoretical Computer Science 97, 137–151 (2009); An invited paper for the Natural Computing Column 6. Beggs, E., Costa, J.F., Tucker, J.V.: Limits to measurement in experiments governed by algorithms, 33 p. (submitted 2010) 7. Bohm, D.: Wholeness and the Implicate Order. Routledge, New York (1996) 8. Bournez, O., Cosnard, M.: On the computational power of dynamical systems and hybrid systems. Theoretical Computer Science 168(2), 417–459 (1996) 9. Campbell, N.R.: Foundations of Science, The Philosophy of Theory and Experiment. Dover, New York (1957) 10. Carnap, R.: Philosophical Foundations of Physics. Basic Books, New York (1966) 11. Geroch, R., Hartle, J.B.: Computability and Physical Theories. Foundations of Physics 16(6), 533–550 (1986)
10
J.F. Costa
12. Hempel, C.G.: Fundamentals of concept formation in empirical science. International Encyclopedia of Unified Science 2(7) (1952) 13. Jain, S., Osherson, D., Royer, J.S., Sharma, A.: Systems That Learn. In: An Introduction to Learning Theory. The MIT Press, Cambridge (1999) 14. Krantz, D.H., Suppes, P., Luce, R.D., Tversky, A.: Foundations of Measurement. Dover, New York (2009) 15. Siegelmann, H.T.: Neural Networks and Analog Computation: Beyond the Turing Limit. Birkh¨ auser, Basel (1999)
Finite State Transducers with Intuition Ruben Agadzanyan and R¯ usi¸ nˇs Freivalds Institute of Mathematics and Computer Science, University of Latvia, Rai¸ na bulv¯ aris 29, Riga, LV-1459, Latvia
Abstract. Finite automata that take advice have been studied from the point of view of what is the amount of advice needed to recognize nonregular languages. It turns out that there can be at least two different types of advice. In this paper we concentrate on cases when the given advice contains zero information about the input word and the language to be recognized. Nonetheless some nonregular languages can be recognized in this way. The help-word is merely a sufficiently long word with nearly maximum Kolmogorov complexity. Moreover, any sufficiently long word with nearly maximum Kolmogorov complexity can serve as a help-word. Finite automata with such help can recognize languages not recognizable by nondeterministic nor probabilistic automata. We hope that mechanisms like the one considered in this paper may be useful to construct a mathematical model for human intuition.
1
Introduction
The main difficulty in all the attempts to construct a mathematical model for human intuition has been the failure to explain the nature of the ”outside help from nowhere” that characterizes our understanding of this (maybe, nonexistant) phenomenon. If this help contains a piece of information about my problem to be solved then it is unbelievable that somebody sends such a help to me. If this help does not contain information then why this message can help me. We propose one possible mechanism of such a help based on finite automata that take an advice and other nonconstructive methods of computation. The help indeed contains zero information about the problem but nontheless it helps. Moreover, it turns out that this mechanism is different from probabilistic, nondeterministic and quantum computation. The use of nonconstructive methods of proof in mathematics has a long and dramatic history. In 1888 a young German mathematician David Hilbert presented to his colleagues three short papers on invariant theory. Invariant theory was the highly estimated achievement of Paul Gordan who had produced highly complicated constructive proofs but left several important open problems. The young David Hilbert had solved all these problems and had done much, much more. Paul Gordan was furious. He was not ready to accept the new solutions
The research was supported by Grant No. 09.1570 from the Latvian Council of Science and by Project 2009/0216/1DP/1.1.2.1.2/09/IPIA/VIA/004 from the European Social Fund.
C.S. Calude et al. (Eds.): UC 2010, LNCS 6079, pp. 11–20, 2010. c Springer-Verlag Berlin Heidelberg 2010
12
R. Agadzanyan and R. Freivalds
because they provided no explicit constructions. Hilbert merely proved that the solutions cannot fail to exist. Gordan refused to accept this as mathematics. He even used the term ”theology” and categorically objected to publication of these papers. Nonetheless the papers were published first in G¨ otingen Nachrichten and later, in final form, in [15]. In the nineteen-forties the situation, however, changed. In spite of all philosophical battles the nonconstructive methods found their way even to discrete mathematics. This was particularly surprising because here all the objects were finite and it seemed that no kind of distinction between actual infinity and potential infinity could influence these proofs while most of the discussions between intuitionists and classicists were around these notions. Paul Erd¨ os produced many nice nonconstructive proofs, the first paper of this kind being [7]. Many such proofs are considered in a survey paper by Joel Spancer [24] and a recent monograph by Noga Alon and Joel H. Spencer [2]. R. Karp and R. Lipton have introduced in [16] a notion Turing machine that takes advice which is in fact a usage of a nonconstructive help from outside in a process of computation. Later C. Damm and M. Holzer [5] have adapted this notion of advice for finite automata. The adaptation was performed in the most straightforward way (what is quite natural) and later extensively used by T.Yamakami and his coauthors [27,21,26]. Another version of the notion a finite automaton that takes advice was introduced in [13] under the name nonconstructive finite automaton. These notions are equivalent for large amounts of nonconstructivity (or large amounts of advice) but, for the notion introduced in [5] languages recognizable with polynomial advice are the same languages which are recognizable with a constant advice. The notion of the amount of nonconstructivity in [13] is such that the most interesting results concern the smallest possible amounts of nonconstructivity. There was a similar situation in the nineteen sixties with space complexity of Turing machines. At first space complexity was considered for one-tape offline Turing machines and it turned out that space complexity is never less than linear. However, it is difficult to prove such lower bounds. Then the seminal paper by R.E.Stearns, J.Hartmanis and P.M.Lewis [25] was published and many-tape Turing machines became a standard tool to study sublinear space complexity.
2
Old Definitions
The essence of nonconstructive methods is as follows. An algorithm is presented in a situation where (seemingly) no algorithm is possible. However, this algorithm has an additional input where a special help is fed in. If this help is correct, the algorithm works correctly. On the other hand, this help on the additional input does not just provide the answer. There still remains much work for the algorithm. Is this nonconstructivism merely a version of nondeterminism? Not at all. Nondeterministic finite automata (both with 1-way and 2-way inputs) recognize only regular languages while nonconstructive finite automata (as defined in
Finite State Transducers with Intuition
13
[13,5]) can recognize some nonregular and even nonrecursive languages. We will see below that this notion is different also from probabilistic finite automata. Definition 1. We say that an automaton A recognizes the language L nonconstructively if the automaton A has an input tape where a word x is read and an additional input tape for nonconstructive help y with the following property. For arbitrary natural number n there is a word y such that for all words x whose length does not exceed n the automaton A on the pair (x, y) produces the result 1 if x ∈ L, and A produces the result 0 if x ∈ / L. Technically, the word y can be a tuple of several words and may be placed on separate additional input tapes. Definition 2. We say that an automaton A recognizes the language L nonconstructively with nonconstructivity d(n) if the automaton A has an input tape where a word x is read and an additional input tape for nonconstructive help y with the following property. For arbitrary natural number n there is a word y of the length not exceeding d(n) such that for all words x whose length does not exceed n the automaton A on the pair (x.y) produces the result 1 if x ∈ L, and A produces the result 0 if x ∈ / L. Technically, the word y can be a tuple of several words and may be placed on separate additional input tapes. In this case, d(n) is the upper bound for the total of the lengths of these words. The automaton A in these definitions can be a finite automaton, a Turing machine or any other type of automata or machines. In this paper we restrict ourselves by considering only deterministic finite automata with 2-way behavior on each of the tapes. It turns out that for some languages the nonconstructive help can bring zero information about the input word’s being or not being in the language considered. In this paper we try to understand how much help can be done by sending merely a random sequence of bits to a finite automaton. Is it equivalent to the automaton being a probabilistic automaton? (Theorem 1 below shows that it is not.) Martin-L¨ of’s original definition of a random sequence was in terms of constructive null covers; he defined a sequence to be random if it is not contained in any such cover. Leonid Levin and Claus-Peter Schnorr proved a characterization in terms of Kolmogorov complexity: a sequence is random if there is a uniform bound on the compressibility of its initial segments. An infinite sequence S is Martin-L¨ of random if and only if there is a constant c such that all of S’s finite prefixes are c-incompressible. Schnorr gave a third equivalent definition in terms of martingales (a type of betting strategy). M.Li and P.Vitanyi’s book [20] is an excellent introduction to these ideas.
3
Results
C. Dwork and L. Stockmeyer prove in [6] a theorem useful for us: Theorem A. [6] Let L ⊆ Σ ∗ . Suppose there is an infinite set I of positive integers and, for each m ∈ I, an integer N (m) and sets Wm = {w1 , w2 , · · · , wN (m) }, Um = {u1 , u2 , · · · , uN (m) } and Vm = {v1 , v2 , · · · , vN (m) } of words such that
14
R. Agadzanyan and R. Freivalds
1. | w |≤ m for all w ∈ Wm , 2. for every integer k there is an mk such that N (m) ≥ mk for all m ∈ I with m ≥ mk , and 3. for all 1 ≤ i, j ≤ N (m), uj wi vj iff i = j. Then L ∈ / AM (2pf a). We use this result to prove Theorem 1. [14] (1)The language L = {x2x | x ∈ {0, 1}∗} cannot be recognized with a bounded error by a probabilistic 2-way finite automaton. (2) The language L = {x2x | x ∈ {0, 1}∗} can be recognized by a deterministic non-writing 2-tape finite automaton one tape of which contains the input word, and the other tape contains an infinite Martin-L¨ of random sequence, the automaton is 2-way on every tape, and it stops producing the correct result in a finite number of steps for arbitrary input word. Proof. (1) Let m be an arbitrary integer. For arbitrary i ∈ {0, 1, 2, · · · , 2m − 1} we define the word xi (m) as the word number i in the lexicographical ordering of all the binary words of length m. We define the words ui , wi , vi in our usage of Theorem A as {, xi (m), 2xi (m)}. (2) Let the input word be x(r)2z(s) where r and s are the lengths of the corresponding words. At first, the 2-tape automaton finds a fragment 01111 · · · which has the length at least r and uses it as a counter to test whether r = s. Then the automaton searches for another help-word. If the help-word turns out to be y then the automaton tests whether x(r) = y and whether z(s) = y. The definition used in the second item of Theorem 1 is our first (but not final) attempt to formalize the main idea of the notion of help from outside bringing zero information about the problem to be solved. Unfortunately, this definition allows something that was not intended to use. Such automata can easily simulate a counter, and 2-way automata with a counter, of course, can recognize nonregular languages. On the other hand, the language L in our Theorem 1 cannot be recognized by a finite automaton with one counter. Hence we try to present a more complicated definition of help from outside bringing zero information to avoid the possibility to simulate a counter. Definition 3. A bi-infinite sequence of bits is a sequence {ai } where i ∈ (−∞,∞) and all ai ∈ {0, 1}. Definition 4. We say that a bi-infinite sequence of bits is Martin-L¨ of random if for arbitrary i ∈ (−∞, ∞) the sequence {bn } where bn = ai+n for all i ∈ N is Martin-L¨ of random, and the sequence {cn } where cn = ai−n for all i ∈ N is Martin-L¨ of random. Definition 5. A deterministic finite automaton with intuition is a deterministic non-writing 2-tape finite automaton one tape of which contains the input
Finite State Transducers with Intuition
15
word, and the other tape contains a bi-infinite Martin-L¨ of random sequence, the automaton is 2-way on every tape, and it stops producing a the correct result in a finite number of steps for arbitrary input word. Additionally it is demanded that the head of the automaton never goes beyond the markers showing the beginning and the end of the input word. Definition 6. A deterministic finite-state transducer with intuition is a deterministic 3-tape finite automaton one non-writing tape of which contains the input word, and the other non-writing tape contains a bi-infinite Martin-L¨ of random sequence, the automaton is 2-way on these tapes, and a third writing non-reading one-way tape for output. The automaton produces a the correct result in a finite number of steps for all input words. Additionally it is demanded that the head of the automaton never goes beyond the markers showing the beginning and the end of the input word. Nondeterministic, probabilistic, alternating, etc. automata with intuition differ from deterministic ones only in the nature of the automata but not in usage of tapes or Martin-L¨ of random sequences. Definition 7. We say that a language L is recognizable by a deterministic finite automaton A with intuition if A for arbitrary bi-infinite Martin-L¨ of random sequence accepts every input word x ∈ L and rejects every input word x ∈ / L. Definition 8. We say that a language L is enumerable by a deterministic finite automaton A with intuition if A for arbitrary bi-infinite Martin-L¨ of random sequence accepts every input word x ∈ L and does not accept any input word x∈ / L. Definition 9. A deterministic finite automaton with intuition on unbounded input is a deterministic non-writing 2-tape finite automaton one tape of which contains the input word, and the other tape contains a bi-infinite Martin-L¨ of random sequence, the automaton is 2-way on every tape, and it produces a the correct result in a finite number of steps for arbitrary input word. It is not demanded that the head of the automaton always remains between the markers showing the beginning and the end of the input word. Recognition and enumeration of languages by deterministic finite automata with intuition is not particularly interesting because of the following two theorems. Theorem 2. A language L is enumerable by a deterministic finite automaton with intuition on unbounded input if and only if it is recursively enumerable. Proof. J.B¯ arzdi¸ nˇs [4] proved that arbitrary one-tape deterministic Turing machine can be simulated by a 2-way finite deterministic automaton with 3 counters directly and by a 2-way finite deterministic automaton with 2 counters using a simple coding of the input word. (Later essentially the same result was re-discovered by other authors.) Hence there exists a a 2-way finite deterministic automaton with 3 counters accepting every word in L and only words in L.
16
R. Agadzanyan and R. Freivalds
Let x be an arbitrary word in L. To describe the processing of x by the 3couter automaton we denote the content of the counter i (i ∈ {1, 2, 3}) at the moment t by d(i, t). The word 00000101d(1,0)0101d(2,0)0101d(3,0)000101d(1,1)0101d(2,1)0101d(3,1)00 · · · · · · 00101d(1,s)0101d(2,s)0101d(3,s)0000 where s is the halting moment, is a complete description of the processing of x by the automaton. Our automaton with intuition tries to find a fragment of the bi-infinite MartinL¨of random sequence on the help-tape such that: 1. it starts and ends by 0000, 2. the initial fragment 0101d(1,0)0101d(2,0)0101d(3,0)00 is exactly 0000010010010, (i.e., the all 3 counters are empty, 3. for arbitrary t the fragment 0101d(1,t)0101d(2,t)0101d(3,t)0101d(1,t+1)0101d(2,t+1)0101d(3,t+1) coresponds to a legal instruction of the automaton with the counters. Since the bi-infinite sequence is Martin-L¨of random, such a fragment definitely exists in the sequence infinitely many times. The correctness of the fragment can be tested using the 3 auxiliary constructions below. Construction 1. Assume that wk ∈ {0, 1}∗ and wm ∈ {0, 1}∗ are two subwords of the input word x such that: 1. they are immediately preceded and immediately followed by symbols other than {0, 1}, 2. a deterministic finite 1-tape 2-way automaton has no difficulty to move from wk to wm and back, clearly identifying these subwords, Then there is a deterministic finite automaton with intuition recognizing whether or not wk = wm . Proof. As in Theorem 1.
Construction 2. Assume that 1 and 1 such that: k
m
are two subwords of the help-word y
1. they are immediately preceded and immediately followed by symbols other than {0, 1},
Finite State Transducers with Intuition
17
2. a deterministic finite 1-tape 2-way automaton has no difficulty to move from wk to wm and back, clearly identifying these subwords, 3. both k and m are integers not exceeding the length of the input word. Then there is a deterministic finite automaton with intuition recognizing whether or not k = m. Proof. Similar the proof of Construction 1. Construction 3. Assume that 1k1 , 1k2 , · · · , 1ks and 1m1 , 1m2 , · · · , 1mt are subwords of the help-word y such that: 1. they are immediately preceded and immediately followed by symbols other than 1, 2. a deterministic finite 1-tape 2-way automaton has no difficulty to move from one subword to another and back, clearly identifying these subwords, 3. both k1 + k2 + · · · + ks and m1 + m2 + · · · + mt are integers not exceeding the length of the input word. Then there is a deterministic finite automaton with intuition recognizing whether or not k1 + k2 + · · · + ks = m1 + m2 + · · · + mt . Proof. Similar the proof of Construction 2.
Corollary of Theorem 2. A language L is recognizable by a deterministic finite automaton with intuition on unbounded input if and only if it is recursive. Theorem 2 and its corollary show that the standard definition of the automaton with intuition should avoid the possibility to use the input tape outside the markers. However, even our standard definition allows recognition and enumeration of nontrivial languages. The proof of Theorem 1 can be easily modified to prove Theorem 3. [14] 1. The language L = {x2x | x ∈ {0, 1}∗} cannot be recognized with a bounded error by a probabilistic 2-way finite automaton, 2. The language L = {x2x | x ∈ {0, 1}∗} can be recognized by a deterministic finite automaton with intuition. Theorem 4. [14] The unary language PERFECT SQUARES = {1n | (∃m)(n = m2 )} can be recognized by a deterministic finite automaton with intuition. Proof. It is well-known that 1 + 3 + 5 + · · · + (2n − 1) = n2 . The deterministic automaton with intuition searches for a help-word (being a fragment of the given bi-infinite Martin-L¨ of sequence) of a help-word 00101110111110 · · ·012n−1 00.
18
R. Agadzanyan and R. Freivalds
At first, the input word is used as a counter to test whether each substring of 1’s is exactly 2 symbols longer than the preceding one. Then the help-word is used to test whether the length of the input word coinsides with the number of 1’s in the help-word. The proof of this theorem can be modified to prove that square roots can be extracted by deterministic finite-state transducers with intuition. Theorem 5. The relation SQUARE ROOTS = {(1n , 1m ) | (n = m2 )} can be computed by a deterministic finite-state transducer with intuition. Proof. The output can easily be obtained from the help-word in the preceding proof. Theorem 6. The relation CUBE ROOTS = {(1n , 1m ) | (n = m3 )} can be computed by a deterministic finite-state transducer with intuition. Proof. In a similar manner the formula 1 + 3(n − 1) + 3(n − 1)2 = n3 suggests a help-word 2
000[1]00[101110111]00[101111110111111111111]00 · · ·00[101n−101(n−1) ]000 where symbols [, ] are invisible. At first, the input word is used as a counter to test whether the help-word is correct but not whether its length is sufficient. Then the help-word is used to test whether the length of the input word coinsides with the number of 1’s in the help-word. Theorem 7. The relation = 1)∨(m = 1 if m is prime)} F ACT ORISAT ION = {(1n , 1m ) | (m divides n∧m
can be computed by a deterministic finite-state transducer with intuition. We define a relation UNARY 3-SATISFIABILITY as follows. The term term1 = xk is coded as [term1 ] being 21k , the term term2 = ¬xk is coded as [term2 ] being 31k , the subformula f being (term1 ∨ term2 ∨ term3 ) is coded as [f ] being [term1 ] ∨ [term2 ] ∨ [term3 ]. The CN F being f1 ∧ f2 ∧ · · · ∧ fm is coded as [f1 ] ∧ [f2 ] ∧ · · · ∧ [fm ]. The string of the values x1 = a1 , x2 = a2 , · · · , xn = an is coded as a1 a2 · · · an . The relation UNARY 3-SATISFIABILITY consists of all the pairs (CN F, a1 a2 · · · an ) such that the given CNF with these values of the arguments takes the value TRUE. Theorem 8. The relation UNARY 3-SATISFIABILITY can be computed by a deterministic finite-state transducer with intuition.
Finite State Transducers with Intuition
19
References 1. Ablayev, F.M., Freivalds, R.: Why Sometimes Probabilistic Algorithms Can Be More Effective. LNCS, vol. 233, pp. 1–14. Springer, Heidelberg (1986) 2. Alon, N., Spencer, J.H.: The Probabilistic Method. John Wiley & Sons, Chichester (2000) 3. Bach, E., Shallit, J.: Algorithmic Number Theory, vol. 1. MIT Press, Cambridge (1996) 4. B¯ arzdi¸ nˇs, J. (Barzdin, J.M.): On a Class of Turing Machines (Minsky Machines). Algebra i Logika 3(1) (1963) (Russian); Review in The Journal of Symbolic Logic 32(4), 523–524 (1967) 5. Damm, C., Holzer, M.: Automata that take advice. In: H´ ajek, P., Wiedermann, J. (eds.) MFCS 1995. LNCS, vol. 969, pp. 565–613. Springer, Heidelberg (1995) 6. Dwork, C., Stockmeyer, L.: Finite state verifiers I: the power of interaction. Journal of the Association for Computing Machinery 39(4), 800–828 (1992) 7. Erd¨ os, P.: Some remarks on the theory of graphs. Bulletin of the American Mathematical Society 53(4), 292–294 (1947) 8. Fagin, R.: Generalized First-Order Spectra and Polynomial-Time Recognizable Sets. In: Karp, R. (ed.) SIAM-AMS Proceedings of Complexity of Computation, vol. 7, pp. 27–41 (1974) 9. Freivalds, R. (Freivald, R.V.): Recognition of languages with high probability on different classes of automata. Dolady Akademii Nauk SSSR 239(1), 60–62 (1978) (Russian) 10. Freivalds, R.: Projections of Languages Recognizable by Probabilistic and Alternating Finite Multitape Automata. Information Processing Letters 13(4/5), 195–198 (1981) 11. Freivalds, R.: Complexity of Probabilistic Versus Deterministic Automata. In: Barzdins, J., Bjorner, D. (eds.) Baltic Computer Science. LNCS, vol. 502, pp. 565–613. Springer, Heidelberg (1991) 12. Freivalds, R.: Non-Constructive Methods for Finite Probabilistic Automata. International Journal of Foundations of Computer Science 19(3), 565–580 (2008) 13. Freivalds, R.: Amount of nonconstructivity in finite automata. In: Maneth, S. (ed.) CIAA 2009. LNCS, vol. 5642, pp. 227–236. Springer, Heidelberg (2009) 14. Freivalds, R.: Multiple usage of random bits by finite automata. Unpublished manuscript (2010) 15. Hilbert, D.: Uber die Theorie der algebraischen Formen. Mathematische Annalen 36, 473–534 (1890) 16. Karp, R.M., Lipton, R.: Turing machines that take advice. L’ Enseignement Mathematique 28, 191–209 (1982) 17. Kolmogorov, A.N.: Three approaches to the quantitative definition of information. Problems in Information Transmission 1, 1–7 (1965) 18. Levin, L.A.: On the notion of a random sequence. Soviet Mathematics Doklady 14, 1413–1416 (1973) 19. Martin-L¨ of, P.: The definition of random sequences. Information and Control 9(6), 602–619 (1966) 20. Li, M., Vit´ anyi, P.M.B.: An Introduction to Kolmogorov Complexity and its Applications, 2nd edn. Springer, Heidelberg (1997) 21. Nishimura, H., Yamakami, T.: Polynomial time quantum computation with advice. Information Processing Letters 90(4), 195–204 (2004)
20
R. Agadzanyan and R. Freivalds
22. Schnorr, C.-P.: A unified approach to the definition of random sequences. Mathematical Systems Theory 5(3), 246–258 (1971) 23. Schnorr, C.-P.: Process Complexity and Effective Random Tests. Journal of Computer and System Sciences 7(4), 376–388 (1973) 24. Spencer, J.: Nonconstructive methods in discrete mathematics. In: Rota, G.-C. (ed.) Studies in Mathematics, MAA, vol. 17, pp. 142–178 (1978) 25. Stearns, R.E., Hartmanis, J., Lewis II, P.M.: Hierarchies of memory limited computations. In: Proceedings of FOCS, pp. 179–190 (1965) 26. Tadaki, K., Yamakami, T., Lin, J.C.H.: Theory of One Tape Linear Time Turing ˇ Machines. In: Van Emde Boas, P., Pokorn´ y, J., Bielikov´ a, M., Stuller, J. (eds.) SOFSEM 2004. LNCS, vol. 2932, pp. 335–348. Springer, Heidelberg (2004) 27. Yamakami, T.: Swapping lemmas for regular and context-free languages with advice. The Computing Research Repository (CoRR), CoRR abs/0808.4122 (2008)
Reversibility and Determinism in Sequential Multiset Rewriting Artiom Alhazov1,2 , Rudolf Freund3 , and Kenichi Morita1 1
3
IEC, Department of Information Engineering, Graduate School of Engineering, Hiroshima University, Higashi-Hiroshima 739-8527 Japan
[email protected] 2 Institute of Mathematics and Computer Science Academy of Sciences of Moldova Academiei 5, Chi¸sin˘ au MD-2028 Moldova
[email protected] Faculty of Informatics, Vienna University of Technology Favoritenstr. 9, 1040 Vienna, Austria
[email protected]
Abstract. We study reversibility and determinism aspects of sequential multiset processing systems, and the strong versions of these properties. Syntactic criteria are established for both strong determinism and for strong reversibility. It also shown that without control all four classes – deterministic, strongly deterministic, reversible, strongly reversible – are not universal, while allowing priorities or inhibitors the first and the third class become universal. Moreover, strongly deterministic multiset rewriting systems with priorities are also universal.
1
Introduction
The object of study of this paper is the computational model called (sequential) multiset processing. It can be interpreted as scattered context grammars, in the sense that the positions of symbols are irrelevant. The further two ingredients we can use are inhibitors (also known as forbidding context) and priorities; we say that systems are without control if none of these ingredients is used. The reader can find the definition of these grammars and these controls, e.g., in [5]. Multiset processing is also closely related to membrane systems, see [16], working within only one membrane in a sequential way (and not with maximal parallelism). If a fixed enumeration of the elements of the alphabet is assumed, then multisets are isomorphic to vectors. In that sense, multiset processing corresponds to vector addition systems (see, e.g., [7]). Alternatively, adding and removing symbols can be viewed as incrementing and decrementing counters, i.e., vector addition systems may be viewed as a variant of stateless counter machines (as multitape non-writing Turing machines), where for every instruction it is specified for each counter which integer is to be added to it, not restricted to -1, 0 or 1. Such a variant is also equivalent to multiset processing systems (in this case, testing for zero corresponds to using inhibitors). C.S. Calude et al. (Eds.): UC 2010, LNCS 6079, pp. 21–31, 2010. c Springer-Verlag Berlin Heidelberg 2010
22
A. Alhazov, R. Freund, and K. Morita
The aim of this paper is to consider such properties of multiset rewriting as reversibility, determinism and their strong versions. Reversibility is an important property of computational systems. It has been well studied for circuits of logical elements ([6]), circuits of memory elements ([12]), cellular automata ([13]), Turing machines ([3], [15]), register machines ([11]). Reversibility as a syntactical property is closely related to the microscopic physical reversibility, and thus it assumes better miniaturization possibilities for potential implementation. Moreover, reversibility essentially is backward determinism. Reversible P systems have been considered in [9], in the energy-based model, simulating Fredkin gate and thus reversible circuits. The maximally parallel case was considered in [2]; such systems are universal with priorities or inhibitors, and it follows that reversibility is undecidable. It is interesting that the description of some computational systems includes the initial configuration (e.g., membrane systems), while it is not the case for many others (cellular automata, Turing machines). In [2] one considers parallel multiset rewriting and defines strong reversibility as extending reversibility requirement such that it does not depend on the initial configuration; a characterization of strongly reversible systems without control is given. It has been shown in [8] that strong reversibility is decidable for parallel multiset rewriting. Also in [2] one defines strong determinism, gives a syntactic criterion for it and shows that the power of strongly deterministic systems is weaker than that of deterministic systems. It is natural to ask similar questions for sequential multiset rewriting systems.
2
Definitions
Consider a finite alphabet O. A multiset over O is a mapping M : O → N. In this paper we represent a multiset M by any string w ∈ O∗ such that |w|a = M (a) for a ∈ O, keeping in mind that the order of symbols is not important. We use notations of set inclusions and set difference to multisets, meaning ≤ and max(difference, 0) for every symbol a, respectively. A multiset rewriting rule is given by r : u → v, where u → O+ , v → O∗ . This rule can be applied to a multiset w if |w|a ≥ |u|a for a ∈ O, and the result is w , denoted w ⇒ w if |wv|a = |w u|a for a ∈ O. If a rule has a promoter a, we write it as u → v|a . If a rule has an inhibitor a, we write it as u → v|¬a . The priority relationship is denoted by >. We also refer to u and v as lhs(r) and rhs(r) if r : u → v (inhibited or not). Applicability of rules with additional control depends on the corresponding condition (presence of symbol a, absence of symbol a, or inapplicability of rules with higher priority, respectively). In the following we will not speak about promoters since in the sequential case rules u → v|a and ua → va are equivalent. The relation ⇒ is defined as the union of relations corresponding to the application of all rules. A multiset rewriting system is a tuple (O, T, w0 , R), where O is the alphabet, T is the terminal subalphabet, w0 is the starting multiset and R is a finite
Reversibility and Determinism in Sequential Multiset Rewriting
23
set of rules. In the accepting case, T is replaced by the input subalphabet Σ, and the computation starts when an arbitrary multiset over Σ is added to w0 . Consider a multiset rewriting system Π with alphabet O. A configuration is a multiset u. The space C of configurations (i.e., of multisets over O) is essentially an |O|-dimensional space with non-negative integer coordinates. The relation ⇒ induces an infinite graph of C. The halting configurations (and only these) have out-degree zero. Throughout this paper, by reachable we mean reachable from the initial configuration. We now define two properties; extending the requirement from reachable configurations to all configurations, we obtain their strong variants (in case of accepting systems, the initial configurations are obtained by adding to a fixed multiset arbitrary multisets over a fixed subalphabet; the extension is natural). Definition 1. We call Π strongly reversible if every configuration has indegree at most one; Π is called reversible if every reachable configuration has in-degree at most one. We call Π strongly deterministic if every configuration has out-degree at most one; (as it is common in membrane computing) Π is called deterministic if every reachable configuration has out-degree at most one. Note: it is crucial that in-degree is the number of all preimages, not just reachable ones; otherwise the concept of reversibility becomes trivial. The not-strong properties refer to the actual computation of the system, but the strong ones do not depend on the initial configuration. We now proceed by defining a computation as a sequence of transitions, starting in the initial configuration, and ending in some halting configuration if it is finite. The result of a halting computation is the number of terminal objects inside the system when it halts (or the number of input objects when the system starts, in the accepting case). The set N (Π) of numbers generated by a multiset processing system Π is the set of results of all its computations. The family of number sets generated by reversible multiset rewriting systems with features α is denoted by N RM R(α)T , where α ⊆ {coo, inh, P ri} and the braces of the set notation are omitted, with coo, inh, and P ri meaning that we use cooperative rules, inhibitors, and priorities on the rules, respectively. Subscript T means that only terminal objects contribute to the result of computations; if T = O, we omit specifying it in the description and we then also omit the subscript T in the notation. In the case of accepting systems, we write Na instead of N , and subscript T has no meaning. For strongly reversible systems, we replace R by Rs . For deterministic (strongly deterministic) systems, we replace R by D (Ds , respectively). We denote the family of recursively enumerable sets of non-negative integers by N RE, and call a class of systems generating or accepting it universal. We denote the family of regular sets of non-negative integers by N REG. A linear number set of non-negative is a set S that can be defined by numbers p0 , p1 , · · ·, pk as S = {p0 + ki=1 ni pi | ni ≥ 0, 1 ≤ i ≤ k}. Linear sets are a subclass of N REG. We call a class of sets sublinear if it is a proper subclass of linear sets.
24
2.1
A. Alhazov, R. Freund, and K. Morita
Register Machines
In this paper we consider register machines with increment, unconditional decrement and test instructions, [11], see also [10]. An n-register machine is defined by a tuple M = (n, Q, q0 , qf , I) where I is a set of instructions bijectively labeled by elements of Q, q0 ∈ Q is the initial label, and qf ∈ Q is the final label. The allowed instructions are: – (q : i?, q , q ) - jump to instruction q if the contents of register i is zero, otherwise proceed to instruction q ; – (q : i+, q , q ) - add one to the contents of register i and proceed to either instruction q or q , non-deterministically; – (q : i−, q , q ) - subtract one from the contents of register i and proceed to either instruction q or q , non-deterministically; – (qf : halt) - terminate; it is a unique instruction with label qf . As for subtract instructions, the computation is blocked if the contents of the corresponding register is zero. Without restricting generality, we assume that a test of a register always precedes its subtraction. (In a popular model with addition and conditional subtraction instructions, reversibility is more difficult to describe.) A configuration of a register machine is defined by the current instruction and the contents of all registers, which are non-negative integers. If q = q for every instruction (q : i+, q , q ) and for every instruction (q : i−, q , q ), then the machine is called deterministic. Clearly, this is necessary and sufficient for the global transition (partial) mapping not to be multi-valued. A register machine is called reversible if in the case that there is more than one instruction leading to some instruction q, then exactly two exist, they test the same register, one leads to q if the register is zero and the other one leads to q if the register is positive. It is not difficult to check that this requirement is a necessary and sufficient condition for the global transition mapping to be injective. Let us formally state the reversibility of a register machine: for any two different instructions (q1 : i1 α1 , q1 , q1 ) and (q2 : i2 α2 , q2 , q2 ), it holds that q1 = q2 and q1 = q2 . Moreover, if q1 = q2 or q1 = q2 , then α1 = α2 =? and i1 = i2 . It has been shown ([11]) that reversible register machines are universal (a straightforward simulation of, e.g., reversible Turing Machines [3], would not be reversible). It follows that non-deterministic reversible register machines can generate any recursively enumerable set of non-negative integers as a value of the first register by all its possible computations starting from all registers having zero value.
3
Examples and Computational Power
We now present a few examples to illustrate the definitions.
Reversibility and Determinism in Sequential Multiset Rewriting
25
Example 1. Any multiset rewriting system Π1 = (O, w0 , {u → v}) with only one rule is strongly reversible: to obtain a preimage, remove v and add u. Clearly, Π0 is also strongly deterministic: there is no choice. If both w0 and v contain u, then no halting configuration is reachable, otherwise a singleton is generated (any singleton can be generated in that way). Therefore, {n} ∈ N Rs M R(coo) for n ∈ N. Moreover, the system (O, λ, {u → v}) generates the empty set, i.e., ∅ ∈ N Rs M R(coo). Example 2. Consider a system Π2 = ({a, b, c}, a, {a → ab, a → c}). It generates the set of positive integers since the reachable halting configurations are cb∗ , and it is reversible (for the preimage, replace c with a or ab with a), but not strongly reversible (e.g., aab ⇒ cab and ca ⇒ cab). Hence, N+ ∈ N RM R(coo). Example 3. Any multiset rewriting system containing some erasing rule u → λ is not reversible, unless other rules are never applicable. Example 4. Any system containing rules x1 → y, x2 → y that may apply at least one of them in some computation is not reversible. Reversible multiset rewriting systems with either inhibitors or priorities are universal. The proof method is essentially the same as in the parallel case ([2]). Theorem 1. N RM R(coo, P ri)T = N RM R(coo, inh)T = N RE. Proof. We reduce the theorem statement to the claim that such mulsitet rewriting systems simulate the work of any reversible register machine M = (n, Q, q0 , qf , I). Consider a multiset rewriting system Π = (O, {r1 }, q0 , R), where O = {ri | 1 ≤ i ≤ n} ∪ Q, R = {q → q ri , q → q ri | (q : i+, q , q ) ∈ I} ∪ {qri → q , qri → q | (q : i−, q , q ) ∈ I} ∪ Rt , Rt = {q → q |¬ri , qri → q ri | (q : i?, q , q ) ∈ I}. Inhibitors can be replaced by priorities by redefining Rt as follows. Rt = {qri → q ri > q → q | (q : i?, q , q ) ∈ I}. There is a bijection between the configurations of Π containing one symbol from Q and the configurations of M , so reversibility of Π follows from the correctness of the simulation, the reversibility of M and because the number of symbols from Q is preserved by transitions of Π. The universality in Theorem 1 leads to the following undecidability. Corollary 1. It is undecidable whether a system from the class of multiset rewriting systems with either inhibitors or priorities is reversible.
26
A. Alhazov, R. Freund, and K. Morita
Proof. Recall that the halting problem for register machines is undecidable. Add instructions qf → F1 , qf → F2 , F1 → F , F2 → F to the construction presented above (F1 , F2 , F are new objects), the system is now reversible if and only if some configuration containing F is reachable, i.e., when the underlying register machine does not halt, which is undecidable. Deciding strong reversibility is much easier: it is necessary and sufficient that no two different rules are applicable to any configurations. Without restricting generality we only consider the systems without useless rules, i.e., no rule is inhibited by some of its reactants. Consider the case that inhibitors may be used, and let r1 : x1 → y1 v, r2 : x2 → y2 v be two rules of the system (possibly controlled by inhibitors), where v is the largest common submultiset of the right sides of r1 , r2 . Then both rules can be reversely applied to some configuration C if and only if it contains y1 v and y2 v and none of these two rules is inhibited. Now writing C as y1 y2 vw, we get possible transitions x1 y2 w ⇒ y1 vy2 w and x2 y1 w ⇒ y2 vy1 w. The inhibitors should in particular forbid the case w = λ (from that, the general case for w follows immediately). Therefore, either r1 should be inhibited by some object from y2 , or r2 should be inhibited by some object from y1 . Clearly, this criterion is also sufficient, since for any two rules the competition for the backwards applicability is solved by inhibitors. We have just proved the following statement. Theorem 2. A sequential multiset rewriting system with inhibitors is strongly reversible if for any two rules r1 , r2 , either r1 is inhibited by rhs(r2 ) \ rhs(r1 ) or r2 is inhibited by rhs(r1 ) \ rhs(r2 ). The case of priorities is slightly more envolved. Let r1 : x1 → y1 v, r2 : x2 → y2 v be two rules of the system, where v is the largest common submultiset of the right sides of r1 , r2 . Then both rules can be reversely applied to some configuration C if and only if it contains y1 y2 v and none of these two rules is made inapplicable by higher priority rules. Writing C as y1 y2 vw, we get possible transitions x1 y2 w ⇒ y1 vy2 w and x2 y1 w ⇒ y2 vy1 w. The priorities should in particular forbid the case w = λ (from that, the general case for w follows immediately). Therefore, either r1 should be made inapplicable by some rule r > r1 with left hand side contained in x1 y2 , or r2 should be made inapplicable by some rule r > r2 with left hand side contained in x1 y2 . Clearly, this criterion is also sufficient since the competition for reverse applicability between any two rules is eliminated. We have just proved the following result. Theorem 3. A sequential multiset rewriting system with priorities is strongly reversible if for any two rules r1 , r2 there exists a rule r such that either r > r1 and lhs(r) ⊆ lhs(r1 )(rhs(r2 ) \ rhs(r1 )) or r > r2 and lhs(r) ⊆ lhs(r2 )(rhs(r1 ) \ rhs(r2 )). It is not difficult to see that the criterion of strong reversibility for systems that may use both inhibitors and priorities is obtained as a disjunction of the requirements from Theorem 2 and Theorem 3 for any two rules r1 , r2 .
Reversibility and Determinism in Sequential Multiset Rewriting
27
Corollary 2. A sequential multiset rewriting system with priorities is strongly reversible if for any two rules r1 , r2 at least one of the following conditions holds – – – –
r1 is inhibited by rhs(r2 ) \ rhs(r1 ), r2 is inhibited by rhs(r1 ) \ rhs(r2 ), there exists a rule r > r1 such that lhs(r) ⊆ lhs(r1 )(rhs(r2 ) \ rhs(r1 )), there exists a rule r > r2 such that lhs(r) ⊆ lhs(r2 )(rhs(r1 ) \ rhs(r2 )).
Corollary 3. A multiset rewriting system without control is strongly reversible if and only if it only has one rule. N Rs M R(coo)T = {∅} ∪ {{n} | n ∈ N}. It is known that (see, e.g, [7]) the generative power of sequential multiset rewriting systems equals P sM AT , even without requiring additional properties like reversibility. Corollary 4. Reversible multiset rewriting systems without priorities and without inhibitors are not universal. It is an open problem to specify the exact generative power of this class.
4
Determinism
The concept of determinism common to multiset rewriting systems as considered in the area of membrane computing essentially means that such a system, starting from the fixed configuration, has a unique computation. As it will be obvious later, this property is often not decidable. Of course, this section only deals with accepting systems. First, it is well-known that deterministic multiset rewriting systems with either priorities or inhibitors are universal, by simulation of any (deterministic accepting) register machine M . In fact, in this case the construction of Theorem 1 is both deterministic and reversible. Corollary 5. N Ds M R(coo, P ri)T = N RE. Proof. To get a strongly deterministic system, it suffices to take the construction with priorities in Theorem 1 (simulating any deterministic accepting register machine) and extend the priorities to a total order relation. The total order relation can be defined by the relation specified by Rt union the relation induced by an arbitrary fixed (e.g., lexicographical) order on objects from Q as they appear in the left side of rules. This also holds for the parallel case; this problem was left open in [2]. In general, if a certain class of non-deterministic systems is universal even in a deterministic way, then the determinism is undecidable for that class. This applies to multiset rewriting systems, similarly to Corollary 1.
28
A. Alhazov, R. Freund, and K. Morita
Corollary 6. It is undecidable whether a system from the class of multiset rewriting systems with either inhibitors or priorities is deterministic. Proof. For an arbitrary register machine M , there is a deterministic multiset rewriting system Π with either inhibitors or priorities simulating M . Without restricting generality we assume that an object qf appears in the configuration of Π if and only if it halts. Add instructions qf → F1 and qf → F2 to the set of rules (F1 , F2 are new objects); the system is now deterministic if and only if some configuration with qf is reachable, i.e., when M does not halt, which is undecidable. On the contrary, the strong determinism we now consider means that a system has no choice of transitions from any configuration. We now claim that it is a syntactic property. Theorem 4. A multiset rewriting system is strongly deterministic if and only if any two rules are either mutually excluded by inhibitor (of one rule appearing on the left side of another rule), or are in a priority relation. Proof. Any one-rule multiset rewriting system is strongly deterministic. The forward implication of the theorem holds because mutually excluding reactant/inhibitor conditions eliminate all competing rules except one, and so does the priority relation. In the result, for any configuration at most one rule is applicable. Conversely, assume that rules p, p of the system are not in a priority relation, and are not mutually excluded by the reactant/inhibitor conditions. Let x, x be the multisets of objects consumed by rules p, p , respectively. Then, from the configuration C = xx , it is possible to apply either rule, causing contradiction to the strong determinism. Corollary 7. A multiset rewriting system without inhibitors and without priority is not strongly deterministic, unless it only has one rule. We now characterize the power of strongly deterministic multiset rewriting systems without additional control: any multiset rewriting system without inhibitors or priorities accepts either the set of all non-negative integers, or a finite set of all numbers bounded by some number. Theorem 5. Na Ds M R(coo) = {∅, N} ∪ {{k | 0 ≤ k ≤ n} | n ∈ N}. Proof. Any strongly deterministic system is of the form (O, Σ, w0 , {u → v}). Its computation starting from a configuration C is not accepting if and only if u is contained in both C and v. If u is not contained in v, then the system accepts all numbers. Otherwise, the system accepts all numbers k such that there exists x ∈ Σ k such that w0 x does not contain multiset u. The latter condition is monotone w.r.t. k, i.e., it is either never satisfied, or always satisfied, or there exists a number n ∈ N such that it holds if and only if k ≥ n.
Reversibility and Determinism in Sequential Multiset Rewriting
29
– System ({a}, {a}, a, {a → a}) accepts ∅ because of the infinite loop in its computation; – system ({a}, {a}, a, {a → λ}) accepts N, i.e., anything, because it halts after erasing everything in one step; and – for any n ∈ N there is a system ({a}, {a}, λ, {an+1 → an+1 }) accepting {k | 0 ≤ k ≤ n}, because the system starts in a final configuration if and only if the input does not exceed n, and enters an infinite loop otherwise. These examples show the converse implication of the theorem.
Theorem 5 shows that the computational power of strongly deterministic seqential multiset rewriting systems without additional control is, in a certain sense, degenerate (it is sublinear). We now strengthen Theorem 5 from strongly deterministic systems to all deterministic ones. Corollary 8. Na DM R(coo) = {∅, N} ∪ {{k | 0 ≤ k ≤ n} | n ∈ N}. Proof. Like in Theorem 5, the system accepts all numbers k for which there exists x ∈ Σ k such that the computation starting from w0 x halts. It suffices to recall that if C ⊆ C and a computation starting with C halts, then a computation starting with C also halts. Indeed, since the computation starting from C is deterministic, if it applies rule r in step s, then the computation starting from C cannot apply any other rule in the same step. The corresponding configurations of the two computations preserve any ⊆ relation of the initial configurations.
5
Conclusions
We outlined the concepts of reversibility, strong reversibility, and strong determinism for multiset rewriting systems, see Table 1, left. Table 1. The power of sequential multiset rewriting systems with different properties, depending on the features (left), and the parallel case from [2] (right). U - universal, N - non-universal, L - sublinear, ? - open, C - conjectured to be non-universal. Property D(acc) Ds (acc) R(gen) Rs (gen)
pure P ri inh Property L (Cor. 8) U (Th. 1) U (Th. 1) D(acc) L (Th. 5) U (Cor. 5) ? Ds (acc) N (Cor. 4) U (Th. 1) U (Th. 1) R(gen) L (Cor. 3) ? ? Rs (gen)
pure P ri inh pro, inh U U U U L U (Cor. 5) ? U C U U U L C C C
We showed that reversible multiset rewriting systems with control are universal, whereas it is known that this result does not hold without control. Moreover, the strongly reversible multiset rewriting systems without control do not halt unless the starting configuration is halting, but this is no longer true with inhibitors.
30
A. Alhazov, R. Freund, and K. Morita
For multiset rewriting systems with inhibitors or priorities, strong reversibility has also been characterized syntactically. Moreover, we gave a syntactic characterization for the property of strong determinism. For systems without control, the power of deterministic systems coincides with that of strongly deterministic systems: a corresponding system without control either accepts all natural numbers, or a finite set of numbers. With the help of priorities, both deterministic and strongly deterministic systems become universal. With inhibitors, deterministic systems are universal, but it is open whether the same holds for strongly deterministic systems. In Table 1 (right) we list the results obtained in the parallel multiset rewriting case for comparison. We repeat some interesting differences of the sequential case with respect to the parallel case: – promoters become useless; – the criterion of the strong determinism is different; – the criterion of strong reversibility is not just decidable, but has a concrete description and is thus easily testable; – reversible uncontrolled systems are likely to be more limited in power (they are non-universal – parallelism cannot do appearance checking); – strong determinism via inhibitors is likely to be more limited in power (inhibitors would need to be heavily used just to insure the property); – the power of deterministic uncontrolled systems is radically different: sublinear instead of universal. Acknowledgments. Artiom Alhazov gratefully acknowleges the support of the Japan Society for the Promotion of Science and the Grant-in-Aid for Scientific Research, project 20·08364. He also acknowledges the support by the Science and Technology Center in Ukraine, project 4032.
References 1. Agrigoroaiei, O., Ciobanu, G.: Dual P Systems. In: Corne, D.W., Frisco, P., Paun, G., Rozenberg, G., Salomaa, A. (eds.) WMC 2008. LNCS, vol. 5391, pp. 95–107. Springer, Heidelberg (2009) 2. Alhazov, A., Morita, K.: On Reversibility and Determinism in P Systems. In: P˘aun, G., P´erez-Jim´enez, M.J., Riscos-N´ un ˜ez, A. (eds.) Preproc. of Membrane Computing - 10th International Workshop, pp. 129–139 3. Bennett, C.H.: Logical Reversibility of Computation. IBM Journal of Research and Development 17, 525–532 (1973) 4. Calude, C., P˘ aun, G.: Bio-steps beyond Turing. BioSystems 77, 175–194 (2004) 5. Dassow, J., P˘ aun, G.: Regulated Rewriting in Formal Language Theory. In: EATCS Monographs in Theoretical Computer Science, vol. 18. Springer, Heidelberg (1989) 6. Fredkin, E., Toffoli, T.: Conservative Logic. Int. J. Theoret. Phys. 21, 219–253 (1982) 7. Freund, R., Ibarra, O.H., P˘ aun, G., Yen, H.-C.: Matrix languages, register machines, vector addition systems. In: Proc. Third Brainstorming Week on Membrane Computing, Sevilla, pp. 155–168 (2005)
Reversibility and Determinism in Sequential Multiset Rewriting
31
8. Ibarra, O.H.: On Strong Reversibility in P Systems and Related Problems (manuscript) 9. Leporati, A., Zandron, C., Mauri, G.: Reversible P Systems to Simulate Fredkin Circuits. Fundam. Inform. 74(4), 529–548 (2006) 10. Minsky, M.L.: Computation: Finite and Infinite Machines. Prentice-Hall, Englewood Cliffs (1967) 11. Morita, K.: Universality of a Reversible Two-Counter Machine. Theoret. Comput. Sci. 168, 303–320 (1996) 12. Morita, K.: A Simple Reversible Logic Element and Cellular Automata for Reversible Computing. In: Margenstern, M., Rogozhin, Y. (eds.) MCU 2001. LNCS, vol. 2055, p. 102. Springer, Heidelberg (2001) 13. Morita, K.: Simple Universal One-Dimensional Reversible Cellular Automata. J. Cellular Automata 2, 159–165 (2007) 14. Morita, K., Nishihara, N., Yamamoto, Y., Zhang, Z.: A Hierarchy of Uniquely Parsable Grammar Classes and Deterministic Acceptors. Acta Inf. 34(5), 389–410 (1997) 15. Morita, K., Yamaguchi, Y.: A Universal Reversible Turing Machine. In: DurandLose, J., Margenstern, M. (eds.) MCU 2007. LNCS, vol. 4664, pp. 90–98. Springer, Heidelberg (2007) 16. P˘ aun, G.: Membrane Computing. An Introduction. Springer, Berlin (2002) 17. P systems webpage, http://ppage.psystems.eu/
Synchronization in P Modules Michael J. Dinneen, Yun-Bum Kim, and Radu Nicolescu Department of Computer Science, University of Auckland, Private Bag 92019, Auckland, New Zealand {mjd,yun,radu}@cs.auckland.ac.nz
Abstract. In the field of molecular computing, in particular P systems, synchronization is an essential requirement for composing or sequentially linking together congenial P system activities. We provide an improved deterministic algorithm based on static structures and traditional rules, which runs in 4e + 13 steps, where e is the eccentricity of the initiating cell. Using the same model, extended with the support of cell IDs, we provide another deterministic algorithm, which runs in 3e + 13 steps. Our algorithms use a convenient framework, called P modules, which embraces the essential features of many popular types of P systems. Keywords: P systems, P modules, synchronization, cellular automata.
1
Introduction
The Firing Squad Synchronization Problem (FSSP) [17,11,8,15,9,18] is one of the best studied problems for cellular automata. The problem involves finding a cellular automaton, such that, after a command is given, all the cells, after some finite time, enter a designated firing state simultaneously and for the first time. Several variants of FSSP have been proposed and studied [15,17]. Studies of these variations mainly focus on finding a solution with as few states as possible and possibly running in optimum time. There are several applications that require synchronization. We list just a few here. At the biological level, cell synchronization is a process by which cells at different stages of the cell cycle (division, duplication, replication) in a culture are brought to the same phase. There are several biological methods used to synchronize cells at specific cell phases [7]. Once synchronized, monitoring the progression from one phase to another allows us to calculate the timing of specific cells’ phases. A second example relates to operating systems [16], where process synchronization is the coordination of simultaneous threads or processes to complete a task without race conditions. In distributed computing, in particular consensus problems, such as the Byzantine agreement problem, processes need to be in a common state at certain fixed times, see Lynch [10]. Finally, in telecommunication networks [6], we often want to synchronize computers to the same time, i.e., primary reference clocks should be used to avoid clock offsets. The synchronization problem has recently been studied in the framework of P systems. Using tree-based P systems, Bernardini et al. [2] provided a nondeterministic solution with time complexity 3h and a deterministic solution with C.S. Calude et al. (Eds.): UC 2010, LNCS 6079, pp. 32–44, 2010. c Springer-Verlag Berlin Heidelberg 2010
Synchronization in P Modules
33
time complexity 4n+2h, where h is the height of the tree structure underlying the P system and n is the number of membranes of the P system. The deterministic solution requires membrane polarization techniques and uses a depth-first-search. More recently, Alhazov et al. [1] described an improved deterministic algorithm for tree-based P systems, that runs in 3h + 3 steps. This solution requires conditional rules (promoters and inhibitors) and combines a breadth-first-search, a broadcast and a convergecast. We continued the study of FSSP for digraph-based P systems [4]. We proposed two uniform deterministic solutions to a variant of FSSP [17], in which there is a single commander, at an arbitrary position, for hyperdag P systems [12] and for neural P systems [13] with symmetric communication channels. We further generalized this problem by synchronizing a subset of cells (or membranes) of the considered hyperdag or neural P system. Our first solution dynamically extends P systems with mobile channels. Our second solution is solely based on classical rules and static channels. In contrast to the previous FSSP solutions for tree-based P systems, our solutions do not require membrane polarizations or conditional rules, but require states, as typically used in hyperdag and neural P systems. These solutions take ec + 5 and 6ec + 7 steps, respectively, where ec is the eccentricity of the commander cell of the digraph underlying these P systems. We continue our work on FSSP for digraph-based P systems (neural and hyperdag P systems), using a single framework, called P modules [5], which embraces the computational functionality of many popular types of P systems. In Section 3 of this paper, we improve our previous results for a natural fourphase FSSP algorithm (classical rules and static structures) by reducing the multiplicative factor from 6 to 4, such that the overall running time is 4ec + 13. Further, in Section 4, by exploiting the notion of cell IDs, we provide a slightly faster algorithm of running time 3ec + 13. In Section 2, we first define a virtual structure for a given P system structure, based on a neighboring relationship from a commanding node and formally define a P module. Finally, in Section 5, we summarize our results and conclude with some open problems.
2
Preliminary
We assume that the reader is familiar with the basic terminology and notations, such as relations, graphs, nodes (vertices), arcs, directed graphs (digraphs), directed acyclic graphs (dags), trees, alphabets, strings and multisets [12]. For a given tree, connected dag or (weakly) connected digraph, we define the eccentricity of a node c, ec , as the maximum length of a shortest path between c and any other node in the corresponding underlying undirected structure. For a tree, the set of neighbors of a node x, N eighbor(x), is the union of x’s parent and x’s children. For a dag (X, δ) and a node x ∈ X, we define N eighbor(x) = δ(x) ∪ δ −1 (x) ∪ δ(δ −1 (x))\{x}, if we want to include the siblings, or, N eighbor(x) = δ(x) ∪ δ −1 (x), otherwise. For a graph (X, A) and a node x ∈ X, we define N eighbor(x) = A(x) = {y | (x, y) ∈ A}. Note that, as defined, N eighbor is always a symmetric relation.
34
M.J. Dinneen, Y.-B. Kim, and R. Nicolescu
A special node c of a structure will be designated as the commander. For a given commander c, we define the level of a node x, levelc(x) ∈ N, as the length of a shortest path between the c and x, over the N eighbor relation. For a given tree, dag or graph and commander c, for nodes x and y, if y ∈ N eighbor(x) and levelc (y) = levelc(x) + 1, then x is a predecessor of y and y is a successor of x. Similarly, a node z is a peer of a node x, if z ∈ N eighbor(x) and levelc (z) = levelc (x). Note that, for a given node x, the set of peers and the set of successors are disjoint. A node without a successor will be referred to as a terminal. The eccentricity of a node c is ec = max{levelc (x) | x ∈ X}. A level-preserving path from c to a node y is a sequence x0 , . . . , xk , such that x0 = c, xk = y, xi ∈ N eighbor(xi−1 ), levelc(xi ) = i, 1 ≤ i ≤ k. We define countc (y) as the number of distinct level-preserving paths from c to y. Also define spanc (y) = max{levelc(z) | z is in a level-preserving path that contains y}. Definition 1 (P module [5]). A P module is a system Π = (O, K, δ, P ), where: 1. O is a finite non-empty alphabet of objects; 2. K is a finite set of cells and each cell σ ∈ K is represented as σ = (Q, s, w, R), where: • Q is a finite set of states for the cell; • s ∈ Q is the cell’s current state; • w ∈ O∗ is the cell’s current multiset of objects; • R is the cell’s finite ordered set of multiset rewriting rules of the form: s x →α s x (u)βγ , where s, s ∈ Q, x, x ∈ O∗ , u ∈ O∗ , α ∈ {min, max} is the rewriting operator, β ∈ {↑, ↓, } and γ ∈ {one, spread, repl} ∪ K are the transfer operators (see below for details). If u = λ (the emptyset), this rule can be abbreviated as s x →α s x . 3. δ is a binary relation on K, i.e., a set of structural arcs, representing duplex or simplex communication channels between cells; 4. P is a subset of K, indicating the port cells, i.e., the only cells can be connected to other modules. The rules given by the ordered set(s) R are applied in the weak priority order [14]. For a cell σ = (Q, t, w, R), a rule s x →α s x (u)βγ ∈ R is applicable if t = s and x ⊆ w. Additionally, if s x →α s x (u)βγ is the first applicable rule, then each subsequent applicable rule’s target state (i.e., state indicated in the right-hand side) must be s . The rewriting operator α = max indicates that the applicable rewriting rule is applied as many times as possible. The transfer operators β = and γ = repl (presented as (u)repl in a rewriting rule) indicate that the multiset u is replicated and sent to all neighbor cells. The other rewriting and (non-deterministic) transfer operators are not used in this paper and are described in [5] for those interested.
3
Deterministic FSSP Solution
In the FSSP, the commander sends an order to all (firing) squad cells, which will prompt them to synchronize by entering the designated firing state. However,
Synchronization in P Modules
35
in general, the commander does not have direct communication channels to all squad cells. Relaying the order through intermediate cells results in some squad cells receiving the order before other squad cells. In this case, to ensure that all squad cells enter the firing state simultaneously, each squad cell needs to wait until all other squad cells receive the order. For convenience and ease of understanding, we present the algorithmic steps into four conceptual phases. The details of these phases are given first, for the solution with a finite set of alphabet objects and later in Section 4, for the solution with cell ID objects. The initial configuration of a P module Π with n > 1 cells, as qualified by δ below, for our FSSP solution with a finite set of alphabet objects is as follows: 1. O = {a, b, c, d, e, f, g, h, u, v, w}. 2. K = {σ1 , . . . , σn } is the set of all cells, σc ∈ K is the commander and F ⊆ K is the set of squad cells. For each cell σi = (Qi , si , wi , Ri ) ∈ {σ1 , . . . , σn }: • Qi = {s0 , s1 , s2 , s3 , s4 , s5 , s6 , s7 , s8 , s9 , s10 }, where s10 is the firing state. • si = s⎧ 0. {a} if σi = σc and σi ∈ / F, ⎪ ⎪ ⎨ {a, f } if σi = σc and σi ∈ F, • wi = {f } if σi ∈ F \ σc , ⎪ ⎪ ⎩ ∅ if σi ∈ K \ (F ∪ {σc }). • Ri is given in Phases 1, 2, 3 and 4. 3. δ is either a symmetric digraph with simplex channels or is a (weakly) connected digraph with duplex channels. 4. P = ∅. For the proofs below, we define, for a cell σi , φi ∈ N as the total number of an object a in all of σi ’s peers (i.e., the number of distinct level-preserving paths from σc to all σi ’s peers), and τi ∈ N is the total number of an object a in all of σi ’s successors (i.e., the number of distinct level-preserving paths from σc to all σi ’s successors). Phase 1 (FSSP-I: First broadcast from the commander) Overview: The commander σc initiates a broadcast, which relays a broadcast message from the predecessors to their successors. The commander starts a counter, which is incremented by one in each step. Each cell relays the broadcast message as follows. If a cell σi receives a broadcast message from its predecessors, σi becomes do-broadcast cell and sends a broadcast message to all its successors. All do-broadcast cells ignore any subsequently received broadcast messages. Precondition: The initial configuration of the P module Π. Postcondition: Each cell σi ends in state s4 and σi has: countc (i) copies of a; countc (i) copies of b; τi copies of d; four copies of e, if σi = σc ; one copy of f , if σi ∈ F ; φi copies of v. Number of steps: For each cell σi , this phase takes levelc(i) + 4 steps.
36
Rules:
M.J. Dinneen, Y.-B. Kim, and R. Nicolescu
0. For state s0 : 1) s0 a →max s1 abe (d)repl 2) s0 d →max s1 ab (d)repl 1. For state s1 : 1) s1 be →max s2 bee 2) s1 a →max s2 a (u)repl 3) s1 u →max s2 4) s1 d →max s2
2. For state s2 : 1) s2 be →max s3 bee 2) s2 a →max s3 a 3) s2 u →max s3 v 3. For state s3 : 1) s3 be →max s4 bee 2) s3 a →max s4 a 3) s3 u →max s4
Proof. Consider a cell σi at levelc(i). Note that, an object d (the broadcast message) needs levelc(i) steps to relay down to a cell σi at level levelc (i). • At step levelc (i)+1: if σi is the commander, it transits to state s1 , broadcasts one d to each of its neighbors and produces one b. If σi is not the commander, σi transits to state s1 , broadcasts countc (i) copies of d to each of its neighbors and produces countc (i) copies of a and b by accumulating one copy of a and b for each sent d. • At step levelc (i) + 2: σi transits to state s2 , broadcasts countc (i) copies of u, receives φi copies of u from its peers and receives τi copies of d from its successors. At the same time, σi eliminates all superfluous copies of u and d. • At step levelc (i) + 3: σi transits to state s3 and converts τi copies of u into τi copies of v. • At step levelc(i) + 4: σi transits to state s4 and eliminates all superfluous copies of u. In this phase, each cell σi is idle in steps 0, . . . , levelc(i) and is active in steps levelc(i) + 1, levelc (i) + 2, levelc(i) + 3, levelc (i) + 4 as stated above. Thus, each σi takes levelc (i) + 4 steps. The commander σc initially has one a and produces four copies of e by accumulating one e in each step, levelc (i) + 1, . . . , levelc (i) + 4. In this phase, f is not produced or consumed, thus, σi ∈ F has one f . The number of each object a, b, d, u and v, for a cell σi , is verified in the steps levelc(i) + 1, . . . , levelc(i) + 4. Phase 2 (FSSP-II: Convergecasts from terminal cells) Overview: Immediately after the first broadcast (Phase 1), each terminal cell σt initiates a convergecast, which relays a convergecast message from the successors to their predecessors, for all distinct level-preserving paths from σc to σt . Each non-terminal cell relays the convergecast message as follows. If a cell σi receives a convergecast message from all its successors, σi becomes a ready-toconvergecast (RTC) cell and sends a RTC-notification to all its peers. If a RTC cell σj receives RTC-notifications from all its peers, σj becomes a do-convergecast (DC) cell and sends a convergecast message to all its predecessors. After σc receives convergecast messages from all its successors, σc stops the counter (which was started in Phase 1) and uses this counter to compute its eccentricity, where the counter is the number of steps for σc to send a message to a farthest cell σz and receive a reply message from σz . Precondition: As described in the postcondition of Phase 1.
Synchronization in P Modules
37
Postcondition: This phase ends when the commander enters state s7 . Each cell σi ends in state s7 and σi has: countc (i) copies of a; ec + 2 copies of e, if σi = σc ; one copy of f , if σi ∈ F . Number of steps: For each cell σi , this phase takes 3ec − levelc(i) + 4 steps. Rules: 5. For state s5 : 4. For state s4 : 1) s4 h →max s5 1) s5 h →max s6 2) s4 cd →max s4 cd 2) s5 vw →max s5 v 3) s4 ad →max s4 ad 3) s5 cv →max s5 cv 4) s4 a →max s4 ah (w)repl 4) s5 av →max s5 av (w)repl 5) s4 w →max s4 5) s5 a →max s5 ah (c)repl 6) s4 cd →max s5 6) s5 eee →max s6 e 6. For state s6 : 7) s4 vw →max s5 v 8) s4 cv →max s5 cv 1) s6 cv →max s6 2) s6 bc →max s6 9) s4 av →max s5 av (w)repl 10) s4 a →max s5 ah (c)repl 3) s6 b →max s6 b 4) s6 a →max s7 a 11) s4 be →max s4 bee 12) s4 be →max s5 ee 5) s6 v →max s7 6) s6 w →max s7 Proof. To establish the correctness of this phase, we categorize a cell σi into one of four groups. (1) (2) (3) (4)
spanc (i) = ec . spanc (i) < ec and σi has no peers. spanc (i) < ec and for each σi ’s peer σj , spanc (j) ≤ spanc (i). spanc (i) < ec and σi has a peer σj , spanc (j) > spanc (i).
A cell σi in group (1), (2) or (3) progresses through two convergecast steps, t1 = spanc (i) + 5 + 2(spanc (i) − levelc (i)) = 3spanc (i) − 2levelc(i) + 5, t2 = 3spanc (i)−2levelc(i)+6, to send countc (i) copies of c (the convergecast message). • At step t1 : σi remains in state s4 , broadcasts countc (i) copies of w and produces countc (i) copies of h by accumulating one h for each sent w. At the same time, σi eliminates all superfluous copies of w. • At step t2 : σi transits to state s5 and consumes countc (i) copies of h, φi copies of w, τi copies of c and τi copies of d. At the same time, σi broadcasts countc (i) copies of c and produces countc (i) copies of h by accumulating one h for each sent c. Note that, if spanc (i) = ec , step t1 = 3ec − 2levelc(i) + 5 and step t2 = 3ec − 2levelc(i) + 6. Hence, each other cell σj in level levelc (i) sends countc (j) copies of c, before or at step 3ec − 2levelc(i) + 6. A cell σi in group (4), where 0 < levelc (i) < ec , progresses through three convergecast steps, t1 , t2 , t3 ∈ {levelc (i) + 5, . . . , 3ec − 2levelc(i) + 6}, where t1 < t2 < t3 , to send countc (i) copies of c. Note that, for levelc(i) = 1, steps t1 , t2 , t3 ∈ {6, . . . , 3ec + 4} and for levelc(i) = ec − 1, steps t1 , t2 , t3 ∈ {ec + 4, . . . , ec + 8}.
38
M.J. Dinneen, Y.-B. Kim, and R. Nicolescu
• At step t1 : σi remains in state s4 , broadcasts countc (i) copies of w and produces countc (i) copies of h by accumulating one h for each sent w. At the same time, σi eliminates all superfluous copies of w. • At step t2 : σi transits to state s5 , broadcasts countc (i) copies of w and consumes countc (i) copies of h, τp copies of c and τp copies of d. • At step t3 : σi remains in state s5 , broadcasts countc (i) copies of c and produces countc (i) copies of h by accumulating one h for each sent c. Each cell progresses through the convergecast steps, after all its successors progress through the convergecast steps. Hence, the commander is the last cell to apply the convergecast steps and this phase ends when it enters state s7 (the commander belongs to group (1), hence it transits to state s6 after it progresses through the convergecast steps). Note that, countc (i) copies of c, sent from σi , is received by σi ’s neighbors, which include σi ’s successors. In general, the commander transits from state s6 to s7 in one step and a non-commander cell σk transits from state s6 to s7 , after σk receives countc (j) copies of c from each predecessor σj . Thus, when σc transits to state s7 , all other cells are already in state s7 . To compute the running time of this phase for each cell σi , let us first compute the step, in which the commander transits to state s7 . Consider a level-preserving path, σc , . . . , σt , where levelc (c) = 0 and levelc (t) = ec . The terminal cell σt ends its last phase at step ec + 4 and each cell σj in this path progresses through two steps, 3ec − 2levelc(j) + 5, 3ec − 2levelc(j) + 6, to send countc (j) copies of c. Further, the commander takes one step to transit from state s6 to s7 . Thus, commander transits to state s7 at step (ec + 5) + 2(ec + 1) + 1 = 3ec + 8, i.e., all other cells end this phase at step 3ec + 8. Each cell σi ends its Phase 1 at step levelc(i) + 4, thus, σi takes (3ec + 8) − (levelc (i) + 4) = 3ec − levelc (i) + 4 steps in this phase. For a cell σi , the rules in this phase do not produce or consume copies of a and f , thus, σi still has countc (i) copies of a and still has one f , if σi ∈ F . Additionally, σi consumes all copies of b, c, d, h, v and w by rules 4.12, 5.1, 6.1, 6.2, 6.5 and 6.6. In any level-preserving path, σc , . . . , σt , where levelc(c) = 0 and levelc (t) = ec , the terminal cell σt starts its convergecast steps at step levelc (t) + 5 = ec + 5. After 2(ec + 1) steps, all cells in this path progress through their convergecast steps. In this phase, from step levelc (c)+5, the commander accumulates one e in each step, until it progresses through all its convergecast steps. The commander initially has four copies of e from Phase 1 and in this phase, the commander accumulates (ec + 5) − (levelc (c) + 5) + 2(ec + 1) = 3ec + 2 copies of e. Hence, the commander has 4 + (3ec + 2) = 3(ec + 2) copies of e. The commander σc then consumes two thirds copies of e by rule 5.6, thus, σc ends this phase with ec + 2 copies of e. Remark 2. The purpose of designating cells as DC cells or RTC cells is to ensure that the commander correctly computes its eccentricity. To highlight this need, consider the case when a RTC cell sends a convergecast message, without receiving RTC notifications from all its peers. A scenario for this considered case
Synchronization in P Modules
39
assumes the following: cell σi has one successor σi and one peer σj ; cell σj has one successor σj and one peer σi ; countc (i) = countc (j); spanc (i)+2 = spanc (j). When σi receives a convergecast message from σi , σi sends a convergecast message to its neighbors, which includes σi ’s predecessors, σi ’s successor σi and σi ’s peer σj . One step later when σj receives σi ’s convergecast message, σj sends a convergecast message. In this case, the commander may compute a value, which is less than its actual eccentricity. Phase 3 (FSSP-III: Second broadcast from the commander) Overview: Note that, the commander σc computes the eccentricity ec in Phase 2. In this phase, σc initiates a second broadcast by sending the broadcast message eec . This broadcast message is relaid from predecessors to successors. In this phase, each cell relays a broadcast message as follows. If a cell σi receives a broadcast message ek , where k = ec − levelc (i), from its predecessors, σi becomes a do-broadcast cell and sends a broadcast message ek−1 to all its successors. All do-broadcast cells ignore any subsequent broadcast messages. Precondition: As described in the postcondition of Phase 2. Postcondition: Each cell σi ends in state s9 and σi has: countc (i) copies of a; one copy of f , if σi ∈ F ; (ec − levelc (i) + 1)countc (i) copies of g. Number of steps: For each cell σi , this phase takes levelc(i) + 3 steps. Rules: 7. For state s7 : 8. For state s8 : 1) s7 ae →max s8 ah 1) s8 h →max s8 2) s8 a →max s9 a 2) s7 e →max s8 g (e)repl 3) s8 e →max s9 Proof. Consider a cell σi . Note that, the object e needs levelc(i) steps to propagate down to a cell σi at level levelc (i). Note that, if σi is not the commander, σi receives (ec − levelc(i) + 2)countc (i) copies of e at step 3ec + levelc(i) + 7. • At step 3ec + levelc(i) + 8: σi transits to state s8 , consumes countc (i) copies of e and produces countc (i) copies of h. At the same time, σi broadcasts the remaining (ec − levelc(i) + 1)countc (i) copies of e and produces (ec − levelc (i) + 1)countc (i) copies of g by accumulating one g for each sent e. • At step 3ec + levelc(i) + 9: σi remains in state s8 and consumes countc (i) copies of h. • At step 3ec + levelc(i) + 10: σi transits to state s9 and eliminates all superfluous copies of e. In this phase, each cell σi is idle in steps 3ec + 7, . . . , 3ec + levelc(i) + 7 and is active in steps 3ec + levelc(i) + 8, 3ec + levelc(i) + 9, 3ec + levelc (i) + 10 as stated above. Thus, each σi takes levelc(i) + 3 steps. The rules in this phase do not produce or consume f , thus, each squad cell still ends with one f . The number of each object a, e, g and h, for a cell σi , is verified in the three steps 3ec + levelc (i) + 8, 3ec + levelc(i) + 9, 3ec + levelc (i) + 10.
40
M.J. Dinneen, Y.-B. Kim, and R. Nicolescu
Phase 4 (FSSP-IV: Counting down for entering the firing state) Overview: In this phase, each cell performs a countdown as follows. At each step a cell σi decrements the counter k of the broadcast message ek (from Phase 3) by one, until k = 0. If k = 0, σi becomes a ready-to-synchronize cell. A squad ready-to-synchronize cell enters the firing state and a non-squad ready-to-synchronize cell enters the initial state. Precondition: As described in the postcondition of Phase 3. Postcondition: Firing squad cells end in state s10 and other cells end in state s0 . All cells are empty. Number of steps: For each cell σi , this phase takes ec − levelc(i) + 2 steps. Rules: 9. For state s9 : 1) s9 ag →max s9 a 3) s9 a →max s0 2) s9 af →max s10 4) s9 a →max s10 Proof. For a cell σi , from the postcondition of Phase 3, the number of g’s is a multiple of the number of a’s, where the multiplier is (ec − levelc (i) + 1). • Between step 3ec + levelc(i) + 11 and step 4ec + 12: σi remains in state s9 and consumes countc (i) copies of g at each step. • At step 4ec + 13: if σi has one f , it transits to state s10 and consumes one a and one f , otherwise, it transits to state s0 and consumes one a. In this phase, each cell σi progresses through steps 3ec + levelc(i) + 11, . . . , 4ec + 12, 4ec + 13 as stated above. Thus, each σi takes ec − levelc (i) + 2 steps. The number of each object a, f and g, for a cell σi , is verified in steps 3ec + levelc(i) + 11, . . . , 4ec + 13. Theorem 3. With finite set of alphabet objects, we can solve the FSSP in 4ec + 13 steps, where ec is the eccentricity of the commander σc . Proof. The result is obtained by summing the individual running times of the four phases, as given by Phases 1, 2, 3 and 4: (levelc (i) + 4) + (3ec − levelc(i) + 4) + (levelc (i) + 3) + (ec − levelc (i) + 2) = 4ec + 13. In Table 1 on Page 44, we present the traces of the FSSP algorithm for the P module shown in Figure 1. Note, for convenience, the phase boundaries are shaded in Table 1.
4
Improved Deterministic FSSP Solution Using Cell IDs
The difficulties discussed in Remark 2 are naturally resolved, if a cell is able to determine the convergecast message sender, i.e., to distinguish between its successors or peers. In this section, we provide a revised FSSP solution, which allows each cell to determine the message sender by having the sender’s label (cell IDs) attached in the received message, e.g., a cell σi sends an object ci . Since a cell σi is now able to distinguish the messages sent from successors and peers, σi does not need to communicate with all its peers during the convergecast phase. Thus, the number of steps σi takes in Phase 2 (convergecasts
Synchronization in P Modules
41
from terminal cells) is reduced by ec , which contributes to the improvement of Theorem 4. The initial configuration of a P module Π = (O , K , δ , P ) (with n cells) for this revised FSSP solution is same as the initial configuration of the previous FSSP solution in Section 3, with the following adjustments: O = {a, b, e, f, g, h}∪ {ci , c¯i , pi , p¯i | i ∈ {1, . . . , n}}. For each cell σi , the set of multiset rewriting rules Ri is given in Phases 1 , 2 , 3 and 4 . This improved FSSP solution is essentially the FSSP solution given in Section 3, with reduced running time in Phase 2. Using cell IDs, this revised FSSP solution enables each cell to avoid steps that were needed to distinguish the message sender. That is, all aspects of the previous FSSP solution remain the same. Thus, the overviews and the correctness proofs of all four phases are omitted. Phase 1 (FSSP-I: First broadcast from the commander) Precondition: The initial configuration of P module Π . Postcondition: Each cell σi ends in state s4 and σi has: countc (i) copies of a; one b, if σi = σc ; countc (i) copies of c¯j , for each σi ’s successor σj ; three copies of e, if σi = σc ; one f , if σi ∈ F . Number of steps: For each cell σi , this phase takes levelc(i) + 4 steps. Rules: 2. For state s2 : 0. For state s0 : 1) s0 a →max s1 abe (pi )repl 1) s2 a →max s3 a 2) s2 cj →max s3 c¯j 2) s0 pj →max s1 ap¯j (ci pi )repl 1. For state s1 : 3) s2 pj →max s3 4) s2 be →max s3 bee 1) s1 a →max s2 a 2) s1 cj →max s2 3. For state s3 : (a) s3 a →max s4 a 3) s1 pj →max s2 4) s1 be →max s2 bee Phase 2 (FSSP-II: Convergecasts from terminal cells) Precondition: As described in the postcondition of Phase 1 . Postcondition: This phase ends when the commander enters state s7 . Each cell σi ends in state s7 and σi has: countc (i) copies of a; ec + 2 copies of e, if σi = σc ; one f , if σi ∈ F . Number of steps: For each cell σi , this phase takes 2ec − levelc(i) + 4 steps. Rules: 5. For state s5 : 4. For state s4 : 1) s4 h →max s5 1) s5 a →max s6 a 6. For state s6 : 2) s4 cj c¯j →max s4 3) s4 a¯ cj →max s4 a¯ cj 1) s6 cj p¯j →max s6 4) s4 a →max s4 ah (ci )repl 2) s6 p¯j →max s6 p¯j 5) s4 be →max s4 bee 3) s6 ck →max s6 4) s6 a →max s7 a 6) s4 be →max s5 7) s4 ee →max s5 e Phase 3 (FSSP-III: Second broadcast from the commander) Precondition: As described in the postcondition of Phase 2 .
42
M.J. Dinneen, Y.-B. Kim, and R. Nicolescu
Postcondition: Each cell σi ends in state s9 and σi has: countc (i) copies of a; one f , if σi ∈ F ; (ec − levelc(i) + 1)countc (i) copies of g. Number of steps: For each cell σi , this phase takes levelc(i) + 3 steps. Rules: 7. For state s7 : 8. For state s8 : 1) s7 ae →max s8 ah 1) s8 h →max s8 2) s8 a →max s9 a 2) s7 e →max s8 g (e)repl 3) s8 e →max s9 Phase 4 (FSSP-IV: Counting down for entering the firing state) Precondition: As described in the postcondition of Phase 3 . Postcondition: Firing squad cells end in state s10 and other cells end in state s0 . All cells are empty. Number of steps: For each cell σi , this phase takes ec − levelc(i) + 2 steps. Rules: 9. For state s9 : 1) s9 ag →max s9 a 3) s9 a →max s0 2) s9 af →max s10 4) s9 a →max s10 Theorem 4. Extending the set of alphabet objects with cell IDs, we can solve the FSSP in 3ec + 13 steps, where ec is the eccentricity of the commander σc . Proof. The result is obtained by summing the individual running times of the four phases, as given by Phases 1 , 2 , 3 and 4 : (levelc (i) + 4) + (2ec − levelc(i) + 4) + (levelc (i) + 3) + (ec − levelc (i) + 2) = 3ec + 13.
5
Conclusion
We have proposed improved deterministic FSSP solutions in the framework of P systems, expressed using P modules. Both of our solutions are based on static structures and traditional rules. Our first FSSP algorithm runs in 4ec + 13 steps, where ec is the eccentricity of the initiating cell. Our second FSSP algorithm, which is extended with the facility of using cell IDs, runs in 3ec + 13 steps. Note that, the former algorithm benefits by using a finite set of alphabet objects, while the latter algorithm requires a linear number of alphabet objects. We end this paper with a couple of open problems. First, is the multiplier of the running times optimal for our two solutions? Note, we can slightly reduce the number of states and run-time steps (additive constants). Another question relates to the type of channels. Are there FSSP solutions for an arbitrary P module that uses simplex channels and its structure is also strongly connected?
References 1. Alhazov, A., Margenstern, M., Verlan, S.: Fast synchronization in P systems. In: Corne, D.W., Frisco, P., P˘ aun, G., Rozenberg, G., Salomaa, A. (eds.) WMC 2008. LNCS, vol. 5391, pp. 118–128. Springer, Heidelberg (2009)
Synchronization in P Modules
43
2. Bernardini, F., Gheorghe, M., Margenstern, M., Verlan, S.: How to synchronize the activity of all components of a P system? Int. J. Found. Comput. Sci. 19(5), 1183–1198 (2008) 3. Calude, C.S., Dinneen, M.J., P˘ aun, G., P´erez-Jim´enez, M.J., Rozenberg, G. (eds.): UC 2005, October 3-7. LNCS, vol. 3699. Springer, Heidelberg (2005) 4. Dinneen, M.J., Kim, Y.-B., Nicolescu, R.: New solutions to the firing squad synchronization problem for neural and hyperdag P systems. In: Membrane Computing and Biologically Inspired Process Calculi, Third Workshop, MeCBIC 2009, Bologna, Italy, September 5, pp. 117–130 (2009) 5. Dinneen, M.J., Kim, Y.-B., Nicolescu, R.: P systems and the Byzantine agreement. The Journal of Logic and Algebraic Programming, 1–31 (to appear), Also see CDMTCS-375 research report http://www.cs.auckland.ac.nz/CDMTCS/researchreports/375Byzantine.pdf 6. Freeman, R.L.: Fundamentals of Telecommunications, 2nd edn. Wiley, IEEE Press (2005) 7. Humphrey, T.C.: Cell Cycle Control: Mechanisms and Protocols. Humana Press, Totowa (2005) 8. Imai, K., Morita, K., Sako, K.: Firing squad synchronization problem in numberconserving cellular automata. Fundam. Inform. 52(1-3), 133–141 (2002) 9. Kobayashi, K., Goldstein, D.: On formulations of firing squad synchronization problems. In: Calude, et al. (eds.) [3], pp. 157–168 10. Lynch, N.A.: Distributed Algorithms. Morgan Kaufmann Publishers Inc., San Francisco (1996) 11. Mazoyer, J.: A six-state minimal time solution to the firing squad synchronization problem. Theor. Comput. Sci. 50, 183–238 (1987) 12. Nicolescu, R., Dinneen, M.J., Kim, Y.-B.: Structured modelling with hyperdag P systems: Part A. In: del Amor, M.A.M., Orejuela-Pinedo, E.F., P˘ aun, G., P´erezHurtado, I., Riscos-N´ un ˜ez, A. (eds.) Membrane Computing, Seventh Brainstorming Week, BWMC 2009, Sevilla, Spain, February 2-6, vol. 2, pp. 85–107. Universidad de Sevilla (2009) 13. P˘ aun, G.: Membrane Computing: An Introduction. Springer, New York (2002) 14. P˘ aun, G.: Introduction to membrane computing. In: Ciobanu, G., P´erez-Jim´enez, M.J., P˘ aun, G. (eds.) Applications of Membrane Computing. Natural Computing Series, pp. 1–42. Springer, Heidelberg (2006) 15. Schmid, H., Worsch, T.: The firing squad synchronization problem with many generals for one-dimensional CA. In: L´evy, J.-J., Mayr, E.W., Mitchell, J.C. (eds.) IFIP TCS, pp. 111–124. Kluwer, Dordrecht (2004) 16. Silberschatz, A., Galvin, P.B., Gagne, G.: Operating System Concepts, 7th edn. Wiley, Chichester (2004) 17. Szwerinski, H.: Time-optimal solution of the firing-squad-synchronization-problem for n-dimensional rectangles with the general at an arbitrary position. Theor. Comput. Sci. 19(3), 305–320 (1982) 18. Umeo, H., Hisaoka, M., Akiguchi, S.: A twelve-state optimum-time synchronization algorithm for two-dimensional rectangular cellular arrays. In: Calude, et al. (eds.) [3], pp. 214–223
44
M.J. Dinneen, Y.-B. Kim, and R. Nicolescu
Appendix
1
2
5
3
6
7
4 Fig. 1. A digraph structure for a P module with duplex channels
Table 1. The FSSP trace of the P module shown in Figure 1, where c = 1, e1 = 3, F = {σ1 , σ3 , σ5 , σ7 } σ1
σ2
σ3
σ4
σ5
σ6
σ7
0
s0 af
s0
s0 f
s0
s0 f
s0
s0 f
1
s1 abef
s0 d
s0 df
s0 d
s0 f
s0
s0 f
2
s2 abd3 e2 f
s1 abd2 u
s1 abd2 f u
s1 abd2 u
s0 d2 f
s0 d
s0 f
3
s3 abd3 e3 f u3
s2 abd2 u2
s2 abd3 f u2
s2 abu2
s1 a2 b2 f u2
s1 abu
s0 d2 f
4
s4 abd3 e4 f
s3 abd2 u2 v 2
s3 abd3 f u3 v 2
s3 abv 2
s2 a2 b2 d2 f
s2 ab
s1 a2 b2 f u2
s3 ab
s2 a2 b2 f
3 5
5
s4 abd e f
6
s4 abd3 e6 f w
7
3 7
s4 abd e f w 3 8
2 2
3
2
s4 abd v
s4 abd f v
s4 abd2 v 2 w
s4 abd3 f v 2 w
2 2
s4 abd v w 2 2
3
2
s4 abd f v w 3
s4 abv
2
s4 abhv2 2
2
s5 abv
2 2
2 2 2
2
s3 a b d f u s4 a2 b2 d2 f s4 a b d f 2 2 2
2
s4 abh
s4 a2 b2 f
s5 abh
s4 a2 b2 f h2
8
s4 abd e f w
s4 abd v w
s4 abcd f v w
s5 abv
9
s4 abd3 e9 f w
s4 abd2 v 2 w
s4 abcd3 f v 2 w
s5 abv 2
s4 a2 b2 c2 d2 f
s6 ab
s5 a2 b2 f h2
10 s4 abd3 e10 f w
s4 abd2 v 2 w 3
s4 abcd3 f v 2 w 3
s5 abv 2
s4 a2 b2 c2 d2 f h2 s6 ab
s6 a2 b2 f w 2
11 s4 abd3 e11 f w
s4 abc2 d2 v 2 w
s4 abc3 d3 f v 2 w
s5 abv 2
s5 a2 b2 f h2
s6 ab
s6 a2 b2 c2 f w 2
12 s4 abd3 e12 f w 3
s4 abc2 d2 hv 2 w 2 s4 abc3 d3 f hv 2 w 2 s5 abv 2 w 2
s6 a2 b2 f w 2
s6 abw
s6 a2 f w 2
13 s4 abc3 d3 e13 f
s5 abc2 hv2
s5 abc2 f hv2
s5 abc2 hv 2
s6 a2 b2 c2 f w 2
s6 abcw
s7 a2 f
14 s4 abc3 d3 e14 f h s6 abc2 v 2 w
s6 abc2 f v 2 w
s6 abc2 v 2 w s6 a2 f w 2
s6 aw
s7 a2 f
15 s5 ae15 f h
s6 abcw
s6 abcf w
s6 abcw
s7 a2 f
s7 a
s7 a2 f
16 s6 ae5 f
s6 aw
s6 af w
s6 aw
s7 a2 f
s7 a
s7 a2 f
17
5
2
s7 a
s7 af
s7 a
18 s8 af g4 h
s7 ae4
s7 ae4 f
s7 ae4
s7 a2 f
19 s8 ae9 f g4
s8 ae6 g3 h
s8 ae6 f g3 h
s8 ae6 g3 h
s7 a2 e6 f
20
s7 ae f
s4 a b d f w
s3 a2 b2 f
s4 ab
2 2 2
s9 af g
4
21 s9 af g3 22 s9 af g
2
10 3
s8 ae
g
s9 ag3 s9 ag
2
12
s8 ae
fg
s9 af g3 s9 af g
2
3
s7 a f
6 3
2
2
s8 a f g h
s9 ag3
s8 a2 e2 f g4
s9 ag
2
s7 a2 f
s7 ae3
4
s8 ae g
2
s7 a2 f
s7 a s7 a
s9 a f g
4
2
s7 a2 f
s8 ag h
s7 a2 e4 f
s8 ag2
s8 a2 f g2 h2
s9 ag
2
s8 a2 f g2
23 s9 af g
s9 ag
s9 af g
s9 ag
s9 a2 f g2
s9 ag
24 s9 af
s9 a
s9 af
s9 a
s9 a2 f
s9 a
s9 a2 f
s9 a2 f g2
25 s10
s0
s10
s0
s10
s0
s10
On Universality of Radius 1/2 Number-Conserving Cellular Automata Katsunobu Imai1 and Artiom Alhazov1,2 1
Department of Information Engineering, Graduate School of Engineering, Hiroshima University, Higashi-Hiroshima 739-8527 Japan
[email protected] 2 Institute of Mathematics and Computer Science Academy of Sciences of Moldova Academiei 5, Chi¸sin˘ au MD-2028 Moldova
[email protected]
Abstract. A number-conserving cellular automaton (NCCA) is a cellular automaton whose states are integers and whose transition function keeps the sum of all cells constant throughout its evolution. It can be seen as a kind of modeling of the physical conservation laws of mass or energy. In this paper we show a construction method of radius 1/2 NCCAs. The local transition function is expressed via a single unary function which can be regarded as ‘flows’ of numbers. In spite of the strong constraint, we constructed radius 1/2 NCCAs that simulate any radius 1/2 cellular automata or any radius 1 NCCA. We also consider the state complexity of these non-splitting simulations (4n2 + 2n + 1 and 8n2 + 12n − 16, respectively). These results also imply existence of an intrinsically universal radius 1/2 NCCA. Keywords: Cellular automata, number-conservation, intrinsic universality, one-way automata, state complexity.
1
Introduction
A number-conserving cellular automaton (NCCA) is a cellular automaton (CA) such that all states of the cells are represented by integers and the sum of the numbers (states) of all cells of a global configuration is preserved throughout the computation. It can be thought as a kind of model of physical phenomena as, for example, fluid dynamics and highway traffic flow [12] and constitutes an alternative to differential equations. Number-conserving condition of CA is first discussed by Hattori et al. [5]. Boccara et al. [1] studied number conservation of one-dimensional CAs on circular configurations. Durand et al. [2,3] considered the two-dimensional case and the relations between several boundary conditions. These results are very useful for deciding whether a given CA is number-conserving but do not help much for the design of NCCAs with complex transition rules. C.S. Calude et al. (Eds.): UC 2010, LNCS 6079, pp. 45–55, 2010. c Springer-Verlag Berlin Heidelberg 2010
46
K. Imai and A. Alhazov
Number-conservation is a strong constraints for cellular automata. Reversibility is another physics inspired strong constraints, but in the case of reversible constraints, Morita [11] showed radius 1/2 reversible CA is universal. As for the case of number-conserving, Moreira et al. [9] proved the universality of onedimensional NCCAs in the case that their neighborhood size is four or more and Kasai [8] proved the same in the case of radius 1. In this paper, we first discuss about the property of radius 1/2 NCCAs from the viewpoint of designing thier local functions. We show a construction method of radius 1/2 NCCAs. The local function is expressed via a single arity unary function which can be regarded as ‘flows’ of numbers. In spite of the strong constraints of radius 1/2 NCCA, it has high computing efficiency. Employing the scheme, we show any radius 1/2 CA can be simulated by a radius 1/2 NCCA. Because there is a intrinsically universal radius 1/2 CA, radius 1/2 NCCA is shown to be intrinsically universal.
2 2.1
Definitions Cellular Automata
Definition 1. A deterministic one-dimensional cellular automaton is defined by A = (Z, N, Q, f, q), where Z is the set of all integers, the neighborhood N = {n1 , · · · , nk } is a finite subset of Z, Q is a non-empty finite set of states of each cell, f : Qk → Q is a mapping called the local transition function and q ∈ Q is a quiescent state that satisfies f (q, · · · , q) = q. A configuration over Q is a mapping α : Z → Q. Then Conf(Q) = {α | α : Z → Q} is the set of all configurations over Q. The global function of A is F : Conf(Q) → Conf(Q) defined as F (α)(i) = f (α(i + n1 ), · · · , α(i + nk )) for all c ∈ Conf(Q), i ∈ Z. We call the value r = (max1≤i≤k ni − min1≤i≤k ni )/2 the radius of A. A neighborhood {−1, 0, 1} corresponds to radius 1, while a neighborhood {0, 1} corresponds to radius 1/2. In these cases we replace N by r in the notation of cellular automata. In what follows we assume Q is a subset of integers. We denote the maximum (minimum) number of a set Q by max(Q) (min(Q)) and the maximum (minimum) value of the range of a function f by max(f ) (min(f )). Throughout the paper we write [q1 , q2 ] to denote a range of integers from q1 to q2 : {i ∈ Z | q1 ≤ i ≤ q2 }. We now recall the definition of the number-conserving condition, adapted for the case when cells may have negative values. Definition 2. [3] A cellular automaton A = (Z, r, Q, f, q) is said to be numberconserving iff F (α0 ) = α0 and lim inf n→∞
μn (F (α)) μn (F (α)) = lim sup = 1 for all α ∈ Conf (Q) \ {α0 }, μn (α) μn (α) n→∞
where F is the global function of A, α0 is zero configuration, i.e., the value of n every cell is min Q, and μn (α) = i=−n (α(i) − min Q).
On Universality of Radius 1/2 Number-Conserving Cellular Automata
2.2
47
Simulation
We briefly discuss the concept of simulation between two CAs. It expresses that τ if a CA A simulates one step of CA B in τ steps (denoted by B ≺ A), there must exist a correspondence between their configurations. A general way to formally define this would raise many questions (particularly, how to make sure that B is simulated by A and not by the correspondence itself). In what follows we only consider the simulations that are linear time, linear space, and only a bounded number of cells in A need to be checked to obtain the value of a cell in B. Hence, the value of cell i in step t of a computation of B starting from β can be computed by looking at radius σ ∈ N neighborhood of cell π(t, i) in step tτ of a computation of A starting from α, where π is a linear function, and values of α are effectively computable from values of cells of β. Notice that simulating k cells by one cell also fits in the concept of linear space, so π(t, i) might involve rounding to obtain an integer. In this case, π(t, i) may return the same values for different i; hence, the value of a cell i may also depend on ρ(i), the remainder of dividing i by k. In general case of linear space, we assume that ρ(i) is some periodic function. It may be important for applications that values of cells represent certain particles, and one may wish to simulate each particle by a single particle, i.e., to be able to recover values of each cell of A by looking at just one cell of B. If σ = 0 then we call the simulation non-splitting. A particularly interesting case is when entire configurations of B are obtained from configurations of A via some morphism h.
3
Number-Conserving Conditions
In this section we recall the necessary and sufficient conditions for radius 1 and radius 1/2 CAs to be number-conserving, as special cases of results from [1,5]. Proposition 1. A deterministic one-dimensional CA A = (Z, 1, Q, f, q) is number-conserving iff there exists a function ϕ : Q2 → Z such that f (l, c, r) = c − ϕ(l, c) + ϕ(c, r) for all c, l, r ∈ Q. The function ϕ represent the movement of numbers between two neighboring cells. We call this function the flow function. Although representation by flow functions is very similar to motion representation [1], it is more convenient for designing an NCCA. When we construct an NCCA in which special computing processes are embedded, the flow function may be defined on a partial configura˜ This process is performed by assigning certain values tion set and a state set Q. ˜ does not attain the state set of A. It must be closed under ˜ 2 . But Q to ϕ˜ on Q the operation of ϕ˜ [7]. The proper state set Q of A is calculated as follows: ˜ ∪ {c + ϕ(c, ˜ Q=Q ˜ l) − ϕ(r, ˜ c) | l, c, r ∈ Q}.
48
K. Imai and A. Alhazov
The flow function ϕ˜ also needs to be extended to ϕ : Q2 → Q, e.g., as ˜ ϕ(a, ˜ b), a, b ∈ Q ϕ(a, b) = 0, otherwise. We will construct universal NCCAs using this representation in section 4. In the case of radius 1/2, its flow function has arity only one. We present and prove the “necessary” direction of the following criterion. Proposition 2. A deterministic one-dimensional CA A = (Z, 1/2, Q, g, q) is number-conserving iff there exists a function ψ : Q → Z such that g(c, r) = c − ψ(c) + ψ(r) for all c, r ∈ Q. Proof. Define δ(c, r) as g(q, c) + g(c, r) + g(r, q) − q − c − r. With respect to the configuration · · · , q, q, c, r, q, q, · · ·, only four cells change their states in the next step. Then for any c, r ∈ Q, δ(c, r) = 0 must hold to preserve number conservation. It follows that δ(c, r) − δ(q, r) = 0. By expanding this equation and reducing g(q, q) − q, we obtain g(c, r) = c − g(q, c) + g(q, r). The necessary condition can be satisfied by taking ψ(c) ≡ g(q, c). The meaning of the formula is that a focus cell i sends a value ψ(α(i)) to the left cell (i − 1). Similarly, it receives a value ψ(α(i + 1)) from the right cell (i − 1). Although this proposition can be used for identifying the function ψ from any given radius 1/2 NCCA A, unlike in the radius 1 case, it is not sufficient to obtain a corresponding NCCA to a given ‘partially defined’ unary function ψ by just extending its state set and assigning 0 to the extended range of the flow function. Example 1. For arbitrary integer constants s, t, k (0 < k < s, s
= t, t + k), let ˜ ex = {0, s, t}, ψex is the following function: Q k, x = s ψex (x) = 0, x
= s. Suppose Aex = (Z,1/2, Qex , gex , 0) is a corresponding NCCA to ψex , i.e., ˜ ex . gex (a, b) = a − ψex (a) + ψex (b). Imagine the ‘closed’ set Qex extending Q Consider configurations of type · · · , 0, t, s, s, · · · , s, 0, 0, · · ·, where the number of consecutive cells with value s is l. It is easy to see that starting in such a configuration, after l steps, automaton Aex will have value t + lk in cell 0. Since l can be unboundedly large, so can be values in Q. Thus Aex cannot be a cellular automaton. Thus we need another way of extending a partially defined state set for the case of radius 1/2. To prevent the divergence of states shown in Example 1, if a cell have an excessive value, it must be moved to a neghboring cell. Remark 1. In case of radius 1 CAs, the value of a focus cell is used in its flow function, so such divergence never occurs. However, in case of radius 1/2 CAs,
On Universality of Radius 1/2 Number-Conserving Cellular Automata
49
the flow is independent of the focus cell. These phenomena are not limited to the case of radius 1/2. In the case of two-dimensional radius 1, there are two types of flow functions of arity 2, and one has the similar property [15]. We now specify when a unary flow function corresponds to a cellular automaton, for which it is necessary and sufficient for next values of cells to be in the state set. A similar condition has been established in [4]. Proposition 3. Let Q = [q1 , q2 ]. A unary flow function ψ induces a well-defined cellular automaton iff q1 ≤ c − ψ(c) + min(ψ) and q2 ≥ c − ψ(c) + max(ψ) for any c ∈ Q. This gives us a practical method to extend the set Q and the flow function in such a way that they induce a well-defined cellular automaton. Proposition 4. Any partially defined unary flow function and its corresponding state set induce a well defined radius 1/2 number conserving cellular automaton. ¯ and the flow function ψ¯ defined on Q. ¯ First, Proof. Consider the state set Q ˜ = [min Q, ¯ max Q] ¯ and extend the flow function to ψ˜ : Q ˜→ consider the range Q ˜ Q. ¯ This can be done arbitrarily, but preferably Z by defining its values for c ∈ Q\ ¯ ≤ ψ(c) ˜ ≤ max(ψ) ¯ and max(ψ)+c−max ¯ ˜ ≤ ¯ ≤ ψ(c) satisfying conditions min(ψ) Q ¯ + c − min Q. ¯ min(ψ) The second step is defining the flow on extended state set, which will be closed with respect to transitions. Define ˜ q1 = min(c − ψ(c)) + min(ψ), ˜ c∈Q
˜ q2 = max(c − ψ(c)) + max(ψ). ˜ c∈Q
˜ if c > max(Q) ˜ Now set Q = [q1 , q2 ] and extend ψ˜ to ψ : Q → Z by ψ(c) = max(ψ) ˜ ˜ and ψ(c) = min(ψ) if c < min(Q). ψ(c) also satisfies the criterion of Proposition 3. Indeed, values q1 and q2 ˜ The have been chosen in such a way that the criterion holds for values c ∈ Q. ˜ (and of maximal flow for choice of minimal flow for values smaller than min(Q) ˜ guarantees that in the next step the value of the values bigger than max(Q)) corresponding cell will not decrease (will not increase, respectively). The other fact, i.e., that the cells with “small values” (“big values”) will have in the next step values not exceeding q2 (not smaller than q1 , respectively), follows from the corresponding inequalities satisfied even by cells with minimal (maximal) flow ˜ that are in Q. ¯ = Example 2. Consider the case of s = 2, t = 3, and k = 1 in Example 1. Q ¯ ¯ ¯ ˜ {0, 2, 3} and ψ(0) = 0, ψ(2) = 1, ψ(3) = 0. Then, Q = [0, 3] and 0, x=1 ˜ ψ(x) = ¯ ψ(x), x
= 1.
50
K. Imai and A. Alhazov
˜ (ψ(1) can be chosen from 0 or 1). Only in the case of c = 3, maxc∈Q˜ (c − ψ(c)) + ˜ = 4 is not in Q. ˜ Thus the proper state set is Q = [0, 4]. max(ψ) ψ(x) =
˜ = 1, x = 4 max(ψ) ˜ ψ(x), x
= 4.
In spite of the fact that the neighborhood size of a radius 1/2 NCCA is two, it can only sense a single cell value to identify the amount of movement. For the normal case, a radius 1/2 CA is also called one-way. But in the case of a radius 1/2 NCCA, if its flow function has both positive and negative values, the movement of particles is ‘bidirectional’. This does not agree with ‘natural’ sense of one-way. So we want to use the word one-way for a radius 1/2 NCCA, only when every values of its flow function is non-negative. Let A be a radius 1/2 cellular automaton whose corresponding flow function is ψ. Replacing ψ(c) by ψ(c) − min(ψ), any radius 1/2 NCCA can be one-way with no change of A. The term min(ψ) can be regarded as a ‘constant background flow’. Even if the range of ψ contains negative values, these negative values can also be erased by the following way: subtract min(Q) from each state in Q and from each argument of ψ(x). Remark 2. Imagine a number-conserving particle-based computing system which employs the scheme of one-way NCCA. If ψ is replaced by ψ(c) − max(ψ), then the direction of every movement of numbers as ‘particles’ is right. In the view point of particle movement, the system is ‘one-way’ to the right. But in the view point of cellular automata, its corresponding NCCA is ‘one-way’ to the left. Hence, the direction of the particle movement is independent on the ‘information flow’ of the corresponding NCCA. Even in the existence of this strong constraints, radius 1/2 (and strictly one-way) NCCA can be universal.
4
Simulations by Radius 1/2 Number-Conserving Cellular Automata
Theorem 1. For any radius 1/2 CA A = (Z, 1/2, Q, g, q), there is a radius 1/2 2
number-conserving CA AN C such that A ≺ AN C . The simulation is linear space and non-splitting, and the correspondence can be defined by a morphism. Proof. Let A = (Z, 1/2, QA , g, q) is a radius 1/2 CA to be simulated. The idea of the construction is the following. A cell c0 is represented as a pair of cells c0 , 1 − c0 , and their sum is constant. To simulate a binary transition function, a cell needs information about values of two neighboring cells. The value of a cell (multiplied by such factor that both values are uniquely decodable) is moved to the neighboring cell in the first step. In the second step this value is returned. In addition, the original value of the cell is erased and the new value is written; writing and erasing are done by redistributing values between the cells in a pair.
On Universality of Radius 1/2 Number-Conserving Cellular Automata
51
We construct a radius 1/2 NCCA AN C = (Z, 1/2, QAN C , σ, qAN C ): ˜ A = [1, |QA |] and d = |QA |. We use the fact that any positive integer x Let Q has a unique representation as s + dt, s, t ∈ Z, 1 ≤ s ≤ d. Indeed, t = (x − 1)/d and s = x − dt. Likewise, any non-positive integer y has a unique representation as −s − dt + 1, s, t ∈ Z, 1 ≤ s ≤ d. Indeed, set x = −y + 1 and use the same formulas as above. ˜ A , assign Take Q = [−d2 − d + 1, d2 + d]. For each s, t ∈ Q ψ(s) = −ds, ψ(1 − s) = 0,
(1) (2)
ψ(1 − s − dt) = −s + g(s, t),
(3)
ψ(t + ds) = ds.
(4)
The automaton is defined by extending Q and ψ as described in the previous section. In the first step of simulation, the values of ψ defined in (1),(2) are used. In the second step, values of ψ defined in (3), (4) are used. Only the values of (4) for t = s are used in actual simulation, the other ones are defined like this to decrease the state set in the final result.
t
···
t+ 1 ··· ···
c0
1 − c0
c1
c0 +dc0
1 − c0 −dc1
c1 +dc1
⇓
1 − c1
c2
1 − c2
···
1 − c1 −dc2
c2 +dc2
1 − c2 −dc3
··· ···
⇓ 1 − c0 c1 1 − c1 c2 1 − c2 −dc1 +dc1 −dc2 +dc2 −dc3 c0 −c1 c1 −c2 c2 − g(c0 , c1 ) +g(c1 , c2 ) − g(c1 , c2 ) +g(c2 , c3 ) − g(c2 , c3 ) +dc1 −dc1 +dc2 −dc2 +dc3 equivalent to: · · · +g(c0 , c1 ) 1 − g(c0 , c1 ) +g(c1 , c2 ) 1 − g(c1 , c2 ) +g(c2 , c3 ) 1 − g(c2 , c3 )
··· c0 · · · +dc0 −c0 t+ 2 ··· · · · +g(c0 , c1 ) · · · −dc0
··· ··· ··· ··· ··· ···
Fig. 1. Simulating a transition
The new rows in the figure represent the movement values. Summing up all numbers, it is easily verified that C(2t + 2)(2i) = g(C(2t)(2i), C(2t)(2i + 2)) and C(2t+2)(2i+1) = 1−g(C(2t)(2i), C(2t)(2i+2)). Therefore, AN C simulates A by taking π(t, i) = 2i, σ = 0, and the simulating configuration for · · · , c0 , c1 , c2 , · · · as shown in Figure 1. The corresponding morphism is preserving positive values and erasing all others. Corollary 1. The state complexity of a non-splitting simulation of radius 1/2 CAs with n states by radius 1/2 NCCAs does not exceed 4n2 + 2n + 1.
52
K. Imai and A. Alhazov
˜ of useful states is 2n2 + 2n. Possible flow Proof. The cardinality of the set Q 2 2 values are in range [−n , n ], see (1), (4). The values of expression c − ψ(c) ˜ are in range [−n2 − n, n2 + n], see (3), (1). Hence, we can take for c ∈ Q Q = [−2n2 − n, 2n2 + n]. Corollary 2. There exists an intrinsically universal one-dimensional one-way number-conserving cellular automaton. Proof. It is well-known that n-state radius 1 CA can be simulated by n2 state one-way CA. Since there exists a four-state radius 1 intrinsically universal CA [13], there exists a 16-state one-way intrinsically universal CA. The claim follows from Theorem 1. The state bound following from Corollary 1 is 1057, which probably can be significantly improved. Therefore, there exists a concrete radius 1/2 NCCA that can simulate any CA. However, such simulation inherits one fact from intrinsically universal CA, namely, one cell is simulated by many cells. In the remaining part of the paper we continue studying non-splitting simulations, where it makes sense to speak about the state complexity of transformations between different classes of CA despite intrinsic universality. Corollary 3. Any radius 1 NCCA can be simulated in a non-splitting way by a radius 1/2 NCCA. Proof. A radius 1 NCCA is a particular case of a radius 1 CA, which can be simulated in a non-splitting way by a radius 1/2 CA, which can in turn be simulated in a non-splitting way by a radius 1/2 NCCA. However, such approach does not benefit from the fact that the original automaton is number conserving, and thus the complexity of this approach is O(n4 ). We now present a transformation preserving the number conservation, which yields better complexity bound. Theorem 2. For any radius 1 NCCA A = (Z, 1, Q, f, q), there is a radius 1/2 2
NCCA A1/2 such that A ≺ A1/2 . The simulation is real space and non-splitting, and the correspondence can be defined by a morphism. Proof. Let A = (Z, 1, QA , f, q) is a radius 1 CA to be simulated. The idea of the construction is the following. To simulate a binary flow function, a cell needs information about values of two neighboring cells. The value of a cell (multiplied by such factor that both values are uniquely decodable) is moved to the neighboring cell in the first step. In the second step this value is returned. In addition, the original value of the cell is moved to the left together with the value defined by the binary flow function. Since in this way we would not be able to distinguish between step 1 and step 2 if the incoming flow is the same as outgoing flow, we additionally need to embed the time period, e.g., in a way · · · , 0, 3, 0, 3, · · · ⇒ · · · , 2, 1, 2, 1, · · · ⇒ · · · , 0, 3, 0, 3, · · ·. Similarly to Theorem 1,
On Universality of Radius 1/2 Number-Conserving Cellular Automata
53
multiple numbers can be encoded in a single number with appropriate multipliers. The current value is represented by multiples of 4, and the value of the neighboring cell in A is represented by multiples of 4d. The flow function ϕ of A can be derived from its transition function f (l, c, r) = ˜ A = [1, |QA |] and d = QA . c − ϕ(l, c) + ϕ(c, r) in the following way. Let Q It is known that in order for an automaton to be number conserving, both the smallest state 1 and the largest state d have to be quiescent; we assume ϕ(d, d) = 0. Then f (c, r, d) = r−ϕ(c, r)+ϕ(r, d), so ϕ(c, r) = r+ϕ(r, d)−f (c, r, d) and ϕ(r, d) = d + ϕ(d, d) − f (r, d, d). After substitution, we obtain ϕ(c, r) = ˜ A . Since r and values of f are in range r + d − f (r, d, d) − f (c, r, d) for c, r ∈ Q ˜ A. [1, d], it follows that −d + 1 ≤ ϕ(c, r) ≤ 2d − 2, r, c ∈ Q t
···
··· t+ 1 ··· ··· ···
4c0
4c1 + 3
4c2
4c0 −4dc0 +4dc1 +2
4c1 + 3 −4dc1 +4dc2 −2
4c2 −4dc2 +4dc3 +2
⇓
4c3 + 3
4c4
4c5 + 3
···
4c3 + 3 −4dc3 +4dc4 −2
4c4 −4dc4 +4dc5 +2
4c5 + 3 −4dc5 +4dc6 −2
··· ··· ··· ···
⇓ ··· 4c0 4c1 + 3 4c2 4c3 + 3 · · · −4dc0 −4dc1 −4dc2 −4dc3 +4dc2 +4dc3 +4dc4 · · · +4dc1 ··· +2 −2 +2 −2 · · · +4dc0 +4dc1 +4dc2 +4dc3 −4dc2 −4dc3 −4dc4 t + 2 · · · −4dc1 +1 −1 +1 −1 ··· ··· −4c0 −4c1 −4c2 −4c3 +4c1 +4c2 +4c3 +4c4 ··· · · · +4ϕ(c1 , c2 ) +4ϕ(c2 , c3 ) +4ϕ(c3 , c4 ) +4ϕ(c4 , c5 ) · · · −4ϕ(c0 , c1 ) −4ϕ(c1 , c2 ) −4ϕ(c2 , c3 ) −4ϕ(c3 , c4 ) equivalent to: · · · 4c1 + 3 4c2 4c3 + 3 4c4 · · · +4ϕ(c1 , c2 ) +4ϕ(c2 , c3 ) +4ϕ(c3 , c4 ) +4ϕ(c4 , c5 ) · · · −4ϕ(c0 , c1 ) −4ϕ(c1 , c2 ) −4ϕ(c2 , c3 ) −4ϕ(c3 , c4 )
4c4 4c5 + 3 −4dc4 −4dc5 +4dc5 +4dc6 +2 −2 +4dc4 +4dc5 −4dc5 −4dc6 +1 −1 −4c4 −4c5 +4c5 +4c6 +4ϕ(c5 , c6 ) +4ϕ(c6 , c7 ) −4ϕ(c4 , c5 ) −4ϕ(c5 , c6 )
··· ··· ··· ··· ··· ··· ··· ··· ··· ··· ···
4c5 + 3 4c6 ··· +4ϕ(c5 , c6 ) +4ϕ(c6 , c7 ) · · · −4ϕ(c4 , c5 ) −4ϕ(c5 , c6 ) · · ·
Fig. 2. Simulating a transition
We construct a radius 1/2 NCCA AN C = (Z, 1/2, QAN C , σ, qAN C ): ˜ A , assign Take Q = [−4d2 + 8d + 1, 4d2 − 4d + 6]. For each s, t ∈ Q ψ(4s) = 4ds, ψ(4s + 3) = 4ds + 2,
(5) (6)
ψ(4dt − 4ds + 4s + 2) = −4ds + 4s + 4ϕ(s, t),
(7)
ψ(4dt − 4ds + 4s + 1) = −4ds + 4s + 4ϕ(s, t) + 1.
(8)
54
K. Imai and A. Alhazov
The automaton is defined by extending Q and ψ as described in the previous section. In the first step of simulation, the values of ψ defined in (5), (6) are used. In the second step, values of ψ defined in (7), (8) are used. The new rows in the figure represent the movement values. Summing up all numbers, it is easily verified that C(2t + 2)(i) = C(2t)(i + 1) − 4ϕ( C(2t)(i)/4 , C(2t)(i+1)/4 )+ϕ( C(2t)(i+1)/4 , C(2t)(i+1)/4 ). Therefore, A1/2 simulates A by taking π(t, i) = i − t, σ = 0, and the simulating configuration for · · · , c0 , c1 , c2 , · · · as shown in Figure 1. The corresponding morphism divides the values by 4, rounding down. Corollary 4. The state complexity of a non-splitting simulation of radius 1 NCCAs with [1, n] state range by radius 1/2 NCCAs does not exceed 8n2 + 12n − 16. Proof. It is not difficult to see that in case n = 1, there only exists one radius 1 NCCA with one state, and it is simulated by the only radius 1/2 NCCA with one state. In the following we assume n ≥ 2. ˜ of useful states is 8n2 − 12n + 6. To improve the The cardinality of the set Q complexity estimation, we will add 4d2 + 4d − 4 to (7) and (8); this constant background flow for the second step will not affect the actual simulation since each cell will both send and receive this value. Possible flow values are now in range [4n, 4n2 + 8n − 7], see (5),(7),(8). The values of expression c − ψ(c) ˜ are in range [−4n2 − 8n + 12, 2], see (7), (8). Hence, we can take for c ∈ Q Q = [−4n2 − 4n + 12, 4n2 + 8n − 5].
5
Conclusion
In this paper, we have discussed radius 1/2 NCCAs and presented a method to construct them. Employing this method, we constructed radius 1/2 NCCA which simulates any n-state one-way CA and a 1/2 NCCA which simulates any n-state radius 1 NCCA. We recall that a simulation is non-splitting if the value of each cell in a simulated automaton can be computed from the value of a single cell of a simulating automaton. The state complexities of the non-splitting constructions mentioned above are 4n2 + 2n + 1 and 8n2 + 12n − 16, respectively. Thus, in spite of the strong constraints, radius 1/2 NCCAs can be universal. Therefore, if nonsplitting requirement is removed, then the state complexity of the simulations is bounded by a fixed number. Radius 1/2 reversible cellular automaton has also known to be universal [11]. What about the case of both reversible and number-conserving CAs? Although the behaviors of reversible and number-conserving cellular automaton, particularly radius 1/2 case, seems to be quite predictable [14] for small state sets, the general question is still open. Acknowledgments. Artiom Alhazov gratefully acknowleges the support of the Japan Society for the Promotion of Science and the Grant-in-Aid for Scientific Research, project 20·08364. He also acknowledges the support by the Science and Technology Center in Ukraine, project 4032.
On Universality of Radius 1/2 Number-Conserving Cellular Automata
55
References 1. Boccara, N., Fuk´s, H.: Number-conserving cellular automaton rules. Fundamenta Informaticae 52, 1–13 (2003) 2. Durand, B., Formenti, E., Grange, A., R´ oka, Z.: Number conserving cellular au´ tomata: new results on decidability and dynamics. In: Morvan, M., R´emila, E. (eds.) Proceedings of Discrete Models for Complex Systems, DMCS 2003. Discrete Mathematics and Theoretical Computer Science, vol. AB, pp. 129–140 (2003) 3. Durand, B., Formenti, E., R´ oka, Z.: Number conserving cellular automata I: decidability. Theoretical Computer Science 299(1-3), 523–535 (2003) 4. Fuk´s, H., Sullivan, K.: Enumeration of number-conserving cellular automata rules with two inputs. Journal of Cellular Automata 2(2), 141–148 (2007) 5. Hattori, T., Takesue, S.: Additive conserved quantities in discrete-time lattice dynamical systems. Pysica 49D, 295–322 (1991) 6. Imai, K., Fujita, K., Iwamoto, C., Morita, K.: Embedding a logically universal model and a self-reproducing model into number-conserving cellular automata. In: Calude, C.S., Dinneen, M.J., Peper, F. (eds.) UMC 2002. LNCS, vol. 2509, pp. 164–175. Springer, Heidelberg (2002) 7. Imai, K., Ikazaki, A., Iwamoto, C., Morita, K.: A logically universal numberconserving cellular automaton with a unary table-lookup function. Trans. IEICE E87-D(3), 694–699 (2004) 8. Kasai, Y.: Number-conserving cellular automata with universality under errors, Master’s thesis, Hiroshima University (2003) 9. Moreira, A.: Universality and decidability of number-conserving cellular automata. Theoretical Computer Science 292, 711–721 (2003) 10. Moreira, A., Boccara, N., Goles, E.: On conservative and monotone onedimensional cellular automata and their particle representation. Theoretical Computer Science 325, 285–316 (2004) 11. Morita, K.: Computation-universality of one-dimensional one-way reversible cellular automata. Information Processing Letters 42, 325–329 (1992) 12. Nagel, K., Schreckenberg, M.: A cellular automaton for freeway traffic. Journal of Physics I(2), 2221–2229 (1992) 13. Ollinger, N., Richard, G.: A Particular Universal Cellular Automaton. In: Proc. International Workshop on The Complexity of Simple Programs (CSP 2008), pp. 205–214 (2008) 14. Scharanko, A., Oliveira, P.: Derivation of one-dimensional, reversible, numberconserving cellular automata rules. In: Proc. 15th International Workshop on Cellular Automata and Discrete Complex Systems (Automata 2009), pp. 335–345 (2009) 15. Tanimoto, N., Imai, K.: A Characterization of von Neumann Neighbor numberconserving cellular automata. Journal of Cellular Automata 4(1), 39–54 (2009)
DNA Origami as Self-assembling Circuit Boards Kyoung Nan Kim1, Koshala Sarveswaran1, Lesli Mark2, and Marya Lieberman1 1
Dept. of Chemistry and Biochemistry, University of Notre Dame, Notre Dame IN 46556 USA 2 St. Joseph High School, South Bend, IN 46556 USA
Abstract. DNA origami have the potential to serve as self-assembling circuit boards for nanoelectronic devices. This paper focuses on understanding just one aspect of the hierarchical self-assembly of DNA origami—the oligomerization of individual origami to form chains of aligned and oriented origami. The eventual goal is to place small numbers of nanomagnets in specific locations on the DNA origami in such a way that the self-assembly of the origami causes a magnetic cellular automaton device, such as a wire, to be formed. Four strategies for forming well ordered chains of DNA origami were compared by examination of AFM images of DNA origami chains deposited on mica. Preliminary results of patterned deposition of DNA origami on lithographic patterns are also reported. Keywords: DNA origami, quantum-dot cellular automata, magnetic QCA, self assembly.
1 Background In lithography, information is written onto the substrate to define the locations where different materials are deposited or removed. In most lithographic processes, the materials that are deposited—silicon dioxide, metals, resists—are relatively uncomplicated. Recently, several groups have shown that more complex materials are capable of creating sub-lithographic features; for example, block-co-polymer thin films.[1] Although these examples demonstrate the startling utility of nanostructured materials and self-assembly for nanoelectronics fabrication, the long-range homogeneity of the materials imposes limitations on their general use. For example, block co-polymer phases cannot be programmed with arbitrary properties at arbitrary sites. Several groups are now working to exploit one of the most information-dense materials known, DNA origami, to construct magnetic QCA (Quantum-dot Cellular Automata) circuitry. MQCA systems can operate at room temperature, and there is a close physical match between the scale of the nanomagnets and the scale of the DNA origami. Importantly, the self-assembly and surface-binding interactions of the DNA origami could ultimately allow fabrication of sub-10 nm circuits on patterns made by optical lithography. 1.1 Background on QCA and Magnetic Cellular Automata The QCA computing paradigm represents information by using binary numbers, but instead of representing bits as “on” or “off” states of a transistor, the bits are C.S. Calude et al. (Eds.): UC 2010, LNCS 6079, pp. 56–68, 2010. © Springer-Verlag Berlin Heidelberg 2010
DNA Origami as Self-assembling Circuit Boards
57
represented by the charge state of a cell having a bistable charge configuration or, in the case of magnetic QCA, the state of a nanomagnet that has a bi-stable magnetic dipole orientation. One state represents a binary ‘1’, the other a binary ‘0’.[2,3,4] In the transistor paradigm, the current from one device charges the gate of the next device and turns the device on or off. In the Fig. 1. Magnetic QCA wire. QCA paradigm, the field from the charge configuration Top: SEM image. Bottom: of one device alters the charge configuration of the next Magnetic force microscopy device. This basic device-device interaction is sufficient image. to allow for the computation of any Boolean function.[5,6] QCA cells also can be used to construct interconnects between computing structures, which is critical for scaling of devices and architectures. If a clocking potential is added which modulates the energy barrier between charge configurations, general purpose computing becomes possible with low power dissipation. Magnetic QCA (MQCA) uses physical interactions between densely-packed nanomagnets to create logic functionality.[7,8] Each magnetic element uses shape anisotropy to hold binary information, and thus also can act as a memory element.[9] MQCA offers several attractive features, principally low power and nonvolatility. Since the state information is represented by the magnetization orientation, there is no stand-by power dissipation. Power is dissipated during switching, and the total power is the sum of the power required for magnetization reversal of the magnetic dots and that needed to generate the magnetic switching fields. It has been estimated that MQCA circuits could outperform CMOS in terms of energy-delay product.[10] MQCA involves patterned magnetic materials, where the neighboring elements are intentionally placed sufficiently close to each other that they can influence each other through their fringing fields. This is an undesirable feature for data-storage applications, but is exploited here to provide local interconnectivity for logic functionality. Figure 1 shows an experimental demonstration of a magnetic QCA wire; the top image is an scanning electron micrograph of the physical structure of the nanomagnets, which are roughly 100 x 200 nm in size, while the bottom image shows the orientation of the magnetic dipoles recorded via magnetic force microscopy.[11] In previous work, logic functionality in properly structured arrays of MQCA nanomagnet elements has been demonstrated experimentally at room temperature. Specifically, a group of researchers at Notre Dame have demonstrated operation of a three-input majority-logic gate as observed by MFM (magnetic force microscopy, see Fig. 2) [Imre, 2006].[12] Fig. 2. Nanomagnet majority gate. Each This logic gate is computationally MFM image shows a correct logical output universal since it can be reduced to based on the inputs set by the magnet geoeither a standard Boolean AND or OR metry. gate. With both interconnect and logic
58
K.N. Kim et al.
functions demonstrated experimentally at room temperature, MQCA provides proofof-principle for all-magnetic information-processing systems. The MQCA scheme is a natural match with self-assembly strategies in that neighboring nanomagnets need only to be placed in proximity to each other, i.e. there is no need to connect these elements with wires. The central challenge for nano-scale MQCA is a patterning challenge. Currently, the testbed systems use electron-beam lithographic (EBL) patterning to fabricate nanomagnets that are as small as 30 nm x 60 nm x 80 nm. However, scaling down another order of magnitude would by extremely difficult for EBL techniques, and of course, EBL is a serial technique. For future nanofabrication tasks, much attention now directed towards use of self assembly. Self assembly is innately parallel, which is an advantage, but it is very difficult to make structures that are heterogeneous by self-assembly. However, the new technique of DNA origami now offers a potential combination of parallel self-assembly at the sub-100 nm scale, and integration with top-down lithography at the 100 nm to micron scale. Many researchers have taken advantage of various capabilities of DNA to assemble materials for electronic functions.[13] Erik Winfree's original studies of DNA tiles, which preceded Paul Rothemund's work on DNA origami, were motivated by a desire to carry out algorithmic computation; this work has since been enhanced by the use of DNA origami as templates to set initial conditions to program tile assembly. Up to 36 different "input" strands can be placed around the perimeter of a DNA origami to recruit and organize other DNA nanostructures.[14] A presentation at the 2009 FNANO meeting[15] described the use of DNA origami to assemble two carbon nanotubes, one on top of the origami and the other below it, into a crossed geometry, which shows it is possible to attach and orient nanowires on origami. With the aid of heroic EBL efforts, the group was able to make electrical leads and obtain measurements on one functional device. Metallization of DNA is possible; Chengde Mao's group has deposited silver nanowires on DNA templates obtaining 20-nm wires with good electrical continuity.[16] Regular lattices of DNA were also constructed by the Mao group, which showed that they can be used as masks for some lithographic processes; by stacking several lattices on top of one another, Moiré patterns are formed, which extend the complexity of the mask.[17] Hao Yan's group showed that binding of gold nanoparticles to DNA tile arrays can force them to roll up and form a variety of tubules, some decorated with stripes or spirals of nanoparticles;[18] this work may be useful for photonic systems. We think that MQCA is a promising target for DNA origami-based self assembly. This paper focuses on understanding just one aspect of the hierarchical self assembly of DNA origami—the oligomerization of individual origami to form chains of aligned and oriented origami. The eventual goal is to place small numbers of nanomagnets in specific locations on the DNA origami in such a way that the self-assembly of the origami causes a MQCA device, such as a wire, to be formed. Thus, the DNA origami would serve as a sort of self-assembling circuit board. 1.2 DNA Origami The information content of DNA, expressed in terms of the sequence of A, C, T, and G nucleotides, is enormous. Biological systems use DNA to store genetic information,
DNA Origami as Self-assembling Circuit Boards
59
but it is also possible to take advantage of the information content of DNA to program and assemble nanostructures. In the DNA origami technique,[19,20,21] a long single strand of DNA called the template strand is folded into a desired two- or threedimensional shape. The physical structure of a DNA origami is relatively homogeneous, but its chemical structure is heterogeneous because the template strand is a non-repetitive sequence (often a plasmid or a viral genome that is many thousands of base pairs in length). Folding is induced by base pairing interactions between the template strand and hundreds of short synthetic oligonucleotides or “staple strands.” The staple strands typically contain segments that are complementary to two widely separated regions on the template strand, and when the single stranded components are all mixed and annealed together, the staple strand brings these two regions together in space. Binding of each staple strand causes a particular fold of the template strand to form and converts a small region of the template strand from single to double stranded DNA. By adding a 5 to 10-fold excess of each of the staple strands, the folding of the template can be driven towards completion. Amazingly, these hundreds of discrete folding interactions take place simultaneously during annealing of the DNA, and result in high yields (up to 90 or 95%) of properly folded DNA origami. The questions addressed in this mss focus on building up larger structures from these DNA origami units. An individual DNA origami has a limited amount of surface area for assembling nanoelectronic or nanomagnetic elements. The most basic circuit element is the wire, which in DNA origami form would consist of a long chain of DNA origami with attached nanorod magnets. In the case of 3 x 20 nm nanomagnets, one origami could hold several nanomagnets, forming a 90 nm long segment of a MQCA wire. Other circuits elements could also be constructed; a MQCA majority decision circuit would fit onto a single origami (see Figure 3B). In order to make nanocircuits with DNA origami, we envision a hierarchical assembly process, in which individual DNA origami are assembled and purified, and then are assembled further by a combination of self and directed assembly to form larger circuit elements. Of course, both the orientation and alignment of the origami in these chains would have to be well controlled.
2 Experimental Preparation and deposition of DNA origami. DNA origami were prepared by published methods.21 Typically 5-10 equivalents of the staple strands were combined with 1 equivalent of template DNA in 1X TAE-Mg buffer (40 mM Tris-HCl (pH 8.0), 2 mM EDTA, 20 mM acetic acid, 12.5 mM Mg2+) to produce a 6 nM concentration of the template strand. After 7 hr annealing from 90˚C to room temperature, DNA origami were used immediately or held in a refrigerator for several days. DNA origami at concentrations of 2-6 nM were deposited on freshly cleaved mica by placing a 1020 ul drop of the DNA origami solution in TAE-Mg buffer on the surface, where it spreads well. After 10-20 min at room temperature, the mica is washed with flowing water, then blown dry with nitrogen. Samples were imaged within a few hours of preparation using non-contact mode AFM. Atomic Force Microscopy. The atomic force microscope images shown were obtained using a Nanoscope IIIa Multimode instrument (Digital Instruments, California)
60
K.N. Kim et al.
in tapping mode. Silicon cantilevers were chosen from Nanosensors (type PPP-NCH). Based on imaging of 5.7 nm gold nanoparticles with several of these tips, the radius of curvature of the tips is 8-10 nm (nanoparticle apparent lateral diameters were 20-22 nm). Second-order line-by-line flattening was performed on the AFM images before analysis. Image processing was done using the freeware program WSxM 5.0.[22] Molecular liftoff. Silicon substrates for EBL were patterned with locator designs as described.[23] They were then cleaned as described above for SiO2 substrates. A 6090 nm thick film of 950,000 g/mol polymethylmethacrylate (PMMA) was spun on and baked at 160°C in an oven for 5 h. A thermal field emission EBL system ELS7700 (Elionix, Japan) was used to generate EBL patterns in the PMMA. The acceleration voltage was 75 kV and the beam current was 50 pA. The approximate beam diameter was 2-3 nm, as estimated by the resolution of images of gold particles. Lithographic patterns were written with a 2.5 nm step size, and the dose was controlled by the time spent at each pixel. Exposed PMMA was developed with isopropyl alcohol: methyl isobutyl ketone (3:1) with 1.5% (v/v) methyl ethyl ketone at room temperature[24]. The samples were dried, immediately soaked in a 0.05% v/v TMAC (N, N, N-trimethylaminopropyltrimethoxysilane chloride) solution in DI water for 5 mins, and were dried with nitrogen. The remaining PMMA was lifted off by soaking in a vial of clean CH2Cl2 for 3 x 1 mins with sonication.
3 Results for Aligning and Orienting DNA Origami We compared four techniques for aligning and orienting chains of rectangular DNA origami: Pi stacking alone, sticky ends in conjuction with pi stacking, sticky ends in conjunction with T-bumpers, and sticky ends in conjuntion with short end destructuring. The metric used to assess alignment was the average offset of one DNA origami from its neighbor; in the cartoon shown in figure 3A, the origami have no detectable offsets, which would indicate excellent alignment. The metric used to assess orientation was the relative location of a small loop of excess single stranded DNA, which protrudes from one long side of every DNA rectangle. If the origami are all oriented the same way, the loops all appear on the same side of the origami chain. In Figure 3A, the loops appear on both sides of the chain, which would indicate poor orientation of the origami.
Fig. 3. (a) Schematic assembly of DNA origami pentamer. This example demonstrates good alignment (offsets are zero) but bad orientation (loops are not restricted to one side of the pentamer). (b) Majority gate mapped onto DNA origami (each rod represents a 3 x 20 nm nanomagnet).
DNA Origami as Self-assembling Circuit Boards
61
Pi stacking. The DNA origami rectangles used in this study naturally form chains in which the short ends of the rectangles stick together. Chain formation is caused by pi stacking interactions. The DNA duplexes run along the long side of the rectangle with turns at the short sides. The hydrophobic DNA bases are partially exposed at the turns, and these hydrophobic regions can be better screened from water if the short end of one DNA origami associates with the short end of another origami. Pure pi stacking interactions readily formed long chains, but the chains were poorly aligned (Fig. 4a, 5a) and randomly oriented. The ratio of sites where loop structures were found on the same side of a DNA origami chain to sites where the loops were on opposite sides of the chain was measured as 0.84. For random attachment, this ratio would be 1.0; the difference is probably not statistically significant. Pi stacking plus sticky ends. To create “sticky ends” for aligning and orienting neighboring origami, two staple strands at the short end of each rectangle were modified by the addition of 11 extra nucleotides, corresponding to 1 turn of duplex DNA. The sequences were chosen so that the strands on one short end of the rectangle (A and B) and the strands on the other short end of the rectangle (A’ and B’) were complementary to one another. This alteration could have caused the origami to roll up into cylinders (if A and A’ and B and B’ on one origami base paired) or to form chains (if A and B on one origami and A’ and B’ on another origami base paired). In fact, only flat single origami and chains of origami were seen when the origami were deposited and imaged on mica. The sticky ends improved alignment significantly (Fig. 4b, 5b) but surprisingly, orientation of the DNA origami was only slightly affected. The ratio of sites where loop structures were found on the same side of a DNA origami chain to sites where the loops were on opposite sides of the chain was 0.43, indicating a slight preference for alternation of orientations. A variant origami with three sticky ends per short end give comparable results. Evidently, the multiple pi stacking interactions in this system dominate the strength of the base pairing of the sticky ends.
Fig. 4. Tapping mode AFM in air. a) Pure pi-stacking: DNA origami form chains, but alignment and orientation are poor. b) Pi stacking with sticky ends: Alignment is greatly improved, but orientation is random.
62
K.N. Kim et al.
Fig. 5. Histogram analysis shows number of origami chains showing given size of measured offests for pi-stacked origami without sticky ends (a) and pi-stacked origami with sticky ends (b). If the origami were perfectly self-aligned, all offsets values would be 0.
T-bumpers. We next compared two methods for reducing pi stacking interactions. First, we tried a strategy used by several other groups, in which staple strands at the short ends of the DNA origami were modified to contain short thymidine loops.[21,25] These T-bumpers are expected to prevent pi stacking by blocking access to the ends of the duplex. A significant decrease in average chain length was apparent whenever the T-bumpers were present. The ratio of sites where loop structures were found on the same side of a DNA origami chain to sites where the loops were on opposite sides of the chain was measured as 1.85, indicating a slight improvement in orientation of the DNA origami. Because the orientation provided by Tbumpers was so weak, it is possible that there was considerable residual pi stacking.
Fig. 6. DNA origami with T-bumpers and sticky ends form short chains that are well aligned but only weakly oriented
We tested several parameters to determine the optimal assembly conditions for these chains of DNA origami. Normally the origami are assembled by mixing the
DNA Origami as Self-assembling Circuit Boards
63
template strand (final concentration 6 nM) with 10 equivalents of each of the staple strands in annealing buffer, heating to 90˚C, and cooling to room temperature at a linear rate over 70 min. Annealing time was varied from 70 min to 7 hours, with little effect on the length, alignment, or orientation of the origami chains (Figure 7). Extended annealing over 28 hours gave large populations of damaged or incorrectly assembled origami. The initial concentration of the template strand could be varied from 2-6 nM without affecting the origami chains (Figure 8). However, when excess sticky end strands (10 eq) were present during annealing, it caused nearly complete disaggregation of the DNA origami chains (Figure 9). The reason is that when sticky end A is properly inserted into a growing DNA origami, if there is a large excess of sticky end A’ in solution, it is likely to base pair with A during annealing and thus sticky end A will not be available for base pairing to another origami to form a DNA origami chain. Of course, if there is less than 1 equivalent of sticky end A, many origami will lack this sticky end, so they will also be unable to form DNA origami chains. 1-2 equivalents of each sticky end was found to be the optimal concentration for promoting chain assembly.
Fig. 7. Effect of annealing time on DNA origami structure. a) 70 min b) 7 hr c) 28 hr.
Fig. 8. Concentration of template strand does not affect chain length, yield, or defects (at constant staple ratio)
64
K.N. Kim et al.
Fig. 9. Excess sticky end strands cause disaggregation. a) 1 eq. sticky end strands b) 10 eq sticky end strands. Both are 5 micron square AFM images on mica; z-scale 0 to 2.6 nm in a, 0 to 5 nm in b.
Fig. 10 a) Origami with destructured short edges do not form chains. b) When sticky ends are added, chains are formed. The orientation and alignment of the origami in these chains is nearly perfect.
Destructured origami ends. One more method to eliminate pi stacking was studied. Simply leaving out the staple strands closest to the short ends of each DNA rectangle destructures these edges of the origami and create short single stranded DNA loops, in a manner analogous to the fringe on a Persian rug. This method was found to be quite effective for removing pi-stacking interactions between DNA origami. Figure 10a shows origami with destructured short edges. The origami are well formed, but do not aggregate into chains. Pi stacking has been eliminated. In figure 10b, origami with destructured short edges and sticky ends are shown. These form origami chains that are well aligned and also show nearly 100% orientation. In contrast to the DNA origami chains formed by pi stacking, here there is a gap of about 6-8 nm between each
DNA Origami as Self-assembling Circuit Boards
65
pair of origami in the chain. This gap is due to the destructured short edges of the origami, which consist of single stranded DNA and hence appear lower in the AFM image. The three sticky ends form duplex DNA linking neighboring DNA origami, and the 8 nm gap and these three bridges are visible in high resolution images such as Figure 11. A partially folded orgami is visible at bottom center of this image—it has been oriented and aligned with its 3 neighbors, but has not finished folding.
Fig. 11. Close up of DNA origami chains formed from origami with destructured ends and three sticky ends. (z scale 5 nm).
4 Anchor Pads For many applications, it would be desirable to place the DNA origami onto a semiconductor substrate in order to interface it with CMOS structures. However, almost all studies of origami use mica as the substrate due to the poor attachment properties to anionic silicon dioxide. We have developed a method for attaching small DNA nanostructures to lithographically-defined locations on silicon using “molecular liftoff” of a cationic self-assembled monolayer (SAM) of aminopropyl triethoxysilane (APTES).[26] We recently found that monolayers (SAMs) formed from trimethylaminopropyl triethoxysilane chloride (TMAC) on silicon appear to provide surfaces comparable to mica for attachment of DNA origami. Here we show preliminary results of patterning TMAC via molecular liftoff and the use of these anchor pads to bind DNA origami in desired locations without interference from stray staple strands (Figure 12). Binding selectivity for the patterned regions is quite good (at least 30:1) and the origami are flat and undistorted, although some of the origami appear to be stacked on top of one another.
66
K.N. Kim et al.
Fig. 12. Deposition of DNA origami on lithographically fabricated anchor pads a) Yellow stripes are 35% TMAC/65% APTES anchor pads (125 nm wide stripes, 5 micron image, z scale 0-4 nm), b) After deposition of DNA origami on the anchor pads (2 micron image, z scale 0-5 nm, APTES stripes indicated by blue arrowheads) c) Structures of APTES and TMAC precursors
5 Discussion Four different methods for controlling the alignment and orientation of DNA origami chains were compared. Pi stacking caused strong aggregation of the DNA origami, but it proved to be a real nuisance, because it was both strong and unselective. It was not enough to include sticky ends to provide specific base pairing interactions between neighboring origami or to use short T-bumpers to fend off pi stacking interactions. The energy of hydrophobic interactions in water depends primarily on the surface area of the hydrophobic functional groups that can be buried away from exposure to the water. The short end of the DNA origami contains the ends of more than 20 duplexes, and although each one exposes only a few square nanometers of hydrophobic area, the combined area is quite large. Eliminating pi stacking required complete destructuring of the short ends of the DNA origami.
6 Conclusions and Prospects for Future Work Self-assembly does not require fancy equipment, yet it can make very high resolution features. However, it is difficult to extend these features over large areas of real estate on a chip, and particularly difficult to organize non-repetitive features. If chunks of functional MQCA circuitry could be assembled on fairly large (300-500 nm) origami assemblies, which in turn bind to optical lithography-scale anchor pads, it could be a way to fabricate next-generation circuitry with previous-generation lithography.[27] One of the strengths of MQCA compared with other beyond-roadmap technologies is the seamless blend of interconnects and devices; this feature means that the same fabrication process that can make majority gates can also make the interconnections between individual gates. As a demonstration of this capability, we are attempting to fabricate nanomagnet-bearing origami that can self-assemble to give a complete functional MQCA wire.
DNA Origami as Self-assembling Circuit Boards
67
Acknowledgements. Funding for this work was provided by ONR’s Sub-10 nm Nanofabrication program, grant # N000140910184. We gratefully acknowledge the Notre Dame Radiation Laboratory for access to the AFM.
References 1. Kim, S.O., Solak, H.H., Stoykovich, M.P., Ferrier, N.J., de Pablo, J.J., Nealey, P.F.: Epitaxial self-assembly of block copolymers on lithographically defined nanopatterned substrates. Nature 424(6947), 411–414 (2003) 2. Tougaw, P.D., Lent, C.S.: Logical Devices Implemented Using Quantum Cellular Automata. Journal of Applied Physics 75, 1818 (1994) 3. Lent, C.S., Tougaw, P.D.: A Device Architecture for Computing with Quantum Dots. Proceedings of the IEEE 85, 541 (1997) 4. Lieberman, M., Chellamma, S., Varughese, B., Wang, Y., Lent, C., Bernstein, G.H., Snider, G., Peiris, F.C.: Quantum-dot cellular automata at a molecular scale. Annals of the New York Academy of Science 960, 225–239 (2002) 5. Niemier, M.T.: The Effects of a New Technology on the Design, Organization and Architectures of Computing Systems. U. of Notre Dame, Ph.D. Dissertation 6. Niemier, M.T., Kogge, P.M.: Exploring and exploiting wire-level pipelining in emerging technologies. In: Proceedings of the 28th annual international symposium on Computer Architecture, Göteborg, Sweden, pp. 166–177. ACM Press, New York (2001) 7. Cowburn, R., Welland, M.: Room temperature magnetic quantum cellular automata. Science 287, 1466 (2000) 8. Orlov, A., Imre, A., Ji, L., Csaba, G., Porod, W., Bernstein, G.H.: Magnetic Quantum-dot Cellular Automata: Recent Developments and Prospects. J. Nanoelec. Optoelec. 3(1), 55– 68 (2008) 9. Chaudhary, A., Chen, D.Z., Whitton, K., Niemier, M., Ravichandran, R.: International Conference on Computer Aided Design Proceedings of the 2005 IEEE/ACM International Conference on Computer-Aided Design, San Jose, CA, pp. 565–571 (2005) 10. Niemier, M.T., Alam, M.T., Hu, X.S., Bernstein, G.H., Porod, W., Putney, M., DeAngelis, J.: Clocking Structures and Power Analysis for Nanomagnet-based Logic Devices. In: Proceedings of International Symposium on Low Power Electronics and Design, pp. 26– 31 (2007) 11. Imre, A.: Experimental Study of Nanomagnets for Magnetic QCA Logic Applications. U. of Notre Dame, Ph.D. Dissertation 12. Imre, A., et al.: Majority logic gate for Magnetic Quantum-Dot Cellular Automata. Science 311(5758), 205–208 (2006) 13. Winfree, E., Liu, F., Wenzler, L.A., Seeman, N.C.: Nature 394, 539–541 (1998); He, Y., Chen, Y., Liu, H., Ribbe, A.E., Mao, C.: J. Am. Chem. Soc. 127, 12202–12203 (2005) 14. Barish, R.D., Schulman, R., Rothemund, P.W.K., Winfree, E.: An information-bearing seed for nucleating algorithmic self-assembly. PNAS 106, 6054–6059 (2009) 15. Han, S.-P., Maune, H., Barish, R., Bockrath, M., Goddard III, W., Rothemund, P., Winfree, E.: Self-Assembly of Carbon Nanotube Devices Directed by 2D DNA Nanostructures. Presented at Foundations of Nanoscience, Snowbird, Utah (April 2009) 16. Deng, Z.X., Mao, C.D.: DNA-templated fabrication of 1D parallel and 2D crossed metallic nanowire arrays. Nano Lett. 3, 1545–1548 (2003) 17. He, Y., Ko, S.H., Tian, Y., Ribbe, A.E., Mao, C.D.: Complexity emerges from lattice overlapping: Implications for nanopatterning. Small 4, 1329–1331 (2008)
68
K.N. Kim et al.
18. Sharma, J., Chhabra, R., Cheng, A., Brownell, J., Liu, Y., Yan, H.: Control of SelfAssembly of DNA Tubules Through Integration of Gold Nanoparticles. Science 323, 112– 116 (2009) 19. Yan, H., LaBean, T.H., Feng, L., Reif, J.H.: Directed Nucleation Assembly of Barcode Patterned DNA Lattices. PNAS 100, 8103–8108 (2003) 20. Rothemund, P.W.: Proceedings of the 2005 IEEE/ACM International Conference on Computer-Aided Design, San Jose, CA, November 6-10, pp. 471-478. IEEE Computer Society, Washington (2005) 21. Rothemund, P.W.K.: Nature 440, 297–302 (2006) 22. Horcas, I., Fernandez, R., Gomez-Rodriguez, J.M., Colchero, J., Gomez-Herrero, J., Baro, A.M.: Review of Scientific Instruments. 78, 013705 (2007) 23. Walter, H.: Ultrahigh resolution electron beam lithography for molecular electronics, Department of Electrical Engineering, University of Notre Dame, PhD thesis (2004) 24. Bernstein, G.H., Hill, D.A., Liu, W.P.: New high-contrast developers for PMMA resist. J. Appl. Phys. 71, 4066–4075 (1992) 25. Ke, Y., Linsday, S., Chang, Y., Liu, Y., Yan, H.: Science 319, 180–183 (2009) 26. Sarveswaran, K., Hu, W., Huber, P.W., Bernstein, G.H., Lieberman, M.: Langmuir 22, 11279–11283 (2006) 27. Niemier, M., Crocker, M., Hu, S., Lieberman, M.: Proceedings of International Conference on CAD (ICCAD), pp. 907–914 (2006)
Tug-of-War Model for Multi-armed Bandit Problem Song-Ju Kim, Masashi Aono, and Masahiko Hara Flucto-Order Functions Research Team, RIKEN-HYU Collaboration Research Center, Advanced Science Institute, RIKEN Fusion Technology Center 5F, Hanyang University, 17 Haengdang-dong, Seongdong-gu, Seoul 133-791, Korea 2-1 Hirosawa, Wako-shi, Saitama 351-0198, Japan
Abstract. We propose a model – the “tug-of-war (TOW) model” – to conduct unique parallel searches using many nonlocally correlated search agents. The model is based on the property of a single-celled amoeba, the true slime mold Physarum, which maintains a constant intracellular resource volume while collecting environmental information by concurrently expanding and shrinking its branches. The conservation law entails a “nonlocal correlation” among the branches, i.e., volume increment in one branch is immediately compensated by volume decrement(s) in the other branch(es). This nonlocal correlation was shown to be useful for decision making in the case of a dilemma. The multi-armed bandit problem is to determine the optimal strategy for maximizing the total reward sum with incompatible demands. Our model can efficiently manage this “exploration–exploitation dilemma” and exhibits good performances. The average accuracy rate of our model is higher than those of well-known algorithms such as the modified -greedy algorithm and modified softmax algorithm. Keywords: Multi-armed bandit problem, reinforcement learning, bioinspired computation, amoeba-based computing.
1
Introduction
A single-celled multinucleated amoeboid organism, a plasmodium of the true slime mold Physarum polycephalum (Fig. 1A), has been studied actively in recent years to investigate its notable computational capabilities. Nakagaki and coworkers showed that the amoeba is capable of finding the shortest path between foods [1,2,3] and anticipating periodic events [4]. When the amoeba is placed in a stellate chamber resting on an agar plate (Fig. 1B), the amoeba acquires multiple branches and constantly changes its shape by simultaneously expanding or shrinking the branches. The amoeba withdraws its branches when illuminated by light because of its photoavoidance behavior. By applying optical feedback according to a recurrent neural network model (Fig. 1C), Aono and co-workers created a neurocomputer employing the C.S. Calude et al. (Eds.): UC 2010, LNCS 6079, pp. 69–80, 2010. c Springer-Verlag Berlin Heidelberg 2010
70
S.-J. Kim, M. Aono, and M. Hara
Fig. 1. (A) An individual unicellular amoeba of the true slime mold Physarum polycephalum (scale bar = 7 mm). (B) An Au-coated plastic chamber resting on an agar plate (scale bar = 7 mm). The amoeba restricts itself inside the chamber where the agar is exposed because of its aversion to metal surfaces. (C) Experimental setup. For transmitted light imaging using a video camera (VC), a surface light source (LS) placed beneath the sample amoeba (SM) was used to emit light of a specific wavelength, which did not have a significant effect on the amoeba’s behavior. The recorded image was digitally processed using a personal computer (PC) to visualize the high-intensity monochrome image (visible white light) by using a projector (PJ). (D) Schematic illustration of the amoeba’s body architecture.
amoeba to explore solutions to optimization problems [5,6,7]. To demonstrate the computational capability of this amoeba-based computer, the amoeba’s branches were induced to expand or withdraw in the stellate chamber to explore an optimal solution to the Traveling Salesman Problem (TSP). The amoeba-based computer was found to be capable of deriving the optimal solution to the fourcity TSP with a high probability. We consider that there must be some crucial differences between biological organisms and digital computers with respect to their information processing. We expect that biological organisms are good at dealing with some kind of problems. In the amoeba’s body, a constant amount of intracellular protoplasmic sol shuttles through tubular channels, while its extracellular gel layer (ectoplasm), like a sponge, rhythmically oscillates the contraction tension to squeeze and absorb the sol (Fig. 1D). While the amoeba oscillates its branches to collect environmental information, the volume of the sol flowing through its body remains constant, unless nutrients are provided. We are interested in how this physical conservation law affects the information processing of the amoeba [8,9,10]. To elucidate this issue, we considered the “multi-armed bandit problem” because it is related to the difficulties of biological organisms faced while adapting to uncertain environments. As an example of the multi-armed bandit problem, we focus on the two-armed bandit problem first. Consider a slot machine that has 2 arms. Both arms have individual reward probabilities PA and PB . At each trial, a player pulls one of
Tug-of-War Model for Multi-armed Bandit Problem
71
the arms and obtains some reward, for example, a coin, with the corresponding probability1 . The player wants to maximize the total reward sum obtained after a certain number of selections. However, it is supposed that the player does not know these probabilities. The problem is to determine the optimal strategy for selecting the arm which yield maximum rewards by referring to past experiences. In the original form of the problem, the player was allowed to pull only one arm at each trial. However, to explore the advantages of parallel computing, we allowed the player to simultaneously pull both the arms2 . With this modification, the situation becomes more realistic, as it were a “two-bandit problem.” The new form of the problem considers 2 slot machines A and B, each having only 1 arm. Machines A and B have reward probabilities PA and PB , respectively. We assume that the reward of machine A is the same as that of machine B. In the multi-armed bandit problem and multi-bandit problem, to maximize the total reward sum after a certain number of selections, the player needs to identify the best machine that has the highest reward probability as “correctly” and “quickly” as possible. Therefore, the player has to “explore” many unknown machines to gather much information to determine the best machine. However, these explorations are risky because the player may lose considerable rewards that could have been “exploited” from the already-known best machine. Thus, there is a trade-off between “exploration” and “exploitation.” Living organisms generally encounter this “exploration–exploitation dilemma” because they have to survive in an unknown world. In order to survive, organisms need to adapt to the unknown situations by overcoming this dilemma. We speculate that organisms would have developed some efficient methods to overcome this dilemma. The multi-armed bandit problem was originally described by Robbins in 1952 [11], although the same problem in essence was studied by Thompson in 1933 [12]. A different version of the bandit problem has also been studied where the reward distributions are assumed to be “known” to the player. In this version, the optimal strategy is known only for a limited class of problems [13,14]. In the original version, a popular measure for the performance of an algorithm is “regret,” that is, the expected loss of rewards due to the lack of making the correct selection at all times. Lai and Robbins first showed that regret has to increase at least logarithmically in the number of selections [15]. They defined the condition which an optimal strategy must satisfy asymptotically. However, the computation of their algorithm is generally hard due to the Kullback–Leibler divergence. Agrawal proposed algorithms where the index could be expressed as a simple function of the total reward sum obtained from a machine [16]. These algorithms are considerably easier to compute than those developed by Lai and Robbins; however, the regret retains the logarithmic behavior asymptotically although with a larger leading constant. Aure et al. proposed a simple algorithm called upper confidence bound 1 (UCB1) that achieved logarithmic regret 1
2
In this study, we assume that each pull results in a reward of fixed size with the given probability. We are dealing with the simplified variant of the general multi-armed bandit problem. The player can choose how many arms to pull on each trial.
72
S.-J. Kim, M. Aono, and M. Hara
uniformly over time, rather than only asymptotically [17]. In addition, they proved that some family of the modified -greedy algorithm also achieves logarithmic regret. Vermorell et al. concluded that the most naive approach, the modified -greedy algorithm, is the best [18]. In this report, we propose a model, namely, the “tug-of-war (TOW) model,” based on the photoavoidance behavior of the amoeba induced by light stimuli. The TOW model is a bio-inspired computing method capable of effectively solving problems without necessarily being a biological model for reproducing an amoeba’s behavior. We showed that the TOW model exhibits good performances. Although “regret” is a popular measure for evaluating the performances of algorithms, we used the “average accuracy rate” in this study. This is because the logarithmic regret behavior generally belongs to a long-time behavior, while we were more interested in a short-time behavior because constant environments are rare in the natural world. We showed that the average accuracy rate of the TOW model is higher than those of well-known algorithms such as the modified -greedy algorithm and modified softmax algorithm for both two- and threearmed bandit problems.
2 2.1
Models Modified -greedy Algorithm
Many algorithms have been proposed for solving the two-armed bandit problem. The -greedy algorithm is one of the most popular algorithms [19]. In this algorithm, a player randomly selects A or B with the probability or performs a “greedy” action on the basis of the estimates given by Eqs. (1) and (2) with the probability 1 − . Greedy action means that the player selects A if QA > QB , or selects B if QA < QB . : random selection 1 − : greedy action based on estimates [number of rewards f rom A] , [number of A selections] [number of rewards f rom B] QB (t) = . [number of B selections] QA (t) =
(1) (2)
Eqs. (1) and (2) are the estimates for the reward probabilities PA and PB , respectively. In the original -greedy algorithm, is constant. However, in this study, we used the time-dependent (t), which is given by Eq. (3), (t) =
1 . 1+τ ·t
Here, τ is a parameter that determines the decay rate.
(3)
Tug-of-War Model for Multi-armed Bandit Problem
73
The modified -greedy algorithm can be easily extended for the m-armed bandit problem. A player randomly selects an arm with the probability or performs a greedy action on the basis of the estimates (Q1 , Q2 , · · ·, Qm ) with the probability 1 − . Here, Qk (t) (k=1, 2, · · ·, m) is also defined as, Qk (t) =
2.2
[number of rewards f rom k] . [number of k selections]
(4)
Modified Softmax Algorithm
The softmax algorithm is another well-known algorithm for the two-armed bandit problem. Some studies have reported that this is the best algorithm for the multi-armed bandit problem in the context of decision making [20,21]. In this algorithm, the probability of selecting A or B, PA (t) or PB (t), is given by the following Boltzmann distributions: exp[β · QA (t)] , exp[β · QA (t)] + exp[β · QB (t)] exp[β · QB (t)] PB (t) = , exp[β · QA (t)] + exp[β · QB (t)] PA (t) =
(5) (6)
where QA and QB are given by Eqs. (1) and (2), respectively. Similar to in the previous algorithm, β was modified to a time-dependent form in our study as follows: β(t) = τ · t. (7) β = 0 corresponds to a random selection, and β → ∞ corresponds to a greedy action. The modified softmax algorithm can also be easily extended for the m-armed bandit problem. In this case, the probability of selecting the arm k, Pk (t) (k=1, 2, · · ·, m), is given by the following equation: exp[β · Qk (t)] Pk (t) = m . j=1 exp[β · Qj (t)]
2.3
(8)
Tug-of-War Model
On the basis of the photoavoidance behavior of an amoeba, we propose the tugof-war (TOW) model. Consider that the shape of an amoeba is like a slug, as shown in Fig. 2. Variables XA and XB correspond the volume increments in branch A and B, respectively. If XA (XB ) is greater than 0, we consider that the amoeba selects A (B). Subsequently, light stimuli are applied to the branch A (B) with the probability 1 − PA (1 − PB ) as a “punishment,” i.e., an effect opposite to a “reward.” In this model, there can be 4 types of selections: A, B, A and B, and no selection at each time step.
74
S.-J. Kim, M. Aono, and M. Hara t=0 V
A
branch A
B
internal resource
A
branch B B
V+S origin of XA
S = - (XA + XB)
XA
XB
(QA - QB) A
origin of XB
(QB - QA) B
V+S
XA
XB
Fig. 2. TOW model
The volume increments XA and XB are determined by the following difference equations: XA (t + 1) = XA (t) + vA (t),
(9)
XB (t + 1) = XB (t) + vB (t),
(10)
vA (t) = vA (t − 1) + aA (t), vB (t) = vB (t − 1) + aB (t).
(11) (12)
Here, vA (t) and vB (t) denote velocities of the corresponding volume increment, and aA (t) and aB (t) denote accelerations of the corresponding volume increment. The acceleration aA (t) (aB (t)) is determined from Table 1; it depends on the local resource deviation SA (t)(SB (t)) and light ON-OFF condition. Therefore, these variables aA (t) and aB (t) are determined stochastically due to stochasticity of the light condition. Variable S(t) is determined by the following equation: S(t + 1) = S(t) − {vA (t) + vB (t)}.
(13)
We can interpret S(t) as the internal resource deviation from V, which is the constant amount of the total resource. If the initial conditions (XA (0), XB (0), Table 1. Rule for determining acceleration aA (aB )
OFF ON
SA (SB ) < 0 SA (SB ) = 0 SA (SB ) > 0 0 +1 +1 –1 –1 0
Tug-of-War Model for Multi-armed Bandit Problem
75
vA (0), vB (0) and S(0)) are set to zero, the value XA (t)+XB (t)+S(t) will always be zero. This implies that XA (t) + XB (t) + S(t) is a conserved quantity, ensuring the conservation of the total resource V. In order to incorporate the learning mechanism into this model, we introduced local biases of internal resource QA and QB for the resource on branches A and B, respectively (see the bottom figure in Fig. 2). By referring to the information on the number of selections and number of light stimulations, we assumed that a local bias of the internal resource is formed on each branch A or B. Thus, the local resource deviations SA (t) and SB (t) are given by SA (t) = S(t) + QA (t − 1) − QB (t − 1), SB (t) = S(t) + QB (t − 1) − QA (t − 1).
(14) (15)
This implies that the communication between branch A and branch B is realized via resource conservation. By defining Q(t) = QA (t) − QB (t), we obtain the following equations: SA (t) = S(t) + Q(t − 1), SB (t) = S(t) − Q(t − 1).
(16) (17)
For every time t, the number of selections and number of stimulations are accumulated in Q(t) such that Q(t) = μ · {(NA − 2 · LA ) − (NB − 2 · LB )},
(18)
Here, μ is the learning parameter, where NA (NB ) is the number of A (B) selections until time t, and LA (LB ) is the number of light stimulations on A (B) side until time t. If the amoeba selects A, without the light stimuli (OFF), the acceleration aA (t) = +1 will be added to vA (t), except in the case of SA (t) < 0. This implies that if the local resource is abundant (SA is zero or a positive value), the no light stimuli (OFF) induce an increase in vA . If the amoeba selects A in the presence of the light stimuli (ON), the acceleration aA (t) = −1 will be added to vA (t), except in the case of SA (t) > 0. This implies that if the local resource is scarce (SA is zero or a negative value), the light stimuli (ON) induce a decrease in vA . In this way, the photoavoidance behavior of the amoeba is implemented in this model. 2.4
Extension of the Tug-of-War Model
The TOW model can be easily extended for the m-armed bandit problem. In this case, the volume increment in branch k, Xk (k=1, 2, · · ·, m), is determined by the following difference equations: Xk (t + 1) = Xk (t) + vk (t), vk (t) = vk (t − 1) + ak (t).
(19) (20)
76
S.-J. Kim, M. Aono, and M. Hara
Here, vk (t) and ak (t) denote the velocity and the acceleration of the volume increment in branch k, respectively. The acceleration ak (t) is determined from Table 1 as well as aA (t) and aB (t). The internal resource deviation from V, S(t), is given by m S(t + 1) = S(t) − vj (t). (21) j=1
As a learning term, we introduced a local biase of internal resource Qk for branch k. Thus, the local resource deviation Sk (t) (k=1, 2, · · ·, m) is given by Sk (t) = S(t) + Qk (t − 1), m 1 Qk (t) = Qk (t) − Qj (t), m−1
(22)
Qk (t) = μ · (Nk − 2 · Lk ),
(24)
(23)
j=1,j=k
where Nk is the number of k selections until time t, and Lk is the number of light stimulations on k side until time t.
3
Accuracy Rate
The performance of each model is evaluated in terms of the “accuracy rate” of the models; accuracy rate is defined as the rate of correct (higher probability) selections made until t. Figure 3 shows the average accuracy rates of the models with PA = 0.4 and PB = 0.6 for the two-armed bandit problem. The horizontal axis denotes the number of selections3 , and the vertical axis denotes the average accuracy rate for 1000 samples for the modified -greedy algorithm (dotted line), modified softmax algorithm (dashed line), and TOW model (solid line). The parameters of each algorithm were optimized in order to obtain the highest accuracy rate. The results show that the TOW model is the best for the twoarmed bandit problem where PA = 0.4 and PB = 0.6. Figure 4 shows the average accuracy rates for PA = 0.45 and PB = 0.55 (two-armed bandit problem). The TOW model was found to be the best even for a difficult problem where PA = 0.45 and PB = 0.55 in the two-armed bandit problem. Figures 5 and 6 show the average accuracy rates of the models with [PA = 0.4, PB = 0.5, and PC = 0.7] and [PA = 0.4, PB = 0.5, and PC = 0.6] for the threearmed bandit problem, respectively. The results show that the TOW model is the best also for the three-armed bandit problem. Thus, we can conclude that the average accuracy rate of the TOW model is always higher than those of well-known algorithms, such as the modified -greedy algorithm and modified softmax algorithm at least for both two- and three-armed bandit problems.
3
Note that the number of selections is not the same as the number of steps (or time t) in the TOW model, although these are the same in the other algorithms.
Tug-of-War Model for Multi-armed Bandit Problem
77
1
average accuracy rate
0.9
0.8
0.7 EPGR SOFTM TOW 0.6
0
50
100 number of selections
200
150
Fig. 3. Average accuracy rate of the modified -greedy algorithm (dotted line), modified softmax algorithm (dashed line), and TOW model (solid line) for PA = 0.4 and PB = 0.6 (two-armed bandit problem). Optimized parameters are τ = 0.15, τ = 0.3, and μ = 5.0, respectively.
1
average accuracy rate
0.9
0.8
0.7
0.6
0.5
EPGR SOFTM TOW 0
100
200 number of selections
300
400
Fig. 4. Average accuracy rate of the modified -greedy algorithm (dotted line), modified softmax algorithm (dashed line), and TOW model (solid line) for PA = 0.45 and PB = 0.55 (two-armed bandit problem). Optimized parameters are τ = 0.05, τ = 0.1, and μ = 5.0, respectively.
78
S.-J. Kim, M. Aono, and M. Hara 1
average accuracy rate
0.9
0.8
0.7 EPGR SOFTM TOW 0.6
0
100
200 300 number of selections
400
500
Fig. 5. Average accuracy rate of the modified -greedy algorithm (dotted line), modified softmax algorithm (dashed line), and TOW model (solid line) for PA = 0.4, PB = 0.5, and PC = 0.7 (three-armed bandit problem). Optimized parameters are τ = 0.1, τ = 0.2, and μ = 2.0, respectively.
average acccuracy rate
0.9
0.8
0.7
EPGR SOFTM TOW 0.6 100
200
300 number of selections
400
500
Fig. 6. Average accuracy rate of the modified -greedy algorithm (dotted line), modified softmax algorithm (dashed line), and TOW model (solid line) for PA = 0.4, PB = 0.5, and PC = 0.6 (three-armed bandit problem). Optimized parameters are τ = 0.05, τ = 0.2, and μ = 4.0, respectively.
4
Conclusions and Discussions
The tug-of-war (TOW) model proposed in this study is a unique method for parallel searches inspired by the photoavoidance behavior of the slime mold amoeba. In this model, many branches of the single-celled amoeba act as search agents to collect information on light stimulations while conserving the total sum of their resources. Owing to this conservation law, there is a “nonlocal correlation” among the branches, because the information on resource increment
Tug-of-War Model for Multi-armed Bandit Problem
79
in a branch is conveyed instantaneously to other branches so that they can immediately decrease their resources to compensate for the increment. In our previous report, we showed that the nonlocal correlation via resource conservation can be advantageous to manage the “exploration–exploitation dilemma” for solving the two-armed bandit problem [9,10]. It was observed that the strength of nonlocal correlation between branches is sensitive for its performance. In this study, we showed that the TOW model exhibits better performance compared with well-known algorithms even for the three-armed bandit problem. It was also worth noting that the computational effort needed by the algorithm to make each selection was comparable to well-known algorithms. We only used the “average accuracy rate” as a performance measure, although “regret” is a more popular measure. However, we have already confirmed that the TOW model has the optimal logarithmic regret that satisfies Lais’ condition [15]. The results for the long-time behavior will be reported elsewhere. We are planning an experiment that uses an actual amoeba of the true slime mold Physarum. It is an interesting subject to verify the problem solving skill of the amoeba, by investigating the correlation between branches and the resource accumulation like a leaning term in the TOW model. In general, parallel exploration is faster than sequential exploration in terms of their actual execution speeds. Usually the the two-armed bandit problem is solved by sequential algorithms such as the modified -greedy algorithm and modified softmax algorithm, because the problem assumes that a player can pull only one arm by selecting between the 2 arms at each trial. The TOW model is an algorithm for conducting parallel explorations, and we changed the two-armed bandit problem to the“ two-bandit problem,” which allows a player to pull simultaneously both the arms. The actual execution speed of the TOW model can also be increased, because this model can be automatically extended for solving the multi-bandit problem. Provided a special case is considered in which the reward probabilities PA and PB are constant, algorithms for the two-armed bandit problem are identical with those for the two-bandit problem. However, considering more general cases in which PA and PB are not constant, it is expected that the multi-armed bandit problem substantially differs from the multi-bandit problem. The investigation of the crucial differences between the parallel and sequential explorations will be an interesting subject in our future studies.
References 1. Nakagaki, T., Yamada, H., Toth, A.: Maze-solving by an amoeboid organism. Nature 407, 470 (2000) 2. Tero, A., Kobayashi, R., Nakagaki, T.: Physarum solver: A biologically inspired method of road-network navigation. Physica A 363, 115–119 (2006) 3. Nakagaki, T., Iima, M., Ueda, T., Nishiura, Y., Saigusa, T., Tero, A., Kobayashi, R., Showalter, K.: Minimum-risk path finding by an adaptive amoebal network. Phys. Rev. Lett. 99, 068104 (2007) 4. Saigusa, T., Tero, A., Nakagaki, T., Kuramoto, Y.: Amoebae anticipate periodic events. Phys. Rev. Lett. 100, 018101 (2008)
80
S.-J. Kim, M. Aono, and M. Hara
5. Aono, M., Hara, M., Aihara, K.: Amoeba-based neurocomputing with chaotic dynamics. Communications of the ACM 50(9), 69–72 (2007) 6. Aono, M., Hara, M.: Spontaneous deadlock breaking on amoeba-based neurocomputer. BioSystems 91, 83–93 (2008) 7. Aono, M., Hirata, Y., Hara, M., Aihara, K.: Amoeba-based chaotic neurocomputing: Combinatorial optimization by coupled biological oscillators. New Generation Computing 27, 129–157 (2009) 8. Aono, M., Hirata, Y., Hara, M., Aihara, K.: Resource-competing oscillator network as a model of amoeba-based neurocomputer. In: Calude, C.S., Costa, J.F., Dershowitz, N., Freire, E., Rozenberg, G. (eds.) UC 2009. LNCS, vol. 5715, pp. 56–69. Springer, Heidelberg (2009) 9. Kim, S.-J., Aono, M., Hara, M.: Tug-of-war model for two-bandit problem. In: Calude, C.S., Costa, J.F., Dershowitz, N., Freire, E., Rozenberg, G. (eds.) UC 2009. LNCS, vol. 5715, p. 289. Springer, Heidelberg (2009) 10. Kim, S.-J., Aono, M., Hara, M.: Tug-of-war model for the two-bandit problem: Nonlocally-correlated parallel exploration via resource conservation. BioSystems (to appear) 11. Robbins, H.: Some aspects of the sequential design of experiments. Bull. Amer. Math. Soc. 58, 527–536 (1952) 12. Thompson, W.: On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika 25, 285–294 (1933) 13. Gittins, J., Jones, D.: A dynamic allocation index for the sequential design of experiments. In: Gans, J. (ed.) Progress in Statistics, pp. 241–266. North Holland, Amsterdam (1974) 14. Gittins, J.: Bandit processes and dynamic allocation indices. J. R. Stat. Soc. B 41, 148–177 (1979) 15. Lai, T., Robbins, H.: Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics 6, 4–22 (1985) 16. Agrawal, R.: Sample mean based index policies with O(log n) regret for the multiarmed bandit problem. Adv. Appl. Prob. 27, 1054–1078 (1995) 17. Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Machine Learning 47, 235–256 (2002) 18. Vermorel, J., Mohri, M.: Multi-armed bandit algorithms and empirical evaluation. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L., et al. (eds.) ECML 2005. LNCS (LNAI), vol. 3720, pp. 437–448. Springer, Heidelberg (2005) 19. Sutton, R., Barto, A.: Reinforcement learning: An introduction. MIT Press, Cambridge (1998) 20. Daw, N., O’Doherty, J., Dayan, P., Seymour, B., Dolan, R.: Cortical substrates for exploratory decisions in humans. Nature 441, 876–879 (2006) 21. Cohen, J., McClure, S., Yu, A.: Should I stay or should I go? How the human brain manages the trade-off between exploitation and exploration. Phil. Trans. R. Soc. B 362 (1481), 933–942 (2007)
Characterising Enzymes for Information Processing: Towards an Artificial Experimenter Chris Lovell, Gareth Jones, Steve R. Gunn, and Klaus-Peter Zauner School of Electronics and Computer Science, University of Southampton, UK, SO17 1BJ {cjl07r,gj07r,srg,kpz}@ecs.soton.ac.uk
Abstract. The information processing capabilities of many proteins are currently unexplored. The complexities and high dimensional parameter spaces make their investigation impractical. Difficulties arise as limited resources prevent intensive experimentation to identify repeatable behaviours. To assist in this exploration, computational techniques can be applied to efficiently search the space and automatically generate probable response behaviours. Here an artificial experimenter is discussed that aims to mimic the abilities of a successful human experimenter, using multiple hypotheses to cope with the small number of observations practicable. Coupling this approach with a lab-on-chip platform currently in development, we seek to create an autonomous experimentation machine capable of enzyme characterisation, which can be used as a tool for developing enzymatic computing.
1
Introduction
Elementary molecular computing has been composed of synthetic molecules used as logical operators [1]. Biomolecules too, for example DNA [2] and enzymes [3], have been employed as Boolean logic gates. However, given the structural complexity of enzymes, and recognising the influence the chemical environment has over enzymatic behaviour, it would appear that enzymatic behaviour is not limited to simple Boolean logic behaviour. Instead, by moving beyond mimicking digital electronics, characterising the response behaviour of enzymes could support new modes of information processing, and ultimately facilitate the application of enzymatic computers [3]. However, resources are typically very limited compared to the large parameter spaces, preventing detailed investigation of behaviours. Effective choice of experiments and a physical platform that minimises resource requirements per experiment, would therefore be desirable. In consideration here is an artificial experimenter, comprising of a set of machine learning techniques that analyse experimental observations to propose possible hypotheses and determine experiments to perform to test those hypotheses. A key problem for the artificial experimenter is how to produce hypotheses from small amounts of observations. Additionally, as experiments can in some cases be unreliable, resulting in erroneous observations not representative of the true underlying behaviours present, the development of hypotheses must also take into C.S. Calude et al. (Eds.): UC 2010, LNCS 6079, pp. 81–92, 2010. c Springer-Verlag Berlin Heidelberg 2010
82
C. Lovell et al.
Artificial Experimenter
Experiment Manager
Experiment Parameters Resources
Hypothesis Manager
Experimentation Platform
Observations Prior knowledge
Fig. 1. Flow of experimentation between an artificial experimenter and an automated experimentation platform. A prototype of the lab-on-chip platform in development is shown.
consideration the validity of an observation. We use a multiple hypotheses technique, whereby observations that are considered by the artificial experimenter to be potentially erroneous, are considered both erroneous and valid in parallel through multiple competing hypotheses, until further experimentation can provide clarity. The ability of a computational system to consider many thousands of hypotheses simultaneously, provides an advantage over a human experimenter, who can contemplate only a much more limited number of hypotheses. Also in development is a lab-on-chip platform, where multiple laboratory functions, for instance on-chip optical absorbance measurement, pumping and mixing, are fully automated [4]. The platform uses microfluidic technology to minimise consumption of chemicals per experiment [5]. As shown in Fig. 1, the artificial experimenter described here can be coupled to such an automated platform to allow for autonomous experimentation. In the following, existing techniques for autonomous experimentation are considered in Section 3, the specification of requirements and prototype methods for an artificial experimenter are presented in sections 3, 4 and 5, with initial results considered in Section 6.
2
Existing Approaches
Active learning provides many techniques for determining the next experiment to perform [6,7]. However, these techniques often look at a simplified view of experimentation, for example where the problem is that of binary classification and there is no noise in the classification [8]. These techniques typically provide mathematical guarantees, such as converging to the optimal hypothesis or decreases in the prediction error. However, first the assumptions made and second the high number of experiments required by these approaches, make them infeasible for problem domains of interest here. Approaches that do look at more complicated problems, such as response prediction with noisy observations, are evaluated based on the use of several hundred experiments per parameter dimension [9,10], not suitable for the present problem. Existing artificial experimenter techniques that have been applied to real experimentation, often take more ad-hoc approaches so as to try and mimic the techniques employed by successful experimenters. For example, the work of Hans
Artificial Experimenter for Biomolecular Computation
83
Krebs to discover the urea cycle was used as a case study to develop heuristics for scientific discovery in the KEKADA system [11]. One of the heuristics proposed was that of investigating surprising observations, which was also considered an important heuristic in an approach called Scouting [12]. The heuristic of surprise can be applied to the present approach, as obtaining a surprising observation, otherwise described as an observation that disagrees with a hypothesis, suggests the discovery of a behaviour not captured by the hypothesis. However, both the KEKADA and Scouting approaches have limited effectiveness in the present problem domain. KEKADA due to other heuristics requiring explicit a priori domain information, not available in the present domain, and the Scouting approach having a restricted technique for representing the response behaviours required. The requirement of extensive domain information to develop mechanistic hypotheses makes the Robot Scientist approach [13] not applicable for enzymatic characterisation, where the required prior information does not exist and mechanistic hypotheses are not required. An approach using regression to identify response behaviours in electrochemistry, although well suited to identifying interesting phenomena, is restricted by requiring a high number of experiments and not considering the case of erroneous observations [14]. Symbolic regression has also been applied to rediscover physics laws [15], however the technique once more required a large number of observations.
3
Problem Specification
Currently few models of enzyme dynamics exist. Therefore, to evaluate these artificial experimenter techniques, simulated experiments are used to provide a known target function that can be compared against. In formulating the problem framework, an experiment is defined as a set of parameters and actions, represented as vector x. Similarly, the observation for an experiment is represented as vector y, considered here to contain only a single element. Subsequently, the underlying behaviour present in the experiment parameter space can be represented as a function: y = f (x + δ) + + φ (1) with Gaussian noise affecting both the experiment parameters and observations through δ and respectively. In addition a shock noise term (φ) is included that can shift the observation. Shock noise factors in failure within experimentation that yields an observation unrepresentative of the true underlying behaviour. The shock noise can be specified as a percentage of experiments that will lead to erroneous observations, where only that percentage of experiments yield φ = 0. This simple yet more general noise model is assumed for testing, as the actual noise model is not known. Having constructed the problem framework, the goals of the experimentation algorithms can be defined. A theoretical goal is to find an accurate representation of f (x), however in reality, accuracy may be limited by the resources available. Therefore key aspects, such as any peaks, troughs and sharp changes in behaviour are to be identified, along with the general trend of the behaviour.
84
C. Lovell et al.
The existence and knowledge of repeatable behaviours, may form the basis of properties harnessed for enzymatic computation. Later, optimisation at particular points of interest can be used to refine accuracy. Additionally, the artificial experimenter should attempt to identify erroneous observations, to prevent those observations from incorrectly influencing the hypotheses, discussed next.
4
Hypothesis Management
Responses from enzyme interaction experiments may be nonmonotonic and in rare situations include a phase change [3]. Therefore, response hypotheses need to be general so as to allow different nonmonotonic functions, but also flexible to allow for abrupt feature changes such as phase changes. For these reasons, hypotheses are represented here using a smoothing spline. A smoothing spline is an established regression technique that can be placed within a Bayesian framework to provide error bars and does not impose a particular spectral scale [16]. A single hyperparameter (λ) controls the smoothness of the regression, with a higher value representing a smoother output. A hypothesis is therefore determined by the smoothing parameter λ, the set of observations to train from, and weightings on those observations. The weights are determined through the procedure described below to handle erroneous observations. Techniques such as outlier identification, are typically employed to determine whether it is statistically likely for a particular observation to be valid when compared to all other observations. However, with only a few observations available, these techniques cannot be applied with any acceptable level of certainty. Take the example in Fig. 2. A reasonable suggestion for the given data would be either the linear prediction in h3 or the curve of h4 , which both would be acceptable outcomes from regression using cross validation or similar to learn the parameters. However, if any of the observations are invalid, those hypotheses become less reasonable. Instead a multiple hypotheses approach, where different hypotheses not only provide different response curve predictions, but also have different views about the validity of observations, can be employed. By assigning hypotheses a confidence based on how well they represent the observations, all hypotheses can be maintained so as to allow hypotheses that fail to match some early observations, but succeed to match later observations, to recover and become more confident. Hypotheses are generated as follows. Given a number of previous observations, a set of hypotheses would have been generated with splines with differing smoothing parameters. On obtaining a new experimental observation, the previous hypotheses are checked against the observation to determine whether or not they are in agreement. If the observation lies outside of the 95% confidence interval of a particular hypothesis, horiginal , the observation is declared as being in disagreement with the hypothesis. To handle the problem of determining whether it is the hypothesis or the observation that is invalid, the disagreeing observation is then considered both valid and invalid through two refined hypotheses, hvalid and hinvalid respectively. Both hvalid and hinvalid are based upon
Artificial Experimenter for Biomolecular Computation
h2 h3
A
B
observation
observation
C
D
A
C
85
h4 h5
B
h1 parameter
parameter
(a)
(b)
Fig. 2. Validity of observations affecting hypothesis proposal. Hypotheses (lines) are formed after observations (crosses) are obtained. In (a) h1 is formed after observation A and B are obtained, however the effectiveness of the hypothesis is questioned by further observation C. Further hypotheses h2 and h3 consider A, B and C valid, but with differing noise estimates. In (b) D is obtained to test the discrepancy between h1 and C, indicating that h1 is unlikely. Hypothesis h4 considers all observations valid, and hypothesis h5 questions the validity of B.
horiginal , which is left unchanged within the current working set of hypotheses. The suspected erroneous observation is weighted accordingly in the smoothing spline calculation, with a high weight (currently set arbitrarily at 100) in hvalid and with a weight of zero in hinvalid . The high weight forms a hypothesis that believes the observation to be true and forces the spline to pass near to the observation. The zero weighted observation forms a hypothesis that considers the observation to be invalid and removes the observation from consideration when the spline is trained. Both hvalid and hinvalid are then added to the working set of hypotheses. With a working set of possible hypotheses, the next task is to determine the confidence of each hypothesis. 4.1
Evaluating a Hypothesis
As discussed previously, erroneous observations make hypothesis proposal a harder task, where any outliers could be due to an erroneous observation or an invalid hypothesis. Subsequently, methods for evaluating these hypotheses also need to take into consideration that the observations obtained may be erroneous. Forcing hypotheses to be evaluated against an erroneous observation, will not lead to the identification of hypotheses that fit the underlying phenomena well. For example, if observation B in Fig. 2 is invalid, but the hypotheses are evaluated against all observations, then hypotheses h2 , h3 and h4 will have higher confidences than the actual better fitting hypothesis h5 , as they pass closer to all observations. Ideally, the observations obtained after a hypothesis is created would be used solely for hypothesis evaluation, however we again have to assume that these will be limited and that their validity will not be guaranteed. As such, using a standard mean squared error approach for evaluation appears
86
C. Lovell et al.
not the best strategy. Instead hypotheses should be allowed to ignore a certain amount of selected observations without penalty. A prediction of the likely percentage error may be determined either through previous experience with the experimental hardware, or through selecting a worst case value that if surpassed, would make the outcome of the experimentation not credible. Using this prediction of the percentage error, we can determine the number of observations a hypothesis can disregard without penalty. The observations a hypothesis may want to disregard are the observations that the hypothesis determines are invalid, so that the hypothesis is not evaluated on its ability to match observations it hypothesises are erroneous. Next a prototype equation for evaluating a hypothesis is considered. The current prototype evaluation metric revises an existing metric for a evaluating smoothing spline [16], but uses only the observations that trained the hypothesis, so as to act as an overfitting test that gives a high value for a hypothesis that does not overfit the data: 2 ⎞−1 ˆ i) w ¯i yi − h(x ⎜ ⎟ i=1 υ(h) = ⎜ 2 ⎟ ⎠ ⎝1 + n ¯ 1 − tr(A) n ¯ ⎛
n
(2)
where w ¯ is 0 when the observation is declared invalid by the hypothesis and 1 ˆ otherwise, h(x) is the prediction of a hypothesis for the experiment parameter and y is the actual experimental observation for that same experiment, n ¯ is the number of observations the hypothesis has declared valid from the training set and A is the hat matrix from of the smoothing spline. Those observations not used to train the hypothesis evaluate the hypothesis in a mean squared error approach, presented here inverted to provide a confidence between 0 and 1: m ¯2 (3) γ(h) = 2 m ˆ xi ) m ¯2 + w ¯i y¯i − h(¯ i=1
where m ¯ is the number of test observations the hypothesis believes to be valid, x ¯ and y¯ in this instance represent the test experiment-observation pairs, and w ¯ is 1 if the hypothesis believes the test observation to be valid, 0 otherwise. The confidence is currently defined as: C(h) = (υ(h) + γ(h)) e(h)
(4)
where e(h) is the penalty term for a hypothesis declaring greater than a given percentage p of observations as erroneous: e(h) =
1 , 1+(¯ n+m)−p(n+m) ¯
1,
if (¯ n + m) ¯ > p (n + m) otherwise
(5)
Artificial Experimenter for Biomolecular Computation
5
87
Experiment Selection
Experiment selection, the active learning component of a artificial experimenter, should look to explore the experimental parameter space to discover behaviours not captured by the hypotheses, whilst also looking to find evidence to discriminate between competing hypotheses. The experiment selection algorithm should automatically balance between exploration and gaining information to elucidate differences between the hypotheses. This trade-off is described as a exploration-exploitation trade-off [17]. Hypotheses using the above hypothesis proposal scheme will mostly disagree when there are differing views in the validity of observations. In the case that an observation has caused a difference due to the observation being invalid, the experiment selection method should investigate the difference, realise it was caused by an error and then continue to search elsewhere. However, if the difference is caused by hypotheses failing to model the underlying behaviour and the observations are valid, further experimentation should be performed to capture the behaviour. The prototype approach uses a strategy of placing experiments maximally away from previous experiments as the exploration strategy. For each proposed experiment, the minimum distance to any other previously performed experiment using a Euclidean distance function is calculated: |x − x | ζ(x) = min x ∈X
(6)
For determining discrepancy, the prototype approach looks to place experiments where the variance in hypotheses predictions is maximal, similar to that used, albeit unsuccessfully, in [10]. The unsuccessfulness of the approach described in [10] appears in part due to the sole use of a variance reduction strategy, the low number of hypotheses contemplated in parallel, and the polynomial kernel functions used. Here the variance calculation is weighted by the confidence of each hypothesis, so that weak hypotheses do not overly influence the decision: ξ(x) = k
N
2 ˆ i (x) − μ∗ C(hi ) h
(7)
i=1
To link the exploration and exploitation strategies, this approach sums the normalised values of the evaluations in (6) and (7). It may be worthwhile to develop more sophisticated management of exploration and exploitation, however the approach here is able to demonstrate the desired behaviour. The following is the prototype experiment selection strategy, where hyperparameter Γ controls the preference of exploration over exploitation: ⎞ (1 − Γ ) ξ(x) Γ ζ(x) ⎠ + = max ⎝ x∈X max ζ(t) max ξ(t) ⎛
xperform
t∈X
t∈X
(8)
88
6
C. Lovell et al.
Preliminary Results
To evaluate the approach of hypothesis management and experiment selection algorithms, we consider the number of experiments required to be performed in order to create a good representation of the phenomenon. Additionally we consider how different experiment selection strategies alter the predicted response behaviour under investigation. To evaluate this, we simulate experiments using a target function to return observations from requested experiment parameters. These target functions are designed to be general yet representative of possible responses from enzyme interaction experiments, where we would expect continuous nonmonotonic behaviours, with additional phase changes in some circumstances [3]. The model applies additive Gaussian noise N (0, 0.52 ) to all observations ( in (1)), and has a 5% rate of generating observations distorted by shock noise (φ in (1)). Independent variable noise (δ in (1)) has been set to 0, as in early trials altering this value had surprisingly little effect on the abililty of the approach demonstrated here. The hypotheses are restricted to choose λ values of 0.01, 0.001 and 0.0005, determined a priori for use with coded independent variables, as they give good flexibility to the spline. In Fig. 3 we demonstrate the placement of experiments for a simulated underlying phenomena using the experiment selection algorithm in (8) with Γ = 0.5. For clarity we separate here the issue of outlying observations caused by the hypothesis being incorrect, and outlying observations caused by the observations being erroneous. In Fig. 3(a) and (b) we develop situations where observations will appear as outliers, but are actually true representations of the underlying phenomena, achieved through the use of a discontinuity between two distinct behaviours. The discontinuity provides observations that are close to each other in the experiment parameter space, yet are significantly different in their observed values, which will yield observations that appear to disagree, but are actually correct. In the example given, when the 9th experiment is performed, the observation will appear to not fit the current hypotheses, as the 1st and 5th observations either side of the 9th observation gave lower observation values. In Fig. 3(a), a good representation of the underlying phenomena is identified by the approach after 10 experiments, using the confidence metric in (4). In this case the 5th and 9th observations have been automatically weighted, as the observations did not match the predictions of the previous hypotheses. The effect of this weighting is to force the hypothesis to pass through the transition region. Without the weightings, the smoothing spline would produce a hypothesis that averaged through the data points in that region. In Fig. 3(b), the majority of the additional data points obtained after Fig. 3(a) are in the region of the discontinuity. Here the discontinuity produces a large number of hypotheses that have differing views of the observations found in the region of the discontinuity. The exploitation scores of proposed experiments in this region will therefore be high, causing more experiments to be performed in this region. There is still however some exploration occurring as can be seen by observation 12. In Fig. 3(c) and (d) the case of erroneous observations is considered. Here we consider a simpler underlying phenomena similar to that found in [9] and [10],
Artificial Experimenter for Biomolecular Computation
9
9
2
8
7
6
9
10
Observation
Observation
2
8
7
3
5
6 4 3
8
7
4
18
−0.6
−0.4
−0.2
0
14
6
4
0.4
0.6
0.8
20
12 8
1
16 19 11
−0.8
−0.6
−0.4
−0.2
Experiment Parameter
0
0.2
0.4
0.6
0.8
1
Experiment Parameter
(a)
(b)
8
8
5
5 6
6
6 4 8 3
2
9
2
Observation
4
Observation
3 15
17
1 −1
1
21
5
7
3
0.2
10
9
5
2
−0.8
13
6
4
1
5
2 1 −1
10
7
0
1
4
6 11 13 18 19
4 8 3 15
9
2 12 7
2
14
10 16
0
20
17
−2
1
−2
−4 −6 −1
89
−0.8
−0.6
−0.4
−0.2
0
0.2
Experiment Parameter
(c)
0.4
0.6
0.8
1
−4 −1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
Experiment Parameter
(d)
Fig. 3. Most confident hypothesis produced by the artificial experimenter for two target functions. Shown is the most confident hypothesis (solid line) with error bars shaded, underlying phenomena (dotted line) and observations (dots) numbered in order performed. The presence of a discontinuity (a and b) or an erroneous observation (c and d) causes additional experiments to be performed near those features, whilst also ensuring the entire parameter space is explored.
but here the 5th observation provided was an erroneous observation. In this case, there is continued exploration of the space after the 5th observation is obtained. With observations 6, 8 and 11 not supporting the 5th observation, there are fewer confident alternative hypotheses in the region of the erroneous observation, compared to the region of discontinuity in Fig. 3(a) and (b), therefore there is less focus on exploitation promoting experiments. In Fig. 4 we demonstrate problems caused by using alternate active search strategies. Figure 4(a) considers a fully explorative search similar to that of reducing error bar uncertainty, where the erroneous observation causes part of the phenomena to be missed. In Fig. 4(b) a fully exploitative strategy similar to that used in [10] results in experiments being located near the erroneous observation that has caused the hypotheses to differ. Whilst Fig. 4(c) and (d) demonstrate how the approach described here is able to form reasonable predictions of the underlying phenomena after a small number of experiments. Finally in Fig. 5, we compare the reduction in error between the most confident hypothesis and the true underlying behaviour, for the present active experiment selection technique and a passive technique. In a passive technique, the artificial experimenter cannot select the next experiment to perform, in
90
C. Lovell et al.
14
14
5 5
12
10
2
8
6
6
Observation
Observation
12
7 9
3 2 −1
−0.8
4
1
−0.6
−0.4
−0.2
0
0.2
0.4
2
8
6
4
8
4
10
0.6
0.8
2 −1
1
3
6 4 9 8 −0.8
7 1 −0.6
−0.4
Experiment Parameter
−0.2
0
0.2
0.4
0.6
0.8
(a)
(b)
14
12
5 12
8
2
8
7
6
3
6
4
4 9 −0.8
−0.6
11 4
4
8 18
9
13 12 17 19 10 20 15
7
0 −0.2
0
0.2
Experiment Parameter
(c)
0.4
0.6
0.8
1
−1
−0.8
−0.6
3
14 16
1
6
5
8 −0.4
6
2
1
2
Observation
Observation
2
10
10
0 −1
1
Experiment Parameter
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
Experiment Parameter
(d)
Fig. 4. Effect of different experiment proposal strategies. Shown is the most confident hypothesis representation (solid line) of a simulated underlying phenomena (dotted line) with observations (dots) numbered in the order performed. An erroneous observation causes the explorative search (a) and exploitative search (b) to fail to characterise the target function. In (c and d) the mixed strategy is shown for two different underlying phenomena and shock noise in (c).
this case the learner is presented with observations from random experiments. The underlying behaviour used in Fig. 4(a-c) is again used with Gaussian noise N (0, 0.52 ). In Fig. 5(a) there is no shock noise and in Fig. 5(b) any 1 of the 20 experiments can be erroneous. Here we demonstrate that in both scenarios, the present experiment selection technique outperforms the passive technique. The most significant advantage of the present approach is in the early stages of experimentation, which is ideal in a situation where we may have only at most 10 experiments per dimension. As expected, performance is worse when shock noise is applied. However, as an erroneous observation can occur at any stage of experimentation and is distorted by Gaussian noise N (10, 1), those runs where the erroneous observation occurs in the first few experiments will suffer more than those where the erroneous observation is later, causing larger error. In this instance the random technique performs far worse, with the average mean squared error being 30 compared to just under 5 for the active strategy, as shown in the inset of Fig. 5(b). Regardless of when the erroneous observation occurs, the results demonstrate that the hypothesis management technique recovers to provide reasonable hypotheses. However, by employing the active strategy, this recovery can occur quicker than through random experiment selection.
Artificial Experimenter for Biomolecular Computation 6
6
5
5
4
4
91
30
Error
Error
20
3
10 3 0
2
2
1
1
0
4
6
8
10
12
14
Number Experiments
(a)
16
18
20
0
4
6
8
10
12
5
10
14
15
16
20
18
20
Number Experiments
(b)
Fig. 5. Error reduction between most confident hypothesis and the target function over a number experiments. Shown is the average error after 100 trials for the active strategy (solid line) and passive strategy (dotted line) for the single discontinuity behaviour, with no shock noise in (a) and 1 of the 20 experiments in each trial is erroneous in (b).
7
Discussion
The union of machine learning and automated laboratory hardware can allow for effective investigation of complex experiment parameter spaces. In particular the lab-on-chip automated platform being developed in parallel will significantly reduce the amount of chemical resources required per experiment [4]. Whilst the artificial experimenter utilises relatively cheap computational resources, to gain as much information as possible from small sets of observations. In situations where hypotheses cannot correctly specify the underlying behaviour and high noise exists, artificial experimenters with few observations available can become misled, resulting in poor representation of the underlying phenomena. However, we have shown that by extending a variance based approach to better manage the exploration-exploitation trade-off and considering a larger corpus of possible hypotheses, that such techniques can provide a significant benefit to experimentation with limited resources. This proof-of-principle technique is demonstrated to work in scenarios where existing machine learning techniques will struggle, and that response behaviours can be characterised with a very small number of observations. However, we believe improvements can be made in the hypothesis evaluation and experiment selection techniques. Currently the artificial experimenter is designed to build models of behaviours it identifies. However, in the future as we become more accustomed to the information processing capabilities of enzymes, we may wish to design target behaviours we require and then inspect the biological system to see if and where they exist. By modifying the kernel functions used in the hypotheses to those that describe the behaviours required, the present approach can be modified to allow targeted experimentation that searches for particular behaviours, whilst still maintaining the multiple hypotheses benefits of the present approach. Overall, the purpose of this autonomous experimentation machine is to allow the human scientist to redirect their time from monotonous experimentation tasks, to determining computational functionality from the enzyme behaviours identified.
92
C. Lovell et al.
Acknowledgements. The reported work was supported in part by a Microsoft Research Faculty Fellowship to KPZ.
References 1. de Silva, A.P., Uchiyama, S.: Molecular logic and computing. Nature Nanotechnology 2, 399–410 (2007) 2. Seelig, G., Soloveichik, D., Zhang, D.Y., Winfree, E.: Enzyme-free nucleic acid logic circuits. Science 314, 1585–1588 (2006) 3. Zauner, K.-P., Conrad, M.: Enzymatic computing. Biotechnol. Prog. 17, 553–559 (2001) 4. Jones, G., Lovell, C., Morgan, H., Zauner, K.-P.: Characterising enzymes for information processing: Microfluidics for autonomous experimentation. In: Calude, C.S., et al. (eds.) UC 2010. LNCS, vol. 6079, p. 191. Springer, Heidelberg (2010) 5. Whitesides, G.M.: The origins and future of microfluidics. Nature 442, 368–373 (2006) 6. MacKay, D.J.C.: Information–based objective functions for active data selection. Neural Computation 4, 589–603 (1992) 7. Cohn, D.A., Ghahramani, Z., Jordan, M.I.: Active learning with statistical models. Journal of Artificial Intelligence Research 4, 129–145 (1996) 8. Freund, Y., Seung, H.S., Shamir, E., Tishby, N.: Selective sampling using the query by committee algorithm. Machine Learning 28, 133–168 (1997) 9. Sugiyama, M., Rubens, N.: Active learning with model selection in linear regression. In: SIAM International Conference on Data Mining, pp. 518–529 (2008) 10. Burbidge, R., Rowland, J.J., King, R.D.: Active learning for regression based on query by committee. In: Intelligent Data Engineering and Automated Learning 11. Kulkarni, D., Simon, H.A.: Experimentation in machine discovery. In: Shrager, J., Langley, P. (eds.) Computational Models of Scientific Discovery and Theory Formation, pp. 255–273. Morgan Kaufmann Publishers, San Mateo (1990) 12. Pfaffmann, J.O., Zauner, K.P.: Scouting context-sensitive components. In: The Third NASA/DoD Workshop on Evolvable Hardware–EH 2001, pp. 14–20 (2001) 13. King, R.D., Whelan, K.E., Jones, F.M., Reiser, P.G.K., Bryant, C.H., Muggleton, S.H., Kell, D.B., Oliver, S.G.: Functional genomic hypothesis generation and experimentation by a robot scientist. Nature 427, 247–252 (2004) ˙ 14. Zytkow, J.M., Zhu, J., Hussam, A.: Automated discovery in a chemistry laboratory. In: Proceedings of the 8th National Conference on Artificial Intelligence, Boston, MA, pp. 889–894. AAAI Press/MIT Press (1990) 15. Schmidt, M., Lipson, H.: Distilling free-form natural laws from experimental data. Science 324(5923), 81–85 (2009) 16. Wahba, G.: Bayesian “confidence intervals” for the cross-validated smoothing spline. J. R. Statist. Soc. B 45(1), 133–150 (1983) 17. Auer, P.: Using confidence bounds for exploitation-exploration trade-offs. Journal of Machine Learning Research 3, 397–422 (2002)
Majority Adder Implementation by Competing Patterns in Life-Like Rule B2/S2345 Genaro J. Mart´ınez1,3 , Kenichi Morita2 , Andrew Adamatzky3 , and Maurice Margenstern4 1
Instituto de Ciencias Nucleares and Centro de Ciencias de la Complejidad, Universidad Nacional Aut´ onoma de M´exico, M´exico DF
[email protected] 2 Hiroshima University, Higashi-Hiroshima 739-8527, Japan
[email protected] 3 Bristol Institute of Technology, University of the West of England, Bristol, United Kingdom
[email protected] 4 Laboratoire d’Informatique Th´eorique et Appliqu´ee, Universit´e de Metz, Metz Cedex, France
[email protected]
Abstract. We study Life-like cellular automaton rule B2/S2345. This automaton exhibits a chaotic behavior yet capable for purposeful computation. The automaton implements Boolean gates via patterns which compete for the space when propagate in channels. Values of Boolean variables are encoded into two types of patterns — symmetric (False) and asymmetric (True). We construct basic logical gates and elementary arithmetical circuits by simulating logical signals using glider reactions taking place in the channels built of non-destructible still lifes. We design a binary adder of majority gates realised in rule B2/S2345.
1
Introduction
There is a plenty of computing devices ‘made of’ Conway’s Game of Life (GoL) cellular automaton [13]. Examples include a complete set of logical functions [32], register machine [8], direct simulation of Turing machine [9,31], and design of a universal constructor [16]. These implementations use principles of collisionbased computing [8,1] where information is transferred by gliders propagating in an architecture-less medium. Theoretical result regarding GoL universality is only a tiny step in a long journey towards real-world implementation of the collision-based computers [33]. GoL has a long history where a number of dedicated researchers obtained significant results on its complex dynamics and computing devices. The first one was published by Gardner [13] followed for a newsletter edited by Wainwright [35,3]. A number of results in GoL is published, some of them really complicated as for example universal computers/constructors [1,6,8,9,12,14,15,16,26,30,31]. C.S. Calude et al. (Eds.): UC 2010, LNCS 6079, pp. 93–104, 2010. c Springer-Verlag Berlin Heidelberg 2010
94
G.J. Mart´ınez et al.
On the way, we have reported in phenomenological studies of semi-totalistic CA [5], a selected set rules named as Life 2c22, identified by periodic structures [28]. The clan closest to the family 2c22 and the Diffusion Rule (Life rule B2/S7) [21], all they also into of a big cluster named as Life dc221 . In this paper we will exploit previous results on the constructions of feedback channels with still life patterns (previous studies since B2/S23456 [20] and B2/S2345678 [22]), reducing the number of cells in state 1 on the evolution rule. Hence every pattern propagation is stimulated since a glider reaction that will produce a specific static geometric pattern, thus their interactions when they compete shall yield a binary value representation. Finally we design specific initial configurations to get implementations of universal logic gates and a binary adder based on majority gates inside B2/S2345.
2
Life Rule B2/S2345
Dynamics of Life rule B2/S2345 is described for the next conditions. Each cell takes two states ‘0’ (‘dead’) and ‘1’ (‘alive’), and updates its state depending on its eight closest neighbours (Moore neighborhood): a) Birth: a central cell in state 0 at time step t takes state 1 at time step t + 1 if it has exactly two neighbours in state 1. b) Survival: a central cell in state 1 at time t remains in the state 1 at time t + 1 if it has two, three, four or five live neighbours. c) Death: all other local situations. Once a resting lattice is perturbed in B2/S2345 (few cells are assigned live states), patterns of states 1 emerge, grow and propagate on the lattice quickly. The main characteristic is that gliders and oscillators emerge but they do not survive for long time.
(a)
(b)
(c)
(d)
Fig. 1. Basic periodic structures in B2/S2345: (a) glider, (b) oscillator (flip-flop), (c) oscillator (blinker), and (d) still life configuration
A set of minimal particles, or basic periodic structures, in rule B2/S2345 include one glider (period one), two oscillators (one blinker and one flip-flop, period two), and finally one still life configuration (see Fig. 1). The still life pattern [23,11] in B2/S2345 has a relevant characteristic. They are not affected by their environment however they do affect their environment [20,22]). Therefore the still life patterns can be used to build channels, or wires, for signal propagation. 1
http://uncomp.uwe.ac.uk/genaro/Life_dc22.html
Majority Adder Implementation by Competing Patterns
2.1
95
Indestructible Still Life Pattern in B2/S2345
Some patterns amongst still life patterns in the rule B2/S2345 belong to a class of indestructible patterns (sometimes referred to as ‘glider-proof’ patterns in GoL) which cannot be destroyed by any perturbation, including collisions with gliders. A minimal indestructible pattern, still life occupying a square of 6 × 6 cells, is shown in Fig. 1d.
(a)
(b)
(c)
(d) Fig. 2. Containment of growing pattern by indestructible patterns. (a) First example display an explosion reaction started from a collision between four gliders (see center), (b) display the final configuration stopping this growing pattern. (c) Display initial positions of a fleet of gliders outside the box walled by still life configurations in our second example, (d) show how interior of the box is protected from the growing pattern.
The indestructible patterns can be used to stop ‘supernova’ explosions in some Life-like rules. Usually a Life-like automaton development started at an arbitrary configuration exhibits unlimited growth (generally related to some kind of nucleation phenomenon [17]). Hence a suitable concatenation of still life configurations avoid a continuos expansion.
96
G.J. Mart´ınez et al.
In rule B2/S2345 such an ‘uncontrollable’ growth can be prevented by a regular arrangement of indestructible patterns. Examples are shown in Fig. 2. In the first example (Fig. 2a) four gliders collide inside a ‘box’ made of still life patterns. The collision between the gliders lead to formation of growing pattern. The propagation of the pattern is stopped by the indestructible wall (Fig. 2b). In the second example, a fleet of gliders collide outside the box (Fig. 2c) however interior of the box remains resting (Fig. 2c) due to impenetrable walls. Similarly, one can construct a colony of still life patterns immune to local perturbations. Thus the indestructibility exemplified above allows us to use still life patterns to construct channel information in logical circuits.
3
Computing by Competing Patterns
The easiest way to control patterns propagating in a non-linear medium circuits is to constrain them geometrically. Constraining the media geometrically is a common technique used when designing computational schemes in spatially extended non-linear media. For example ‘strips’ or ‘channels’ are constructed within the medium (e.g. excitable medium) and connected together, typically using arrangements such as T -junctions. Fronts of propagating phase (excitation) or diffusive waves represent signals, or values of logical variables. When fronts interact at the junctions some fronts annihilate or new fronts emerge. The propagation in the output channels represent results of the computation. Hence we built a computing scheme from channels — long areas of ‘0’-state cells walled by still life blocks, and T -junctions2 — sites where two or more channels join together.
A
B
C
Fig. 3. T -shaped system processing information
Each T -junction consists of two horizontal channels A and B (shoulders), acting as inputs, and a vertical channel, C, assigned as an output (Fig. 3). Such type of circuitry has been already used to implement xor gate in chemical laboratory precipitating reaction-diffusion systems [4], and precipitating logical gates imitated in CA [20,22]. A minimal width of each channel equals three widths of the still life block (Fig. 1d) and width of a glider (Fig. 1a). Boolean values are represented by position of gliders, positioned initially in the middle of channel, value 0 (Fig. 4a), or slightly offset, value 1 (Fig. 4c). 2
T -junction based control signals were suggested also in von Neumann [34] works, and used by Banks [7] and Codd [10] as well.
Majority Adder Implementation by Competing Patterns
(a)
(b)
(c)
(d)
97
Fig. 4. Feedback channels constructed with still life patterns ((a) and (c)) show the initial state with the empty channel and one glider respectively. The symmetric pattern represent value 0 (b), and non-symmetric pattern represent value 1 (d) late of glider reaction.
The initial positions of the gliders determine outcomes of their reaction. Glider, corresponding to the value 0 is transformed to a regular symmetric pattern, similar to frozen waves of excitation activity (Fig. 4b). Glider, representing signal value 1, is transformed to transversally asymmetric patterns (Fig. 4d). Both patterns propagate inside the channel with constant, advancing unit of channel length per step of discrete time.
(a)
(b)
(c)
(d)
Fig. 5. Configurations of delay element for signal ‘0’ (a) and (b), and signal ‘1’ (c) and (d). Thus (a) and (c) shows initial configurations, (b) and (d) final states.
(a)
(b)
Fig. 6. Implementations of or and and gates at the Life rule B2/S2345. Input binary values A and B they are represented as ‘In/0’ or ‘In/1,’ output result C is represented by ‘Out/0’ or ‘Out/1.’ Thus (a) display or gate, and (b) and gate implementation.
98
G.J. Mart´ınez et al.
(a)
(b)
(c)
(d)
Fig. 7. not gate implementation for input ‘1’ (a,b) and input ‘0’ (c,d). This way (a) and (c) display initial configurations, (b) and (d) display final configurations.
(a)
(b)
Fig. 8. (a) Initial configuration: majority input values In/0 (first column), and majority input values In/1 (second column), and (b) final configurations of the majority gates
Majority Adder Implementation by Competing Patterns
99
cin
a
MAJaux
cout
b
a b cin
cout
MAJ
MAJ
a
sum
MAJ1
cin
a
cin
MAJ3
sum
MAJ2
MAJ
b
(a)
(b)
Fig. 9. Circuit and schematic diagram of a full binary adder comprised of notmajority gates. Delay elements are not shown.
3.1
Implementation of Logic Gates
Our first stage is implement basic universal logic gates. When patterns, representing values 0 and 1, meet at T -junctions they compete for the output channel. So depending on initial distance between gliders, one of the patterns wins and propagates along the output channel. On the way we can design a delay element as shown in Fig. 5. Useful to delay signals (wave propagations) and synchronize multiple collisions. Figure 6a shows a way to implement an or gate. Due to different locations of gliders in initial configurations of gates, patterns in both implementations of gates are different however, results of computation are the same. Similarity a codification to implement and gate is shown in Fig. 6b. The not gate is implemented using additional channel (as a trick), where control pattern is generated, propagate and interfere with data-signal pattern. Initial and final configurations of not gate are shown in Fig. 7. A consequence with this idea is that the number of control channels growth proportionally to number of gates in the circuit. Of course, we accept that it could not be the most elegant and efficient way of constructing not gate, but useful for our purposes at the moment. 3.2
Majority Gate
Majority gate implementation on three input values can be represented as a logical proposition: (a ∧ b) ∨ (a ∧ c) ∨ (b ∧ c), where the result is precisely the most frequently value on such variables [24]. Implementation of majority gate in B2/S2345 is shown in Fig. 8. The gate has three inputs: North, West and South channels, and one output: East channel.
100
G.J. Mart´ınez et al.
Fig. 10. Final configuration of the adder for inputs: (a) main stages; (b) a = 0, b = 0, cin = 0; (c) a = 0, b = 1, cin = 0; (d) a = 1, b = 0, cin = 0; (e) a = 1, b = 1, cin = 0; (f) a = 0, b = 0, cin = 1; (g) a = 0, b = 1, cin = 1; (h) a = 1, b = 0, cin = 1; (i) a = 1, b = 1, cin = 1
Majority Adder Implementation by Competing Patterns
101
Three propagating pattern, which represent inputs, collide at the cross-junction of the gate. The resultant pattern is recorded at the output channel. 3.3
Implementation of Binary Adder with not-majority Gates
Here we will implement a binary adder constructed of three not-majority gates and two inverters. Such type of adder appears in several publications, particularly in construction of the arithmetical circuits in quantum-dot cellular automata [29,36]. Original version of the adder using not-majority gates was suggested by Minsky in his designs of artificial neural networks [24]. Figure 9a shows the classic circuit illustrating the dynamics of this adder. This way Fig. 9b represents a scheme of the adder to implement in B2/S2345. The scheme highlights critical points where some extra gates/wires are necessary to adjust inputs and synchronize times of collisions. Figure 10a presents most important stages of the full adder on B2/S2345 evolution space standing out delays stages and not gates. The adder is implemented on 1, 402 × 662 lattice that relates an square of 928,124 cells lattice with an initial population of 56,759 cells in state ‘1.’ Final configurations of the adder for every initial configuration of inputs are shown in Figs. 10b–i with a final population of 129,923 alive cells on an average of 1,439 generations3.
4
Conclusions
We have demonstrated that chaotic rule B2/S2345 supports complex patterns. That relates another case where a chaotic CA contains non evident complex behaviour [21,25], and how such systems could have some computing on its evolution space from particular initial conditions. We have shown how construct basic logical gates and arithmetical circuits by restricting propagation of patterns in channels, constructed indestructible still life blocks. However we have recognized a number of limitations on this model. Disadvantage of the approach presented is that computing space is geometrically constrained and the computation is one-time-use. Also actually we do not have a way to develop a crossing signal and fanout gate that are essential to complete a feedback full circuit operation. Nevertheless the geometrical constraining brings some benefits as well. Most computing circuits in Life-like automata are using very complex dynamics collisions between gliders and still life [9,31,16]. In this case gliders are used only to ‘ignite’ propagating patterns in the channels [4,35]. Let us check how to exploit computations on one-time-use. We could employ the known cascade circuits concept (see Fig. 11). The cascade circuit is a one without feedback where each box contains a logic circuit that realizes a local function of the CA, which is also a cascade one. Since it has a no feedback 3
To look enlarge pictures or videos of the http://uncomp.uwe.ac.uk/genaro/Life_dc22.html
simulations
please
visit
102
G.J. Mart´ınez et al.
Fig. 11. Simulating a radius
1 2
CA by an infinite (but periodic) cascade circuit
because each logic gate is used only once. This way, an initial state of each cell should be set at the position of t = 0. These results are potentially useful in the search of control of big volume on data in non-linear medium. Life-like rule B2/S2345 in this case is only an example of how is possible controller data as a single bit value in a deterministic version. Also we will explore and develop more complex flows of data as were done in reversible devices [19,27]. You can see the problem to control chemical wave propagation in reaction-diffusion computers [4] hence a competing pattern can represent a fragment of each wave. In future studies we are planning to implement the computing architecture designed in the paper to manufacture experimental prototypes of precipitating chemical computers; they will be based on crystallization of ‘hot ice’ [2]. Implementations and constructions are done with Golly system4 . Source configurations and specific initial condition (RLE files) to reproduce these results are available in ‘Life dc22’ home page5 .
Acknowledgement Genaro J. Mart´ınez was partially funded by Engineering and Physical Sciences Research Council (EPSRC), United Kingdom, grant EP/F054343 and DGAPA UNAM, Mexico. Kenichi Morita was partially funded by Grant-in-Aid for Scientific Research (C) No. 21500015 from JSPS. 4 5
http://golly.sourceforge.net/ http://uncomp.uwe.ac.uk/genaro/Life_dc22.html
Majority Adder Implementation by Competing Patterns
103
References 1. 2. 3. 4. 5.
6.
7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21.
22.
23. 24.
Adamatzky, A. (ed.): Collision-Based Computing. Springer, Heidelberg (2002) Adamatzky, A.: Hot ice computer. Physics Letters A 374(2), 264–271 (2009) Adamatzky, A. (ed.): Game of Life Cellular Automata. Springer, Heidelberg (2010) Adamatzky, A., Costello, B.L., Asai, T.: Reaction-Diffusion Computers. Elsevier, Amsterdam (2005) Adamatzky, A., Mart´ınez, G.J., Seck-Tuoh-Mora, J.C.: Phenomenology of reactiondiffusion binary-state cellular automata. Int. J. Bifurcation and Chaos 16(10), 1–21 (2006) Adachi, S., Peper, F., Lee, J., Umeo, H.: Occurrence of gliders in an infinite class of Life-like cellular automata. In: Umeo, H., Morishita, S., Nishinari, K., Komatsuzaki, T., Bandini, S. (eds.) ACRI 2008. LNCS, vol. 5191, pp. 32–41. Springer, Heidelberg (2008) Banks, E.R.: Information Processing and Transmission in Cellular Automata, Ph.D. thesis Department of Mechanical Engineering, MIT (1971) Berlekamp, E.R., Conway, J.H., Guy, R.K.: Winning Ways for your Mathematical Plays, ch. 25, vol. 2. Academic Press, London (1982) Chapman, P.: Life Universal Computer (2002), http://www.igblan.free-online.co.uk/igblan/ca/ Codd, E.F.: Cellular Automata. Academic Press, London (1968) Cook, M.: Still Life Theory. In: [15], pp. 93–118 (2003) Eppstein, D.: Growth and decay in Life-like cellular automata, arXiv:0911.2890v1 (nlin.CG) (2009) Gardner, M.: Mathematical Games — The fantastic combinations of John H. Conway’s new solitaire game Life. Scientific American 223, 120–123 (1970) Griffeath, D., Moore, C.: Life Without Death is P-complete. Complex Systems 10, 437–447 (1996) Griffeath, D., Moore, C. (eds.): New constructions in cellular automata. Oxford University Press, Oxford (2003) Goucher, A.: Completed Universal Computer/Constructor (2009), http://pentadecathlon.com/lifeNews/2009/08/post.html Gravner, J.: Growth Phenomena in Cellular Automata. In: [15], pp. 161–181 (2003) Hameroff, S.R.: Ultimate Computing: Biomolecular Consciousness and Nanotechnology. Elsevier Science Publishers BV, Amsterdam (1987) Imai, K., Morita, K.: A computation-universal two-dimensional 8-state triangular reversible cellular automaton. Theoret. Comput. Sci. 231, 181–191 (2000) Mart´ınez, G.J., Adamatzky, A., Costello, B.L.: On logical gates in precipitating medium: cellular automaton model. Physics Letters A 1(48), 1–5 (2008) Mart´ınez, G.J., Adamatzky, A., McIntosh, H.V.: Localization dynamic in a binary two-dimensional cellular automaton: the Diffusion Rule. arXiv:0908.0828v1 (cs.FL) (2009) Mart´ınez, G.J., Adamatzky, A., McIntosh, H.V., Costello, B.L.: Computation by competing patterns: Life rule B2/S2345678. In: Adamatzky, A., et al. (eds.) Automata 2008: Theory and Applications of Cellular Automata. Luniver Press (2008) McIntosh, H.V.: Life’s Still Lifes (1988), http://delta.cs.cinvestav.mx/~ mcintosh Minsky, M.: Computation: Finite and Infinite Machines. Prentice-Hall, Englewood Cliffs (1967)
104
G.J. Mart´ınez et al.
25. Mitchell, M.: Life and evolution in computers. History and Philosophy of the Life Sciences 23, 361–383 (2001) 26. Magnier, M., Lattaud, C., Heudin, J.-K.: Complexity Classes in the Twodimensional Life Cellular Automata Subspace. Complex Systems 11(6), 419–436 (1997) 27. Morita, K., Margenstern, M., Imai, K.: Universality of reversible hexagonal cellular automata. Theoret. Informatics Appl. 33, 535–550 (1999) 28. Mart´ınez, G.J., M´endez, A.M., Zambrano, M.M.: Un subconjunto de aut´ omata celular con comportamiento complejo en dos dimensiones (2005), http://uncomp.uwe.ac.uk/genaro/Papers/Papers_on_CA.html 29. Porod, W., Lent, C.S., Bernstein, G.H., Orlov, A.O., Amlani, I., Snider, G.L., Merz, J.L.: Quantum-dot cellular automata: computing with coupled quantum dots. Int. J. Electronics 86(5), 549–590 (1999) 30. Packard, N., Wolfram, S.: Two-dimensional cellular automata. J. Statistical Physics 38, 901–946 (1985) 31. Rendell, P.: Turing universality of the game of life. In: [1], pp. 513–540 (2002) 32. Rennard, J.P.: Implementation of Logical Functions in the Game of Life. In: [1], pp. 491–512 (2002) 33. Toffoli, T.: Non-Conventional Computers. In: Webster, J. (ed.) Encyclopedia of Electrical and Electronics Engineering, vol. 14, pp. 455–471. Wiley & Sons, Chichester (1998) 34. von Neumann, J.: Theory of Self-reproducing Automata. In: Burks, A.W. (ed. and completed). University of Illinois Press, Urbana (1966) 35. Wainwright, R. (ed.): Lifeline - A Quaterly Newsletter for Enthusiasts of John Conway’s Game of Life, vol. (1-11), (March 1971-September 1973) 36. Walus, K., Schulhof, G., Zhang, R., Wang, W., Jullien, G.A.: Circuit design based on majority gates for applications with quantum-dot cellular automata. In: Proceedings of IEEE Asilomar Conference on Signals, Systems, and Computers (2004)
Solving Partial Differential Equation via Stochastic Process Jun Ohkubo Institute for Solid State Physics, University of Tokyo, Kashiwanoha 5-1-5, Kashiwa-shi, Chiba 277-8581, Japan
[email protected]
Abstract. We investigate a theoretical framework to solve partial differential equations by using stochastic processes, e.g., chemical reaction systems. The framework is based on ‘duality’ concept in stochastic processes, which has been widely studied and used in mathematics and physics. Using the duality concept, a partial differential equation is connected to a stochastic process via a duality function. Without solving the partial differential equation, information about the partial differential equation can be obtained from a solution of the stochastic process. From a viewpoint of unconventional computation, one may say that the stochastic process can solve the partial differential equation. An algebraic method to derive dual processes is explained, and two examples of partial differential equations are shown. Keywords: duality, partial differential equation, chemical reaction system, stochastic process.
1
Introduction
Chemical reaction systems are ubiquitous in real worlds. Many reaction systems organize living matter. According to changes of environments, many signal cascades, signal transformations, protein productions, and so on, occur to adapt for the new environments. In this sense, one may view these reaction systems as information processing systems in living matter. In addition, prey-predator relations in ecological systems are mathematically expressed in a similar way to the chemical reaction systems. Hence, a mathematical scheme for chemical reaction systems has wide applications, and it would be beneficial to reveal the mathematical characteristics of reaction systems. In order to investigate chemical reaction systems, a rate equation approach is sometimes used in order to describe dynamics of the reaction systems [1]. For example, it has been known that the Lotka-Volterra system is effectively described by simultaneous differential equations and it shows oscillating behaviors. However, the description based on the rate equation approach owes the assumption that the system size is large enough. If the system size is not large, stochasticity becomes important, and we need stochastic description for the dynamics of the reaction systems. Actually, it has been revealed that fluctuations play important roles in living matter [2]. Hence, one may use master equations, which C.S. Calude et al. (Eds.): UC 2010, LNCS 6079, pp. 105–114, 2010. c Springer-Verlag Berlin Heidelberg 2010
106
J. Ohkubo
describes the stochastic behavior of the system adequately, or the Gillespie algorithm [1], which can exactly compute the dynamics numerically. In addition, stochastic differential equations or Fokker-Planck equations are widely used [3]. The Fokker-Planck equations are partial differential equations, which are derived by Kramers-Moyal expansion method or van Kampen’s size-expansion method from the original master equations. In this sense, the dynamics of the reaction systems are approximately connected to partial differential equations. Although one may consider that observing the reaction systems corresponds to solving the partial differential equations, the correspondences between the reaction systems and the Fokker-Planck equations are only approximations; if the system size is small, the description based on the Fokker-Planck equations becomes inadequate. The aim of the present paper is not solving the dynamics of chemical systems, but using the dynamics to solve different problems. That is, we show a theoretical framework to solve a partial differential equation using a stochastic process, i.e., a reaction system. The framework is based on ‘duality’ concept in stochastic processes [4], which is an exact correspondence between a master equation and a partial differential equation. (Note that the correspondence between a master equation and a Fokker-Planck equation is an approximation.) The duality concept has been widely studied and used in mathematics and physics; for example, see [4,5,6,7,8,9,10,11,12,13]. Using a solution of the master equation, it is possible to obtain m-th-moment-like quantities of the partial differential equation, without solving the partial differential equation directly. In this sense, one may say that the reaction system can solve the partial differential equation. Generally, a master equation has its dual process described by a partial differential equation, and an algebraic method to derive a partial differential equation from a master equation has been proposed recently [14]. We will show that these schemes are available to obtain a theoretical framework to solve a partial differential equation using a stochastic process. The present paper is organized as follows. In Sec. 2, we explain our main results, i.e., a calculation scheme to solve partial differential equations using stochastic processes. Section 3 gives the algebraic method to derive a partial differential equation from a master equation. In Sec. 4, two examples of reaction systems and corresponding partial differential equations are shown. Finally, we discuss an applicability of our theoretical framework in Sec. 5.
2
Main Results: Calculation Scheme
We first explain our main results, i.e., a calculation scheme to solve a partial differential equation via a stochastic process. We will show the scheme using a concrete example. In Sec. 3, the theoretical framework will be discussed. Our aim here is to obtain information about a partial differential equation from a solution of a stochastic process, without solving the partial differential equation. From the viewpoint of unconventional computation, one may say that we can solve the partial differential equation using natural objects, i.e., chemical reactions.
Solving Partial Differential Equation via Stochastic Process
107
Firstly, we consider the following chemical reaction system: A → A + A at rate γ,
(1)
A + A → A at rate σ.
The probability of the number of particle A at time t is denoted by P (n, t). The chemical reaction system has a corresponding partial differential equation: ∂ ∂ 1 ∂2 φ(z, t) = − [γz(z − 1)φ(z, t)] + [σ(1 − z)zφ(z, t)] . (2) ∂t ∂z 2 ∂z 2 Note that (2) is not a Fokker-Planck equation for the stochastic process in (1). The method to obtain the partial differential equation from the reaction system will be explained in Secs. 3 and 4. As discussed in Secs. 3 and 4, we can show that the stochastic process and the partial differential equation are connected via ∞ ∞ ∞ ∞ n dz φ(z, t) P (n, 0)z = dz φ(z, 0) P (n, t)z n . (3) −∞
−∞
n=0
n=0
This relation is called the duality in stochastic processes. If the following normalization constant can be defined adequately, ∞ Z(t) = dz φ(z, t) < ∞, (4) −∞
˜ t) is considered as a probability distribution: then the following quantity φ(z, ˜ t) ≡ φ(z, t) . φ(z, Z(t)
(5)
Hence, we here define m-th-moment-like quantities as follows: ∞ m z(t) ≡ dz φ(z, t)z m .
(6)
−∞
The quantities z(t)m include information about the solution φ(z, t) of the partial differential equation of (2). Finally, noticing the duality relation of (3), it is possible to show that the quantities z(t)m are obtained from the solution of the stochastic process P (n, t). If we set the initial state of the stochastic process as P (n, 0) = δn,m , where δm,n is the Kronecker delta, the quantities z(t)m of the partial differential equation are calculated as follows: ∞ ∞ ∞ z(t)m = dz φ(z, t)z m = dz φ(z, 0) P (n, t)z n . (7) −∞
−∞
n=0
According to the above discussions, without using the solution φ(z, t), the quantities z(t)m are calculated from P (n, t) for an arbitrary initial state of the partial differential equation, φ(z, 0). Figure 1 shows the calculation scheme. Using the reaction system as the ‘tool’, the information about the partial differential equation is obtained. We here summarize the calculation scheme as follows:
108
J. Ohkubo
Fig. 1. Calculation scheme. Although the quantities z(t)m should be basically calculated by solutions of a partial differential equation, the quantities z(t)m are evaluated by a solution of a stochastic process using the duality relation, without solving the partial differential equation directly.
Calculation Scheme 1. In the reaction system, set m particles as an initial state. 2. Using the time development of the reaction system, evaluate its probability distribution P (n, t). 3. Using (7), calculate z(t)m for arbitrary initial condition φ(z, 0) of the partial differential equation. We again note that the partial differential equation (2) is not a Fokker-Planck equation for the reaction system (1). It is, therefore, possible to obtain the information about (2), not approximately, but exactly, from the solution of the reaction system. If we consider small systems, as explained in Sec. 1, the discrete property of reaction systems should be considered seriously. If we can observe the discrete probability distribution P (n, t), it is available to investigate partial differential equations. Theoretically, the scheme can connect a master equation and a partial differential equation exactly. However, there would be many difficulties to use the scheme in realistic situations. One of the problems is how to find the partial differential equation, and an algebraic method to derive it is explained in the following sections. Some of the other problems will be discussed in Sec. 5.
3 3.1
Derivation of Dual Process Definition of Duality in Stochastic Processes
We briefly review general discussions for duality. Details for duality are given in [4]. Usually, duality in stochastic processes connects two stochastic processes via a duality function. Suppose that (nt )t≥0 and (zt )t≥0 are continuous time Markov processes on state spaces Ω and Ωdual , respectively. Here, we denote the expectation value given that the process (nt )t≥0 starts from n0 as En0 ; the expectation
Solving Partial Differential Equation via Stochastic Process
109
in the process (zt )t≥0 starting from z0 is given by Edual z0 . The process (nt )t≥0 is said to be dual to (zt )t≥0 with respect to a duality function D : Ω × Ωdual → R if for all n ∈ Ω, z ∈ Ωdual and t ≥ 0 we have En0 D(z0 , nt ) = Edual z0 D(zt , n0 ).
(8)
Although the above duality denotes a dual relation between two stochastic processes, the duality relation can be seen as a connection between stochastic process with discrete state space and a partial differential equation, as explained later. 3.2
Generating Function Approach and Algebraic Method
We here treat only a simple case with one variable, but the extension for multivariate case is straightforward. A stochastic process for a chemical reaction system is generally described by a birth-death process, and its time evolution obeys a master equation. It is sometimes useful to use a generating function instead of the original master equation [3]. A generating function G(x, t) for probability P (n, t) is defined as G(x, t) =
∞
P (n, t)xn ,
(9)
n=0
where n ∈ N, x ∈ R, and P (n, t) is the probability with n particles at time t. The time evolution equation for G(x, t) is written as ∂ ∂ G(x, t) = L x, G(x, t), (10) ∂t ∂x ∂ where L(x, ∂x ) is a linear operator, which is constructed from the original master equation. It is known that an algebraic scheme is sometimes useful for analyzing the stochastic process. The algebraic scheme is called the Doi-Peliti method [15,16,17], field theoretical method, or second quantization method, and so on [18]. The DoiPeliti method is equivalent to the generating function approach, and we here employ these two methods to derive the duality relation. Starting from the generating function approach, two operators in the DoiPeliti method are introduced as follows:
a† ≡ x,
a≡
∂ . ∂x
(11)
a† is called a creation operator, and a is an annihilation operator. These operators satisfy the commutation relations [a, a† ] = 1,
[a, a] = [a† , a† ] = 0,
(12)
where [A, B] ≡ AB − BA. Each operator works on a vector in Fock space |n as follows: a† |n = |n + 1,
a|n = n|n − 1.
(13)
110
J. Ohkubo
The vacuum state |0 is characterized by a|0 = 0. It is possible to interpret state |n as a state with n particles. Hence, the creation operator adds one particle to the system; the annihilation operator removes one particle from the system and multiplies n to the state vector. The inner product of bra state m| and ket state |n is defined as m|n = δm,n n!.
(14)
Consider the following construction for ket and bra states in the Doi-Peliti method: m ∂ n (·). (15) |n ≡ x , m| ≡ dx δ(x) ∂x Using the definition of the operators (11), it is easy to see that all properties in ∂ the Doi-Peliti method are recovered using x and ∂x . Hence, when we define a time-dependent state |ψ(t) as |ψ(t) =
∞
P (n, t)|n,
(16)
n=0
we obtain the time evolution of the state |ψ(t) as ∂ |ψ(t) = L(a† , a)|ψ(t), ∂t where the linear operator L(a† , a) in (17) is obtained by replacing x and ∂ ) in (10) with a† and a, respectively. L(x, ∂x 3.3
(17) ∂ ∂x
of
Differential Equation Obtained from Duality
The derivation of the dual process and duality function has been given in [14]. We here briefly review the derivation. A bra state φ(t)| is defined as ∞ dz φ(z, t)z|, (18) φ(t)| ≡ −∞
where z| is a coherent state of a† [19]: z| ≡ 0|eza ,
(19)
z|a† = zz|,
(20)
which satisfies
and z is assumed to be a real variable.
Solving Partial Differential Equation via Stochastic Process
111
As shown in [14], we can show that ∞ ∂ ∂ ∗ φ(t)|L x, dz L z, = φ(z, t) z|, (21) ∂x ∂z −∞ ∂ ∂ ∂ is obtained by simply replacing x and ∂x as z and ∂z , respecwhere L z, ∂z ∂ ∂ ∗ tively; L z, ∂z is the adjoint operator of L z, ∂z . While the time evolution ∂ operator L(x, ∂x ) changes |ψ(0) to |ψ(t), (21) means that the time evolution ∂ ∗ ) acts on φ(z, t). Hence, we can consider the following time operator L (z, ∂z evolution equation for φ(z, t): ∂ ∂ φ(z, t) = L∗ z, φ(z, t). (22) ∂t ∂z The above discussion indicates that, instead of the time evolution of |ψ, we can use the time evolution of φ(z, t) to evaluate a quantity φ(0)|ψ(t), i.e., φ(t)|ψ(0) = φ(0)|ψ(t) .
(23)
The above equation indicates the duality relation between φ and ψ. Using the definition of |ψ(t) (i.e., (16)) and noticing |n = (a† )n |0, we finally obtain (3).
4 4.1
Examples Reaction System with One Component
As a simple example, we here consider a reaction system with only one component, i.e., the example considered in Sec. 2. For readability, we again write the reaction scheme here: A → A + A at rate γ, A + A → A at rate σ. The master equation for the reaction system is written as ∂ P (n, t) =γ(n − 1)P (n − 1, t) − γnP (n, t) ∂t (n + 1)n n(n − 1) −σ P (n, t) + σ P (n + 1, t), (24) 2 2 where n is the number of component A. It is easy to confirm that the following linear operator in the Doi-Peliti method recovers the master equation (24): σ L = γ(a† − 1)a† a + (1 − a† )a† a2 . (25) 2 According to the discussions in Sec. 3, the adjoint operator L∗ is constructed as follows: ∂ 1 ∂2 [γz(z − 1)] + [σ(1 − z)z] . (26) ∂z 2 ∂z 2 Hence, it is possible to obtain information about the partial differential equation (2) from the chemical reaction system (1). L∗ = −
112
4.2
J. Ohkubo
Reaction System with Two Components
We next treat a little complicated case, i.e., reaction system with two components: A → 2A A + B → 2B B→∅
at rate c1 , at rate c2 , at rate c3 .
(27)
The master equation of the system is denoted as ∂ P (nA , nB , t) =c1 (nA − 1)P (nA − 1, nB , t) − c1 nA P (nA , nB , t) ∂t + c2 (nA + 1)(nB − 1)P (nA + 1, nB − 1, t) − c2 nA nB P (nA , nB , t) + c3 (nB + 1)P (nA , nB + 1, t) − c3 nB P (nA , nB , t),
(28)
where nA and nB are the numbers of component A and that of B, respectively. The linear operator L in the Doi-Peliti method is written by L = c1 (a†A a†A aA − a†A aA ) + c2 (aA a†B a†B aB − a†A aA a†B aB ) + c3 (aB − a†B aB ), (29) where a†A and aA (or a†B and aB ) are the creation and annihilation operators for component A (or B), respectively. We finally obtain the following adjoint linear operator L∗ = −c1
∂ ∂2 ∂ xA (xA − 1) − c2 (xA xB + x2B ) + c3 (1 + xB ), ∂xA ∂xA xB ∂xB
(30)
and the corresponding partial differential equation is ∂ ∂ φ(zA , zB , t) = − c1 [xA (xA − 1)φ(zA , zB , t)] ∂t ∂xA ∂2 − c2 [xB (xA + xB )φ(zA , zB , t)] ∂xA ∂xB ∂ + c3 [(1 + xB )φ(zA , zB , t)]. ∂xB
(31)
Hence, the information about the partial differential equation (31) is obtained by using the solution of the reaction system (27).
5
Discussions
In the present paper, we proposed a theoretical framework to solve a partial differential equation by chemical reaction systems, i.e., stochastic processes. The framework is based on the duality concept in stochastic process. Extending the
Solving Partial Differential Equation via Stochastic Process
113
duality concept, it is possible to evaluate m-th-moment-like quantity z(t)m , which includes information about the solution of the partial differential equation. In order to find a corresponding partial differential equation of a reaction system, the algebraic method is available. As discussed in Sec. 1, the discrete property would be important in a wide range of systems from living matter, nanotechnology to unconventional computation in small systems. In such small systems, a continuous description for reaction systems becomes inadequate, and we need appropriate description for the discreteness. One of the applicability of such discreteness was shown in the present paper, with the aid of the duality concept in physics and mathematics. Of course, our proposition is only a theoretical one; there may be some difficulties to use the framework in experiments. For example, it might be difficult to set specific initial states; it is necessary to specify the initial state with m particles to evaluate m-th-moment-like quantity z(t)m . In addition, an accurate observation of the probability distribution might be difficult in usual chemical reaction systems; we may construct more tractable systems with small sizes. These difficulties may raise new challenging problems for experiments to operate such small systems. From a theoretical point of view, there is at least one problem; we have not obtained a general scheme to derive a chemical system from a given partial differential equation, although the reverse procedure is clarified. Due to the construction of the algebraic method which is used to derive the duality, one can imagine that only a limited class of partial differential equations are solved using the scheme. It is also unclear whether there are other schemes to solve arbitrary partial differential equations using the similar scheme as the duality or not. Compared with large systems, small systems would be working effectively from a viewpoint of energy consumption. In this sense, we hope that new computation concept inspired by small systems or stochasticity would be important and suggestive for future researches in unconventional computation. Acknowledgments. This work was supported in part by grant-in-aid for scientific research (Grants No. 20115009 and No. 21740283) from the Ministry of Education, Culture, Sports, Science and Technology, Japan.
References 1. Wilkinson, D.J.: Stochastic Modelling for Systems Biology. CRC Press, London (2006) 2. Rao, C.V., Wolf, D.M., Arkin, A.P.: Control, exploitation and tolerance of intracellular noise. Nature (London) 420, 231 (2002) 3. Gardiner, C.: Stochastic Methods, 4th edn. Springer, Berlin (2004) 4. Liggett, T.M.: Interacting Particle Systems (Classics in Mathematics). Reprint of the 1985 edition. Springer, Berlin (2005) 5. Kipnis, C., Marchioro, C., Presutti, E.: Heat Flow in an Exactly Solvable Model. J. Stat. Phys. 27, 65 (1982) 6. Spohn, H.: Long range correlations for stochastic lattice gases in a non-equilibrium steady state. J. Phys. A: Math. Gen. 16, 4275 (1983)
114
J. Ohkubo
7. Shiga, T., Uchiyama, K.: Stationary States and their Stability of the Stepping Stone Model Involving Mutation and Selection. Probab. Th. Rel. Fields 73, 87 (1986) 8. Sch¨ utz, G., Sandow, S.: Non-Abelian symmetries of stochastic processes: Derivation of correlation functions for random-vertex models and disordered-interactingparticle systems. Phys. Rev. E 49, 2726 (1994) 9. Sch¨ utz, G.M.: Duality Relations for Asymmetric Exclusion Processes. J. Stat. Phys. 86, 1265 (1997) 10. M¨ ohle, M.: The concept of duality and applications to Markov processes arising in neutral population genetics models. Bernoulli 5, 761 (1999) 11. Doering, C.R., Mueller, C., Smereka, P.: Interacting particles the stochastic FisherKolmogorov-Petrovsky-Piscounov equation, and duality. Physica A 325, 243 (2003) 12. Giardin` a, C., Kurchan, J., Redig, F.: Duality and exact correlations for a model of heat conduction. J. Math. Phys. 48, 033301 (2007) 13. Giardin` a, C., Kurchan, J., Redig, F., Vafayi, K.: Duality and Hidden Symmetries in Interacting Particle Systems. J. Stat. Phys. 135, 25 (2009) 14. Ohkubo, J.: Duality in interacting particle systems and boson representation. To appear in J. Stat. Phys., arXiv:0909.5290 15. Doi, M.: Second quantization representation for classical many particle system. J. Phys. A: Math. Gen. 9, 1465 (1976) 16. Doi, M.: Stochastic theory of diffusion-controlled reaction. J. Phys. A: Math. Gen. 9, 1479 (1976) 17. Peliti, L.: Path integral approach to birth-death processes on a lattice. J. Physique 46, 1469 (1985) 18. T¨ auber, U.C., Howard, M., Vollmayr-Lee, B.P.: Applications of field-theoretic renormalization group methods to reaction-diffusion problems. J. Phys. A: Math. Gen. 38, R79 (2005) 19. Perelomov, A.: Generalized Coherent States and Their Applications. Springer, Berlin (1986)
Postselection Finite Quantum Automata Oksana Scegulnaja-Dubrovska, Lelde L¯ ace, and R¯ usi¸ nˇs Freivalds Department of Computer Science, University of Latvia, Rai¸ na bulv¯ aris. 29, Riga, Latvia
Abstract. Postselection for quantum computing devices was introduced by S.Aaronson[2] as an excitingly efficient tool to solve long standing problems of computational complexity related to classical computing devices only. This was a surprising usage of notions of quantum computation. We introduce Aaronson’s type postselection in quantum finite automata. There are several nonequivalent definitions of quantum finite automata. Nearly all of them recognize only regular languages but not all regular languages. We prove that PALINDROMES can be recognized by MMquantum finite automata with postselection. At first we prove by a direct construction that the complement of this language can be recognized this way. This result distinguishes quantum automata from probabilistic automata because probabilistic finite automata with non-isolated cut-point 0 can recognize only regular languages but PALINDROMES is not a regular language.
1
Introduction
S.Aaronson[2] introduced an interesting notion of postselection for quantum computing devices. It is clear from the very beginning that they can never be implemented because they contradict laws of Quantum Mechanics. However, this notion appears to be extremely useful to prove properties of existing types of algorithms and machines. The definition of postselection by S.Aaronson[2] cannot be used for finite automata directly because his construction needs unlimited ammount of memory. Definition 1. A postselection quantum finite automaton is a quantum finite automaton (MO- or MM-quantum automaton) with a set of states called postselection set of states and a special state q+ . At the very end of the work of the automaton when the end-marker is already read but before the measurement of accepting and rejecting states the amplitudes of all the states outside the postselection set are mechanically made to equal zero. If at least one of the postselection states is not equal zero, then the amplitudes of all the postselection states are normalized, i.e. multiplied to a positive real number such that in the result of this
The research was supported by Grant No. 09.1570 from the Latvian Council of Science and by Project 2009/0216/1DP/1.1.2.1.2/09/IPIA/VIA/004 from the European Social Fund.
C.S. Calude et al. (Eds.): UC 2010, LNCS 6079, pp. 115–126, 2010. c Springer-Verlag Berlin Heidelberg 2010
116
O. Scegulnaja-Dubrovska, L. L¯ ace, and R. Freivalds
normalization the total of squares of the modulos of the amplitudes of the postselection states equals 1. If at the moment of postselection all the amplitudes of the postselection states are equal to zero, then these amplitudes stay equal to zero but the state q+ gets amplitude 1. This way, at the result of the postselection the total of squares of the modulos of the amplitudes of all the states equals 1. Postselection is the power of discarding all runs of a computation in which a given event does not occur. To illustrate, suppose we are given a Boolean formula in a large number of variables, and we wish to find a setting of the variables that makes the formula true. Provided such a setting exists, this problem is easy to solve using postselection: we simply set the variables randomly, then postselect on the formula being true. This paper studies the power of postselection in a quantum computing context. S. Aaronson[2] defines a new complexity class called P ostBQP (postselected bounded-error quantum polynomial-time), which consists of all problems solvable by a quantum computer in polynomial time, given the ability to postselect on a measurement yielding a specific outcome. The main result is that P ostBQP equals the well-known classical complexity class P P (probabilistic polynomial-time). Here P P is the class of problems for which there exists a probabilistic polynomial-time Turing machine that accepts with probability greater than 1/2 if and only if the answer is yes. For example, given a Boolean formula, a P P machine can decide whether the majority of settings to the variables make the formula true. Indeed, this problem turns out to be P P -complete (that is, among the hardest problems in P P ). S. Aaronson himself describes his aim as follows: ”The motivation for the P ostBQP = P P result comes from two quite different sources. The original motivation was to analyse the computational power of fantasy versions of quantum mechanics, and thereby gain insight into why quantum mechanics is the way it is. In particular, 4 will show that if we changed the measurement probability rule from |ψ|2 to |ψ|p for some p = 2, or allowed linear but non-unitary evolution, then we could simulate postselection, and thereby solve P P -complete problems in polynomial time. If we consider such an ability extravagant, then we might take these results as helping to explain why quantum mechanics is unitary, and why the measurement rule is |ψ|2 . A related motivation comes from an idea that might be called anthropic computing - arranging things so that we are more likely to exist if a computer produces a desired output than if it does not. As a simple example, under the many-worlds interpretation of quantum mechanics, we might kill ourselves in all universes where a computer fails! My result implies that, using this technique, we could solve not only N P -complete problems efficiently, but P P -complete problems as well. However, the P ostBQP = P P result also has a more unexpected implication. One reason to study quantum computing is to gain a new, more general perspective on classical computer science. By analogy, many famous results in computer science involve only deterministic computation, yet it is hard to imagine how anyone could have proved these results had researchers not long ago
Postselection Finite Quantum Automata
117
taken aboard the notion of randomness. Likewise, taking quantum mechanics aboard has already led to some new results about classical computation[1,5,20]. What this paper will show is that, even when classical results are already known, quantum computing can sometimes provide new and simpler proofs for them.”
2
Definitions
A quantum finite automaton (QFA) is a theoretical model for a quantum computer with a finite memory. If we compare them with their classical (nonquantum) counterparts, QFAs have both strengths and weaknesses. The strength of QFAs is shown by the fact that quantum automata can be exponentially more space efficient than deterministic or probabilistic automata [8]. The weakness of QFAs is caused by the fact that any quantum process has to be reversible (unitary). This makes quantum automata unable to recognize some regular languages. We start by reviewing the concept of probabilistic finite state transducer. For a finite set X we denote by X ∗ the set of all finite strings formed from X, the empty string is denoted . Definition 2. A probabilistic finite state transducer (pfst) is a tuple T = (Q, Σ1 , Σ2 , V, f, q0 , Qacc , Qrej ), where Q is a finite set of states, Σ1 , Σ2 is the input/ output alphabet, q0 ∈ Q is the initial state, and Qacc , Qrej ⊂ Q are (disjoint) sets of accepting and rejecting states, respectively. (The other states, forming set Qnon , are called non–halting). The transition function V : Σ1 × Q → Q is such that for all a ∈ Σ1 the matrix (Va )qp is stochastic, and fa : Q → Σ2∗ is the output function. If all matrix entries are either 0 or 1 the machine is called a deterministic finite state transducer (dfst). The meaning of this definition is that, being in state q, and reading input symbol a, the transducer prints fa (q) on the output tape, and changes to state p with probability (Va )qp , moving input and output head to the right. After each such step, if the machine is found in a halting state, the computation stops, accepting or rejecting the input. Definition 3. Let R ⊂ Σ1∗ × Σ2∗ . For α > 1/2 we say that T computes the relationR with probability α if for all v, whenever (v, w) ∈ R, then T (w|v) ≥ α, and whenever (v, w) ∈ R, then T (w|v) ≤ 1 − α For 0 < α < 1 we say that T computes the relationR with isolated cutpoint α if there exists ε > 0 such that for all v, whenever (v, w) ∈ R, then T (w|v) ≥ α + ε, but whenever (v, w) ∈ R, then T (w|v) ≤ α − ε. The following definition is modelled after the ones for pfst for quantum finite state automata [21]:
118
O. Scegulnaja-Dubrovska, L. L¯ ace, and R. Freivalds
Definition 4. A quantum finite state transducer (qfst) is a tuple T = = (Q, Σ1 , Σ2 , V, f, q0 , Qacc , Qrej ), where Q is a finite set of states, Σ1 , Σ2 is the input/output alphabet, q0 ∈ Q is the initial state, and Qacc , Qrej ⊂ Q are (disjoint) sets of accepting and rejecting states, respectively. The transition function V : Σ1 × Q → Q is such that for all a ∈ Σ1 the matrix (Va )qp is unitary, and fa : Q → Σ2∗ is the output function. Probabilistic and quantum finite automata are special cases of the transducers where the result can be only only 0 or 1. Nothing needs to be added for the definition of probabilistic automata. However, the case of quantum automata is much more complicated.
3
Specifics of Quantum Finite Automata
Quantum finite automata (QFA) were introduced independently by Moore and Crutchfield [23] and Kondacs and Watrous [21]. They differ in a seemingly small detail. The first definition allows the measurement only at the very end of the computation process. Hence the computation is performed on the quantum information only. The second definition allows the measurement at every step of the computation. In the process of the measurement the quantum information (or rather, a part of it) is transformed into the classical information. The classical information is not processed in the subsequent steps of the computation. However, we add the classical probabilities obtained during these many measurements. We will see below that this leads to unusual properties of the quantum automata and the languages recognized by these automata. To distinguish these quantum automata, we call them, correspondingly, MOQFA (measure-once) and MM-QFA (measure-many). Definition 5. An MM-QFA is a tuple M = (Q; Σ; V ; q0 ; Qacc ; Qrej ) where Q is a finite set of states, Σ is an input alphabet, V is a transition function, q0∈Q is a starting state, and Qacc ⊆ Q and Qrej ⊆ Q are sets of accepting and rejecting states (Qacc ∩ Qrej = ∅). The states in Qacc and Qrej , are called halting states and the states in Qnon = Q − (Qacc ∪ Qrej ) are called non halting states. κ and $ are symbols that do not belong to Σ. We use κ and $ as the left and the right endmarker, respectively. The working alphabet of M is Γ = Σ ∪ {κ; $}. The state of M can be any superposition of states in Q (i. e., any linear combination of them with complex coefficients). We use |q to denote the superposition consisting of state q only. l2 (Q) denotes the linear space consisting of all superpositions, with l2 -distance on this linear space. The transition function V is a mapping from Γ × l2 (Q) to l2 (Q) such that, for every a ∈ Γ , the function Va : l2 (Q) → l2 (Q) defined by Va (x) = V (a, x) is a unitary transformation (a linear transformation on l2 (Q) that preserves l2 norm). The computation of a MM-QFA starts in the superposition |q0 . Then transformations corresponding to the left endmarker κ, the letters of the input word
Postselection Finite Quantum Automata
119
x and the right endmarker $ are applied. The transformation corresponding to a∈Γ consists of two steps. 1. First, Va is applied. The new superposition ψ is Va (ψ) where ψ is the superposition before this step. 2. Then, ψ is observed with respect to Eacc , Erej , Enon where Eacc = span{|q : q∈Qacc }, Erej = span{|q : q∈Qrej }, Enon = span{|q : q∈ Qnon }. It means that if the system’s state before the measurement was ψ = αi |qi + βj |qj + γk |qk qi ∈Qacc
qj ∈Qrej
qk ∈Qnon
then the measurement accepts ψ with probability Σα2i , rejects with probability Σβj2 and continues the computation (applies transformations corresponding to next letters) with probability Σγk2 with the system having state ψ = Σγk |qk . We regard these two transformations as reading a letter a. We use Va to denote the transformation consisting of Va followed by projection to Enon . This is the transformation mapping ψ to the non-halting part of Va (ψ). We use Vw to denote the product of transformations Vw = Van Van−1 . . . Va2 Va1 , where ai is the i-th letter of the word w. We also use ψy to denote the non-halting part of QFA’s state after reading the left endmarker κ and the word y∈Σ ∗ . From the notation it follows that ψw = Vκw (|q0 ). We will say that an automaton recognizes a language L with probability p (p > 12 ) if it accepts any word x∈L with probability > p and accepts any word x/ ∈L with probability ≤ p. The MO-QFA differ from MM-QFA only in the additional requirement demanding that non-zero amplitudes can be obtained by the accepting and rejecting states no earlier than on reading the end-marker of the input word. A probability distribution {(pi , φi )|1 ≤ i ≤ k} on pure states {φi }i=1 with probabilities 0 ≤ pi ≤ 1 ( ki=1 (pi ) = 1), is called a mixed state or mixture. A quantum finite automaton with mixed states is a tuple (Q, Σ, φinit , {Tδ }, Qa , Qr , Qnon ), where Q is finite a set of states, Σ is an input alphabet, φinit is an initial mixed state, {Tδ } is a set of quantum transformations, which consists of defined sequence of measurements and unitary transformations, Qa Q, Qr Q and Qnon Q are sets of accepting, rejecting and non-halting states. Comment 1 For quantum finite automata the term rejection is misleading. One can imagine that if an input word is accepted with a probability p then this word is rejected with probability 1 − p. Instead the reader should imagine that the only possible result of our automata is acception. The counterpart of our notion in recursive function theory is recursive enumerability but not recursivity. For probabilistic automata all the results by M.O.Rabin[24] are valid for both possible definitions but for quantum automata the difference is striking. Sometimes even MO-QFA can be size-efficient compared with the classical FA.
120
O. Scegulnaja-Dubrovska, L. L¯ ace, and R. Freivalds
Theorem 1. [8] 1. For every prime p the language Lp = { the length of the input word is a multiple of p } can be recognized by a MO-QFA with no more than const log p states. 2. For every p a deterministic FA recognizing Lp needs at least p states. 3. For every p a probabilistic FA with a bounded error recognizing Lp needs at least p states. The first results on MM-quantum finite automata were obtained by Kondacs and Watrous [21]. They showed that the class of languages recognized by QFAs is a proper subset of regular languages. Theorem 2. [21] 1. All languages recognized by 1-way MM-QFAs are regular. 2. There is a regular language that cannot be recognized by a 1-way MM-QFA with probability 12 + for any > 0. Brodsky and Pippenger [14] generalized the second part of Theorem 2 by showing that any language satisfying a certain property is not recognizable by an MMQFA. Theorem 3. [14] Let L be a language and M be its minimal automaton (the smallest DFA recognizing L). Assume that there is a word x such that M contains states q1 , q2 satisfying: 1. 2. 3. 4.
q1 = q2 , If M starts in the state q1 and reads x, it passes to q2 , If M starts in the state q2 and reads x, it passes to q2 , and There is a word y such that if M starts in q2 and reads y, it passes to q1 ,
then L cannot be recognized by any 1-way quantum finite automaton (Fig.1). y
x
q1
q2
x
Fig. 1. Conditions of theorem 3
Theorem 4. [10] The class of languages recognizable by a MM-QFA is not closed under union. Corollary 1. [10] The class of languages recognizable by a MM-QFA is not closed under any binary boolean operation where both arguments are significant. Another direction of research is studying the accepting probabilities of QFAs.
Postselection Finite Quantum Automata
121
Theorem 5. [8] The language a∗ b∗ is recognizable by an MM-QFA with probability 0.68... but not with probability 7/9 + for any > 0. This shows that the classes of languages recognizable with different probabilities are different. Next results in this direction were inspired by [9] where the probabilities with which the languages a∗1 . . . a∗n can be recognized are studied. There is also a lot of results about the number of states needed for QFA to recognize different languages. In some cases, it can be exponentially less than for deterministic or even for probabilistic automata [8]. In other cases, it can be exponentially bigger than for deterministic automata [11]. Summarizing these results we can see that in spite of seeming naturality of the notion of MM-quantum finite automata with isolated cut-point this class of recognizable languages has rather specifical properties. On the other hand, there have been many results on probabilistic and quantum algorithms working with non-isolated cut-point and on relations between recognition of languages with isolated and non-isolated cut-point[4]. However, it needs to be added that most of these papers when describing quantum automata restrict themselves to MO-quantum automata. MM-quantum automata are the most popular ones among the papers studying recognition with isolated cut-point, and MO-quantum automata are the most popular ones among the papers studying recognition with non-isolated cut-point.
4
Co-PALINDROMES Can Be Recognized by Finite Automata
There exist nonregular languages recognizable by probabilistic finite automata with non-isolated cut-point (they are called stochastic laguages) and languages recognizable by quantum finite automata with non-isolated cut-point. Since MOquantum finite automata differ from MM-quantum finite automata, it is possible that these classes are different as well. However, most natural problems on these automata are still open. We concentrate here on a very special subclass of these languages, namely, on classes of languages recognizable with cut-point 0. In the case of probabilistic recognition this is not an interesting notion because in this case the input word is declared accepted if the probability of acception exceeds 0, and it is declared rejected if the probability of acception equals 0. It is obvious that such automata are equivalent to nondeterministic automata but nondeterministic finite automata recognize the same regular languages as deterministic automata do. The case of quantum finite automata is different. We consider the language PALINDROMES, i.e. the language P ALIN DROM ES = {x|x ∈ {0, 1}∗ and x = xrev }. Theorem 6. The complement of PALINDROMES is recognizable by 1-way MM-quantum finite automata with non-isolated cut-point 0.
122
O. Scegulnaja-Dubrovska, L. L¯ ace, and R. Freivalds
Sketch of proof. We denote a real number 0. 000 · · · 0 x(1)x(2)x(3) · · · x(n) n zeros by 0.0{n}x(1)x(2)x(3) · · · x(n). The main idea is to have at every moment of the processing the input word x(1)x(2)x(3) · · · x(n) 2 special states of the automaton (say, q2 and q3 ) the amplitudes of which are, respectively, k1 (n) ::= 0.0{n}x(1)x(2)x(3) · · · x(n) and k2 (n) ::= 0.0{n}x(n) · · · x(3)x(2)x(1). We have k1 (n + 1) = 0.0{n + 1}x(1)x(2)x(3) · · · x(n + 1) = = (0.0{n}x(1)x(2)x(3) · · · x(n)) ×
where n+1 =
1 + n+1 2
( 12 )2n+2 , if x(n + 1) = 1 0 , if x(n + 1) = 0.
and n+1 = n × 14 . We have also k2 (n + 1) = 0.0{n + 1}x(n + 1) · · · x(3)x(2)x(1) = = (0.0{n}x(n) · · · x(3)x(2)x(1)) ×
where δn+1 =
1 + δn+1 4
( 12 )n+2 , if x(n + 1) = 1 0 , if x(n + 1) = 0.
and δn+1 = δn × 12 . Two states (q4 and q5 ) are used to have amplitudes ( 12 )2n and ( 12 )n+1 , respectively, in order to produce the current n and δn . It is not possible to half (δ) unlimitedly the amplitudes in a quantum automaton but we have an MMquantum automaton, and we use the Hadamard operation
( √12 ) ( √12 ) ( √12 ) −( √12 ) instead, and we follow Hadamard operation by measuring part of the amplitude to REJECT. We consider below the part of the states (q3 , q6 , q5 , q7 , q8 ), among which the first one is q3 and the third one is q5 . The rest of them are auxiliary states used to ensure that during the processing input symbol x(n) the amplitudes 1 , 0, 0, 0 where k2 (n) = are changed from k2 (n), 0, 21n , 0, 0, 0 to k2 (n + 1), 0, 2n+1 0.0{n}x(n) · · · x(3)x(2)x(1).
Postselection Finite Quantum Automata
123
If x(n) = 1 then we use the following operation ⎛
⎞⎛
⎞ ( √12 ) 0 0 ( √12 ) 0 ⎜ √7 ⎟ ⎜ 0 1 0 1 ⎜( √ ) ( √ 0 0 ⎟ ) 0 0 0⎟ ⎟ ⎜ 2 2 ⎟⎜ 2 2 1 ⎜ ⎜ 0 ( √12 ) ⎟ 0 0 ( √2 ) 0 ⎟= 0 ( √12 ) ( √12 ) 0 ⎟ ⎜ ⎟⎜ ⎟ 1 1 ⎜ ⎟⎜ √ √ 1 1 ( ) 0 0 −( ) 0 ⎠ ⎝ 0 0 ( √2 ) −( √2 ) 0 ⎠ ⎝ 2 2 √1 ) √1 ) 0 −( 0 0 ( 0 0 0 0 1 2 2 √
1 ( 2√ ) −( 2√72 ) 2
0
⎛
0
0
√
( 14 ) −( 2√72 )
0
( 14 )
0
⎞
√ ⎜ √7 ⎟ 1 ⎜( 4 ) ( √ ) 0 ( 47 ) 0 ⎟ 2 2 ⎜ 1 ⎟ =⎜ ( ) 0 ( 12 ) −( 12 ) ( 12 ) ⎟ ⎜ 21 ⎟ ⎝ −( ) 0 ( 12 ) ( 12 ) ( 12 ) ⎠ 2 0 0 ( √12 ) 0 −( √12 )
If x(n) = 0 then we use the following operation ⎛
(1) √4 ⎜ 15 ⎜( 4 ) ⎜ ⎜ ⎜ ⎝
0 0 0
√ ⎞ −( 415 ) 0 0 0 ⎟ ( 14 ) 0 0√ 0 ⎟ ⎟ 3 1 . 0 (√2 ) −( 2 ) 0 ⎟ ⎟ 3 1 ⎠ 0 ( 2 ) (2) 0 0 0 0 1
Now we consider the part of the states (q2 , q9 , q4 , q10 , q11 ), among which the first one is q2 and the third one is q4 . The rest of them are auxiliary states used to ensure that during the processing input symbol x(n) the amplitudes 1 1 , 0, 0, 0 to k1 (n + 1), 0, 22n+2 , 0, 0, 0 where k1 (n) = are changed from k1 (n), 0, 22n 0.0{n}x(1)x(3)x(3) · · · x(n). If x(n) = 1 then we use the following operation ⎞⎛ 1 ⎞ 0 0 0 ( √2 ) 0 0 ( √12 ) 0 ( √12 ) ( √12 ) ⎜ ( √1 ) −( √1 ) 0 ⎜ ⎟ 0 0⎟ 0 0 ⎜ 2 ⎟⎜ 0 1 0 ⎟ √ 2 √ ⎜ ⎟ ⎜ ⎟ 1 13 7 1 √ √ ⎜ 0 ⎟ ⎜ ⎟ 0 0 ( ) 0 ( ) 0 ( 2√2 ) ( 2√2 ) 0 ⎟ ⎜ 14 14 ⎟ = ⎜ √ 1 1 ⎟ ⎜ ⎜ ⎟ √ √ 1 0 0 ( 2√ ⎝ 0 ) −( 2√72 ) 0 ⎠ ⎝ ( 2 ) 0 √0 −( 2 ) ⎠ 2 √1 ) 0 0 ( √13 ) 0 −( 0 0 0 0 1 14 14 ⎛
⎛
⎞ 0 ( 12 ) 0 ( 12 ) ( √12 ) ⎜ ( 1 ) −( √1 ) 0 ⎟ ( 12 ) 0 ⎜ 2 ⎟ 2 √ ⎜ 1 ⎟ 1 1 13 ⎜ ⎟ ( ) 0 ( ) −( ) ( ) = ⎜ 4√ 4 √4 √4 ⎟ 1 7 13 ⎜ −( 7 ) 0 ( √ ) ( 4 ) ( 4√7 ) ⎟ ⎝ ⎠ 4 4 7 √ √1 ) 0 0 ( √13 ) 0 −( 14 14
124
O. Scegulnaja-Dubrovska, L. L¯ ace, and R. Freivalds
If x(n) = 0 then we use the following operation √ ⎛ ⎞ (√12 ) −( 23 ) 0 0 0 ⎜ 3 ⎟ 0 0 0⎟ ⎜ ( 2 ) ( 12 ) √ ⎜ ⎟ . ⎜ 0 0 ( 1 ) −( 415 ) 0 ⎟ √4 ⎜ ⎟ 15 1 ⎝ 0 ⎠ 0 ( 4 ) (4) 0 0 0 0 0 1 When all the input word is read, the operation corresponding to the end-marker confronts the states q2 and q3 with the Hadamard operation
( √12 ) ( √12 ) ( √12 ) −( √12 ) and the resulting amplitude of q3 is sent by measuring to ACCEPT. If the amplitudes for q2 and q3 have been equal before this operation, the word is rejected; otherwise it is accepted. We are interested in recognition of PALINDROMES but Theorem 6 considers only the complement of this language. It is not at all true that recognizability of a language implies the recognizability of the complement as well. It is so for deterministic finite automata and even for nondetermninistic finite automata. However, for nondetermninistic automata the size of the recognizing automaton may differ even exponentially. For probabilistic and quantum finite automata with isolated cut-point it is so but in the case of non-isolated cut-point this has been an open problem for a long time. We study the case of non-isolated cut-point 0 here. There is no problem for probabilistic automata because in this case probabilistic automata are equivalent to nondeterministic automata and they recognize only regular languages but regular languages are closed to complementation. We prove below in Section 5 that PALINDROMES can be recognized by MM-quantum finite automata with non-isolated cut-point 0.
5
Postselection
Theorem 7. Co-PALINDROMES can be recognized by an MM-quantum postselection finite automaton with probability 1. Proof. The MM-quantum finite automaton recognizing Co-PALINDROMES after the final application of Hadamard operation
( √12 ) ( √12 ) ( √12 ) −( √12 ) measures the resulting amplitude of q3 . If the amplitudes for q2 and q3 have been equal before this operation, the word is rejected; otherwise it is accepted.
Postselection Finite Quantum Automata
125
The MM-quantum postselection finite automaton makes postselection after the Hadamard operation but before the final measuring. The posselection set of states consists of one state only, namely, the state q3 . If the amplitudes for q2 and q3 have not been equal before the Hadamard operation, the postselection normalizes the amplitude to 1 or to -1. If If the amplitudes for q2 and q3 have been equal before the Hadamard operation, the postselection does not change the amplitude 0. Theorem 8. If a language L can be recognized by an MM-quantum postselection finite automaton with probability 1 then the complement of the language L can also be recognized by an MM-quantum postselection finite automaton with probability 1. Proof. Obvious. Corollary 2. PALINDROMES can be recognized by an MM-quantum postselection finite automaton with probability 1.
References 1. Aaronson, S.: Lower bounds for local search by quantum arguments. SIAM Journal on Computing 35(4), 805–824 (2004) (See also quant-ph/0307149)1 2. Aaronson, S.: Quantum Computing, Postselection, and Probabilistic PolynomialTime. Proceedings of the Royal Society A 461(2063), 3473–3482 (2005) (See also quant-ph/0412187) 3. Ablayev, F.M.: On Comparing Probabilistic and Deterministic Automata Complexity of Languages. In: Kreczmar, A., Mirkowska, G. (eds.) MFCS 1989. LNCS, vol. 379, pp. 599–605. Springer, Heidelberg (1989) 4. Ablayev, F.M., Freivalds, R.: Why Sometimes Probabilistic Algorithms Can Be More Effective. In: Gruska, J., Rovan, B., Wiedermann, J. (eds.) MFCS 1986. LNCS, vol. 233, pp. 1–14. Springer, Heidelberg (1986) 5. Aharonov, D., Regev, O.: Lattice problems in NP cap coNP. Journal of the ACM 52(5), 749–765 (2005) 6. Ambainis, A., Beaudry, M., Golovkins, M., K ¸ ikusts, A., Mercer, M., Therien, D.: Algebraic Results on Quantum Automata. Theory of Computing Systems 39(1), 165–188 (2006) 7. Ambainis, A., Bonner, R., Freivalds, R., K ¸ ikusts, A.: Probabilities to accept languages by quantum finite automata. In: Asano, T., Imai, H., Lee, D.T., Nakano, S.-i., Tokuyama, T. (eds.) COCOON 1999. LNCS, vol. 1627, pp. 174–183. Springer, Heidelberg (1999) (See also quant-ph/9904066) 8. Ambainis, A., Freivalds, R.: 1-way quantum finite automata: strengths, weaknesses and generalizations. In: Proc. FOCS 1998, pp. 332–341 (1998) (See also quantph/9802062) 9. Ambainis, A., K ¸ ikusts, A.: Exact results for accepting probabilities of quantum automata. Theoretical Computer Science 295(1-3), 3–25 (2003) 1
quant-ph preprints are available at http://www.arxiv.org/abs/quant-ph/preprintnumber.
126
O. Scegulnaja-Dubrovska, L. L¯ ace, and R. Freivalds
10. Ambainis, A., K ¸ ikusts, A., Valdats, M.: On the class of languages recognizable by 1-way quantum finite automata. In: Ferreira, A., Reichel, H. (eds.) STACS 2001. LNCS, vol. 2010, pp. 75–86. Springer, Heidelberg (2001) 11. Ambainis, A., Nayak, A., Ta-Shma, A., Vazirani, U.: Quantum dense coding and quantum finite automata. Journal of ACM 49, 496–511 (2002); Earlier version: STOC 1999 and quant-ph/9804043 12. Barenco, A., Bennett, C.H., Cleve, R., DiVincenzo, D.P., Margolus, N., Shor, P.W., Sleator, T., Smolin, J., Weinfurter, H.: Elementary gates for quantum computation. Physical Reviews A52, 3457–3467 (1995) 13. Bernstein, E., Vazirani, U.: Quantum complexity theory. SIAM Journal on Computing 26, 1411–1473 (1997) 14. Brodsky, A., Pippenger, N.: Characterizations of 1-way quantum finite automata. SIAM Journal on Computing 31(5), 1456–1478 (2002) (See also quant-ph/9903014) 15. Bukharaev, R.G.: Probabilistic automata. Journal of Mathematical Sciences 13(3), 359–386 (1980) 16. Freivalds, R. (Freivald, R.V.): Recognition of languages with high probability on different classes of automata. Dolady Akademii Nauk SSSR 239(1), 60–62 (1978) (Russian) 17. Freivalds, R.: Projections of Languages Recognizable by Probabilistic and Alternating Finite Multitape Automata. Information Processing Letters 13(4/5), 195–198 (1981) 18. Freivalds, R.: Complexity of Probabilistic Versus Deterministic Automata. In: Barzdins, J., Bjorner, D. (eds.) Baltic Computer Science. LNCS, vol. 502, pp. 565–613. Springer, Heidelberg (1991) 19. Freivalds, R.: Non-Constructive Methods for Finite Probabilistic Automata. International Journal of Foundations of Computer Science 19(3), 565–580 (2008) 20. Kerenidis, I., de Wolf, R.: Exponential lower bound for 2-query locally decodable codes via a quantum argument. Journal of Computer and System Sciences 69(3), 395–420 (2004) (See also quant-ph/0208062) 21. Kondacs, A., Watrous, J.: On the power of quantum finite state automata. In: Proc. FOCS 1997, pp. 66–75 (1997) 22. Macarie, I.I.: Space-Efficient Deterministic Simulation of Probabilistic Automata. SIAM Journal on Computing 27(2), 448–465 (1998) 23. Moore, C., Crutchfield, J.: Quantum automata and quantum grammars. Theoretical Computer Science 237, 275–306 (2000) (Also quant-ph/9707031) 24. Rabin, M.O.: Probabilistic automata. Information and Control 6(3), 230–245 (1963)
A New Representation of Chaitin Ω Number Based on Compressible Strings Kohtaro Tadaki Research and Development Initiative, Chuo University JST CREST 1–13–27 Kasuga, Bunkyo-ku, Tokyo 112-8551, Japan
[email protected]
Abstract. In 1975 Chaitin introduced his Ω number as a concrete example of random real. The real Ω is defined based on the set of all halting inputs for an optimal prefix-free machine U , which is a universal decoding algorithm used to define the notion of program-size complexity. Chaitin showed Ω to be random by discovering the property that the first n bits of the base-two expansion of Ω solve the halting problem of U for all binary inputs of length at most n. In this paper, we introduce a new representation Θ of Chaitin Ω number. The real Θ is defined based on the set of all compressible strings. We investigate the properties of Θ and show that Θ is random. In addition, we generalize Θ to two directions Θ(T ) and Θ(T ) with a real T > 0. We then study their properties. In particular, we show that the computability of the real Θ(T ) gives a sufficient condition for a real T ∈ (0, 1) to be a fixed point on partial randomness, i.e., to satisfy the condition that the compression rate of T equals to T . Keywords: algorithmic information theory, Chaitin Ω number, randomness, partial randomness, fixed point, program-size complexity.
1
Introduction
Algorithmic information theory (AIT, for short) is a framework for applying information-theoretic and probabilistic ideas to recursive function theory. One of the primary concepts of AIT is the program-size complexity (or Kolmogorov complexity) H(s) of a finite binary string s, which is defined as the length of the shortest binary input for a universal decoding algorithm U , called an optimal prefix-free machine, to output s. By the definition, H(s) is thought to represent the amount of randomness contained in a finite binary string s, which cannot be captured in an effective manner. In particular, the notion of program-size complexity plays a crucial role in characterizing the randomness of an infinite binary string, or equivalently, a real. In [3] Chaitin introduced the halting probability Ω as a concrete example of random real. His Ω is defined based on the set of all halting inputs for U , and plays a central role in the metamathematical development of AIT [5]. The first n bits of the base-two expansion of Ω solve the halting problem of U for inputs of length at most n. Based on this property, Chaitin showed that Ω is random. C.S. Calude et al. (Eds.): UC 2010, LNCS 6079, pp. 127–139, 2010. c Springer-Verlag Berlin Heidelberg 2010
128
K. Tadaki
In this paper, we introduce a new representation Θ of Chaitin Ω number. The real Θ is defined based on the set of all compressible strings, i.e., all finite binary strings s such that H(s) < |s|, where |s| is the length of s. The first n bits of the base-two expansion of Θ enables us to calculate a random finite string of length n, i.e., a finite binary string s for which |s| = n and |s| ≤ H(s). Based on this property, we show that Θ is random. In the work [8] we introduced the notion of partial randomness for a real as a stronger representation of the compression rate of a real by means of programsize complexity. At the same time, we generalized the halting probability Ω to Z(T ) with a real T so that, for every T ∈ (0, 1], if T is computable then the partial randomness of the real Z(T ) exactly equals to T 1 . In the case of T = 1, Z(T ) results in Ω, i.e., Z(1) = Ω. Later on, in the work [9] we revealed a special significance of the computability of the value Z(T ). Namely, we proved the fixed point theorem on partial randomness 2 , which states that, for every T ∈ (0, 1), if Z(T ) is a computable real, then the partial randomness of T equals to T , and therefore the compression rate of T equals to T , i.e., limn→∞ H(T n )/n = T , where Tn is the first n bits of the base-two expansion of T . In a similar manner to the generalization of Ω to Z(T ), in this paper we generalize Θ to two directions Θ(T ) and Θ(T ). We then show that the reals Θ(T ) and Θ(T ) both have the same randomness properties as Z(T ). In particular, we show that the fixed point theorem on partial randomness, which has the same form as for Z(T ), holds for Θ(T ). The paper is organized as follows. We begin in Section 2 with some preliminaries to AIT and partial randomness. In Section 3 we introduce Θ and study its property. Subsequently, we generalize Θ to two directions Θ(T ) and Θ(T ) in Section 4 and Section 5, respectively. In Section 6, we prove the fixed point theorem on partial randomness based on the computability of the value Θ(T ).
2
Preliminaries
We start with some notation about numbers and strings which will be used in this paper. #S is the cardinality of S for any set S. N = {0, 1, 2, 3, . . . } is the set of natural numbers, and N+ is the set of positive integers. Q is the set of rationals, and R is the set of reals. A sequence {an }n∈N of numbers (rationals or reals) is called increasing if an+1 > an for all n ∈ N. Normally, O(1) denotes any function f : N+ → R such that there is C ∈ R with the property that |f (n)| ≤ C for all n ∈ N+ . On the other hand, o(n) denotes any function g : N+ → R such that limn→∞ g(n)/n = 0. {0, 1}∗ = {λ, 0, 1, 00, 01, 10, 11, 000, . . .} is the set of finite binary strings where λ denotes the empty string, and {0, 1}∗ is ordered as indicated. We identify any string in {0, 1}∗ with a natural number in this order, i.e., we consider ϕ : {0, 1}∗ → N such that ϕ(s) = 1s − 1 where the concatenation 1s of strings 1 1 2
In [8], Z(T ) is denoted by Ω T . The fixed point theorem on partial randomness is called a fixed point theorem on compression rate in [9].
A New Representation of Chaitin Ω Number Based on Compressible Strings
129
and s is regarded as a dyadic integer, and then we identify s with ϕ(s). For any s ∈ {0, 1}∗, |s| is the length of s. For any n ∈ N, we denote by {0, 1}n the set { s | s ∈ {0, 1}∗ & |s| = n}. A subset S of {0, 1}∗ is called prefix-free if no string in S is a prefix of another string in S. For any function f , the domain of definition of f is denoted by dom f . We write “r.e.” instead of “recursively enumerable.” Let α be an arbitrary real. α is the greatest integer less than or equal to α, and α is the smallest integer greater than or equal to α. For any n ∈ N+ , we denote by αn ∈ {0, 1}∗ the first n bits of the base-two expansion of α − α with infinitely many zeros. For example, in the case of α = 5/8, α6 = 101000. A real α is called right-computable if there exists a total recursive function f : N+ → Q such that α ≤ f (n) for all n ∈ N+ and limn→∞ f (n) = α. On the other hand, a real α is called left-computable if −α is right-computable. A left-computable real is also called an r.e. real. A real α is called computable if there exists a total recursive function f : N+ → Q such that |α − f (n)| < 1/n for all n ∈ N+ . It is then easy to show the following theorem. Theorem 1. Let α ∈ R. (i) α is computable if and only if α is both right-computable and left-computable. (ii) α is right-computable if and only if the set { r ∈ Q | α < r } is r.e.
2.1
Algorithmic Information Theory
In the following we concisely review some definitions and results of AIT [3,5]. A prefix-free machine is a partial recursive function C : {0, 1}∗ → {0, 1}∗ such that dom C is a prefix-free set. For each machine C and each s ∈ {0, 1}∗, prefix-free HC (s) is defined by HC (s) = min |p| p ∈ {0, 1}∗ & C(p) = s (may be ∞). A prefix-free machine U is said to be optimal if for each prefix-free machine C there exists d ∈ N with the following property; if p ∈ dom C, then there is q ∈ dom U for which U (q) = C(p) and |q| ≤ |p| + d. It is easy to see that there exists an optimal prefix-free machine. We choose a particular optimal prefix-free machine U as the standard one for use, and define H(s) as HU (s), which is referred to as the program-size complexity of s or the Kolmogorov complexity of s. It follows that for every prefix-free machine C there exists d ∈ N such that, for every s ∈ {0, 1}∗, (1) H(s) ≤ HC (s) + d. Based on this we can show that, for every partial recursive function Ψ : {0, 1}∗ → {0, 1}∗, there exists d ∈ N such that, for every s ∈ dom Ψ , H(Ψ (s)) ≤ H(s) + d.
(2)
Based on (1) we can also show that there exists d ∈ N such that, for every s = λ, H(s) ≤ |s| + 2 log2 |s| + d.
(3)
For any s ∈ {0, 1}∗, we define s∗ as min{ p ∈ {0, 1}∗ | U (p) = s}, i.e., the first element in the ordered set {0, 1}∗ of all strings p such that U (p) = s. Then, |s∗ | = H(s) for every s ∈ {0, 1}∗ .
130
K. Tadaki
Chaitin [3] introduced Ω number as follows. For each optimal prefix-free machine V , the halting probability ΩV of V is defined by 2−|p| . ΩV = p∈dom V
For every optimal prefix-free machine V , since dom V is prefix-free, ΩV converges and 0 < ΩV ≤ 1. For any α ∈ R, we say that α is weakly Chaitin random if there exists c ∈ N such that n − c ≤ H(αn ) for all n ∈ N+ [3,5]. Chaitin [3] showed that ΩV is weakly Chaitin random for every optimal prefix-free machine V . Therefore 0 < ΩV < 1 for every optimal prefix-free machine V . 2.2
Partial Randomness
In the work [8], we generalized the notion of the randomness of a real so that the degree of the randomness, which is often referred to as the partial randomness recently, can be characterized by a real T with 0 < T ≤ 1 as follows. Definition 1 (weak Chaitin T -randomness). Let T ∈ (0, 1] and let α ∈ R. We say that α is weakly Chaitin T -random if there exists c ∈ N such that, for all n ∈ N+ , T n − c ≤ H(αn ).
In the case where T = 1, the weak Chaitin T -randomness results in the weak Chaitin randomness. Definition 2 (T -compressibility and strict T -compressibility). Let T ∈ (0, 1] and let α ∈ R. We say that α is T -compressible if H(αn ) ≤ T n + o(n), namely, if lim supn→∞ H(αn )/n ≤ T . We say that α is strictly T -compressible if there exists d ∈ N such that, for all n ∈ N+ , H(αn ) ≤ T n + d.
For every real α, if α is weakly Chaitin T -random and T -compressible, then limn→∞ H(αn )/n = T , i.e., the compression rate of α equals to T . In the work [8], we generalized Chaitin Ω number to Z(T ) as follows. For each optimal prefix-free machine V and each real T > 0, the generalized halting probability ZV (T ) of V is defined by ZV (T ) =
|p|
2− T .
p∈dom V
Thus, ZV (1) = ΩV . If 0 < T ≤ 1, then ZV (T ) converges and 0 < ZV (T ) < 1, since ZV (T ) ≤ ΩV < 1. The following theorem holds for ZV (T ). Theorem 2 (Tadaki [8]). Let V be an optimal prefix-free machine. (i) If 0 < T ≤ 1 and T is computable, then ZV (T ) is a left-computable real which is weakly Chaitin T -random and T -compressible. (ii) If 1 < T , then ZV (T ) diverges to ∞.
The computability of the value ZV (T ) has a special implication on T as follows.
A New Representation of Chaitin Ω Number Based on Compressible Strings
131
Theorem 3 (fixed point theorem on partial randomness, Tadaki [9]). Let V be an optimal prefix-free machine. For every T ∈ (0, 1), if ZV (T ) is computable, then T is weakly Chaitin T -random and T -compressible, and therefore lim
n→∞
H(Tn ) = T. n
(4)
The equality (4) means that the compression rate of T equals to T itself. Intuitively, we might interpret the meaning of (4) as follows: Consider imaginarily a file of infinite size whose content is “The compression rate of this file is 0.100111001 . . . . . . ” When this file is compressed, the compression rate of this file actually equals to 0.100111001 . . . . . . , as the content of this file says. This situation is selfreferential and forms a fixed point. For a simple and self-contained proof of Theorem 3, see Section 5 of Tadaki [11]. A left-computable real has a special property on partial randomness, as shown in Theorem 4 below. Definition 3 (T -convergence, Tadaki [10]). Let ∞T ∈ (0, 1]. AnTincreasing sequence {an } of reals is called T -convergent if n=0 (an+1 − an ) < ∞. A left-computable real α is called T -convergent if there exists a T -convergent computable, increasing sequence of rationals which converges to α.
Theorem 4 (Tadaki [12]). Let T be a computable real with 0 < T < 1. For every left-computable real α, if α is T -convergent then α is strictly T -compressible.
3
New Representation of Chaitin Ω Number
In this section, we introduce a new representation Θ of Chaitin Ω number based on the set of all compressible strings, and investigate its property. Definition 4. For any optimal prefix-free machine V , ΘV is defined by 2−|s| , ΘV = HV (s)<|s|
where the sum is over all s ∈ {0, 1}∗ such that HV (s) < |s|. For each optimal prefix-free machine V , we see that 2−HV (s) ≤ 2−HV (s) ≤ ΘV < HV (s)<|s|
s∈{0,1}∗
2−|p| = ΩV .
p∈dom V
Thus, ΘV converges and 0 < ΘV < ΩV for every optimal prefix-free machine V . It is important to evaluate how many strings s satisfy the condition HV (s) < |s|. For that purpose, we define SV (n) = { s ∈ {0, 1}∗ | |s| = n & HV (s) < n } for each optimal prefix-free machine V and each n ∈ N. We can then show the following theorem.
132
K. Tadaki
Theorem 5. Let V be an optimal prefix-free machine. Then SV (n) {0, 1}n for every n ∈ N. Moreover #SV (n) = 2n−H(n)+O(1) for all n ∈ N+ , i.e., there exists d ∈ N such that (i) #SV (n) ≤ 2n−H(n)+d for all n ∈ N, and (ii) 2n−H(n)−d ≤
#SV (n) for all sufficiently large n ∈ N. The first half of Theorem 5 is easily shown by counting the number of binary strings of length less than n. Solovay [7] showed that #{ s ∈ {0, 1}∗ | HV (s) < n } = 2n−H(n)+O(1) for every optimal prefix-free machine V . The last half of Theorem 5 slightly improves this result. Theorem 6. For every optimal prefix-free machine V , ΘV is a left-computable real which is weakly Chaitin random.
Theorem 6 results from each of Theorem 7 (i) and Theorem 8 (i) below by setting T = 1. Thus, we here omit the proof of Theorem 6. The works of Calude, et al. [1] and Kuˇcera and Slaman [6] showed that, for every α ∈ (0, 1), α is left-computable and weakly Chaitin random if and only if there exists an optimal prefix-free machine V such that α = ΩV . Thus, it follows from Theorem 6 that, for every optimal prefix-free machine V , there exists an optimal prefix-free machine W such that ΘV = ΩW . However, it is open whether the following holds or not: For every optimal prefix-free machine W , there exists an optimal prefix-free machine V such that ΩW = ΘV . In the subsequent two sections, we generalize ΘV to two directions ΘV (T ) and Θ V (T ) with a real T > 0. We see that the reals ΘV (T ) and ΘV (T ) both have the same randomness properties as ZV (T ) (i.e., the properties shown in Theorem 2 for ZV (T )).
4
Generalization of Θ to Θ(T )
Definition 5. For any optimal prefix-free machine V and any real T > 0, ΘV (T ) is defined by |s| ΘV (T ) = 2− T .
HV (s)<|s|
Thus, ΘV (1) = ΘV . If 0 < T ≤ 1, then ΘV (T ) converges and 0 < ΘV (T ) < 1, since ΘV (T ) ≤ ΘV < 1. The following theorem holds for ΘV (T ). Theorem 7. Let V be an optimal prefix-free machine, and let T > 0. (i) If T is computable and 0 < T ≤ 1, then ΘV (T ) is a left-computable real which is weakly Chaitin T -random. (ii) If T is computable and 0 < T < 1, then ΘV (T ) is strictly T -compressible. (iii) If 1 < T , then ΘV (T ) diverges to ∞. Proof. Let V be an optimal prefix-free machine. We first note that, for every s ∈ {0, 1}∗, HV (s) < |s| if and only if there exists p ∈ dom V such that V (p) = s and |p| < |s|. Thus, the set { s ∈ {0, 1}∗ | HV (s) < |s| } is r.e. and, obviously, infinite. Let s1 , s2 , s3 , . . . be a particular recursive enumeration of this set.
A New Representation of Chaitin Ω Number Based on Compressible Strings
133
(i) Suppose that T is a computable real and 0 < T ≤ 1. Then, since ΘV (T ) = ∞ −|si |/T , it is easy to see that ΘV (T ) is left-computable. i=1 2 For each n ∈ N+ , let αn be the first n bits of the base-two expansion of with infinitely many ones. Then, since 0.αn < ΘV (T ) for every n ∈ N+ , Θ V∞(T ) −|s i| = ΘV (T ), and T is computable, there exists a partial recursive i=1 2 ξ(α ) function ξ : {0, 1}∗ → N+ such that, for every n ∈ N+ , 0.αn < i=1n 2−|si |/T . ∞ ξ(α ) It is then easy to see that i=ξ(αn )+1 2−|si |/T = ΘV (T ) − i=1n 2−|si |/T < ΘV (T ) − 0.αn < 2−n for every n ∈ N+ . It follows that, for all i > ξ(αn ), 2−|si |/T < 2−n and therefore T n < |si |. Thus, given αn , by calculating the set { si | i ≤ ξ(αn ) & |si | = T n } and picking any one finite binary string of length T n which is not in this set, one can obtain s ∈ {0, 1}T n such that |s| ≤ HV (s). This is possible since { si | i ≤ ξ(αn ) & |si | = T n } = SV (T n) {0, 1}T n , where the last proper inclusion is due to the first half of Theorem 5. Hence, there exists a partial recursive function Ψ : {0, 1}∗ → {0, 1}∗ such that T n ≤ HV (Ψ (αn )). Using the optimality of V , we then see that T n ≤ H(Ψ (αn )) + O(1) for all n ∈ N+ . On the other hand, it follows from (2) that there exists cΨ ∈ N such that H(Ψ (αn )) ≤ H(αn ) + cΨ . Therefore, we have T n ≤ H(αn ) + O(1)
(5)
for all n ∈ N+ . This inequality implies that ΘV (T ) is not computable and therefore the base-two expansion of ΘV (T ) with infinitely many ones has infinitely many zeros also. Hence αn = ΘV (T )n for every n ∈ N+ . It follows from (5) that ΘV (T ) is weakly Chaitin T -random. (ii) Suppose thatT is a computable real and 0 < T < 1. Note that ΘV (T ) = ∞ −|si |/T −|si |/T T −|si | and ∞ ) = ∞ = ΘV < ∞. Thus, since T is i=1 2 i=1 (2 i=1 2 computable, it is easy to show that ΘV (T ) is a T -convergent left-computable real. It follows from Theorem 4 that ΘV (T ) is strictly T -compressible. (iii) Suppose that T > 1. We then choose a particular computable real t satisfying 1 < t ≤ T . Let us first assume that ΘV (t) converges. Based on an argument similar to the proof of Theorem 7 (i), it is easy to show that ΘV (t) is weakly Chaitin t-random, i.e., there exists c ∈ N such that tn − c ≤ H(ΘV (t)n ) for all n ∈ N+ . It follows from (3) that tn − c ≤ n + o(n). Dividing by n and letting n → ∞ we have t ≤ 1, which contradicts the fact t > 1. Thus, ΘV (t) diverges to ∞. By noting ΘV (t) ≤ ΘV (T ) we see that ΘV (T ) diverges to ∞.
5
Generalization of Θ to Θ(T )
Definition 6. For any optimal prefix-free machine V and any real T > 0, ΘV (T ) is defined by ΘV (T ) = 2−|s| , HV (s)
where the sum is over all s ∈ {0, 1}∗ such that HV (s) < T |s|.
Thus, Θ V (1) = ΘV . For each optimal prefix-free machine V and each real T with 0 < T ≤ 1, we see that
134
K. Tadaki
ΘV (T ) <
HV (s)
2−
HV (s) T
≤
2−
HV (s) T
s∈{0,1}∗
≤
|p|
2− T = ZV (T ).
p∈dom V
Thus, Θ V (T ) converges and 0 < ΘV (T ) < ZV (T ) for every optimal prefix-free machine V and every real T with 0 < T ≤ 1. We define SV,T (n) = { s ∈ {0, 1}∗ | |s| = n & HV (s) < T n } for each optimal prefix-free machine V , each T ∈ (0, 1], and each n ∈ N. It follows from Theorem 5 that SV,T (n) ⊂ SV (n) {0, 1}n for every optimal prefix-free machine V , every T ∈ (0, 1], and every n ∈ N. The following theorem holds for ΘV (T ). Theorem 8. Let V be an optimal prefix-free machine, and let T > 0. (i) If T is left-computable and 0 < T ≤ 1, then ΘV (T ) is a left-computable real which is weakly Chaitin T -random. (ii) If T is computable and 0 < T < 1, then Θ V (T ) is strictly T -compressible. (iii) If 1 < T , then Θ V (T ) diverges to ∞. Proof. Let V be an optimal prefix-free machine. (i) Suppose that T is a left-computable real and 0 < T ≤ 1. We first note that, for every s ∈ {0, 1}∗, HV (s) < T |s| if and only if there exists p ∈ dom V such that V (p) = s and |p| < T |s|. Thus, since T is left-computable, the set { s ∈ {0, 1}∗ | HV (s) < T |s| } is r.e. and, obviously, infinite. Let s1 , s 2 , s3 , . . . be ∞ a particular recursive enumeration of this set. Then, since Θ V (T ) = i=1 2−|si | , it is easy to see that ΘV (T ) is left-computable. For each n ∈ N+ , let αn be the first n bits of the base-two expansion of ΘV (T) with infinitely many ones. Then, since 0.αn < Θ V (T ) for every n ∈ N+ ∞ and i=1 2−|si | = ΘV (T ), there exists a partial recursive function ξ : {0, 1}∗ → ξ(α ) N+ such that, for every n ∈ N+ , 0.αn < i=1n 2−|si | . It is then easy to see ∞ ξ(α ) −|si | that = ΘV (T ) − i=1n 2−|si | < Θ V (T ) − 0.αn < 2−n for i=ξ(αn )+1 2 every n ∈ N+ . It follows that, for all i > ξ(αn ), 2−|si | < 2−n and therefore n < |si |. Thus, given αn , by calculating the set { si | i ≤ ξ(αn ) & |si | = n, } and picking any one finite binary string of length n which is not in this set, one can obtain s ∈ {0, 1}n such that T |s| ≤ HV (s). This is possible since { si | i ≤ ξ(αn ) & |si | = n } = SV,T (n) {0, 1}n. Hence, there exists a partial recursive function Ψ : {0, 1}∗ → {0, 1}∗ such that T n ≤ HV (Ψ (αn )). Using the optimality of V , we then see that T n ≤ H(Ψ (αn )) + O(1) for all n ∈ N+ . On the other hand, it follows from (2) that there exists cΨ ∈ N such that H(Ψ (αn )) ≤ H(αn ) + cΨ . Therefore, we have T n ≤ H(αn ) + O(1)
(6)
for all n ∈ N+ . This inequality implies that Θ V (T ) is not computable and therefore the base-two expansion of Θ V (T ) with infinitely many ones has infinitely many zeros also. Hence αn = Θ V (T )n for every n ∈ N+ . It follows from (6) that ΘV (T ) is weakly Chaitin T -random.
A New Representation of Chaitin Ω Number Based on Compressible Strings
135
(ii) Suppose that T is a computable real and 0 < T < 1. Note that
(2−|s| )T =
HV (s)
HV (s)
≤
2−T |s| < 2
2−HV (s)
HV (s)
−HV (s)
≤
s∈{0,1}∗
2−|p| = ΩV < ∞.
p∈dom V
Thus, since T is computable, it is easy to show that Θ V (T ) is a T -convergent leftcomputable real. It follows from Theorem 4 that ΘV (T ) is strictly T -compressible. (iii) Suppose that T > 1. Using (3), it is easy to show that there exists n0 ∈ N such that, for every s ∈ {0, 1}∗, if |s| ≥ n0 then HV (s) < T |s|. Thus, obviously, ΘV (T ) diverges to ∞.
6
Fixed Point Theorem on Partial Randomness by ΘV (T )
In this section, we prove the following form of fixed point theorem on partial randomness, which is based on the computability of the value ΘV (T ). Note that this theorem has the same form as Theorem 3. Theorem 9 (fixed point theorem on partial randomness by ΘV (T )). Let V be an optimal prefix-free machine. For every T ∈ (0, 1), if ΘV (T ) is computable, then T is weakly Chaitin T -random and T -compressible.
Let V be an arbitrary optimal prefix-free machine in what follows. Theorem 9 follows immediately from Theorem 10, Theorem 11, and Theorem 12 below, as well as from Theorem 1 (i). Let s1 , s2 , s3 , . . . be a particular recursive enumeration of the infinite r.e. set { s ∈ {0, 1}∗ | HV (s) < |s| }. For each k ∈ N+ and each real k x > 0, we define Zk (x) as i=1 2−|si |/x . Note then that limk→∞ Zk (x) = ΘV (x) for every x ∈ (0, 1]. Theorem 10. For every T ∈ (0, 1), if ΘV (T ) is right-computable then T is weakly Chaitin T -random. k Proof. First, we define Wk (x) as i=1 |si | 2−|si |/x for each k ∈ N+ and each real x > 0. We show that, for each x ∈ (0, 1), Wk (x) converges as k → ∞. Let x be an arbitrary real with x ∈ (0, 1). Since x < 1, there is l0 ∈ N+ such that (log2 l)/l ≤ 1/x − 1 for all l ≥ l0 . Then there is k0 ∈ N+ such that |si | ≥ l0 for all i > k0 . Thus, we see that, for each i > k0 , |si | 2−
| si | x
=2
1 −( x −
log2 |si |
| si |
)|si |
≤ 2−|si | .
k k Hence, for each k > k0 , Wk (x) − Wk0 (x) = i=k0+1 |si | 2−|si |/x ≤ i=k0+1 2−|si | < ΘV . Therefore, since {Wk (x)}k is an increasing sequence of reals bounded to the above, it converges as k → ∞, as desired. For each x ∈ (0, 1), we define a positive real W (x) as limk→∞ Wk (x).
136
K. Tadaki
On the other hand, since ΘV (T ) is right-computable by the assumption, there exists a total recursive function f : N+ → Q such that ΘV (T ) ≤ f (m) for all m ∈ N+ , and limm→∞ f (m) = ΘV (T ). We choose a particular real t with T < t < 1. Then, for each i ∈ N+ , using the mean value theorem we see that 2−
| si | x
− 2−
| si | T
<
|s | ln 2 − ti |s | 2 (x − T ) i T2
for all x ∈ (T, t). We then choose a particular c ∈ N with W (t) ln 2/T 2 ≤ 2c . Here, the limit value W (t) exists, since 0 < t < 1. It follows that Zk (x) − Zk (T ) < 2c (x − T )
(7)
for all k ∈ N+ and x ∈ (T, t). We also choose a particular n0 ∈ N+ such that 0.(T n ) + 2−n < t for all n ≥ n0 . Such n0 exists since T < t and limn→∞ 0.(T n ) + 2−n = T . Since T n is the first n bits of the base-two expansion of T with infinitely many zeros, we then see that T < 0.(T n ) + 2−n < t for all n ≥ n0 . In addition, we choose a particular n1 ∈ N+ such that (n− c)2−n ≤ 1 for all n ≥ n1 . For each n ≥ 1, since |T − 0.(Tn )| < 2−n , we see that that |T (n − c) − 0.(Tn )(n − c)| < (n−c)2−n ≤ 1. Hence, we have 0.(Tn )(n − c) ≤ T (n − c) &
T (n − c) − 2 ≤ 0.(Tn )(n − c)
(8)
for every n ≥ n1 . We define n2 = max{n0 , n1 , c + 1}. Now, given T n with n ≥ n2 , one can find k0 , m0 ∈ N+ such that f (m0 ) < Zk0 (0.(Tn ) + 2−n ). This is possible from Z(T ) < Z(0.(Tn ) + 2−n ), lim Zk (0.(Tn ) + 2−n ) = Z(0.(Tn ) + 2−n ),
k→∞
and the properties of f . It follows from Z(T ) ≤ f (m0 ) and (7) that ∞
2−|si |/T = Z(T ) − Zk0 (T ) < Zk0 (0.(Tn ) + 2−n ) − Zk0 (T ) < 2c−n .
i=k0 +1
Hence, for every i > k0 , 2−|si |/T < 2c−n and therefore T (n − c) < |si |. Thus, by calculating the set { si | i ≤ k0 & |si | = 0.(T n )(n − c) } and picking any one finite binary string of length 0.(T n )(n − c) which is not in this set, one can obtain s ∈ {0, 1}0.(T n)(n−c) such that |s| ≤ HV (s). This is possible since { si | i ≤ k0 & |si | = 0.(Tn )(n − c), } = SV (0.(Tn )(n − c)) {0, 1}0.(T n)(n−c) , where the first equality follows from the first inequality in (8) and the last proper inclusion is due to the first half of Theorem 5. Hence, there exists a partial recursive function Ψ : {0, 1}∗ → {0, 1}∗ such that 0.(T n )(n − c) ≤ H(Ψ (T n )) for all n ≥ n2 . Using (2), there is cΨ ∈ N such that H(Ψ (Tn )) ≤ H(Tn ) + cΨ for all n ≥ n2 . Thus, it follows from the second inequality in (8) that T n − T c − 2 − cΨ < H(Tn ) for all n ≥ n2 , which implies that T is weakly Chaitin T -random.
A New Representation of Chaitin Ω Number Based on Compressible Strings
137
Theorem 11. For every T ∈ (0, 1), if ΘV (T ) is right-computable, then T is also right-computable. Proof. Since ΘV (T ) is right-computable, there exists a total recursive function f : N+ → Q such that ΘV (T ) ≤ f (m) for all m ∈ N+ , and limm→∞ f (m) = ΘV (T ). Thus, since ΘV (x) is an increasing function of x ∈ (0, 1], we see that, for every x ∈ Q with 0 < x < 1, T < x if and only if there are m, k ∈ N+ such that f (m) < Zk (x). It follows from Theorem 1 (ii) that T is right-computable.
Theorem 12. For every T ∈ (0, 1), if ΘV (T ) is left-computable and T is rightcomputable, then T is T -compressible. Proof. For each i ∈ N+ , using the mean value theorem we see that 2−
|s1 | t
− 2−
|s1 | T
> (ln 2) |s1 | 2−
|s1 | T
(t − T )
for all t ∈ (T, 1). We choose a particular c ∈ N+ such that (ln 2) |s1 | 2− Then, it follows that Zk (t) − Zk (T ) > 2−c (t − T )
|s1 | T
≥ 2−c . (9)
for all k ∈ N and t ∈ (T, 1). Since T is a right-computable real with T < 1 by the assumption, there exists a total recursive function f : N+ → Q such that T < f (l) < 1 for all l ∈ N+ , and liml→∞ f (l) = T . On the other hand, since ΘV (T ) is left-computable by the assumption, there exists a total recursive function g : N+ → Q such that g(m) ≤ ΘV (T ) for all m ∈ N+ , and limm→∞ g(m) = ΘV (T ). By Theorem 6, ΘV is weakly Chaitin random and therefore ΘV ∈ / Q. Thus, the base-two expansion of ΘV is unique and contains infinitely many ones, and 0 < ΘV < 1 in particular. Given n and ΘV T n (i.e., the first T n bits of the base-two expansion of k0 −|si | ΘV ), one can find k0 ∈ N+ such that 0.(ΘV T n ) < i=1 2 . This is possible k since 0.(ΘV T n ) < ΘV and limk→∞ i=1 2−|si | = ΘV . It is then easy to see ∞ k0 −|si | that i=k0 +1 2−|si | = ΘV − i=1 2 < 2− T n ≤ 2−T n . Using the inequality ad + bd ≤ (a + b)d for any reals a, b > 0 and d ≥ 1, it follows that +
ΘV (T ) − Zk0 (T ) =
∞
2−
| si | T
< 2−n .
(10)
i=k0 +1
Note that liml→∞ Zk0 (f (l)) = Zk0 (T ). Thus, since Zk0 (T ) < ΘV (T ), one can then find l0 , m0 ∈ N+ such that Zk0 (f (l0 )) < g(m0 ). It follows from (10) and (9) that 2−n > g(m0 ) − Zk0 (T ) > Zk0 (f (l0 )) − Zk0 (T ) > 2−c (f (l0 ) − T ). Thus, 0 < f (l0 ) − T < 2c−n . Let tn be the first n bits of the base-two expansion of the rational number f (l0 ) with infinitely many zeros. Then, | f (l0 ) − 0.tn | < 2−n . It follows from | T − 0.(Tn ) | < 2−n that | 0.(Tn ) − 0.tn | < (2c + 2)2−n . Hence, T n = tn , tn ± 1, tn ± 2, . . . , tn ± (2c + 1), where T n and tn are regarded as a dyadic integer. Thus, there are still 2c+1 + 3 possibilities of T n , so that one needs only c + 2 bits more in order to determine Tn .
138
K. Tadaki
Thus, there exists a partial recursive function Φ : N+ × {0, 1}∗ × {0, 1}∗ → {0, 1}∗ such that ∀ n ∈ N+
∃ s ∈ {0, 1}∗
|s| = c + 2 & Φ(n, ΘV T n , s) = Tn .
(11)
Let us consider a prefix-free machine D which satisfies the following two conditions (i) and (ii): (i) For each p, q ∈ dom U and v, s ∈ {0, 1}∗, pqvs ∈ dom D if and only if |v| = U (q) and |s| = c+2. (ii) For each p, q ∈ dom U and v, s ∈ {0, 1}∗ such that |v| = U (q) and |s| = c+2, D(pqvs) = Φ(U (p), v, s). It is easy to see that such a prefix-free machine D exists. For each n ∈ N+ , note that n = U (n∗ ) and ΘV T n = U (T n∗). Thus, it follows from (11) that there exists s ∈ {0, 1}∗ with |s| = c + 2 such that D(n∗ Tn∗ ΘV T n s) = Φ(n, ΘV T n , s) = T n . Hence, HD (Tn ) ≤ |n∗ | + |T n∗ | + ΘV T n + |s| = H(n) + H(T n) + T n + c + 2. It follows from (3) that HD (Tn ) ≤ T n + 2 log2 n + 2 log2 log2 n + O(1) for
all n ∈ N+ . Using (1) we see that T is T -compressible. Acknowledgments. This work was supported by KAKENHI, Grant-in-Aid for Scientific Research (C) (20540134), by SCOPE from the Ministry of Internal Affairs and Communications of Japan, and by CREST from Japan Science and Technology Agency.
References 1. Calude, C.S., Hertling, P.H., Khoussainov, B., Wang, Y.: Recursively enumerable reals and Chaitin Ω numbers. Theoret. Comput. Sci. 255, 125–149 (2001) 2. Calude, C.S., Hay, N.J., Stephan, F.C.: Representation of left-computable εrandom reals, Research Report of CDMTCS, May 2009, p. 365 (2009), http://www.cs.auckland.ac.nz/CDMTCS/researchreports/365cris.pdf 3. Chaitin, G.J.: A theory of program size formally identical to information theory. J. Assoc. Comput. Mach. 22, 329–340 (1975) 4. Chaitin, G.J.: Algorithmic entropy of sets. Computers & Mathematics with Applications 2, 233–245 (1976) 5. Chaitin, G.J.: Algorithmic Information Theory. Cambridge University Press, Cambridge (1987) 6. Kuˇcera, A., Slaman, T.A.: Randomness and recursive enumerability. SIAM J. Comput. 31(1), 199–211 (2001) 7. Solovay, R.M.: Draft of a paper (or series of papers) on Chaitin’s work... done for the most part during the period of September-December 1974. IBM Thomas J. Watson Research Center, Yorktown Heights, New York, 215 p. (1975) (unpublished manuscript) 8. Tadaki, K.: A generalization of Chaitin’s halting probability Ω and halting selfsimilar sets. Hokkaido Math. J. 31, 219–253 (2002) 9. Tadaki, K.: A statistical mechanical interpretation of algorithmic information theory. In: Local Proceedings of Computability in Europe 2008 (CiE 2008), June 15-20, pp. 425–434. University of Athens, Greece (2008), http://www.cs.swan.ac.uk/cie08/cie2008-local.pdf
A New Representation of Chaitin Ω Number Based on Compressible Strings
139
10. Tadaki, K.: Partial randomness and dimension of recursively enumerable reals. In: Kr´ aloviˇc, R., Niwi´ nski, D. (eds.) MFCS 2009. LNCS, vol. 5734, pp. 687–699. Springer, Heidelberg (2009) 11. Tadaki, K.: Fixed points on partial randomness. In: Proceedings of the 6th Workshop on Fixed Points in Computer Science (FICS 2009), Coimbra, Portugal, September 12-13, pp. 100–107 (2009), http://cs.ioc.ee/fics09/fics09proc.pdf 12. Tadaki, K.: One-wayness and two-wayness in algorithmic randomness. Submitted to the 35th International Symposium on Mathematical Foundations of Computer Science (MFCS 2010), Brno, Czech Republic, August 23-27. LNCS. Springer, Heidelberg (2010)
Quantum Query Algorithms for Conjunctions Alina Vasilieva and Taisia Mischenko-Slatenkova Faculty of Computing, University of Latvia Raina bulv. 29, LV-1459, Riga, Latvia
[email protected],
[email protected]
Abstract. Every Boolean function can be presented as a logical formula in conjunctive normal form. Fast algorithm for conjunction plays significant role in overall algorithm for computing arbitrary Boolean function. First, we present a quantum query algorithm for conjunction of two bits. Our algorithm uses one quantum query and correct result is obtained with a probability p = 4/5, that improves the previous result. Then, we present the main result - generalization of our approach to design efficient quantum algorithms for computing conjunction of two Boolean functions. Finally, we demonstrate another kind of an algorithm for conjunction of two bits, that has a correct answer probability p = 9/10. This algorithm improves success probability by 10%, but stands aside and cannot be extended to compute conjunction of Boolean functions. Keywords: Quantum computing, query algorithm, Boolean function, algorithm design.
1
Introduction
Quantum computing is an exciting alternative way of computation, which is based on the laws of quantum mechanics. Quantum algorithms can solve certain problems faster than classical algorithms. The most exciting examples are Shor’s [1] and Grover’s algorithms [2]. One of the most important open problems in quantum computing is the proper understanding why quantum algorithms can have any advantages over probabilistic algorithms. This problem has arisen in most unusual contexts [3,4,5,6,7,8]. We concentrate on the case when these advantages are least expected, namely, when the computing device is finite and the computation time is also finite. We discuss quantum versus classical query algorithms. Let f (x1 , x2 , ..., xN ) : {0, 1}N → {0, 1} be a Boolean function. We consider the query model, where the definition of the function is known, but a black box contains the input X = (x1 , x2 , ..., xN ) and can be accessed by querying xi values. The goal is to compute the value of the function. The complexity of a query algorithm is measured by the number of questions it asks. The classical version of this model is known as decision trees [9].
This research was supported by Grant No. 09.1570 from the Latvian Council of Science and by Project Nr. 2009/0138/1DP/1.1.2.1.2/09/IPIA/VIAA/004 from the European Social Fund.
C.S. Calude et al. (Eds.): UC 2010, LNCS 6079, pp. 140–151, 2010. c Springer-Verlag Berlin Heidelberg 2010
Quantum Query Algorithms for Conjunctions
141
The quantum query model differs from the quantum circuit model [10,11,12,13], and algorithm designing techniques for this model are less developed. The problem of quantum query algorithm construction is very non-trivial and the goal of this research is to find new approaches for algorithm design. Every Boolean function can be presented as a logical formula in conjunctive normal form (CNF). Formula is in CNF if it is a conjunction (ANDs) of disjunctions (ORs) of variables or negated variables. Therefore, we need fast and efficient algorithm for conjunction. While there is exact quantum algorithm for XOR that goes with N/2 queries, exact quantum algorithms for disjunction and conjunction require N queries in N-bit case, this is a proved lower bound [10]. To enlarge classical and quantum algorithm complexity gap it is important to work out as efficient quantum algorithm for conjunction as possible. Grover’s search algorithm [2] potentially could be adjusted to compute N-bit √ conjunction using O( N ) queries. But such approach would be more efficient than the classical one just for sufficiently large N. However, usually there is a need to evaluate conjunction of rather small number of variables, but the total number of such distinct evaluations may be huge. This reasoning motivated us to search for other approaches for computing conjunctions, which would be preferable in cases when number of variables is not very large. Computing two-bit conjunction already has the following results. Approach from [14] gives a bounded-error quantum algorithm for function AN D(x1 , x2 ) with one query and a correct answer probability p = 2/3. Better one-query algorithm with probability p = 3/4 was obtained in [15]. In this paper, first, we demonstrate a quantum query algorithm for conjunction of two bits, which asks one query and outputs correct result with probability p = 4/5. This probability is more than 13% better than the maximal correct answer probability of the best possible classical analogue, which cannot exceed p = 2/3. Then, we extend our approach and formulate a general method for computing a conjunction of two Boolean functions: F = f1 ∧ f2 . Finally, we show another efficient algorithm for two bit conjunction that achieves correct answer probability p = 9/10, and thus improves our early result by 10%. Although, we are unable to extend it for computing conjunction of two functions.
2
Background and Definitions
In this section, we introduce reader with theoretical background and provide all necessary definitions. First, we define classical query model. Next, we provide a brief overview of the basics of quantum computing. Finally, we describe the quantum query model, which is the main subject of this paper. 2.1
Classical Query Model
The classical version of the query model is known as decision trees. This model is intended for computing Boolean functions [9]. The definition of the Boolean function is known to everybody, but the input X = (x1 , x2 , ..., xN ) is hidden in
142
A. Vasilieva and T. Mischenko-Slatenkova
a black box, and can be accessed by querying xi values. The algorithm must be able to determine the value of a function correctly for arbitrary input. The complexity of the algorithm is measured by the number of queries on the worstcase input. For more details, see the survey by Buhrman and de Wolf [9]. Definition 1. [9] The deterministic complexity of a function f, denoted by D(f ), is the maximum number of questions that must be asked on any input by a deterministic algorithm for f. Probabilistic decision tree may contain internal nodes having a probabilistic branching, i.e., multiple arrows exiting from this node, each one labeled with a probability for algorithm to follow that way. The total probability to obtain result b ∈ {0, 1} after the execution of an algorithm on certain input X equals the sum of probabilities for each leaf labeled with b to be reached. The total probability of an algorithm to produce the correct result is the probability on the worst-case input. 2.2
Quantum Computing
This section briefly outlines the basic notions of quantum computing that are necessary to define the computational model used in this paper. For more details, see excellent textbooks by Nielsen and Chuang [11] and Kaye et al. [12]. An n-dimensional quantum pure state is a unit vector in a Hilbert space. Let |0 , |1 , ..., |n − 1 be an orthonormal basis for Cn . Then, any state can be n−1 expressed as |ψ = i=0 ai |i for some ai ∈ C. Since the norm of |ψ is 1, we n−1 2 have i=0 |ai | = 1. States |0 , |1 , ..., |n − 1 are called basis states. Any state n−1 of the form i=0 ai |i is called a superposition of basis states. The coefficient ai is called an amplitude of |i. The state of a system can be changed by applying unitary transformation. Unitary transformation U is a linear transformation on Cn that maps vector of unit norm to vector of unit norm. The transpose of a matrix A is denoted with ATij = Aji . The simplest case of quantum measurement is used in our query model. It is the full measurement in the computation basis. Performing this measurement on a state |ψ = a0 |0 + ... + an−1 |n − 1 gives the outcome i with probability |ai |2 . The measurement changes the state of the system to |i and destroys the original state. 2.3
Quantum Query Model
The quantum query model is the quantum counterpart of decision trees. For a detailed description, see the survey by Ambainis [13] and textbooks by Kaye, Laflamme, Mosca [12] and de Wolf [10]. A quantum computation with T queries is a sequence of unitary transformations: U0 → Q0 → U1 → Q1 → ... → UT −1 → QT −1 → UT .
Quantum Query Algorithms for Conjunctions
143
Ui s are arbitrary unitary transformations that do not depend on the input bits. → Qi s are query transformations. Computation starts in the initial state 0 . Then we apply U0 , Q0 , ..., QT −1 , UT and measure the final state. We use the following definition of query transformation: if input is a state |ψ = i ai |i , then the output is: |φ = (−1)ϕi ai |i, where ϕi ∈ {x1 , ..., xN , 0, 1} . (1) i
For each query we may arbitrarily choose a variable assignment ϕi for each basis state. If the value of the assigned variable ϕi ∈ {x1 , ..., xN } is ”1”, then the sign of the i-th amplitude ai changes to the opposite. Each quantum basis state corresponds to the algorithm’s output. We assign a value of a function to each output. The probability of obtaining result j ∈ {0, 1} after executing an algorithm on input X equals the sum of squared moduli of all amplitudes, which correspond to outputs with value j. Definition 2. [9] A quantum query algorithm computes f exactly if the output equals f(x) with a probability p=1, for all x ∈ {0, 1}N . Definition 3. [9]. A quantum query algorithm computes f with bounded-error N if the output equals f(x) with probability p > 1/2, for all x ∈ {0, 1} . Complexity is denoted by Qp (f ). Quantum query algorithms can be conveniently represented in diagrams, and we will present our algorithms this way throughout this paper.
3
Quantum Query Algorithms for Conjunctions
In this section, we present our results in designing quantum query algorithms for a set of Boolean functions that are based on the AND Boolean operation. We start with the classical complexity of the two-argument Boolean function AN D2 (x1 , x2 ) = x1 ∧ x2 . Then, we demonstrate a bounded-error quantum query algorithm that computes AN D2 (x1 , x2 ) with one query and probability p = 4/5. Next, we generalize our approach and present a general method for designing efficient quantum query algorithms for computing a composite function AN D2 [f1 , f2 ], where f1 and f2 are Boolean functions. Finally, we show another quantum query algorithm for two-variable conjunction that achieves a correct answer probability p = 9/10. To our regret, we did not succeeded in extending this algorithm to compute AN D2 [f1 , f2 ]. Definition 4. AN Dn [f1 , ..., fn ] construction (n ∈ IN) is a composite Boolean function where arguments are arbitrary Boolean functions fi and which is defined as: n AN Dn [f1 , f2 , ..., fn ](X) = 1 ⇔ i=1 fi (Xi ) = n, X = X1 X2 ...Xn ; Xi is input for i-th function; fi ’s are called base functions.
144
3.1
A. Vasilieva and T. Mischenko-Slatenkova
Classical Complexity of AN D2 (x1 , x2 )
Classical deterministic complexity of the Boolean function AN D2 (x1 , x2 ) obviously is equal to the number of variables: D(AN D2 ) = 2. Next, we are going to show that the best probability for a classical randomized decision tree to compute this function with one query is p = 2/3. The general form of the optimal randomized decision tree is shown in Fig. 1.
Fig. 1. General form of the optimal randomized decision tree for computing Boolean function AN D2 (x1 , x2 )
We denote the probability to see result b ∈ {0, 1} with Pr(”b”|X). The correct answer probability calculation: 1. Pr(”0”|X = 00) = (1 − s) + 12 s + 12 s = 1 2. Pr(”0”|X = 01 or X = 10) = (1 − s) + 12 s + 12 sq = 1 − 12 (s − sq) 3. Pr(”1”|X = 11) = 12 s(1 − q) + 12 s(1 − q) = s − sq We denote (s − sq) = z. Then, total probability of the correct answer is: 1 p = min(p(”0”), p(”1”)) = min(1 − z, z) 2
(2)
The best probability is obtained when p(”0”) = p(”1”): 2 1 1− z =z ⇒ z = 2 3
(3)
Corollary 1. The maximal correct answer probability for the randomized classical decision tree to compute Boolean function AN D2 (x1 , x2 ) with one query is p=2/3. 3.2
First Quantum Query Algorithm for AN D2 (x1 , x2 )
In this section, we present the first version of bounded-error quantum query algorithm for the simplest case of two-variable function AN D2 (x1 , x2 ).
Quantum Query Algorithms for Conjunctions
145
Theorem 1. There exists a quantum query algorithm Q1 that computes Boolean function AN D2 (x1 , x2 ) with one quantum query and correct answer probability p=4/5: Q4/5 (Q1) = 1. Proof. Algorithm is presented in Fig. 2.
Fig. 2. Bounded-error quantum query algorithm Q1 for computing AN D2 (x1 , x2 )
Our algorithm uses 3-qubit quantum system. Each horizontal line corresponds to the amplitude of the basis state. Computation starts with amplitude distribution |ϕ = ( √25 , 0, 0, 0, √15 , 0, 0, 0)T . Two large rectangles correspond to the 8 × 8 unitary matrices U0 and U1 . Vertical layer of circles specifies the queried variable order for a single query Q0 . During this quantum query the sign of the second amplitude changes to the opposite if x1 = 1; the sign of the fourth amplitude changes to the opposite if x2 = 1. Finally, eight small squares at the end of each horizontal line define the assigned function value for each basis state. The important moment is that amplitude value a = √15 is assigned to the basis state |100 and does not change until the end of execution. Overall correct answer probability goes up due to this special feature. Quantum state after transformation U0 becomes equal to: |ϕ2 = U0 · |ϕ = U0 · ( √25 , 0, 0, 0, √15 , 0, 0, 0)T = ( √15 , √15 , √15 , √15 , √15 , 0, 0, 0)T Further evolution of the quantum system for each input X is shown in Table 1.
Table 1. Computation process of the quantum algorithm Q1 for computing Boolean function AN D2 (x1 , x2 ) X |ϕ3 = Q0 U0 |ϕ 00
( √15 , √15 , √15 , √15 , √15 , 0, 0, 0)T
01 ( √15 ,
√1 , √1 , − √1 , √1 , 0, 0, 0)T 5 5 5 5
10 ( √15 , − √15 , 11 ( √15 , − √15 ,
√1 , √1 , √1 , 0, 0, 0)T 5 5 5 √1 , − √1 , √1 , 0, 0, 0)T 5 5 5
|ϕ = U1 Q0 U0 |ϕ p(AN D2 = 1) F INAL T 2 2 1 ( 5 , 0, 5 , 0, √5 , 0, 0, 0) 0 T 2 √1 1 1 1 ( 5 , 5 , 0, − √5 , √5 , 0, 0, 0) 5 1 (0, √15 , 25 , √15 , √15 , 0, 0, 0)T 5 (0,
√2 , 0, 0, √1 , 0, 0, 0)T 5 5
4 5
146
A. Vasilieva and T. Mischenko-Slatenkova
3.3
Internal Details of the Algorithm Q1
This section is a transitional point to the generalized method for computing the composite construction AN D2 [f1 , f2 ]. Now we reveal the internal details of the algorithm Q1 that allows us to adapt its structure to compute much wider set of Boolean functions. Quite chaotic and asymmetric matrix U0 actually is a product of two other matrices: ⎛ 1 ⎞ ⎛ ⎞ √ √1 0 0 0000 √1 0 √1 00000 2 2 2 2 1 1 ⎜ √ −√ 0 0 0 0 0 0 ⎟ ⎜ ⎜ 2 ⎟ ⎜ 0 1 0 0 0 0 0 0⎟ 2 ⎟ ⎜ ⎟ 1 1 √1 0 − √1 0 0 0 0 0 ⎟ ⎜ 0 0 √2 √2 0 0 0 0 ⎟ ⎜ ⎜ ⎟ 2 2 ⎜ ⎟ ⎜ √1 − √1 0 0 0 0 ⎟ ⎜ 0 0 0 1 0 0 0 0⎟ ⎟ U0 = U0B · U0A = ⎜ 0 0 2 2 ⎟·⎜ ⎜ 0 0 0 0 1 0 0 0⎟ ⎜ 0 0 0 0 1 0 0 0⎟ ⎟ ⎜ ⎟ ⎜ ⎜ 0 0 0 0 0 1 0 0⎟ ⎜ 0 0 0 0 0 1 0 0⎟ ⎟ ⎜ ⎟ ⎜ ⎝ 0 0 0 0 0 0 1 0⎠ ⎝ 0 0 0 0 0 0 1 0⎠ 0 0 0 00001 0 0 0 0 0001 Matrix U1 is a product of the following two matrices: ⎛ ⎞ ⎛ √1 √1 1 0 0 0 0000 2 2 √1 − √1 ⎜ 0 √1 0 √1 0 0 0 0 ⎟ ⎜ 2 2 2 2 ⎜ ⎟ ⎜ ⎜0 0 1 0 0 0 0 0⎟ ⎜ ⎜ 0 0 ⎜ 1 ⎟ ⎜ 0 √ 0 − √1 0 0 0 0 ⎟ ⎜ ⎟ ⎜ 0 0 U1 = U1B · U1A = ⎜ ⎜ 0 02 0 0 2 1 0 0 0 ⎟ · ⎜ ⎜ ⎟ ⎜ 0 0 ⎜0 0 0 0 0 1 0 0⎟ ⎜ ⎜ ⎟ ⎜ 0 0 ⎝0 0 0 0 0 0 1 0⎠ ⎜ ⎝ 0 0 0 0 0 0 0001 0 0
0 0
0 0
√1 2 √1 2
√1 2 − √12
0 0 0 0
0 0 0 0
⎞ 0000 0 0 0 0⎟ ⎟ ⎟ 0 0 0 0⎟ ⎟ 0 0 0 0⎟ ⎟ 1 0 0 0⎟ ⎟ 0 1 0 0⎟ ⎟ 0 0 1 0⎠ 0001
Detailed algorithm flow now looks as follows: |ϕ → U0A → U0B → Q0 → U1A → U1B → [M easure] Now the most important point - algorithm part represented by transformations UoB → Q0 → U1A actually executes two instances of exact quantum algorithm for f (x) = x in parallel. Figures 3 and 4 graphically demonstrate this significant detail. In other words, first, quantum parallelism is employed to evaluate each variable. Then, unitary transformation U1B is applied to correlate the amplitude
Fig. 3. Exact quantum query algorithm for computing f (x) = x
Quantum Query Algorithms for Conjunctions
147
Fig. 4. Quantum query algorithm Q1 for computing AN D2 (x1 , x2 ) revised
distribution in such a way, that the resulting quantum algorithm computes AN D2 (x1 , x2 ) with acceptable error probability. In the next section we will generalize this approach to allow using other Boolean functions as sub-routines. 3.4
General Method for Computing AN D2 [f1 , f2 ]
It is possible to replace a sub-algorithm for f (x) = x by any other quantum query algorithm, which satisfies specific properties. We define a set of algorithm classes, where each class is denoted QQAp . Definition 5. Quantum query algorithm belongs to the class QQAp (−1 ≤ p ≤ +1) iff there is exactly one accepting basis state and on any input for its amplitude a ∈ C only two values are possible before the final measurement: either a = 0 or a = p . Our method for computing AN D2 [f1 , f2 ] is applicable to base algorithms that belong to class QQA+1 . Theorem 2. If there exist an exact quantum query algorithms A1 and A2 for computing Boolean functions f1 (X1 ) and f2 (X2 ) that belong to the class QQA+1 , then composite Boolean function AN D2 [f1 , f2 ] can be computed with a probability p=4/5 using max(QE (A1), QE (A2)) queries to the black box. Proof. The general method for designing an algorithm for computing Boolean function AN D2 [f1 , f2 ] is described below.
The general method for computing AN D2 [f1 , f2 ]. Input. Two exact quantum query algorithms A1,A2 ∈ QQA+1 that compute Boolean functions f1 (X1 ), f2 (X2 ). We denote the dimension of Hilbert space utilized by the first algorithm with m1 (number of amplitudes), for the second algorithm with m2 . We denote the positions of accepting outputs of A1 and A2 with acc1 and acc2 .
148
A. Vasilieva and T. Mischenko-Slatenkova
Constructing steps. 1. If m1 = m2 , then utilize a quantum system with 4m1 amplitudes for a new algorithm. First 2m1 amplitudes will be used for the parallel execution of A1 and A2. Additional qubit is required to provide separate amplitude for storing the value of √15 . = m2 (without loss of generality assume that m1 > m2 ), then utilize 2. If m1 a quantum system with 2m1 amplitudes for a new algorithm. First (m1 + m2 ) amplitudes will be used for the parallel execution of A1 and A2. First remaining free amplitude will be used for storing the value of √15 . 3. Combine transformations⎞and queries of A1 and A2 as follows: ⎛ unitary
Om1 ×m2 Om1 ×k Ui1 if m1 = m2 2m1 , Ui = ⎝ Om2 ×m1 Ui2 Om2 ×k ⎠, where k = m1 − m2 , if m1 = m2 Ok×m1 Ok×m2 Ik Omi ×mj are mi × mj zero-matrices, Ik is k × k identity matrix, Ui1 and Ui2 are either unitary transformations or query transformations of A1 and A2. 4. Start computation from the state: 2 2 1 √ , 0, ..., 0 |ϕ = ( , 0, ..., 0, , 0, ..., 0, )T . 5 5 5 m1
m2
remaining amplitudes
5. Before the final ⎧ measurement apply an additional unitary gate: 1, √ if (i = j) AN D (i = acc1 ) AN D (i = (m1 + acc2 )) ⎪ ⎪ ⎪ ⎪ 1/ 2, if (i = j = acc ) ⎪ 1 ⎪ ⎨ √ 1/ 2, if ((i = acc1 ) AN D (j = (m1 + acc2 ))) U = (uij ) = ⎪ ⎪ √ OR ((i = (m1 + acc2 )) AN D (j = acc1 )) ⎪ ⎪ ⎪ −1/ 2, if (i = j = (m1 + acc2 )) ⎪ ⎩ 0, otherwise 6. Define as accepting output exactly one basis state |acc1 . Output. A bounded-error quantum query algorithm A for computing a Boolean function F (X) = f1 (X1 ) ∧ f2 (X2 ) with a probability p = 4/5 and complexity Q4/5 (A) = max(QE (A1), QE (A2)). The most significant property of our method is fact that the overall algorithm complexity does not exceed the greatest complexity of sub-algorithms. Additional queries are not required to compute composite function. However, error probability is the cost for efficient computing. 3.5
Iterative Application of a Method for Computing AN D2 [f1 , f2 ]
Algorithm designing method for computing AN D2 [f1 , f2 ] construction has useful properties, which allow executing this method repeatedly. Theorem 3. Let F1 = AN D2 [f11 , f12 ] and F2 = AN D2 [f21 , f22 ] be composite Boolean functions. Let Q1 and Q2 be bounded-error quantum query algorithms that have been constructed using a method for computing AN D2 [f1 , f2 ] and that
Quantum Query Algorithms for Conjunctions
149
compute F1 and F2 with a probability p=4/5. Then, a bounded-error quantum query algorithm Q can be constructed to compute composite Boolean function F = AN D2 [F1 , F2 ] with a probability p=16/25. Proof. We straightforwardly apply our method for computing AN D2 [f1 , f2 ] to algorithms Q1 and Q2 instead of instances of QQA+1 class. As a result, the obtained complex algorithm computes F = AN D2 [F1 , F2 ] with a probability p = 4/5 · 4/5 = 16/25.
Next iteration produces quantum algorithms that compute conjunctions like: F = f1 ∧ f2 ∧ f3 ∧ f4 ∧ f5 ∧ f6 ∧ f7 ∧ f8 ,
(4)
with a probability p = 64/125 that is just slightly more than one half. 3.6
Better Quantum Query Algorithm for AN D2 (x1 , x2 )
In this section, we are going to demonstrate another bounded-error quantum query algorithm for computing AN D2 (x1 , x2 ) that achieves a better correct answer probability p = 9/10. This improves algorithm precision by 10% in comparison with our first algorithm. Unfortunately, this algorithm cannot be straightforwardly substituted in a method for constructing algorithm for the AN D2 [f1 , f2 ] construction. Theorem 4. There exists a quantum query algorithm Q2 that computes Boolean function AN D2 (x1 , x2 ) with one quantum query and correct answer probability p=9/10: Q9/10 (Q2) = 1. Proof. Algorithm is presented in Fig. 5. Computation process for each input X is shown in Table 2.
The structure of the algorithm Q2 is different from the structure of the first algorithm Q1. The fundamental difference is in a behavior of the query and to our regret this difference does not allow to extend this algorithm to compute AND of sub-functions.
Fig. 5. Bounded-error quantum query algorithm Q2 for computing AN D2 (x1 , x2 )
150
A. Vasilieva and T. Mischenko-Slatenkova
Table 2. Computation process of the quantum algorithm Q2 for computing Boolean function AN D2 (x1 , x2 ) X |ϕ2 = Q0 U0 |00 00 01 10 11
1 1 , , − 21 )T 2 2 1 , − 12 , − 12 )T 2
|ϕF INAL = U2 U1 Q0 U0 |00 p(AN D2 = 1)
2 √1 , , − 12 )T 5 10 1 3 (0, − √10 , √10 , 0)T (− 12 , − 12 , 12 , − 12 )T ( 21 , − √110 , − 25 , − 21 )T (− 12 , − 12 , − 21 , − 12 )T (0, − √310 , − √110 , 0)T
( 12 , ( 12 ,
( 21 ,
1 10 1 10 1 10 9 10
From the other side, correct answer probability obtained by the algorithm Q2 is better than that of Q1. It is closer to p=1, so, in some applications it may be more advantageous to use this algorithm instead of the previous one. Algorithm Q2 also uses less qubits than algorithm Q1, and that may be considered an advantage as well.
4
Conclusion
This paper is devoted to computing Boolean functions represented by a logical formula in conjunctive normal form. Computing of conjunctions is considered in a quantum bounded-error settings. We presented a quantum query algorithm that computes conjunction of two bits by asking only one query with correct answer probability p = 4/5. Then, we extended our approach and formulated a general method for computing conjunction of two Boolean functions with the same probability and number of queries equal to max(QE (f1 ), QE (f2 )). Suggested approach allows designing quantum algorithms for complex functions based on already known algorithms. Significant behavior is that overall algorithm complexity does not increase; additional queries are not required to compute composite function. We presented even more precise quantum query of the same efficiency for two bit conjunction that asks one query and outputs correct result with probability p = 9/10. This improved algorithm precision by 10% comparing to our early result. Although, we failed to extend it for computing conjunction of two functions. Proposed quantum algorithms are more efficient than the best possible classical deterministic or quantum exact algorithms, and provide higher accuracy than the best possible classical randomized decision trees. We see many possibilities for future research in the area of quantum query algorithm design. Regarding computing conjunctions we would like to extend number of clauses from two to N. We would like to improve the probability of iterative application of the construction method. Another fundamental goal is to develop a framework for building efficient ad-hoc quantum query algorithms for arbitrary Boolean functions. Acknowledgments. This research was supported by Grant No. 09.1570 from the Latvian Council of Science and by Project Nr. 2009/0138/1DP/1.1.2.1.2/09/ IPIA/VIAA/004 from the European Social Fund.
Quantum Query Algorithms for Conjunctions
151
References 1. Shor, P.W.: Polynomial time algorithms for prime factorization and discrete logarithms on a quantum computer. SIAM Journal on Computing 26(5), 1484–1509 (1997) 2. Grover, L.: A fast quantum mechanical algorithm for database search. In: Proceedings of 28th STOC 1996, pp. 212–219 (1996) 3. Calude, C.S.: Information and Randomness: an Algorithmic Perspective, 2nd edn. Springer, Heidelberg (2002) 4. Calude, C.S., Pavlov, B.: Coins, quantum measurements and Turing’s barrier. Quantum Information Processing 1(1-2), 107–127 (2002) 5. Calude, C.S., Stay, M.A.: Natural halting probabilities, partial randomness and zeta functions. Information and Computation 204(11), 1718–1739 (2006) 6. Freivalds, R.: Complexity of Probabilistic Versus Deterministic Automata. In: Barzdins, J., Bjorner, D. (eds.) Baltic Computer Science. LNCS, vol. 502, pp. 565–613. Springer, Heidelberg (1991) 7. Ambainis, A., Freivalds, R.: 1-Way Quantum Finite Automata: Strengths, Weaknesses and Generalizations. In: FOCS, pp. 332–341 (1998) 8. Morita, K.: Reversible computing and cellular automata - A survey. Theoretical Computer Science 395(1), 101–131 (2008) 9. Buhrman, H., de Wolf, R.: Complexity Measures and Decision Tree Complexity: A Survey. Theoretical Computer Science 288(1), 21–43 (2002) 10. de Wolf, R.: Quantum Computing and Communication Complexity. University of Amsterdam (2001) 11. Nielsen, M., Chuang, I.: Quantum Computation and Quantum Information. Cambridge University Press, Cambridge (2000) 12. Kaye, R., Laflamme, R., Mosca, M.: An Introduction to Quantum Computing, Oxford (2007) 13. Ambainis, A.: Quantum query algorithms and lower bounds (survey article). In: Proceedings of FOTFS III, Trends on Logic, vol. 23, pp. 15–32 (2004) 14. Lace, L.: Doctoral Thesis. University of Latvia (2008) 15. Vasilieva, A.: Quantum Query Algorithms for AND and OR Boolean Functions, Logic and Theory of Algorithms. In: Proceedings of Fourth Conference on Computability in Europe, pp. 453–462 (2008)
Universal Continuous Variable Quantum Computation in the Micromaser Rob C. Wagner1 , Mark S. Everitt2 , Viv M. Kendon1 , and Martin L. Jones1 1
2
School of Physics and Astronomy, University of Leeds, Leeds, UK, LS2 9JT National Institute of Informatics, 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo 101-8430, Japan
Abstract. We present universal continuous variable quantum computation (CVQC) in the micromaser. With a brief history as motivation we present the background theory and define universal CVQC. We then show how to generate a set of operations in the micromaser which can be used to achieve universal CVQC. It then follows that the micromaser is a potential architecture for CVQC but our proof is easily adaptable to other potential physical systems.
1
Introduction
Analogue computation has a long-running history, from the invention of the Astrolabe [1] for plotting the heavens in around 200 BC, through the slide rule and mechanical differential analyser to more modern electronic devices. Analogue computation is less well developed than its digital counterpart, but offers many opportunities both in the theoretical advancement and physical realisation of computers [2,3,4]. Quantum mechanics is a more recent invention, being conceived and developed since the early 20th century. Much of the original research in quantum mechanics used continuous variable systems, such as operations on the positions and momenta of particles. It would thus appear that quantum mechanics offers a breeding ground for new theories of continuous variable (CV) computation. However, perhaps inspired by the prevalence of classical digital computation, most of the research into quantum computation is aimed at discrete variables, in the form of qubits [5,6,7]. Lloyd and Braunstein [8] laid the groundwork for continuous variable quantum computation (CVQC) in 1999. Almost immediately research went into implementing CV algorithms, such as analogues of Grover’s, Deutsch & Jozsa’s and Shor’s algorithms [9,10,11], and investigating the general structure of computation and simulation with CVs. Some of these looked at implementing discrete computation embedded in CV systems [12,13] and others looked at using the physics of CV systems directly to implement CV computing [14,15]. There are two major schemes for CVQC: encoding the information in infinitelysqueezed states, such as the position eigenstates; or encoding the information in Gaussian states, such as the quantum coherent states of light. Previous work C.S. Calude et al. (Eds.): UC 2010, LNCS 6079, pp. 152–163, 2010. c Springer-Verlag Berlin Heidelberg 2010
Universal Continuous Variable Quantum Computation in the Micromaser
153
favours the first approach but we argue that the second is more appropriate for a realistic view and implementation of CVQC. In this paper we will show how to achieve universal CVQC in a particular experiment, the micromaser. Computations and universality are described in Section 2 with recipes for universality in two different encodings of variables. The micromaser itself is described in Section 3 along with the background physics. Our results are laid out in Section 4, which is how to achieve universal CVQC in the micromaser and in Section 5 we give our plans for further work.
2
Universal CVQC
The notion of universality is important in all branches of computation theory. We call our system a universal computer for our purposes if it can perform any computation for which we wish to use it. In discrete variable quantum computation (DVQC), universal computation is being able to achieve any unitary operation on the states encoding the variables. Since realising any unitary operation on a continuous variable would require an infinite number of parameters, for continuous variable quantum computation (CVQC) we restrict ourselves to exponentials of Hermitian polynomials on the space of continuous variables. This is sensible as unitary operations are usually considered to be the result of applying a Hamiltonian for a period of time and Hamiltonians are Hermitian polynomials1 . We encode the CV information in the eigenstates of some continuous-spectrum operator and computations are embodied as physical manipulations which correspond to operations on the eigenstates. This is the definition of continuous variable quantum computation that we employ. To encode our variables and to describe the physical modes involved we make use of the quadrature operators x ˆ and pˆ, which are orthogonal in the sense that [ˆ x, pˆ] = i, up to a real normalisation constant. Any polynomial in x ˆ and pˆ can be generated given a certain set of available operators as stated by Lloyd & Braunstein [8]: “Simple linear operations on continuous variables, together with a nonlinear operation and any interaction suffices to enact to an arbitrary degree of accuracy arbitrary Hermitian polynomials of the set of continuous variables.” In terms of operators, the simple linear operations are {±ˆ x, ±ˆ p}, a non-linear 2 2 2 ˆ and an interaction operation might be the Kerr Hamiltonian HKerr = xˆ + pˆ is to couple modes together, e.g. two mode squeezing or the sum gate described later in this section. The way in which polynomials are generated is as follows: ˆ for some small time δt given Hamiltonians Aˆ and B, ˆ ˆ 2 ˆ ˆ ˆ ˆ e−iAδt e−iBδt eiAδt eiBδt = e[A,B]δt +O δt3 (1) and 1
ˆ ˆ ˆ δt ˆ δt ˆ δt ˆ δt eiA 2 eiB 2 eiB 2 eiA 2 = ei(A+B )δt +O δt3
(2)
A polynomial f in the position and momentum operators x ˆ, pˆ is said to be Hermitian iff it is its own adjoint: f † = f
154
R.C. Wagner et al.
ˆ ±B ˆ and ±Aˆ ± B ˆ from Aˆ and B ˆ So we can generate the Hamiltonians ±i A, very easily to arbitrary fidelity. Note that we are converting between the space of unitary operators and the space of Hamiltonians to generate our Hamiltonians since unitary evolution is how nature evolves. A non-linear operation Cˆ is one of order cubed or higher in xˆ and pˆ, meaning that when it is commuted with another ˆ an operator of higher order in x ˆ is obtained. Nonoperator H, ˆ and pˆ than H linear operations can’t be efficiently simulated on a classical discrete variable computer [14]. Recursively then, polynomials of any order can be generated. While this gives us universality by our definition of obtaining any Hermitian polynomial in xˆ and pˆ, there is some choice left in how to encode the CV quantities into the states. The set of continuous variable states is generally considered to belong to one of two distinct classes. One consists of variables encoded into a set of infinitely squeezed states - eigenstates of a quadrature operator - and the other is the set of Gaussian states. The infinitely squeezed states are a limit case of the Gaussian states but the two have slightly differing sets of universal operations. The two are not equivalent but lead to the same set of computable functions since we are looking for any Hermitian polynomial in the operators. We choose to encode our information in finitely squeezed Gaussian states because infinitely squeezed states are somewhat unphysical. To achieve universal computation we need a non-linear operation (order cubed or higher) and the following list of linear operations (in order: displacement; fourier transform; 1mode squeezing; 2-mode squeezing) ˆ (x) ≡ exp (−2ixˆ X p) 2 iπ x ˆ + pˆ2 Fˆ ≡ exp 2 ζ 2 ζ †2 Sˆ (ζ) ≡ exp a ˆ − a ˆ 2 2 Sˆi,j (ζ) ≡ exp ζ a ˆi a ˆj − ζˆ a†i a ˆ†j ,
(3) (4) (5) (6)
√ √ where a ˆ = (x + ip)/ 2 and a ˆ† = (x − ip)/ 2 are the ladder operators. Here the first three are single-mode operations and the last is our interaction. We describe these operations in more detail in section 4. For one-mode squeezing, Equation (5), the variable ζ ∈ C describes the orientation and amount of squeezing (described later) and on the annihilation operator a ˆ, Sˆ (ζ) acts as Sˆ (ζ)† a ˆSˆ (ζ) = e+r x ˆ(θ) + i e−r pˆ(θ) ,
(7)
where x ˆ(θ) = xˆ cos θ + pˆ sin θ and pˆ(θ) = −ˆ x sin θ + pˆ cos θ are the rotated quadratures. The last operation is the two-mode squeezing operator Sˆi,j (ζ), which reduces the variance in the relative position and total momentum of the two modes being coupled. One way to easily visualise the effect of these operations on the phase space of the modes is using Husimi Q-function plots. For example, in Figure 1 we present ˆ (x) and Zˆ (p) ≡ exp (−2ixˆ the plots for the X p) displacement operations. The
Universal Continuous Variable Quantum Computation in the Micromaser
155
Fourier transform, Fˆ in Figure 1, gives a rotation of π/2 in phase space. We see that we can easily generalise the Fourier transform to Fˆ (t) ≡ exp it xˆ2 + pˆ2 . Figure 2 shows the effect of one-mode squeezing applied to the ground state. The variance in one axis is increased while the variance in the conjugate axis is decreased to compensate. Two-mode squeezing generates a similar effect, but between the pair-wise positions and momenta of the two modes. However, if we look at the relative positions and momenta instead, we see the correlations between the two modes. Fˆ
ˆ Z(p)
ˆ X(x) +ˆ x
+ˆ p
Fig. 1. A Q-function plot of positive displacements of position and momentum from the ground state to other coherent states. The third Figure demonstrates the Fourier transform for a rotation of π/2.
ˆ iθ ) S(re e−r
θ
Fig. 2. One-mode squeezing applied to the ground state. Shown is how r and θ parameterise the squeezing in phase space.
To achieve general universal CVQC, we need to go beyond the realms of Gaussian states (which are the eigenstates of the class of Hamiltonians only quadratic in xˆ and pˆ). To do this requires a non-Gaussian operation, such as a non-linearity (see the above mentioned Kerr Hamiltonian), a cubic-phase gate [16], or simply measurement - although without a clever scheme this may not be very useful for computation due to the probabilistic nature of the process. Having established a set of universal gates for computation, we briefly describe how to carry out a computation. We can create any eigenstate of an operator which can be generated by our universal set (equations 3-6 plus a non-linear
156
R.C. Wagner et al.
operation). To do this we first initialise to an appropriate coherent state (say the ground state |0) and perform the appropriate sequence of operations which generates the desired unitary operation. We have shown (equations 1 & 2) that we can generate any desired unitary operation from the space of those available to us starting with our basis set but in practice there could be a long sequence of elementary operations to achieve the desired operation. After applying the desired operation we can use homodyne-like measurements to find the output state and hence determine the result of the computation. The feasibility of an individual computation is at the mercy of the comparison of the resources available and the resources required, such as coherence times, number of modes etc.
3
What Is a Micromaser?
To perform CV quantum computations we propose the use of a micromaser-like system. The micromaser [17], or microscopically pumped maser, is essentially a very high quality microwave cavity with a rarified beam of atoms passing through it. The transition of single atoms moving through the cavity one at a time from one highly excited Rydberg state to another will keep the microwave field in the cavity pumped. This is an extreme case of a beam maser, which would normally use a dense beam of atoms to pump a lossier cavity. The micromaser has been historically important as a test of cavity QED as a physical realisation of the Jaynes-Cummings model [18] which predicts the behaviour of a two level atom in a single mode field without semi-classical approximations. This model is important as the foundation of our understanding of masers and lasers. A simplified schematic can be found in Figure 3. By virtue of the very high quality of a micromaser cavity, for which the best examples may retain a photon for 0.3 s [19], the linewidth of the cavity mode is very small. For cavity of lifetime 0.3 s the linewidth is less than a hertz for a mode frequency of 21.456 GHz. The field in a micromaser cavity is well suited to CVQC due to this long lifetime and well defined mode frequency. Quantum coherence of the cavity field is also maintained by the high quality cavity. We propose the use
Fig. 3. The ‘phase sensitive’ micromaser. Atoms are emitted by the source in the excited state and may be rotated coherently between the ground and excited states (both of which are actually highly excited Rydberg states of the atom) with a lossy microwave field, labelled as rotation here. The cavity in the centre may then interact with the atom, and a final rotation on the atom allows a measurement basis to be selected for measurement of the atomic state using state selective field ionisation.
Universal Continuous Variable Quantum Computation in the Micromaser
157
of the quadratures of the cavity field as a continuous variable for CVQC. Multivariable computations require extensions of the micromaser to multi-mode fields which requires modified cavity designs and carefully chosen energy levels of the atom. The Rydberg atoms that are commonly used in micromaser experiments provide a multitude of possible transitions to couple to many modes. 3.1
The Jaynes-Cummings Model
The Jaynes-Cummings model describes the simplest non-trivial interaction between atoms and light, consisting of a single atom with two states, labelled |e for the upper and |g for the lower2 , interacting with a single mode field. The Hamiltonian of this interaction is [20] + ˆ = ωa σ H ˆ3 + ωˆ a† a ˆ − ig σ ˆ a ˆ−σ ˆ−a ˆ† , 2
atom interaction field
(8)
where ωa is the transition energy of the atom from |g to |e, ω is the frequency of the mode, g is the atom-field coupling constant, a ˆ† (ˆ a) is the creation (annihiσ − ) is the atomic raising (lowering) lation) operator of the field mode and σ ˆ + (ˆ operator. This model can be generalised easily to include more field modes or atomic levels. 3.2
Some Experimental Considerations
The source of atoms used in a micromaser is typically an effusive oven. The atoms emitted will thus have Poisson distributed arrival statistics and a velocity distribution determined by the oven temperature. The velocity distribution can be refined by the use of Fizeau wheel or detuning the first step excitation laser and placing it at an angle to the atomic beam [21] to select the velocity which Doppler shifts the laser into resonance. Poisson statistics of the atomic beam mean that there is some chance of multiple atom events in the cavity which are undesirable. The strategy used is to rarify the beam such that the chance of two or more atom events is lowered. The probability of a one-atom (desirable) event is given by P1 = e−2rL/v [21], where r is the rate of atomic arrivals at the cavity, L is the length of the cavity and v is the velocity of the atoms. Given that a typical micromaser cavity is approximately 3 cm long and that the selected velocity of the beam will be approximately 300 m s−1 , an average rate of 10 atoms per second will result in the probability of an event being a one atom event of 99.8%. Operations that require many atoms may have to sacrifice some precision due to the the greater atomic rates necessary to complete a computation within the decoherence time of the cavity field. For experiments run by the micromaser 2
These stand for excited and ground respectively, although in a laboratory two highly excited Rydberg states of rubidium are chosen such that the difference in energy between the states is close to resonance with the microwave cavity field.
158
R.C. Wagner et al.
group in Garching, cavities as good as up to a quality factor of Q = 4 × 1010 were used, corresponding to a life time of the field of 0.3 s. Some of the most recent advances in micromaser physics have involved the generation of Fock (number) states [22]. Using a micromaser field in a Fock state it is possible to produce a single atom in a particular state on demand [19]. If atoms on demand can be supplied then the rate can be greatly increased as multiple atom events are effectively eliminated, and many more computational operations performed. Another intriguing possibility for a single atom source is with the use of a standing-wave dipole trap, which can be used to accelerate single atoms deterministically [23].
4
Operations in the Micromaser
Having described the two main schemes for encoding CVs in quantum systems for computation in Section 2, we need to choose a scheme for use in the micromaser. To decide this, it is necessary to look first at which states we can initialise and what operations we can perform on them. Coherent states3 [20], number (Fock) states (including Trapping States) [22], steady states [24], and tangent & cotangent states [25] may all be produced in a micromaser cavity field. Most of these are unsuitable for UCVQC but we see that we can create coherent states. We now show how we can achieve UCVQC on the Gaussian states, going through the list of required operations. 4.1
Displacement Operations
The single-quadrature displacement functions ˆ (x) ≡ e−2ixpˆ , Zˆ (p) ≡ e2ipˆx X
(9)
are of the form of the generalised Displacement Operator, ˆ (α ≡ x + ip) ≡ exp (2ipˆ D x − 2ixˆ p) .
(10)
ˆ (x) = D ˆ (x) and Zˆ (p) = D ˆ (ip). These operations are depicted in Explicitly, X Figure 1. Displacement is remarkably simple to achieve in the micromaser cavity field [26,27]. By applying the appropriate external coherent field we can displace the ˆ (α), thus by choosing an appropriate α, we can achieve state in the cavity by D ˆ ˆ X (x) and Z (p) in the micromaser. The experiments of Lange and Walther [27] demonstrated control over the average photon number over a large domain, from the subphoton level for states less than α = 100. Via feedback we can make these as stable as we require [28]. With modern microwave synthesisers much better performance is expected, however without knowledge of the device used it is difficult to estimate this improvement over the performance in the original experiment. In an actual experiment the power required for specific shifts should 3
We use the standard definition of a coherent state, |α = e−|α|
2
/2
n
(α)n √ n!
|n.
Universal Continuous Variable Quantum Computation in the Micromaser
159
be determined experimentally as each cavity will couple at a different strength to an external field due to differences in machining. 4.2
The Fourier Transform
2 2 ˆ The Fourier transform Fˆ ≡ eiπ(xˆ +pˆ )/2 = eiπ(N +1)/2 is simply a π/2 rotation in phase space about the origin. For example, on a ground state displaced in the positive-ˆ x direction, we obtain a ground state displaced in the positive-ˆ p direction, as shown in Figure 1 in Section 2. It has a trivial class of eigenstates, the number states:
Fˆ |n = eiπ(n+1)/2 |n = in+1 |n .
(11)
This can easily be achieved using a micromaser with the atom detuned from the mode which we wish to apply the Fourier transform to, and far detuned from the other modes so as not to act on those. Given that the linewidth of each mode is so relatively small, this is a trivial requirement. For a mode detuned by Δ from an atomic transition the atom does not make a transition from |e to |g, but the combined system evolves for a time t as |e, n → e−ig
2
(n+1)t/Δ
|e, n ,
(12)
and the atom may be neglected after interaction. The detuning can clearly be chosen to satisfy the Fourier transform in Equation (11). 4.3
One-Mode Squeezing
As there is no direct squeezing operation in the micromaser, we will first describe squeezing for linear optics since the formalism is the same. In linear optics, onemode squeezing ζ 2 ζ †2 ˆ S (ζ) ≡ exp a ˆ − a ˆ (13) 2 2 is generated via a nonlinear-optical χ(2) interaction and yields the attenuation of a quadrature and the amplification of its conjugate. This is called nonlinear in optics, but to us this is a linear operation since it’s only quadratic in the field operators. We’ll still need a higher-order nonlinearity for our universal gate set. Since we are not considering linear optics, an analogue must be found for a micromaser system. The simplest system to consider is the two mode micromaser. This is best described as a three level ladder of atomic states for which the difference in energy between the uppermost and the lowermost states is twice the frequency of the field. The middle state is detuned from the one photon transition. As the central state is detuned from resonance with the field, the effective process is two photon transitions between the upper and lower state of the atom. The atom must be prepared in a particular superposition and after interaction the atom must be pulsed with a classical field so that it does not ‘give away’ information about the cavity field and lead to decoherence.
160
R.C. Wagner et al.
The particulars of this systemwere discussed in a paper by Orszag et. al. [29], ˆi = κ a but it is intuitive that H ˆ2 − a ˆ†2 will be the form of the two photon process. By analysing the cavity’s steady state, if it begins in a superposition n An |n, then it tends towards a squeezed state if |α/γ| = 1 − for a very small . This state is highly squeezed in momentum but is not quite of minimum uncertainty. We cannot take an infinite limit as the amount of squeezing is dependent on and all the squeezing disappears for = 0 but we can get an arbitrary amount of squeezing. After many atoms have passed through the cavity, we approach a pure squeezed state with (Δˆ p)2 0. Since it can take a long time to achieve this amount of squeezing, we may wish to prepare a sufficient quantity of squeezed states in advance in order that we may transfer the squeezing onto a state during the computation. Lam et. al. [30] have demonstrated this using Optical Parametric Oscillators in linear optics with the crucial component of a beam splitter to couple the modes. We show in Section 4.4 that a beam splitter may be replaced by any interaction and then explain an achievable interaction for the micromaser. 4.4
Two-Mode Squeezing
In linear optics, two-mode Squeezing Sˆi,j (ζ) ≡ exp ζ a ˆi a ˆj − ζˆ a†i a ˆ†j
(14)
is generated via a non-degenerate optical parametric amplifier (NOPA). It generates correlations between two modes, so it is an entangling operation. While it is a simple operation in mathematical terms, it is not as easy to achieve in the micromaser. However, the result for universality given in Section 2 tells us that we need any interaction between two modes. By passing a 3-level atom with the energy level diagram in Figure 4 through a cavity with two modes we can get an interaction between the modes which effectively excludes the atom.
Fig. 4. The energy level diagram for the 3-level atom. δn is the detuning of the relevant mode from each transition, and ωn is the frequency of each mode. The thick arrow between |a and |c denotes a classical pump field. If the atom moves up the ladder from |a → |b → |c it will then be pumped back down to |a by the classical field, ˆ2 . The converse is also true so that this acting on the field as the net operator a ˆ1 a system will behave like two-mode squeezing.
Universal Continuous Variable Quantum Computation in the Micromaser
161
The interaction also involves a coherently pumped lossy field and allows photon transfer between the two relevant modes through virtual processes. It is governed by the effective Hamiltonian with the assumption that δ1 , δ2 , δ3 g1 , g2 , Γ ˆ eff = Θ + ig1 g2 Γ a H ˆ†1 a ˆ†2 − a ˆ1 a ˆ2 , (15) ˆ Pˆ Q where Θ acts as a Fourier transform component on each mode. The rest is clearly related to two mode squeezing as the Schr¨ odinger equation is solved to give the ˆ = e−iHˆ eff t . The coupling strength g1 (g2 ) is between the evolution operator U first (second) cavity mode and the transition between states |a and |b (|b and |c) of the atom, Γ is the coupling strength between the coherently pumped lossy field and the transition between levels |a and |c and † δ3 ˆ = Pˆ + δ2 . Pˆ = a ˆi a ˆi − δi , Q (16) 2 i=1,2 In an actual experiment this final transition would likely be replaced with a two photon transition to follow selection the state vec rules. We need to modify tor simultaneously by |ψ = exp i i=1,2 a ˆ†i a ˆi ωi + δi − δ23 t |ψ. Now, two modes inside a single cavity can be coupled and we have an interaction polynomial which is good enough.. The effective coupling constant of the interaction is ˆ and the interaction is very close to two mode squeezing. g1 g2 Γ/Pˆ Q 4.5
Non-linearity
As well as the simple linear and interaction terms listed above, we need some non-linearity to achieve universal (not just Gaussian state) computation. We can simply measure the state of a 2-level atom (see Section 3) after transit through the cavity. This does not give a clean Hamiltonian, but any non-linearity is needed to be able to generate any Hermitian polynomial of continuous variables.
5
Summary and Future Work
In this paper we have given an overview of the history of continuous variables in computation, both classical and quantum. We stated the standard result for universal continuous variable quantum computation (CVQC) for two different encodings. We then gave an account of the micromaser, both what we can do in the experiment and what states can be created in the cavity. Given this, we showed how we can in principle achieve universal CVQC using Gaussian states in the micromaser using simple interactions. Having a system which can, in principle, perform universal CVQC is very useful but we must consider how feasible it is to use the micromaser in such a way. All the necessary operations can be produced quickly and accurately in the system and any Hamiltonian can be generated by a polynomial number of the base operations. We may have a problem of scalability since our current
162
R.C. Wagner et al.
interaction is between two modes in one cavity. However, for low numbers of modes the micromaser is a perfect candidate for efficient universal CVQC.
Acknowledgments We thank Bill Munro and Kae Nemoto for first suggesting that CVQC was significant and interesting. We thank our funders: RCW is funded by the UK EPSRC; MSE is funded by the Japanese Society for the Promotion of Science; VMK is funded by a UK Royal Society University Research Fellowship.
References 1. Morrison, J.E.: The Astrolabe, Softcover edn., Janus (November 2007) 2. Shannon, C.E.: Mathematical Theory of the Differential Analyzer. Journal of Mathematics and Physics 20, 337–354 (1941) 3. Rubel, L.: The Extended Analog Computer. Advances in Applied Mathematics 14(1), 39–50 (1993) 4. Moore, C.: Recursion theory on the reals and continuous-time computation. Theoretical Computer Science 162(1), 23–44 (1996) 5. Feynman, R.P.: Simulating physics with computers. International Journal of Theoretical Physics 21(6-7), 467–488 (1982) 6. Deutsch, D.: Quantum Theory the Church-Turing Principle and the Universal Quantum Computer. Proceedings of the Royal Society of London. Series A, Mathematical and Physical Sciences (1934-1990) 400(1818), 97–117 (1985) 7. Feynman, R.P.: Quantum mechanical computers. Foundations of Physics 16(6), 507–531 (1986) 8. Lloyd, S., Braunstein, S.L.: Quantum Computation over Continuous Variables. Physical Review Letters 82(8), 1784–1787 (1999) 9. Pati, A.K., Braunstein, S.L., Lloyd, S.: Quantum searching with continuous variables. arXiv:quant-ph/0002082v2 (June 2000), http://arxiv.org/abs/quant-ph/0002082v2 10. Pati, A.K., Braunstein, S.L.: Deutsch-Jozsa algorithm for continuous variables. arXiv:quant-ph/0207108v1 (July 2002), http://arxiv.org/abs/quant-ph/0207108v1 11. Lomonaco, S.J., Kauffman, L.H.: A Continuous Variable Shor Algorithm. arXiv:quant-ph/0210141v2 (June 2004), http://arxiv.org/abs/quant-ph/0210141v2 12. Ralph, T., Gilchrist, A., Milburn, G., Munro, W., Glancy, S.: Quantum computation with optical coherent states. Physical Review A 68(4), 042319 (2003) 13. Spiller, T.P., Nemoto, K., Braunstein, S.L., Munro, W.J., van Loock, P., Milburn, G.J.: Quantum computation by communication. New Journal of Physics 8(2), 30 (2006) 14. Bartlett, S., Sanders, B., Braunstein, S., Nemoto, K.: Efficient Classical Simulation of Continuous Variable Quantum Information Processes. Physical Review Letters 88(9), 097904 (2002) 15. Kok, P., Braunstein, S.L.: Multi-dimensional Hermite polynomials in quantum optics. Journal of Physics A: Mathematical and General 34(31), 6185–6195 (2001)
Universal Continuous Variable Quantum Computation in the Micromaser
163
16. Gottesman, D., Kitaev, A., Preskill, J.: Encoding a qubit in an oscillator. Physical Review A 64(1), 012310 (2001) 17. Meschede, D., Walther, H., M¨ uller, G.: One-Atom Maser. Physical Review Letters 54(6), 551–554 (1985) 18. Jaynes, E.T., Cummings, F.W.: Comparison of quantum and semiclassical radiation theories with application to the beam maser. Proceedings of the IEEE 51(1), 89–109 (1963) 19. Walther, H., Varcoe, B.T.H., Englert, B.G., Becker, T.: Cavity quantum electrodynamics. Reports on Progress in Physics 69(5), 1325–1382 (2006) 20. Barnett, S., Radmore, P.: Methods in Theoretical Quantum Optics, 1st edn., January 2003. Oxford Series on Optical and Imaging Sciences, vol. 15. Oxford University Press, USA (January 2003) 21. Englert, B.G.: Elements of Micromaser Physics. arXiv:quant-ph/0203052 (March 2002), http://arxiv.org/abs/quant-ph/0203052 22. Brattke, S., Varcoe, B.T.H., Walther, H.: Generation of Photon Number States on Demand via Cavity Quantum Electrodynamics. Physical Review Letters 86(16), 3534–3537 (2001) 23. Kuhr, S., Alt, W., Schrader, D., Muller, M., Gomer, V., Meschede, D.: Deterministic Delivery of a Single Atom. Science 293(5528), 278–280 (2001) 24. Weidinger, M., Varcoe, B.T.H., Heerlein, R., Walther, H.: Trapping States in the Micromaser. Physical Review Letters 82(19), 3795–3798 (1999) 25. Slosser, J.J., Meystre, P.: Tangent and cotangent states of the electromagnetic field. Physical Review A 41(7), 3867–3874 (1990) 26. Agarwal, G.S., Lange, W., Walther, H.: Intense-field renormalization of cavityinduced spontaneous emission. Physical Review A 48(6), 4555–4568 (1993) 27. Lange, W., Walther, H.: Observation of dynamic suppression of spontaneous emission in a strongly driven cavity. Physical Review A 48(6), 4551–4554 (1993) 28. Engen, G.F.: Amplitude Stabilization of a Microwave Signal Source. IEEE Transactions on Microwave Theory and Techniques 6(2), 202–206 (1958) 29. Orszag, M., Ram´ırez, R., Retamal, J.C., Roa, L.: Generation of highly squeezed states in a two-photon micromaser. Physical Review A 45(9), 6717–6720 (1992) 30. Lam, P.K., Ralph, T.C., Buchler, B.C., McClelland, D.E., Bachor, H.A., Gao, J.: Optimization and transfer of vacuum squeezing from an optical parametric oscillator. Journal of Optics B: Quantum and Semiclassical Optics 1(4), 469–474 (1999)
Quantum Computation with Devices Whose Contents Are Never Read Abuzer Yakaryılmaz1 , R¯ usi¸ nˇs Freivalds2 , 1 A.C. Cem Say , and Ruben Agadzanyan2 1
2
Bo˘ gazi¸ci University, Department of Computer Engineering, ˙ Bebek 34342 Istanbul, Turkey
[email protected],
[email protected] Institute of Mathematics and Computer Science, University of Latvia, Rai¸ na bulv¯ aris 29, Riga, LV-1459, Latvia
[email protected],
[email protected]
Abstract. In classical computation, a “write-only memory” (WOM) is little more than an oxymoron, and the addition of a WOM to a (deterministic or probabilistic) classical computer brings no advantage. We demonstrate a setup where a quantum computer using a WOM can solve problems that neither a classical computer with a WOM nor a quantum computer without a WOM can solve, when all other resource bounds are equal. We also show that resource-bounded quantum reductions among computational problems are more powerful than their classical counterparts.
1
Introduction
It is well known that many physical processes that violate human “common sense” are in fact sanctioned by quantum theory. Quantum computation as a field is interesting for precisely the fact that it demonstrates that quantum computers can perform resource-bounded tasks which are (in some cases, provably) beyond the capabilities of classical computers. In this paper, we demonstrate a case where the usage of “write-only memory” (WOM), a computational component that is used exclusively for being written to, and never being read, (which is little more than a joke in the classical setup,) improves the power of a quantum computer significantly. In that setup, we prove that a quantum computer using a WOM can solve problems that neither a classical computer with a WOM nor a quantum computer without a WOM can solve, when all other resource bounds are equal. As a separate contribution, we show that resource-bounded quantum reductions among computational problems are more powerful than their classical
Yakaryılmaz and Say were partially supported by the Scientific and Technological ¨ ITAK) ˙ Research Council of Turkey (TUB with grant 108142. Freivalds and Agadzanyan were partially supported by Grant No. 09.1570 from the Latvian Council of Science and by Project 2009/0216/1DP/ 1.1.2.1.2/09/IPIA/VIA/004 from the European Social Fund.
C.S. Calude et al. (Eds.): UC 2010, LNCS 6079, pp. 164–174, 2010. Springer-Verlag Berlin Heidelberg 2010
Quantum Computation with Devices Whose Contents Are Never Read
165
counterparts. For this purpose, we use a programming technique that is also employed in the demonstration of the superiority of quantum machines with WOM. The rest of the paper is structured as follows: The computational model we use is reviewed in Section 2. Section 3 describes how a quantum computer with a WOM can outperform its classical and quantum rivals, and examines the significance of this result from the point of view of quantum function computation. In Section 4, we prove that deterministic reductions are strictly less powerful than classical probabilistic reductions, which are in turn outperformed by quantum reductions. Section 5 is a conclusion.
2
The Underlying Model
We use standard definitions (involving a read-only input tape and one read/write work tape) [1] for deterministic and probabilistic Turing machines (TM and PTM, respectively). Our definition of quantum Turing machine (QTM) is a modification of the one found in [13]1 . Technically, we allow the halting register, which is observed after each step of the computation to decide whether to accept, reject, or continue, to have multiple symbols in its alphabet corresponding to each of these alternatives, and to be refreshed to its initial symbol after each observation. This small modification2 allows our QTM’s to implement general quantum operations, and therefore to simulate their classical counterparts precisely and efficiently. This result was shown for QTM’s with classical tape head position by Watrous [14]. The configuration of a QTM for a given input string consists of the following elements: 1. the head position of the input tape, 2. the contents and head position of the work tape, 3. the internal state. The content of the halting register is not included in the configuration description, since it is refreshed after each step, as described above. For any standard machine model, say, M, we use the name M-WOM to denote M augmented with a WOM component. A TM-WOM has an additional writeonly tape associated with a finite alphabet Υ . In each step of the computation, either a symbol from the alphabet, υ ∈ Υ , is printed on the current tape square, and the head moves one square to the right, or the empty string, ε, is “printed,” and so the head remains at the same position. The computational power of the 1
2
The QTM model is appropriate for studying the effect of space bounds on computational power. (The algorithm we present in the next section has such a bound imposed on the amount of read/write memory that it can use.) See [15] for an alternative model of quantum computation. Unlike [13], we also allow efficiently computable irrational numbers as transition amplitudes in our QTM’s. This simplifies the description of some algorithms in the remainder of this paper.
166
A. Yakaryılmaz et al.
PTM-WOM is easily seen to be the same as that of the PTM; since the machine does not use the contents of the WOM in any way when it decides what to do in the next move, every write-only action can just as well be replaced with a write-nothing action. However, this is not the case for the QTM-WOM, as will be shown in the next section. We will focus on quantum finite automata with WOM (QFA-WOM’s), which are just QTM-WOM’s which do not use their work tapes and move the input tape head to the right in every step. The configuration of a QFA-WOM is a pair (q, w), where q is an internal state, and w is the string written in the WOM. In the reducibility results in Section 4, we will use slightly different TM models. In m-reductions, the associated TM (with an output tape with write-only head) does not report a decision, but just terminates with some string written on its output tape. In more general reductions using oracles, the TM has an oracle tape. When it enters a special internal state, the string w currently written on the oracle tape is replaced by either “yes” or “no”, indicating the membership of w in the language of the oracle.
3
Quantum Computation with Write-Only Memory
We start by presenting the first known example for a quantum finite-state machine which traverses its input with a one-way head, and is able to recognize a non-context-free language with bounded error. It is a well known fact [10, 6, 5] that no (classical or quantum) one-way finite automaton (without WOM) can recognize a nonregular language with bounded error. Theorem 1. There exists a QFA-WOM that recognizes the language Ltwin = {w2w | w ∈ {0, 1}∗} with probability 23 . Proof. We construct a QFA-WOM having Q = {q1 , q2 , q3 , q4 } as the set of the internal states, Γ = { , 0, 1, 2, $} as the tape alphabet, Ω = {n, a, r} as the halting register alphabet, and Υ = {0, 1} as the WOM alphabet. The halting register is refreshed to n before each transition. The computation continues whenever n is observed in that register after the transition. On the other hand, the computation halts with acceptance if a is observed, and halts with rejection if r is observed. The transition details are shown in Figure 1. 1. The computation splits into three paths, path1 , path2 , and path3 , with equal probability at the beginning. (Note in Figure 1 that states q1 and q2 implement path1 , and states q3 and q4 implement path2 .) 2. path3 rejects immediately. 3. path1 (path2 ) scans the input and copies w1 (w2 ) to the WOM if the input is of the form w1 2w2 , where w1 , w2 ∈ {0, 1}∗. (a) If the input is not of the form w1 2w2 , both paths reject. (b) Otherwise, at the end of the computation, path1 and path2 perform the following Hadamard transform: path1 : |n, q2 , w1 → path2 : |n, q4 , w2 →
√1 |a, q1 , w1 2 √1 |a, q1 , w2 2
+ −
√1 |r, q1 , w1 2 √1 |r, q1 , w2 , 2
Quantum Computation with Devices Whose Contents Are Never Read
167
In the table below, the amplitude of the transition that takes place when the machine scans tape symbol σ while it is in state q, causing it to set the halting register to symbol ω, switch to state q and add υ to the string in the WOM can be read in the row labeled by q, at the column labeled by (ω, q , υ). Empty boxes indicate zero amplitude. The columns corresponding to the “missing” elements of Ω × Q × (Υ ∪ {ε}) contain all zeros, and have been omitted. In order for the machine to be well-formed, the rows of this table corresponding to the same tape symbol must be orthonormal to each other.
q1 ε 0 q1 q2 q3 q4
0 0 0 0
q1 q2 q3 q4
1 1 1 1
q1 q2 q3 q4
2 2 2 2
q1 q2 q3 q4
$ $ $ $
q1 q2 q3 q4
n q2 q3 1 ε ε
1 √ 3
q4 ε 0
1
a q1 ε
1 √ 3
r q1 q2 q3 q4 ε ε ε ε 1 √ 3
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 √ 2
1 √ 2
1 √ 2
−1 √ 2
1 1
Fig. 1. Transitions of the QFA-WOM of Theorem 1
where |ω, q, w refers to the configuration (q, w) having ω in the halting register. The configurations at the ends of path1 and path2 interfere with each other, i.e., the machine accepts with probability 23 , if and only if the input is of the form w2w, w ∈ {0, 1}∗. Otherwise, each of path1 and path2 contributes just 16 to the overall acceptance probability, and the machine accepts with probability 13 . Lemma 1. No PTM (or PTM-WOM) using o(log(n)) space can recognize Ltwin with bounded error. Proof. Any PTM using o(log(n)) space to recognize L with bounded error can be used to construct a PTM recognizing the palindrome language Lpal = {w | w = wR , w ∈ {a, b}∗ } with bounded error using the same amount of space. (One would only need to modify the L machine to treat the right end-marker on the tape as the symbol 2, and switch its head direction when it attempts to go past that symbol.) It is however known [3] that no PTM using o(log(n)) space can recognize Lpal with bounded error.
168
A. Yakaryılmaz et al.
Corollary 1. QTM-WOM’s are strictly superior to PTM-WOM’s for any space bound o(log(n)) in terms of language recognition with bounded error. Remark. If one changes the model in Theorem 1 so that the WOM is now an output tape, the machine becomes a quantum finite state transducer (QFST) [4] computing the function [11] w, if x = w2w, where w ∈ {0, 1}∗ f (x) = , undefined, otherwise with bounded error. The arguments above can then be rephrased in a straightforward way to show that conventional QTM’s are strictly superior to PTM’s in function computation for any common space bound that is o(log(n)).
4
Reduction Results
The concept of reducibility among problems is of fundamental importance in theoretical computer science. In this section, we compare the relative power of deterministic, probabilistic and quantum reductions. The interference technique used in the previous section has an interesting application in the demonstration of the superiority of quantum over classical reducibilities. In keeping with the conventions of the reducibility literature, all languages mentioned in this section will be sets of nonnegative integers. 4.1
Deterministic versus Probabilistic Reducibility
Definition 1. Language A is probabilistically m-reducible to language B with probability p > 12 , denoted by A prob(m),p B, if there is a PTM which outputs y1 , . . . , yk with probabilities p1 , . . . , pk , respectively, for a given input x, satisfying the following conditions: / A. yi ∈B pi ≥ p when x ∈ A, and yi ∈B / pi ≥ p when x ∈ Deterministic m-reducibility, denoted A m B, is a special case of probabilistic m-reducibility, where the PTM in Definition 1 is replaced by a deterministic TM. Theorem 2. There exist recursively enumerable languages A and B such that 1. A m B, 2. A prob(m), 23 B. Proof. To visualize the concepts, we will use the figure below. The bottom and top rows in the figure are called lineA and lineB , respectively. On lineB , numbers are grouped into triples, and any two members of a triple are said to be the relatives of the remaining member. B⊂ 0 1 2 3 4 5 6 7 8 ↑ ↑ ↑ A⊂ 0 1 2
··· ··· ···
3x
3x + 1 3x + 2 ↑ x
··· ··· ···
Quantum Computation with Devices Whose Contents Are Never Read
169
FOR n = 1, 2, . . . ## STAGE n MARK the first free number, say y, with marker-(n − 1), which becomes active MARK y at lineA and 3y, 3y + 1, and 3y + 2 at lineB with “ − ” LOOP ## markers with higher priority will be simulated earlier SIMULATE each active marker (ϕi ) for n steps with the associated number (x) as input IF ϕi (x) returns a value, say t CALL UPDATE SIGNS(x,t) MAKE marker-i inactive FOR j = i + 1, . . . , n − 1 MOVE marker-j to the first free number on lineA MAKE marker-j active END GOTO NEXT STAGE END END END
Fig. 2. MAIN ALGORITHM (Theorem 2)
We will give an effective construction for the languages A and B. Initially, they contain no elements. During the construction, we will label the numbers on lineA and lineB using the signs “ + ” or “ − ”, that will indicate, respectively, the members and nonmembers of the corresponding language. Note that the “ − ” signs will be tentative in nature, but a number marked with “ + ” will never change sign. This will ensure that the languages are recursively enumerable. The PTM P for A prob, 23 B simply realizes the transformation ⎧ with probability 13 ⎨ 3x P :x
→ 3x + 1 with probability 13 ⎩ 3x + 2 with probability 13 Let ϕ0 , ϕ1 , . . . be an enumeration of deterministic TM’s with output tapes. In the following, ϕi will be named marker-i. ϕi has higher priority than ϕj if i < j. The algorithm described in Figures 2 and 3, which is based on Friedberg and Muchnik’s priority method [12], effectively constructs the languages A and B so that 1. each marker fails to be a deterministic reduction on at least one input, and, 2. P is indeed an m-reduction from A and B with probability 23 . In the algorithm, the first free number refers to 0 in Stage 1, and to the successor of the greatest number having either a sign or a marker on lineA in all later stages. The main idea of the algorithm is that each marker can be moved only a finite number of times, and so any marker (ϕi ) remains ultimately at a number (x) on lineA. Thus, it is easy to make sure that the signs of x and ϕi (x) = t contradict in order to get x∈ / A ⇔ ϕi (x) = t ∈ B,
170
A. Yakaryılmaz et al.
If t ∈ / {3x, 3x + 1, 3x + 2}, then we have two cases: If t has no sign, then mark t and its relatives at lineB and 3t at lineA with “ − ”. Mark x at lineA and at least two of {3x, 3x + 1, 3x + 2} at lineB with “ + ”. If t has a sign, say S: If S is “ + ”, there is no need for marking since x is already marked with “ − ”. If S is “ − ”, then mark x at lineA and at least two of {3x, 3x + 1, 3x + 2} at lineB with “ + ”. If t ∈ {3x, 3x + 1, 3x + 2}: Mark t at lineB with “ + ”.
Fig. 3. UPDATE SIGNS(x,t) (Theorem 2)
to render the function ϕi ineligible for being an m-reduction, while employing the additional numbers in the triples on lineB to ensure that P works correctly. Note that some markers may never halt, but this is not a problem, since such markers are not proper reductions by definition. It is straightforward to construct two deterministic TM’s that recognize A and B, respectively, by simulating the algorithm of Figure 2, so both A and B are recursively enumerable. It should also be noted that one can easily augment the success probability of the probabilistic reduction above to any desired value p < 1, by increasing the size of the blocks in the division of lineB from three to a suitably greater number. 4.2
Probabilistic versus Quantum Reducibility
Definition 2. Language A is probabilistic (respectively, quantum) Turing reducible with k queries to language B with probability p > 12 , denoted A prob(T -k),p B (respectively, A quan(T -k),p B), if there exists a PTM (respectively, QTM), which is restricted to query the oracle for B at most k times, that recognizes A (that is, responds correctly to all questions of membership in A) with probability at least p. Theorem 3. There exist recursively enumerable languages A and B such that 1. A prob(T -1), 23 B, 2. A quan(T -1),1 B. Proof. We will use an idea similar to the one in the proof of Theorem 2. B⊂ 0 1 2 3 4 5 A⊂ 0 1 2
··· ··· ···
2x
x
2x + 1
··· ··· ···
In the current setting, the numbers on lineB are grouped as pairs. We will again put “ + ” and “ − ” signs on the numbers to indicate their membership status.
Quantum Computation with Devices Whose Contents Are Never Read
171
We define QTM M for A quan(T -1),1 B for a given input x as follows: 1. The computation splits to two paths, path1 and path2 , with amplitude √12 . 2. Each path sets a clock depending on x so that they can evolve to configurations which are identical to each other except for their internal states, (call these the twin configurations,) just before the last step of the computation. 3. path1 and path2 respectively prepare 2x and 2x + 1 on the oracle tape for their single query. 4. If the answer of the oracle is negative, the amplitude of that path is multiplied with −1. 5. Both paths enter the twin configurations, and then make the following Hadamard transformation: 1 path1 → √ |Reject + 2 1 path2 → √ |Reject − 2
1 √ |Accept 2 1 √ |Accept. 2
A straightforward calculation shows that M decides on input x as described in Table 1 with probability 1 by querying the B oracle about 2x and 2x + 1. Table 1. The decision of M 2x − − + +
2x + 1 − + − +
x − + + −
Let ϕ0 , ϕ1 , . . . be an enumeration of PTM’s having the capability of querying the oracle for B at most once. The algorithm described in Figures 4 and 5 constructs the languages A and B, ensuring that they satisfy the following two conditions: 1. each probabilistic reduction errs with a probability exceeding 13 , on at least one input, and 2. M runs correctly, according to Table 1. The algorithm in Figure 4 can be used to obtain two deterministic TM’s that recognize A and B, respectively, as in the previous subsection.
5
Conclusion
In this paper, we showed that write-only memory devices can increase the computational power of quantum computers, by demonstrating a language, which is
172
A. Yakaryılmaz et al.
FOR n = 1, 2, . . . ## STAGE n MARK the first free number, say y, with marker-(n − 1), which becomes active MARK y at lineA and 2y and 2y + 1 at lineB with “ − ” LOOP ## The markers with higher priority are simulated earlier SIMULATE the first n levels of the probabilistic computation tree of each active marker (ϕi ) with the associated number (x) as input ## The oracle for B is assumed to respond with “no” to queries about numbers at lineB which are not signed yet LET T = {t1 , t2 , . . . , tm } be the set of numbers for which ϕi queries B’s oracle in the various branches of its simulation FIND all subsets of T , T = {T | T = {t1 , t2 , . . . , tl }, l ≤ m}, such that all branches of ϕi (x) that query the oracle about the numbers in T halt with the same decision, say D, and the total probability of those branches exceeds 13 LET T be the biggest subset of T whose elements are associated with “no”, and contain both 2x and 2x + 1 IF T = T and T = ∅ PUT a temporary sign “ ∗ ” on 2x at lineB RE-SIMULATE ϕi for the first n levels on this new lineB , and RE-FIND T based on this new simulation ## The oracle for B is assumed to respond with “yes” to queries about numbers with sign “ ∗ ” SET T to ∅ IF there is a T ∈ T \ T (pick one arbitrarily if there exists more than one such set) CALL UPDATE SIGNS(x, D, T ) MAKE marker-i inactive FOR j = i + 1, . . . , n − 1 MOVE marker-j to the first free number on lineA MAKE marker-j active END GOTO NEXT STAGE END REPLACE any “ ∗ ” with “ − ” END END Fig. 4. MAIN ALGORITHM (Theorem 3)
known to be unrecognizable by both classical and quantum computers with certain restrictions, to be recognizable by a quantum computer employing a WOM under the same restrictions. As a separate contribution, we proved that quantum reductions among computational problems are more powerful than probabilistic reductions, which are in turn superior to deterministic reductions. Note that it is already known that adding a WOM to a reversible classical computer may increase its computational power, since it enables one to embed irreversible tasks into “larger” reversible tasks by using the WOM as a trashcan. As a simple example, reversible finite automata (RFA’s) can recognize a proper
Quantum Computation with Devices Whose Contents Are Never Read
173
1. Mark x at lineA with the sign that contradicts D. (Note that x could not have the sign “ + ” before this step.)
2. Mark all t ∈ T having no sign with “ − ”, and so mark t2 at lineA and the relative of t at lineB with “ − ”. 3. Update the signs of 2x and/or 2x + 1 at lineB if needed. All possible cases for this update are shown below. In case 2, since 2x + 1 ∈ / T , it is safe to change the sign of 2x + 1. case condition D 1 yes 2 yes 3.a 2x + 1 ∈ / T no 3.b 2x + 1 ∈ T no 4 no
before step 3 x 2x 2x + 1 − − − − ∗ − + − − + − − + ∗ −
after step 3 x 2x 2x + 1 − − − − + + + − + + + − + + −
Fig. 5. UPDATE SIGNS(x, D, T ) (Theorem 3 )
subset of regular languages [9], but RFA’s with WOM can recognize exactly the regular languages, and nothing more. In the quantum case, WOM can also have a similar effect. For example, the computational power of the most restricted type of quantum finite automata (MCQFA’s) [7] is equal to RFA’s, but it has been shown [2, 8] that MCQFA’s with WOM can recognize all and only the regular languages, attaining the power of the most general quantum finite automata (QFA) without WOM. In all these examples, the addition of WOM to a specifically weak model raises it to the level of the most general classical (deterministic) automaton. On the other hand, in this work, we show that adding WOM to the most general type of QFA results in a much more powerful model that can achieve a task that is impossible for all sublogarithmic space PTM’s. Some remaining open problems related to this study can be listed as follows: 1. Does a WOM add any power to quantum computers which are allowed to operate at logarithmic or even greater space bounds? 2. How would having several separate WOM’s, each of which would contain different strings, affect the performance? 3. Can analogues of Theorems 2 and 3 be proven for general Turing reductions, where the number of queries is not restricted, or for even weaker kinds of reduction?
References 1. Arora, S., Barak, B.: Computational Complexity: A Modern Approach. Cambridge University Press, New York (2009) 2. Ciamarra, M.P.: Quantum reversibility and a new model of quantum automaton. In: Freivalds, R. (ed.) FCT 2001. LNCS, vol. 2138, pp. 376–379. Springer, Heidelberg (2001)
174
A. Yakaryılmaz et al.
3. Freivalds, R., Karpinski, M.: Lower space bounds for randomized computation. In: Shamir, E., Abiteboul, S. (eds.) ICALP 1994. LNCS, vol. 820, pp. 580–592. Springer, Heidelberg (1994) 4. Freivalds, R., Winter, A.J.: Quantum finite state transducers. In: Pacholski, L., Ruˇziˇcka, P. (eds.) SOFSEM 2001. LNCS, vol. 2234, pp. 233–242. Springer, Heidelberg (2001) 5. Jeandel, E.: Topological automata. Theory of Computing Systems 40(4), 397–407 (2007) 6. Kondacs, A., Watrous, J.: On the power of quantum finite state automata. In: FOCS 1997: Proceedings of the 38th Annual Symposium on Foundations of Computer Science, Miami, Florida, pp. 66–75 (1997) 7. Moore, C., Crutchfield, J.P.: Quantum automata and quantum grammars. Theoretical Computer Science 237(1-2), 275–306 (2000) 8. Paschen, K.: Quantum finite automata using ancilla qubits. Technical report, University of Karlsruhe (2000) 9. Pin, J.-E.: On the language accepted by finite reversible automata. In: Ottmann, T. (ed.) ICALP 1987. LNCS, vol. 267, pp. 237–249. Springer, Heidelberg (1987) 10. Rabin, M.O.: Probabilistic automata. Information and Control 6, 230–243 (1963) 11. Say, A.C.C., Yakaryılmaz, A.: Quantum function computation using sublogarithmic space (2010) (Poster presentation at QIP 2010) 12. Sch¨ oning, U., Pruim, R.: Gems of Theoretical Computer Science. Springer, Heidelberg (1998) 13. Watrous, J.: Space-bounded quantum computation. PhD thesis, University of Wisconsin - Madison, USA (1998) 14. Watrous, J.: On the complexity of simulating space-bounded quantum computations. Computational Complexity 12(1/2), 48–84 (2004) 15. Yao, A.C.-C.: Quantum circuit complexity. In: Proceedings of the 34th Annual Symposium on Foundations of Computer Science, pp. 352–361 (1993)
The Extended Glider-Eater Machine in the Spiral Rule Liang Zhang International Center of Unconventional Computing and Department of Computer Science University of the West of England, Bristol
[email protected]
Abstract. We investigate the glider-eater interaction in a 2-dimensional reaction-diffusion cellular automaton, the Adamatzky-Wuensche Spiral Rule. We present the complete state transition table of such interactions, with which one can build the extended glider-eater machine composed of multiple instances of gliders and eaters to compute in specific problems. We demonstrate the implementation of asynchronous counters with the extended glider-eater machine. Since the counter can be understood as a part of the Minsky register machine with only the INC (increment) function implemented, we envisage that the extended glider-eater machine could be essential if one intends to build a complete Minsky register machine in the Spiral Rule and to prove the rule is Turing-universal. Keywords: Spiral Rule, Cellular automata, Collision-based computing, Reaction-diffusion computing, Asynchronous counters.
1
Introduction
In Wolfram’s classification [18], class 4 cellular automata are those with emergent localized structures and complex dynamics, and because of which they may be capable of Turing-universal computation. Some well-known cellular automata were proved to exhibit such universality. For example, the Rule 110, a one-dimensional elementary cellular automaton, can emulate the cyclic tag system [7], a universal computational model. In the two-dimensional cellular automata rule space, there is Conway’s Game of Life [11]. Berlekamp et al. [5] first provided an outline of proving “Life is universal” by presenting necessary units to construct a universal computer. Subsequent works of Rendell [15] and Chapman [6] separately implemented a Turing machine and a universal Minsky’s register machine, proving the computational universality of Life. In addition, Margolus’ BBMCA (Billiard Ball Model Cellular Automaton) [12] was also proved to be Turing-universal by emulating the billiard ball model [10], which had been shown as universal. In all examples above, the computation is performed by interactions of localized patterns or localizations. Especially in the last two examples (Life and BBMCA), the appearance of mobile localizations are encoded as logical values: a logical Truth is assigned when a mobile localization is present at a specific C.S. Calude et al. (Eds.): UC 2010, LNCS 6079, pp. 175–186, 2010. c Springer-Verlag Berlin Heidelberg 2010
176
L. Zhang
location in a specific time step, while a logical False corresponds to the absence of that localization. In this way, the interaction of localizations can be interpreted as a logic gate: all incoming localizations before the interaction represent the inputs of the gate, and all outgoing localizations after the interaction are the outputs. Such concepts constitute the fundamental basis of collision-based computing [1], which is not only applicable to cellular automata [21], [22], but also to other physical systems [10], [8], of which we are particularly interested in the light-sensitive sub-excitable Belousov–Zhabotinsky medium, where compact wave-fragments are emerged and interact with each other [2], [16], [17]. In the present paper, we focus our investigation on the Spiral Rule cellular automaton [19], [3], [20], a two-dimensional cellular automaton discovered by Adamatzky and Wuensche. The Spiral Rule has come to our attention for two main reasons. Firstly, the rule has many emergent structures such as gliders, eaters (stationary localizations), mobile glider guns and stationary spiral glider guns, which shows it has a great potential to be universal. And secondly, the Spiral Rule is a cellular automaton model of reaction-diffusion chemical system, thus one may apply the results getting from the rule directly to real chemical media, such as the light-sensitive BZ medium. For instance, De Lacy Costello et al. [9] experimentally constructed spiral glider guns in a heterogeneous BZ network (i.e., a BZ medium consisting of chessboard-like alternating excitable and non-excitable regions defined by different light intensities). More specifically, we investigate interactions between gliders and one type of the eaters. Previous works on the spiral rule have shown that the eater can act as a memory device when interacting with gliders and such glider-eater interactions can result in a glider-eater machine with the eater having four states modified by gliders passing by [3], and that the glider-eater interactions can be used to manipulate binary strings with 2-bit, 4-bit, and 6-bit [4]. The present paper will briefly introduce the Spiral Rule in Sect. 2, and provide more details on the glider-eater interaction in Sect. 3. In Sect. 4 we present the extended glider-eater machine and use it to demonstrate how to build useful circuits such as the counters in Sect. 5. Future works will be discussed in Sect. 6.
2
The Spiral Rule: A Reaction-Diffusion Cellular Automaton
The Spiral Rule is a 3-state k-totalistic cellular automaton on a 2-dimensional lattice with hexagonal tiling [19], [3]. Each cell has a 7-cell neighborhood consisting of the central cell itself and its six closest neighbors, and at any given time-step it is in one of the three states: 0, 1 and 2, corresponding to dot, white circle and black disc in all following figures. Like all other cellular automata, cells update their states simultaneously at each time-step. Cell-states transition follows the k-totalistic rule, where the next state of a cell depends on numbers of different cell-states in its neighborhood in the current time-step. The ruletable of the Spiral Rule is: 000200120021220221200222122022221210. Moreover, the Spiral Rule can be seen as a discrete reaction-diffusion chemical system,
The Extended Glider-Eater Machine in the Spiral Rule
(a) G1
(b) G2
(c) G3
(d) G4
177
(e) G5
Fig. 1. Five basic gliders in the Spiral Rule [3]. Three of them (G1, G4, and G5) have a period of 2, and the other two (G2 and G3) remain the same configuration at all times.
(a) E1, without and with memory cells set
(b) E2
Fig. 2. Two types of eaters in the Spiral Rule [3]. We will focus on E1 in this paper.
with three types of chemical reactants: inhibitor I, activator A and substrate S corresponding to cell-states 2, 1 and 0 respectively. The Spiral Rule exhibits a rich dynamics of localized patterns [19], [3], among which there are five basic types of gliders, shown in Fig. 1. All gliders have a 1-cell head in cell-state 1 and a multi-cell tail in cell-state 2. Stationary localizations in the Spiral Rule (Fig. 2) are called eaters because when they interact with gliders, at most of the time the latter may vanish as if they were eaten by the eaters. The first type of eaters E1 (Fig. 2a) has a special feature that , in between the six cells with cell-state 2 at the outer circle, there are six memory cells which can be set to cell-state 2 and reset to cell-state 0 after interacting with gliders and remain those states until further gliders coming. Such cell behavior is like binary memory bit being set to values 1 and 0. Other emergent structures in the spiral rule include mobile and stationary glider guns, which we will not describe in detail in the present paper.
3 3.1
Glider-Eater Interaction Types of Interactions
Hereafter, when we mention the eater, we mean the eater E1. In this section we will describe in detail the interactions between all types of basic gliders and the eater. Firstly, whether or not they may interact and what type of interaction may be, depends on the relative positions of the glider and the eater. More specifically, it depends on the vertical distance of the two structures, Dv . Def. 1. The vertical distance of the glider and the eater, Dv , is the distance between the center cell of the eater and the traveling trajectory of the glider. When the center cell of the eater is on the traveling trajectory of the glider, Dv = 0, otherwise Dv equals the offset distance of the center cell to the trajectory
178
L. Zhang 6 4 2 0 2 4 6
5 3 1 1 3 5
Fig. 3. The vertical distance between the glider and the eater. Each line represents the traveling trajectory of a glider moving from West to East, and the number to the right of the line represents the vertical distance between that glider to the eater in the middle. The critical value of the vertical distance is 4.
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
(j)
Fig. 4. An example of Glider-Eater interaction when Dv = 4. (a-d) A glider G1 traveling East toward an eater E1, where no memory cells are set to cell-state 2 yet, (e-h) the glider interacts with the eater while passing by, and (i-j) a glider G5 moves away from the eater and leaves it with its top-right memory cell set to cell-state 2.
(a) state α
(b) state β
(c) state γ
(d) state δ
Fig. 5. States of the eater, from left to right, α, β, γ and δ
(a)
(b)
Fig. 6. An example of the same glider-eater input combination
resulting in different output combinations: (a) , and (b) . Note that input gliders in these cases are mirror image to each other, and this shows that reflected configurations of the same type of glider should be treated differently when dealing with glider-eater interactions
The Extended Glider-Eater Machine in the Spiral Rule
179
(Fig. 3). The critical value of Dv is 4, for that value determines what type of interaction it is between the glider and the eater. When Dv < 4, the glider is annihilated by the eater; and when Dv > 4, the glider and the eater do not interact with each other at all. In both situations, the outcome seems boring. However, when Dv = 4, the glider can survive the interaction and may transform into another type of glider, and the eater may change into another configuration as well. Computationally speaking, such a glider-eater interaction can be roughly seen as a gate with a pair of input values in the form of and a pair of output values in the same form. For example, in Fig. 4, the input values are and the output values are . From now on, we will focus our investigation only on glider-eater interaction when Dv = 4. 3.2
States of the Eater
If we allow gliders to move toward the eater in all 12 possible entry routes, as it may naturally happen if we start evolving the Spiral Rule from a random initial configuration, then all memory cells of the eater are accessible to be set and reset, thus the eater can be in one of 64 configurations at any time-step. However, in our investigation, we limit the glider to move East on top of the eater just like in Fig. 4, as a result only the two memory cells at the top of the eater can be modified by the glider during the interaction. And if we deliberately choose the other four memory cells to have cell-state 0, then all eaters in our investigation could only be in one of the four configurations shown in Fig. 5, and we name them as states α, β, γ and δ. 3.3
States of the Gliders
Now that we have chosen states of the eater, what about those of the gliders? Intuitively, one may choose five basic glider types as their states as we did in Sect. 3.1. However, closer inspections tell us that such an arrangement does not work very well, for the same pair of input values could lead to different pairs of output values. In other words, the underlying function is not injective. This problem is more clearly demonstrated with the following two examples. The first example (Fig. 6) shows two distinct interactions between a glider G2 and an eater in state γ. The output values of such an interaction could be a glider G3 with the eater in state δ (Fig. 6a), or a glider G1, with the eater remaining in state γ (Fig. 6b). The reason we get multiple output values here is that, although both gliders used in this example are G2, they actually are reflected configurations to each other. And in the glider-eater interaction, reflected configurations of the same glider tend to behave differently. The second example (Fig. 7) shows interactions between a glider G4 and an eater in state α. In both cases, the output glider is G2, although reflected to each other. Nonetheless, the eaters are left in different states, δ and α. The reason of this difference is that gliders G4 are in different phases at the initial time-step. Further investigations show that configurations of different phases in gliders G4
180
L. Zhang
(a)
(b)
Fig. 7. The second example of the same glider-eater input combination, resulting in different output combinations: (a) , and (b) . This example shows that different phases of the same type of glider should be treated differently.
(a)
(b)
Fig. 8. Reflected configurations of gliders (a) G2, and (b) G5
and G5 behave differently; while in G1, configurations of different phases tend to behave similar to each other. The findings from the above two examples lead us to look into the behavior of each and every possible basic glider configuration rather than the general five basic types of gliders while investigating glider-eater interactions. There are in total 11 distinct glider configurations in the spiral rule, considering different phases and the factor of reflection, with 8 of them shown in Fig. 1, and the rest of them, being the reflected configuration of G2 and G5, shown in Fig. 8. Moreover, in the second example we have seen that gliders in different phases may behave differently. When gliders G1, G4 and G5 change from one phase to another, they also move one step from West to East, which shows us another fact that the horizontal distance Dh between the glider and the eater matters when they interact. Def. 2. The horizontal distance of the glider and the eater, Dh , is the distance between the head of the glider and the central line of the eater perpendicular to the trajectory of the glider. In other words, the horizontal distance shows how many time-steps the head of the glider is away from that specific central line of the eater. If the head of the glider in on that line, Dh = 0. For example, in a previous figure, Fig. 4, from (a) to (j), Dh = 4, 3, 2, 1, 0, 1, 2, 3, 4 and 5 respectively. Clearly, before the interaction, the glider remains in one phase when Dh is even (Fig. 4a and Fig. 4c), then it is in another phase when Dh is odd (Fig. 4b and Fig. 4d). Actually this finding is very important, because together with the finding found from the second example above (Fig. 7), we know that if we initiate the glidereater interaction with a glider (must be G1, G4 or G5) in the same phase but a different horizontal distance (we only care about even or odd) to the eater, we may have different results. Such a fact is easily neglected as it was in previous
The Extended Glider-Eater Machine in the Spiral Rule
181
A B C D Fig. 9. Eleven basic glider configurations can be categorized into four groups, based on their behavior when they interact with eaters. Glider configurations in the same group have the same behavior.
works [3], and the consequences of such a neglection is that the resulting state transition table of the glider-eater interaction is somewhat limited. So, after we run through all possible interactions between gliders in each basic configuration starting from places with either odd or even horizontal distances from the eaters in all possible states. We have found that all these 11 basic configurations can be categorized into four groups based on their behavior when they interact with eaters, as shown in Fig. 9. These four groups are well fit in the newly developed state transition tables, and are the states of the gliders in our further investigations of the glider-eater interaction. 3.4
State Transition Tables
Up till now, from the above sections, we know that, (a) in the glider-eater interaction that we are interested (when Dv = 4), a pair of values will be transformed into another pair of values in the same form; (b) we have four glider-states A, B, C and D categorized from eleven basic glider configurations, shown in Fig. 9; (c) we have four eater-states α, β, γ and δ, shown in Fig. 5; and (d) when we describe the transition, we need to take the horizontal distance (even or odd) into consideration, both before and after the interaction. The complete state transition tables are upgraded from the one in [3] developed by Adamatzky and Wuensche, who did not emphasize the difference of even/odd horizontal distances between the glider and the eater. The newly developed state transition tables T consists of four tables, shown in Fig. 10. The subscript i and j of T, represents the horizontal distance between a glider and the eater before and after the interaction. If the distance is an even number, i or j equals 0, or else the distance is an odd number, then i or j equals 1. In each table, the first row shows the glider-state and the first column shows the eater-state before the interaction. The values of the table represent the state of the glider and the eater after the interaction. For example, in Table T10 , the top-left value Dα means that when a glider with state A has a odd distance to a eater with state α before their interaction, then the glider will be transformed to another glider with D after the interaction while the eater remains at state α.
182
L. Zhang T00 α β γ δ
A Dα Aβ Aδ Dγ
B Cβ Bα Bγ Cδ (a)
C Bγ Cδ Cβ Bα
D Aδ Dγ Dα Aβ
T01 α β γ δ
A Cα Aβ Aδ Cγ
B Dβ Bα Bγ Dδ (b)
C Bγ Dδ Dβ Bα
D Aδ Cγ Cα Aβ
T10 α β γ δ
A Dα Aβ Aδ Dγ
B Cβ Bα Bγ Cδ (c)
C Aδ Dγ Dα Aβ
D Bγ Cδ Cβ Bα
T11 α β γ δ
A Cα Aβ Aδ Cγ
B Dβ Bα Bγ Dδ (d)
C Aδ Cγ Cα Aβ
D Bγ Dδ Dβ Bα
Fig. 10. State transition tables of glider-eater interactions. T = Tij , (0 ≤ i ≤ 1, 0 ≤ j ≤ 1), where i = (the horizontal distance between the glider and the eater before the interaction) mod 2, and j = (the horizontal distance between the glider and the eater after the interaction) mod 2.
As we can see, the state transition tables provides a unique pair of output values for each input value, once the observing horizontal distances (before and after the interaction) are chosen. And beyond that, these four tables are closely related to each other, – All values in columns A and B in tables T0j are the same as those in table T1j , since gliders in states A and B either have only one phase (G2 and G3) or configurations of their two phases are in the same state-group (G1), therefore even or odd horizontal distances before the interaction makes no differences. – All values in column C in T0j is the same as column D in T1j , and vice versa. This is because all gliders in these two states have two phases (G4 and G5), and the configuration of one phase belongs to state-group C and the other one belongs to state-group D. – For all corresponding values in table Ti0 and Ti1 , if the glider-state is A or B in one table, it will be the same in the other one. If the glider-state is C or D in one table, it will change into D or C respectively in the other one. Again, this is because only gliders G4 and G5 in state-group C and D may change from one glider-state to another in consecutive time-steps. – All eater-states in corresponding position in table Ti0 and Ti1 are the same. Because once the interaction is over, the state of the eater will never change. – In a word, as long as we have values of one table, we can induce the others.
4
The Extended Glider-Eater Machine
As briefly introduced in [3] by Adamatzky and Wuensche, one can feed an eater with a series of gliders with certain glider-states to form a glider-eater machine,
The Extended Glider-Eater Machine in the Spiral Rule
183
(a) Distance between eaters: odd. Eater-states are α and δ, and gliderstates are A and C after 20 time-steps
(b) Distance between eaters: even. Eater-states are α and γ, and gliderstates remain B and B after 20 time-steps Fig. 11. The extended glider-eater machine, where two gliders interact with two eaters. Here we demonstrate that if the distances between corresponding eaters are all odd or all even, the final states of the gliders and eaters may be different. In both examples, the left picture is the initial configuration of that machine, and the right picture is the configuration after 20 time-steps.
then we can observe state transition of both gliders and the eater. In fact, all state transitions can be predicted using the above state transition tables. In addition, we can readily extend such a glider-eater machine to include multiple eaters, as shown in Fig. 11, where two gliders interact with two eaters. The states of the gliders and eaters at any time-step can be predicted using the state transition tables as well. However, here we need to emphasize an important fact that in the extended glider-eater machine, the distances between adjacent eaters can influence the final results. In our examples in Fig. 11, simply because the distance between the two eaters are different (one is odd and the other is even), we have different states of the eaters and gliders after the interaction.
5
Asynchronous Binary Counters
The extended glider-eater machine shows us that by setting the initial states and arranging relative positions of the gliders and eaters, one can manipulate the states of the gliders and in turn the states of particular eaters. Therefore we should be able to build gates/circuits using the machine. Here we implement asynchronous binary counters with it. A counter is an arithmetic circuit which counts numbers from 0, 1, 2... etc. It also can be seen as an implementation of the INC (increment) function of the register in the Minsky register machine [13], [14]. With the extended glider-eater machine, we can count how many gliders (with particular state) are interacted with eaters, and show the result in some of the eaters. We call our counters “asynchronous”, since we do not care how far away the gliders are from the eaters, as long as, in our implementation, the gliders have even horizontal distances to the first eater.
184
L. Zhang
Fig. 12. A snapshot of the initial configuration of the binary (3,2)-counter implemented using the extended glider-eater machine in the spiral rule
Fig. 13. A snapshot of the configuration of a binary (7,3)-counter implemented using the extended glider-eater machine in the spiral rule
From the state transition tables, we know that gliders with state B can change an eater with state α to a state β and vice versa. This is consistent with an INC function on a binary bit, if we consider the eater with state α as binary number 0, and the eater with state β as binary number 1. Now we demonstrate how to build a binary (3,2)-counter using the extended glider-eater machine. Here the number ’3’ means the counter can count at most 3 gliders and ’2’ means the counting result will be shown in 2 registers or, in our case, eaters. The first step is to identify in which states the registers or eaters will be, after each glider being counted. Since our three counters are all in state B, we can know from the state transition tables that after they interact with the first register, their states should be transformed into C, B and C, with the state of the first register changed in the order of α → β → α → β. And in order to change the state of the second register in the order of α → α → β → β after interacting with gliders, the states of gliders prior to the second register have to be A, B and A. Therefore the only problem left is to place an eater or multiple eaters between these two registers and transform the gliders to the required states. The solution is to use an eater with state δ and make it having an odd distance with the first register and an even distance with the second one, as shown in Fig. 12. We can also implement counters with more capacity. A snapshot of the configuration of a binary (7,3)-counter is shown in Fig. 13, where counting results can be read from the leftmost eater, the fifth eater from the left and the rightmost eater corresponding to three binary bits from the least significant one to the most significant one. It is very likely that using the extended glider-eater machine, we can implement counters with sufficient capacity able to count any large finite numbers, with huge numbers of eaters of course, although we have yet to prove this to be true.
6
Discussions
In this paper we have analyzed the glider-eater interaction in the Spiral Rule in great detail, which presents us with a result of the complete state transition
The Extended Glider-Eater Machine in the Spiral Rule
185
tables of such interaction. With the help of the transition tables, we are able to build extended glider-eater machines capable of computing. One of the straightforward choice is to construct a counter as we have done here. The counter can be seen as a part of the Minsky register machine, since it implements the INC function of the Minsky machine perfectly. The rest of the job, if we were to build the complete Minsky machine, are to implement the DEC (decrement) function, which is a reverse of the INC function, together with building a module to check whether or not the value in the register is zero. Another question that is worth looking into is that whether or not the extended glider-eater machine is capable of build counters to count any finite numbers, which is a prerequisite to implement a Minsky register machine. The extended glider-eater machine also has its limitations. For example, we have found that, with a single extended machine, one cannot build binary adders with more than one bit. This is because gliders representing binary bits with different significance levels have different weights, therefore when interacting with eaters, the eater have to transform into different states. In order to solve this problem, at least we cannot use the same gliders to represent all bits. Acknowledgments. The research is undertaken as part of the EPSRC funded project “Dynamical logical circuits in subexcitable chemical media” (principal investigators: Andrew Adamatzky and Ben De Lacy Costello).
References 1. Adamatzky, A. (ed.): Collision-Based Computing. Springer, London (2002) 2. Adamatzky, A., De Lacy Costello, B.: Binary collisions between wave-fragments in a sub-excitable Belousov–Zhabotinsky medium. Chaos, Solitons & Fractals 34(2), 307–315 (2006) 3. Adamatzky, A., Wuensche, A.: Computing in spiral rule reaction-diffusion hexagonal cellular automaton. Complex Systems 16(4) (2007) 4. Adamatzky, A., Martinez, G., Zhang, L., Wuensche, A.: Operating binary strings using gliders and eaters in reaction-diffusion cellular automaton. Mathematical and Computer Modeling (2010) (in Press) 5. Berlekamp, E.R., Conway, J.H., Guy, R.K.: What is Life? In: Winning Ways: For Your Mathematical Plays, Games in Particular, ch. 25, vol. 2, pp. 817–850. Academic Press, London (1982) 6. Chapman, P.: Life Universal Computer, http://www.igblan.free-online.co.uk/igblan/ca/index.html 7. Cook, M.: Universality in Elementary Cellular Automata. Complex Systems 15(1), 1–40 (2004) 8. De Lacy Costello, B., Adamatzky, A.: Experimental implementation of collisionbased gates in Belousov–Zhabotinsky medium. Chaos, Solitons & Fractals 25(3), 535–544 (2005) 9. De Lacy Costello, B., Toth, R., Stone, C., Adamatzky, A., Bull, L.: Implementation of glider guns in the light-sensitive Belousov-Zhabotinsky medium. Physical Review E 79(2), 026114 (2009)
186
L. Zhang
10. Fredkin, E., Toffoli, T.: Conservative logic. Int. J. Theor. Phys. 21(3-4), 219–253 (1982) 11. Gardner, M.: The fantastic combinations of John Conway’s new solitaire game “life”. Scientific American 223, 120–123 (1970) 12. Margolus, N.: Physics-like models of computation. Physica D: Nonlinear Phenomena 10(1-2), 81–95 (1984) 13. Minsky, M.: Recursive Unsolvability of Post’s Problem of ‘Tag’ and Other Topics in Theory of Turing Machines. Annals of Math. 74, 437–455 (1961) 14. Minsky, M.: Computation: Finite and Infinite Machines. Prentice-Hall, Inc., Englewood Cliffs (1967) 15. Rendell, P.: Turing Universality of the Game of Life. In: Adamatzky, A. (ed.) Collision-Based Computing, pp. 513–539. Springer, London (2002) 16. Toth, R., Stone, C., Adamatzky, A., De Lacy Costello, B., Bull, L.: Experimental validation of binary collisions between wave fragments in the photosensitive Belousov–Zhabotinsky reaction. Chaos, Solitons & Fractals 41(4), 1605–1615 (2009) 17. Toth, R., Stone, C., De Lacy Costello, B., Adamatzky, A., Bull, L.: Simple Collision-Based Chemical Logic Gates with Adaptive Computing. Int. J. Nanotechnology and Molecular Computation 1(3), 1–16 (2009) 18. Wolfram, S.: Universality and Complexity in Cellular Automata. Physica D: Nonlinear Phenomena 10(1-2), 1–35 (1984) 19. Wuensche, A., Adamatzky, A.: On spiral glider-guns in hexagonal cellular automata: Activator-inhibitor paradigm. Int. J. of Modern Physics C 17(7), 1009– 1026 (2006) 20. Wuensche, A.: Discrete Dynamics Lab (DDLab), http://www.cogs.susx.ac.uk/users/andywu/multi_value/spiral_rule.html 21. Zhang, L., Adamatzky, A.: Collision-based implementation of a two-bit adder in excitable cellular automaton. Chaos, Solitons & Fractals 41(3), 1191–1200 (2009) 22. Zhang, L., Adamatzky, A.: Towards arithmetical chips in sub-excitable media: Cellular automaton models. Int. J. Nanotechnology and Molecular Computation 1(3), 63–81 (2009)
Formalizing the Behavior of Biological Processes with Mobility Bogdan Aman and Gabriel Ciobanu Romanian Academy, Institute of Computer Science and A.I.Cuza University of Ia¸si, Romania [email protected], [email protected]
New formal approaches and software tools are required to cope with ensembles and quantities in biology. We model spatial and dynamic biological processes using a rule-based model of computation called mobile membranes [1,2] in which mobility is the key issue. The different feature of this formalism is given by the rules applied in parallel, a realistic description in biology which is not possible in process calculi involving mobility. The parallel application of rules depends on the available resources. The model is characterized by two essential features: • A spatial structure consisting of a hierarchy of membranes (which are either disjoint or included) with multisets of objects on their surface. • The biologically inspired rules describing the evolution of the structure: endocytosis and exocytosis (moving a membrane inside a neighboring membrane, or outside the membrane where it is placed). A “computation” is performed in the following way: starting from an initial structure, the system evolves by applying the rules in a nondeterministic and maximally parallel manner. The maximally parallel way of using the rules means that in each step we apply a maximal multiset of rules, namely a multiset of rules such that no further rule can be added to the set. A halting configuration is reached when no rule is applicable. A formal encoding of mobile membranes in colored Petri nets [3] is provided in order to be able to use a complex software tool to verify automatically some behavioral properties for the biological systems: reachability, boundedness, liveness, fairness. These properties are of great help when studying similar properties for mobile membranes. In this way we start from a biological system, describe it by using a complex rule-based formalism, and then simulate and analyze automatically its evolution by using Colored Petri Nets Tools. Following these steps we come back to biology by providing relevant solutions of the formal approaches.
References 1. Aman, B., Ciobanu, G.: Describing the Immune System Using Enhanced Mobile Membranes. Electronic Notes in Theoretical Computer Science 194, 518 (2008) 2. Aman, B., Ciobanu, G.: Simple, Enhanced and Mutual Mobile Membranes. In: Priami, C., Back, R.-J., Petre, I. (eds.) Transactions on Computational Systems Biology XI. LNCS (LNBI), vol. 5750, pp. 26–44. Springer, Heidelberg (2009) 3. Jensen, K.: Coloured Petri Nets; Basic Concepts, Analysis Methods and Practical Use. In: Monographs in Theoretical Computer Science, vol. 1-3. Springer, Heidelberg (1992-1997)
C.S. Calude et al. (Eds.): UC 2010, LNCS 6079, p. 187, 2010. c Springer-Verlag Berlin Heidelberg 2010
Quantum Finite State Automata over Infinite Words Ilze Dzelme-B¯erzi¸ na Institute of Mathematics and Computer Science, University of Latvia, Raina 29, Riga, LV-1459, Latvia
The study of finite state automata working on infinite words was initiated by B¨ uchi [1]. B¨ uchi discovered connection between formulas of the monadic second order logic of infinite sequences (S1S) and ω-regular languages, the class of languages over infinite words accepted by finite state automata. Few years later, Muller proposed an alternative definition of finite automata on infinite words [4]. McNaughton proved that with Mullers definition, deterministic automata recognize all ω-regular languages [2]. Later, Rabin extended decidability result of B¨ uchi for S1S to the monadic second order of the infinite binary tree (S2S) [5]. Rabin theorem can be used to settle a number of decision problems in logic. A theory of automata over infinite words has started from these studies. The above results inspired us to study quantum finite state automata over infinite words. We have adapted the definition of measure-once quantum finite state automata [3] for infinite words, as well as given a definition for group ω - automata. The B¨ uchi, Streett, and Rabin acceptance conditions have been formulated for quantum finite state automata over infinite words. We study the language class accepted by quantum finite state automata over infinite words. It has been proved that measure-once quantum B¨ uchi automata with bounded error accepts a proper subset of the limit languages. As well as shown that Streett acceptance condition is more powerful then B¨ uchi acceptance condition also for quantum case.
References 1. B¨ uchi, J.R.: On a decision method in restricted second order arithmetic. Z. Math. Logik Grundlag. Math. 6, 66–92 (1960) 2. McNaughton, R.: Testing and generating infinite sequences by a finite automaton. Inform. Control 9, 521–530 (1966) 3. Moore, C., Crutchfield, J.: Quantum automata and quantum grammars Theoretical Computer Science 237, 275–306 (2000) 4. Muller, D.E.: Infinite sequences an finite machines. In: Proc. 4th IEEE Symp. on Switching Circuit Theory and Logical Design, pp. 3–16 (1963) 5. Rabin, M.O.: Decidability of second order theories and automata on infinite trees. Trans. AMS 141, 1–37 (1969)
C.S. Calude et al. (Eds.): UC 2010, LNCS 6079, p. 188, 2010. c Springer-Verlag Berlin Heidelberg 2010
A Geometrical Allosteric DNA Switch Anthony J. Genot, Jon Bath, and Andrew J. Turberfield Clarendon Laboratory, Department of Physics, University of Oxford, Parks Road, Oxford OX1 3PU, U.K.
Using the programmable interactions of DNA, it is possible to design small circuits capable of processing information [1, 2, 3]. Such DNA circuits may be constituted of logic gates [4,5] connected by signal restoration modules [6]. In most DNA logic gates, the sequences of the output and input strands are not independent. Sequence modification of a gate input must be passed on to the output. As this output is usually the input of another downstream gate, this interdependence of input and output greatly limits the scalability of circuits. In order to implement higher-order programming language [7], uncoupling inputs and outputs will be essential. Here we show how geometry can be leveraged to uncouple inputs and outputs in DNA gates. Building on a mechanism recently introduced, the remote toehold [8], we have constructed and tested a robust YES gate whose input and output are unrelated in sequence.
References [1] Adleman, L.M.: Molecular computation of solutions to combinatorial problems. Science 266(5187), 1021–1024 (1994) [2] Seelig, G., Soloveichik, D., Zhang, D.Y., Winfree, E.: Enzyme-free nucleic acid logic circuits. Science 314(5805), 1585–1588 (2006) [3] Stojanovic, M.N., Stefanovic, D.: A deoxyribozyme-based molecular automation. Nature Biotechnology 21(9), 1069–1074 (2003) [4] Kameda, A., Yamamoto, M., Ohuchi, A., Yaegashi, S., Hagiya, M.: Unravel four hairpins! DNA Computing. In: Mao, C., Yokomori, T, eds. (2006) [5] Zhang, D.Y., Winfree, E.: Dynamic allosteric control of noncovalent DNA catalysis reactions. Journal of the American Chemical Society 130(42), 13921–13926 (2008) [6] Zhang, D.Y., Turberfield, A.J., Yurke, B., Winfree, E.: Engineering entropy-driven reactions and networks catalyzed by DNA. Science 318(5853), 1121–1125 (2007) [7] Phillips, A., Cardelli, L.: A programming language for composable DNA circuits. Journal of the Royal Society Interface 6 (2009) [8] Genot, A.J., Zhang, D.Y., Bath, J., Turberfield, A.J.: The remote toehold, a mechanism to dynamically control DNA hybridization (in preparation)
C.S. Calude et al. (Eds.): UC 2010, LNCS 6079, p. 189, 2010. © Springer-Verlag Berlin Heidelberg 2010
Properties of “Planar Binary (Butchi Number)” Yuuki Iwabuchi and Junichi Akita Division of Electrical Engineering and Computer Science, Kanazawa university [email protected], [email protected]
1
Introduction
Conventional binary numbers has the carry rule of leftward propagation in linear array. In this paper, we propose the number representation with the carry rule of both leftward and upward propagations in planar array, which we call “planar binary (Butchi number),” and we describe their properties.
2
Property of “Planar Binary”
In the procedure of increment in “Butch number,” the carry paths branch for both leftward and upward at the reversing bit accroding to propergating carries. The arithmetical operations of addition and multiplication can be defined for the Butchi number, that satisfy both the commutative law and the distributive law. The Butchi number representations for large natural numbers generally have recursive triangles shapes. The Butchi number representation can be considered as the extension of the binary number in one dimentional representation for two dimentional representaion. The possibility of the operation representation with simple rule in two dimentional domain will be discussed in our future works.
363 (a)
364 (b)
Fig. 1. Examples of Planer Binary’s property. (a)Representation of 40000 and (b)Incremental Operation.
C.S. Calude et al. (Eds.): UC 2010, LNCS 6079, p. 190, 2010. c Springer-Verlag Berlin Heidelberg 2010
Characterising Enzymes for Information Processing: Microfluidics for Autonomous Experimentation Gareth Jones, Chris Lovell, Hywel Morgan, and Klaus-Peter Zauner School of Electronics and Computer Science, University of Southampton, UK, SO17 1BJ {gj07r,cjl07r,hm,kpz}@ecs.soton.ac.uk
Information processing within biological systems relies upon the interactions of numerous protein macromolecules with one another, and with their environment [1]. Recognising this, enzymes have been applied in the implementation of Boolean logic gates. However, given the structural complexity of enzymes, it would appear that enzyme behaviour is not limited to simple Boolean logic behaviour. Instead through characterising the response behaviour of enzymes, new modes of information processing could be supported ultimately facilitating the application of enzymatic computers [2]. Typically, resources are very limited compared to the large parameter spaces, preventing detailed investigation of behaviours. Effective choice of experiments and a physical platform that minimises resource requirements per experiment would therefore be desirable. We propose an autonomous experimentation system consisting of a microfluidic platform coupled to an artificial experimenter [3]. A new electrohydraulic interface, together with low-cost fabrication techniques have been developed. This allows for many chemical channels and enables complex reaction mixtures. Consequently, enzymatic computing studies that were previously unaffordable, should become achievable. Fluidic Channel
Control Channel
Linear Solenoid Actuator Fluidic Channel
Fluidic Channel Microfluidic Device
Control Channel
Control Channel
400μm
References 1. Bray, D.: Protein molecules as computational elements in living cells. Nature 376, 307–312 (1995) 2. Zauner, K.-P., Conrad, M.: Enzymatic computing. Biotechnol. Prog. 17, 553–559 (2001) 3. Lovell, C., Jones, G., Gunn, S.R., Zauner, K.-P.: Characterising enzymes for information processing: Towards an artificial experimenter. In: Calude, C.S., et al. (eds.) UC 2010. LNCS, vol. 6079, pp. 81–92. Springer, Heidelberg (2010)
C.S. Calude et al. (Eds.): UC 2010, LNCS 6079, p. 191, 2010. c Springer-Verlag Berlin Heidelberg 2010
Inference with DNA Molecules Alfonso Rodr´ıguez-Pat´on, Jos´e Mar´ıa Larrea, and I˜ naki Sainz de Murieta Universidad Polit´ecnica de Madrid, Departamento de Inteligencia Artificial, Madrid, Spain [email protected] http://www.lia.upm.es
We have designed a simple model that implement the inference rules Modus Ponens and Modus Tollens using DNA. The model is inspired by [1,2], but implemented using DNA strand displacement as a reinterpretation of [3,4]. Our work introduces two main differences from [1]: absence of restriction enzymes, and explicit representation of the logical negation by using a Watson-Crick complementary strand. The latter feature enables an implicit error cancellation mechanism that leads to better robustness and scalability, allows bidirectional inference and accepts negated propositions as valid outputs. The model also allows composition of implications, such as P → (Q → R). As this is equivalent to P ∧ Q → R, the model can represent rules with conjunctions in the antecedent.
Fig. 1. The rule P → Q is encoded inside the circle. On the left (right) side, P (¬Q) is given as input and Q (¬P ) is inferred using Modus Ponens (Modus Tollens).
References 1. Ran, T., Kaplan, S., Shapiro, E.: Molecular implementation of simple logic programs. Nature Nanotech 4(10), 642–648 (2009) 2. Benenson, Y., Gil, B., Ben-Dor, U., Adar, R., Shapiro, E.: An autonomous molecular computer for logical control of gene expression. Nature (2004) 3. Seelig, G., Soloveichik, D., Zhang, D.Y., Winfree, E.: Enzyme-free nucleic acid logic circuits. Science 314(5805), 1585–1588 (2006) 4. Takahashi, K., Yaegashi, S., Kameda, A., Hagiya, M.: Chain reaction systems based on loop dissociation of DNA. In: Carbone, A., Pierce, N.A. (eds.) DNA 2005. LNCS, vol. 3892, pp. 347–358. Springer, Heidelberg (2006)
Research was partially supported by project BACTOCOM funded by grant from the European Commission, Spanish MICINN under project TIN2009-14421, Comunidad de Madrid and Universidad Polit´ecnica de Madrid.
C.S. Calude et al. (Eds.): UC 2010, LNCS 6079, p. 192, 2010. c Springer-Verlag Berlin Heidelberg 2010
A Network-Based Computational Model with Learning Hideaki Suzuki1 , Hiroyuki Ohsaki2 , and Hidefumi Sawai1 1
2
National Institute of Information and Communications Technology 588-2, Iwaoka, Iwaoka-cho, Nishi-ku, Kobe, 651-2492, Japan Graduate School of Information Science and Technology, Osaka University 1-5 Yamadaoka, Suita, Osaka, 565-0871, Japan
As is well-known, a natural neuron is made up of a huge number of biomolecules from a nanoscopic point of view. A conventional ‘artificial neural network’ (ANN) [1] consists of nodes with static functions, but a more realistic model for the brain could be implemented with functional molecular agents which move around the neural network and cause a change in the neural functionality. One such network-based computational model with movable agents is ‘program-flow computing’ [4] wherein programs (agents) move from node to node and bring different functions to CPUs (nodes). This model is also closely related to ‘active network’ [5] which enables a router (node) to have various functions by delivering packets (agents) with encapsulated programs. Based upon these previous studies, the paper proposes a novel network-based computational model named “Algorithmically Transitive Network (ATN)”. The distinctive features of the ATN are: – [Calculation]. A program is represented by a ‘data-flow network’ like the ‘dataflow computer’ [3]. During calculation, a node with an arithmetic/logic operation reads the input ‘tokens’ on its incoming edges, fires, and creates the output tokens on its outgoing edges, causing the forward propagation of calculated data. – [Learning]. After the calculation, triggered from the teaching signals, the network propagates differential coefficients of the energy function backward [2,6] and adjusts node parameters with the steepest descent method. – [Topological Reformation]. The network topology (algorithm) can be modified or improved during execution through conducting programs of movable agents. To demonstrate the learning capability, the ATN is applied to some symbolic regression problems.
References 1. Haykin, S.: Neural networks and learning machines. Prentice-Hall, Englewood Cliffs (2009) 2. Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323, 533–536 (1986) 3. Sharp, J.A. (ed.): Data flow computing: Theory and practice. Ablex Publishing Corp., Norwood (1992) 4. Suzuki, H.: A network cell with molecular agents that divides from centrosome signals. BioSystems 94, 118–125 (2008) 5. Tennenhouse, D.L., Wetherall, D.J.: Towards an active network architecture. ACM Computer Communication Review 26(2), 5–18 (1996) 6. Werbos, P.J.: The roots of backpropagation: From ordered derivatives to neural networks and political forecasting. In: Adaptive and Learning Systems for Signal Processing. Communications and Control Series. Wiley-Interscience, Hoboken (1994) C.S. Calude et al. (Eds.): UC 2010, LNCS 6079, p. 193, 2010. c Springer-Verlag Berlin Heidelberg 2010
Image Processing with Neuron-Like Branching Elements (POSTER) Hisako Takigawa-Imamura1 and Ikuko N. Motoike1,2 1
2
iCeMS, Kyoto University, Kyoto 606-8501, Japan PRESTO, Japan Science and Technology Agency, Saitama 332-0012, Japan [email protected] Abstract. The dendritic shape of neurons may be responsible for functional characteristics in information processing in a brain. Since neurons have redundant inputs and excitability thresholds for outputs, logic operations with neurons can be considered as a kind of the AND operation. Here, we propose two-dimensional excitable media of radial dendritic patterns that conduct the AND operation with time constraint. Further, we apply the dendritic elements to construct an image processing circuit. The investigation of the bio-inspired computation may give insights into how the combination of the AND operation processes information. It is known that excitation waves fail to propagate from a narrow path to a broad area, as seen in a diode. By this mechanism called the curvature effect, a signal given on a terminal branch of a dendritic pattern disappears at a central broad area where branches converge. However, when a sufficient number of neighboring branches receives signals around the same time, accumulation of excitation penetration at the central area can revive excitation waves, leading to signal transmission to the unstimulated side of the dendrite. We determine the condition of the excitation property and the geometry of dendrites including branching frequencies for the curvature effect and for the deep penetration of excitation waves by computer simulation. On the description of dendritic pattern formation, we adopt a cellular automaton model of self-organization, by which a multilayer single-electron devise to generate dendritic patterns has been proposed [1]. In the present study, we design a circuit where a number of random dendrites are tiled. One-dimensionalized signals of image information are given from one side of a queue of dendrites, and signal output are detected from the other side. Each element receives data from several pixels via terminals of branches, and only elements at dense signal areas can output signals, resulting in a similar function to characterization of the image. Interaction between adjacent elements tends to facilitate clustering in the output image. We will discuss the correlation between spatial features of input/output images and the geometry of the circuits.
Reference 1. Oya, T., Motoike, I.N., Asai, T.: Single-electron circuits performing dendritic pattern formation with nature-inspired cellular automata. Int. J. Bifurcat. Chaos. 17, 3651– 3655 (2007)
C.S. Calude et al. (Eds.): UC 2010, LNCS 6079, p. 194, 2010. c Springer-Verlag Berlin Heidelberg 2010
Author Index
Adamatzky, Andrew 93 Agadzanyan, Ruben 11, 164 Akita, Junichi 190 Alhazov, Artiom 21, 45 Aman, Bogdan 187 Amari, Shun-ichi 1 Aono, Masashi 69 Bath, Jon
189
Cardelli, Luca 2 Chatelin, Fran¸coise 3 Ciobanu, Gabriel 187 Costa, Jos´e F´elix 6 Dinneen, Michael J. Dzelme-B¯erzi¸ na, Ilze Everitt, Mark S.
32 188
152
Freivalds, R¯ usi¸ nˇs 11, 115, 164 Freund, Rudolf 21 Genot, Anthony J. 189 Gunn, Steve R. 81 Hara, Masahiko
69
Imai, Katsunobu Iwabuchi, Yuuki
45 190
Jones, Gareth 81, 191 Jones, Martin L. 152 Kendon, Viv M. 152 Kim, Kyoung Nan 56 Kim, Song-Ju 69 Kim, Yun-Bum 32
L¯ ace, Lelde 115 Larrea, Jos´e Mar´ıa 192 Lieberman, Marya 56 Lovell, Chris 81, 191 Margenstern, Maurice 93 Mark, Lesli 56 Mart´ınez, Genaro J. 93 Mischenko-Slatenkova, Taisia Morgan, Hywel 191 Morita, Kenichi 21, 93 Motoike, Ikuko N. 194 Nicolescu, Radu
140
32
Ohkubo, Jun 105 Ohsaki, Hiroyuki 193 Rodr´ıguez-Pat´ on, Alfonso
192
Sainz de Murieta, I˜ naki 192 Sarveswaran, Koshala 56 Sawai, Hidefumi 193 Say, A.C. Cem 164 Scegulnaja-Dubrovska, Oksana Suzuki, Hideaki 193 Tadaki, Kohtaro 127 Takigawa-Imamura, Hisako 194 Turberfield, Andrew J. 189 Vasilieva, Alina
140
Wagner, Rob C.
152
Yakaryılmaz, Abuzer Zauner, Klaus-Peter Zhang, Liang 175
164 81, 191
115