\P 5J\ Soft Computing Series — Volume 6
Brainuiare: Bio-Inspired Architecture and its Harduiare Implementation Editor: TsutOIIIU Miki
RSI Fuzzy Logic Systems Institute (FLSI)
Brainware: Bio-Inspired Hrchitecture and its Hardware Implemention
Fuzzy Logic Systems Institute (FLSI) Soft Computing Series Series Editor: Takeshi Yamakawa (Fuzzy Logic Systems Institute, Japan)
Vol. 1: Advanced Signal Processing Technology by Soft Computing edited by Charles Hsu (Trident Systems Inc., USA) Vol. 2:
Pattern Recognition in Soft Computing Paradigm edited by Nikhil R. Pal (Indian Statistical Institute, Calcutta)
Vol. 3: What Should be Computed to Understand and Model Brain Function? — From Robotics, Soft Computing, Biology and Neuroscience to Cognitive Philosophy edited by Tadashi Kitamua (Kyushu Institute of Technology, Japan) Vol. 4:
Practical Applications of Soft Computing in Engineering edited by Sung-Bae Cho (Yonsei University, Korea)
Vol. 5: A New Paradigm of Knowledge Engineering by Soft Computing edited by Liya Ding (National University of Singapore)
F L 5 I 1 Soft Computing Series — Volume 6
Brainutare: Bio-Inspired Architecture and its Hardware Implementation
Editor
Tsutomu Miki Kyushu Institute of Technology, Japan
fe World Scientific Singapore • New Jersey • London • Hong Kong
Published by World Scientific Publishing Co. Pte. Ltd. P O Box 128, Farrer Road, Singapore 912805 USA office: Suite IB, 1060 Main Street, River Edge, NJ 07661 UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE
British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library.
BRAINWARE: BIO-INSPIRED ARCHITECTURE AND ITS HARDWARE IMPLEMENTATION FLSI Soft Computing Series — Volume 6 Copyright © 2001 by World Scientific Publishing Co. Pte. Ltd. All rights reserved. This book, or parts thereof, may not be reproduced in anyform or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the Publisher.
For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.
ISBN 981-02-4547-5
Printed in Singapore by Fulsland Offset Printing
Series Editor's Preface
The IIZUKA conference originated from the Workshop on Fuzzy Systems Application in 1988 at a small city, which is located in the center of Fukuoka prefecture in the most southern island, Kyushu, of Japan, and was very famous for coal mining until forty years ago. Iizuka city is now renewed to be a science research park. The first IIZUKA conference was held in 1990 and from then onward this conference has been held every two years. The series of these conferences played important role in the modern artificial intelligence. The workshop in 1988 proposed the fusion of fuzzy concept and neuroscience and by this proposal the research on neuro-fuzzy systems and fuzzy neural systems has been encouraged to produce significant results. The conference in 1990 was dedicated to the special topics, chaos, and nonlinear dynamical systems came into the interests of researchers in the field of fuzzy systems. The fusion of fuzzy, neural and chaotic systems was familiar to the conference participants in 1992. This new paradigm of information processing including genetic algorithms and fractals is spread over to the world as "Soft Computing". Fuzzy Logic Systems Institute (FLSI) was established, under the supervision of Ministry of Education, Science and Sports (MOMBUSHOU) and International Trade and Industry (MITI), in 1989 for the purpose of proposing brand-new technologies, collaborating with companies and universities, giving university students education of soft computing, etc. FLSI is the major organization promoting so called IIZUKA Conference, so that this series of books edited from IIZUKA Conference is named as FLSI Soft Computing Series. The Soft Computing Series covers a variety of topics in Soft Computing and will propose the emergence of a post-digital intelligent systems.
Takeshi Yamakawa, Ph.D. Chairman, IIZUKA 2000 Chairman, Fuzzy Logic Systems Institute
Volume Editor's Preface Brain is an ultimate intelligent processor, which processes a huge amount of information acquired from sensory systems in a flash and comes up with the most appropriate answer immediately. Furthermore, a human brain can handle ambiguous and uncertain information adequately. Implementation of such a human-brain architecture and function is so called "Brainware". The Brainware will be a candidate of the new tool that realizes a human-friendly computer society, not a computer-friendly human society. In order to make mis tool practical for use, silicon implementation of Brainware is indispensable. The enormous hardware capacity based on silicon VLSI technology can offer a potential in implementing very complex systems on a chip. However, a microprocessor based on even today's VLSI technology is insufficient for silicon implementation of Brainware, because its architecture is inherently different from mat of bio-systems. In order to realize the Brainware on a chip, die new hardware paradigm is needed. One of the new streams is so called "bio-inspired" hardware for which some principles and mechanisms of bio-systems have been applied. In mis book, the hardware implementing the bio-inspired system is discussed as to its devices, architecture and systems. This book consists of eight enriched versions of papers selected from HZUKA'98. First of all, several silicon realizations of nerve function are introduced. In chap.l, me new functional device with multi-inputs, neuron MOS (vMOS), is described. The vMOS works like McCulloch & Pitts model and turns on when the weighted-sum of input signals exceeds the direshold voltage of the device. In this chapter, the association processor architecture based on a psychological brain model has been proposed and implemented in vMOS hardware computing scheme. Chap.2 presents the realization of nerve function using new device technology. The neuron circuit is composed of a metal-ferroelectricsemiconductor FET (MFSFET) and a complimentary unijunction circuit. It realizes the adaptive learning function by modulating the output frequency. In chap.3, as an effective signal transformation in neural systems, a PWM approach is introduced. From the viewpoint of designing large networks, signal transformation, simple weight circuits and nonlinear function circuits are
viii Preface
discussed. Since visual information processing has to treat with enormous data, it is difficult to process in real-time by using ordinary computing method. So, in the following three chapters, neuromorphic vision systems are discussed. In chap.4, methodologies for making an application specific design of vision circuits based on bio-inspired architecture is introduced. Complex visual processing, such as image processing, eye-tracking and visual inspection, is described. Chap.5 presents a simple MOS circuits with a correlation model based on the insect motion detectors aiming at realizing a motion-sensing. This model offers the real-time computation of the optical flow. And, chap.6 focuses upon a morphological picture processing which needs a long time for computing by using ordinary methods. The cellular automaton using vMOS is proposed for its real-time processing. In this chapter, noise reduction, edge detection, thinning and shrinking are especially explained in detail. Human brain, from the different viewpoint, can be' regarded as a huge scale dynamical complex system. One of the keys to explicate its function is to investigate chaotic behavior found in bio-systems in order to implement it in hardware. Thus, in chap.7, how to create such chaotic dynamics by using simple circuitry is described. According to recent neuroscience researches, a neuron is expected to process information efficiently by using the complex physiological properties of its dendrites. In the last chapter, computational function of neuronal dendrites is explored based on mathematical model. This model describes dendrites' essential responsiveness and offers an effective design of hierarchical large-scale neural networks. I hope that this book will arouse some interest and be a help to researchers working in the research field of Brainware. I would like to express sincere thanks to Prof. Tadashi Shibata for his contribution by organizing "Bio-inspired Hardware" sessions at IIZUKA'98.
Tsutomu Miki Volume Editor Iizuka, Japan March, 2000
Contents Series Editor's Preface
v
Volume Editor's Preface
vii
Chapter 1
Neuron MOS Transistor: The Concept and Its Application Tadashi Shibata
Chapter 2
Adaptive Learning Neuron Integrated Circuits Using Ferroelectric-Gate FETs Sung-Min Yoon, Eiske Tokumitsu, Hiroshi Ishiwara
33
An Analog-digital Merged Circuit Architecture Using PWM Techniques for Bio-Inspired Nonlinear Dynamical Systems Takashi Morie, Makoto Nagata, Atsushi Iwata
61
Chapter 3
Chapter 4
Application-Driven Design of Bio-Inspired Low-Power Vision Circuits & Systems Andreas Konig, Jan Skribanowitz, Michael Eberhardt, Jens Doge, Thomas Knobloch
1
89
Chapter 5
Motion Detection with Bio-Inspired Analog MOS Circuits Hiroo Yonezu, Tetsuya Asai, Masahiro Otani, Naoki Ohshima
123
Chapter 6
v MOS Cellular-Automaton Circuit for Picture Processing Masayuki Ikebe, Yoshihito Amemiya
135
Chapter 7
Semiconductor Chaos-Generating Elements of Simple Structure and Their Integration Koichiro Hoh, Tatsuo Tsujita, Takahiro Irita, Yuichiro Aihara, Jun-ya lrisawa, Akira Imamura, Minoru Fujishima
163
x
Contents
Chapter 8
Computation in Single Neuron with Dendritic Trees .., Norihiro Katayama, Mitsuyuki Nakao, Mitsuaki Yamamoto
179
Appendix A
197
About the Authors
207
Keyword Index
229
Chapter 1 Neuron MOS Transistor: The Concept and Its Application Tadashi Shibata The University of Tokyo
Abstract A multiple-input transistor has been developed by a simple modification in the regular MOSFET structure. The transistor turns on when the weighted sum of input signals exceeds the threshold voltage of the device. Due to its functional similarity to McCulloch and Pitts Model of a neuron [1], it is named neuron MOS transistor or vMOS (neuMOS) for short. Such a functionality enhancement in the elementary device has shown a great impact on the way of constructing circuits and systems. In applications to binary logic circuits, the number of transistors and interconnects has been remarkably reduced. The concept of soft hardware, i.e., the real-time reconfigurable logic gates has been also developed using vMOS. Several hardware-computing schemes have been developed in which algorithms are directly carried out in the circuits using vMOS' as key elements. This allows real-time response of a system in real-world data processing. In order to implement intelligent systems on silicon, the association processor architecture has been developed based on a psychological brain model and implemented in the vMOS hardware-computing scheme. Applications of the association processor architecture to recognition problems as well as to practical problems are presented. Keywords : neuron MOS transistor, soft hardware logic, hardware algorithm, analog/digital merged processing, psychological brain model, association processor, winner take all, analog EEPROM, vector quantization
1.1 Introduction Over the past several decades the progress of semiconductor technology as represented by the number of components on a chip has been growing exponentially following the Moore's Law [2]. It has dramatically impacted on the digital computer technology and we can now enjoy the super-computer performance of eighties with our laptop PC's. Present-day computers are dedicated machines for ultra fast numerical calculations. Although their
l
2
T. Shibata
computing powers are enormous, diey are not very good at such tasks like seeing, recognizing, and taking immediate actions, while they are just effortless for humans, or for biological systems in general. A question arises if such a performance gap is going to be narrowed by just increasing the clock frequencies of MPU's and integration densities of memories, and by further sophistication in software programs. The scaling of device dimensions and the resultant enhancement in the integration density and speed performance have been the sole success scenario of silicon technology. However, such a scenario is encountering the fundamental limitations of material properties and device physics [3] as well as severe economic issues. New paradigms in computing are now in critical demand. This article proposes an approach to the problem based on the functionality enhancement in an elementary device. This allows us to conduct some elemental information processing at the very hardware level, thus lessening the burdens of softwares in a total system. In addition, the hardware cost is greatly reduced because elemental computations are carried out in very simple electronic circuits. The concept of such hardware computation scheme has been extended to a hardware-intensive recognition system based on a psychological brain model. The association processor architecture, a hardware maximum-likelihood search engine, has been developed for such recognition systems. In §1.2, a naive comparison between electronic systems and biological systems is made. The concept of vMOS is introduced in §1.3 and its application to binary logic circuits is presented in §1.4. As an example of hardware computation using vMOS, a center-of-mass tracker circuit is described in §1.5. In §1.6, a psychological brain model is presented and the association processor architecture is described as its prototype hardware implementation. Applications of the architecture to some practical problems are presented in §1.7 and the concluding remarks are given in §1.8. 1.2 Bio Processing vs. Electronic Processing The comparison between the biological computing system and electronic system is given in Fig. 1.1. A frog finds a fly passing through and catches it in a moment. This action has been produced by a series of information processing carried out within this small creature. The processing would include the capture
Neuron MOS Transistor: The Concept and Its Application 3 of the fly's image on the retina, identification of the object as food, computation of its expected motion followed by the activation of motor neurons to catch the fly. Such a real-time action, however, is impossible even with the most advanced super computers. The switching speed of a short channel transistor, the very basic element of a computer, is about 109 times faster than its biological counterpart. Signals on metal interconnects travel at a speed of light, while nerve impulses propagate only 2~3m/sec in the brain. Why, with such an overwhelming speed advantage, is real-time response not possible in electronic systems? One of the major reasons, we believe, is the difference in the functionality of an elementary I
See and Catch !!
r—~:
; \ v/
Real-Time Response I
[_
i
<
ii ii
r ,,::^„ ~i
^
•
•
-
—
_ .
/
j
J T-M.
•: AllVii:i._:
sec
f iTPi.'.-nr N-i..-1 i: •: • J\
2~3m/sec
••
V-.
x 109 faster V
•:- '!•'
X10 7 faster!
I
r
I M I ' U 1 l ' : i: i \V\-r
lfc
Mi™11.11 i i i i f i c u i " 1 ! ! . > :•"
• 1
~108m/sec
• - * * '
Fig. 1.1 Comparison between bio-computing system and electronic system. device. The transistor is a simple switch, while the neuron is a multiple-input thresholding device. The difference does not seem very great at this level, but produces a big difference at a system level. Functionality enhancement in a transistor could be a solution. In the next section we describe a functional device which mimics the basic behavior of a neuron at the single transistor level.
4
T. Shibata
1.3 Neuron MOS and Elemental Logic Gate
The concept of Neuron MOS Transistor (neuMOS or vMOS for short) [4] is shown in Fig. 1.2. The floating gate potential is determined by the multiple input signals via capacitance coupling and controls the on and off states of the transistor. Due to its functional similarity to the neuron model [1], the device bears the name. Applications of vMOS to binary digital circuits [5-9], real-time reconfigurable logic gates [5, 7], self-learning neural networks [10, 11], image processing [12-14], and analog multipliers [15] have been demonstrated. Fig. 1.3(a) shows the basic logic gate using vMOS'. It is a regular CMOS inverter having a common floating gate whose potential is determined by capacitance coupling with multiple input terminals. The on and off states of the inverter are determined whether the common-gate potential exceeds the inversion threshold or not. The common floating gate needs not be necessarUy electrically floating in a sense of EPROM and EEPROM floating gates, but can be refreshed in synchronous to the clock signal, as depicted in Fig. 1.3(b) for instance. By introducing the well-known autozeroing technique in chopper comparators, the accuracy in thresholding operation is enhanced. This also allows the subtraction operation directly carried out on the floating gate, thus further enhancing the functionality in an elemental gate. In order to avoid the increase in the power dissipation due to the DC pass current flowing during aotozeroing, new circuit schemes have been developed [16-19].
V, V2
11
1
Floating Gate Source
Drain
^
F
> V
TH
I
Transistor'Turns ON
Fig. 1.2 Concept of neuron MOSFET (neuMOS or vMOS for short).
Neuron MOS Transistor: The Concept and Its Application 5
BINARY DECISION vio—•! V2c—Qi
"1" or"0"
ANALOG COMPUTATION ON (TEMPORARY) FLOATING GATE
(C)
(b)
(a)
Fig. 1.3 vMOS logic gate is composed as a CMOS inverter having a commonfloatinggate capacitively coupled to multiple input terminals. As shown in Fig. 1.3(c), the circuit receives either analog or digital signals as inputs and the computation is carried out in the analog domain through the charge sharing among multiple input capacitors taking place on the floating gate. The computation is immediately followed by the inverter action and is given with a "Yes-or-No" binary decision. This is the principle of analog/digitalmerged processing, being implemented at the very elemental gate level. We believe this is an important scheme in dealing with real world data because a large part of the voluminous analog data is discarded and only the essence is
CONTROL
o .
a VOLTAGE VOLI
-i—c _ i _
r
C,= $ j.0
-2;5
0.0
2.5
GATE VOLTAGE (V}
5.0
-
Fig. 1.4 Measured I-V characteristics of a two-input vMOS. By applying VA =—5~5V to the control terminal, the apparent threshold of the transistor as seen from the VG-terminal is arbitrarily altered.
6
T. Shibata
retained. Moreover, this is different from die conventional analog processing in that noises and errors produced in each analog processing are discarded by a binary decision at each stage and do not propagate to the following stages to amount to a large error.
1.4
Application to Binary Logic Gate
Fig. 1.4 shows a vMOS having two input terminals witii identical coupling capacitors. The operation of the device is very simple. It turns on when the average of the two inputs exceeds the threshold. However, if we look at one of the terminals as a gate terminal and the other as a control terminal, the apparent threshold as seen from the gate is controlled by the other as demonstrated in the
(X„ X j )
o, 0 o, 1 1, 0 1, 1
vP
Vp 1/8 VDD 3/8 VDD 5/8 VDD 7/8 V n n
t>— t
Pre Inverter
(X.ltXJ:(0,0) Main Inverter
t
t vDD
(0,1) (1,0) (1,1)
Vp (Principal Variable)
Fig. 1.5 vMOS logic gate representing Exclusive OR. VP (principal variable) takes the multilevel values shown in the insert according to the binary inputs X| and X2. The ratios of coupling capacitors are indicated by fractional numbers as 1/2, 3/8, and 1/8. On the right is a floating gate potential diagram representing fa (floating gate potential) as a function of VP. Here the effect of the floating gate-to-ground capacitance C0 is neglected for simplicity. In order to include the effect of C0, the ordinate needs be multiplied by y like yVDD and l/2yVDD etc., where y = (Q-K^+Ca) / (C0+ C,+C2+C3) with C L C2, C 3 being the three input gate capacitors.
Neuron MOS Transistor: The Concept and Its Application
7
figure. If we apply a large positive voltage to the control terminal, for instance, it is easy to turn on the transistor by the gate terminal because the floating gate potential is already boosted by the control terminal, making it behave as a depletion-mode transistor. If a negative bias is applied, it behaves as a highthreshold-voltage enhancement-mode transistor. The variable threshold nature of vMOS plays an essential role in introducing flexibility to the function of electronic circuits. A typical form of a binary logic gate implemented by vMOS technology is illustrated in Fig. 1.5. VP (called principal variable) is an input variable, taking multiple-level values according to the binary inputs Xi and X2 as indicated in the figure. It should be noted that the conversion of Xi and X2 to VP does not need any special circuitry because it is easily done by directly transferring Xi and X2 to the vMOS floating gate via coupling capacitors having the ratio of 2:1 as is the case in Fig. 1.6 (for design details, see Ref. [5, 6]). However, the explanation below is given using VP because of the simplicity of discussion. The circuit in Fig. 1.5 is an Exclusive OR gate and the diagram on the right represents the floating gate potential §v as a function of VP increasing from OV to VDD. Assume for the moment that the input terminal of "3/8" is grounded and only VP on the "1/2" terminal is increasing. The contribution of VP to fp via the "1/2" input terminal is represented by the shaded triangle in the diagram on the
V MOS Inverters Fig. 1.6 Soft Hardware logic circuit for two binary inputs Xi and X 2 composed of three pre-inverters and a main inverter. Any of all possible 16 Boolean functions can be specified by three control signals VA, V Bs V c [5, 6].
8
T. Shibata
right. Since the coupling capacitance of the gate is 1/2 of the total capacitance, <|>F never exceeds VDD/2, the CMOS inverter threshold. Namely, the direct VP input only cannot upset the inverter, and the contribution from the indirect input via the pre-inverter is essential for logic operation. If the pre-inverter has the inversion threshold of 3/4VDD, the overall variation of ()>F would be the one shown in the diagram. (The contribution from the pre-inverter output indicated by the parallel piped is superimposed on the triangular contribution from the direct VP input.) <|)F exceeds die inverter threshold (VDD/2) at <|>F = 3/8VDD and 5/8VDD, namely at (X„ X2) = (0, 1) and (1, 0). This means diat the circuit works as an Exclusive OR. Any Boolean logic function can be implemented in a similar circuit configuration consisting of a main inverter and pre-inverter(s). The number of pre-inverters and their inversion thresholds as well their coupling capacitances are determined according to the specific function to implement in die circuit. The essential feature of die circuit is the way of VP's coupling to die main inverter: the direct coupling to the main inverter and the indirect coupling via pre-inverter(s).
Fig. 1.7 Demonstration of soft hardware logic circuit operation for two binary inputs X! and X2. Real-time alteration of five Boolean functions are shown by measured results [6]. Slow operation is due to the capacitive loading of direct probing on unbuffered output terminal.
Neuron MOS Transistor: The Concept and Its Application 9 If the inversion thresholds of the pre-inverters are made alterable by control signals using the variable threshold characteristics of a vMOS, so-called Soft Hardware Logic (real-time reconfigurable logic gates) are implemented [5, 7]. Fig. 1.6 shows the two-input soft hardware logic representing all 16 Boolean functions in which the selection is conducted by three control signals VA, VB, and V c . Measured waveforms are demonstrated in Fig. 1.7 along with corresponding floating gate potential diagrams. The detailed design procedure is described in Refs. [5, 6]. The characteristic feature of vMOS binary logic gates is a remarkable reduction in the transistor counts and the number of interconnects as compared to the regular CMOS implementation. However, due to the multivalued nature of its logic operation, noise margins as well as DC pass currents arising from the floating gate bias in the transition region needs to be taken into careful consideration. Several new circuit schemes have been developed for vMOS to deal with these issues [8, 16-19]. 1.5 Example of Hardware Algorithm: Center of Mass Circuit Detecting a center of mass (COM) in the image captured on a 2D image sensor is important in target tracking. In this section, it is demonstrated that a very simple vMOS circuitry can do this task [20, 22]. This is an example illustrating the concept of vMOS hardware computing scheme. Computation is based on quasi two-dimensional processing where the pixel data are projected onto horizontal and vertical axes and then being processed in parallel by linear array
Fig. 1.8 Calculation of center of mass (XG) based on pixel data projected onto x and y axes.
10
T. Shibatm
TURNS ON
&
\yJtfCWi VOUT2
1
0 0
H>* IQ-^VOUTI 1 1 I
v
4* Jo-
W>-Vo VoUTn voo
vMOS Source followers
Q
C\
0~to~1 Transition Detector
v MOS inserters
Fig. 1.9 vMOS circuit solving the equation: XG • EMj= £ XjMj. The solution is given by a digital flag appearing at the location of X G . of elemental circuits. Then the calculation of COM (G M ) is reduced to find X Q satisfying the equation X Q • E Mj = E Xt M{ as shown in Fig. 1.8. In Fig. 1.9 is shown the vMOS circuit solving the equation. In the fist stage there are two complementary vMOS source followers (push-pull configuration of depletion-mode NMOS and PMOS sharing a common floating MEASUREMENT
SIMULATION
fff|£r IIIJIOJLi ^-™^ Hi[| fi n i VOUT2 VOUT3
imi| COM
11 P !V O U T 4 •; V O U T 5
"ILIJI.I..JJ \L...Ji;, ;
VOUT6 V O U T T
VOUT8 HME(2tesc/mf)
aran
fetf SHr !
t-M. .
n.
S-MABLJE VOU1T2 VOUT3 VOUT4 VOUT5 VOUT6 VOUTT VOUT8
T}M£(iO/*«ec/0|tf)
Fig. 1.10 Fabricated test circuit of that shown in Fig. 1.9. Measurement results are shown in comparison with simulation.
Neutron MOS Transistor: The Concept and Its Application
11
gate). The top source follower calculates EXiMi. Here Mt is represented by Yb the pixel data sum in the i-th column, and X< by the magnitude of the coupling capacitor of the i-th input terminal. TMt is calculated by the bottom source follower having all identical coupling capacitors. In the second stage are vMOS inverters each having different input capacitance ratio for EXjMi and EMi inputs which is proportional to the location Xj. Inverters turn on where the inequality XQ • EMj > EXJMJ holds. Therefore the location of COM is identified as the 0to-1 transition point in the inverter array. In this manner, the substantial computation is carried out in the analog domain, but the result is represented by a digital flag. This is an example of die analog/digital-merged computation. Fig. 1.10 shows a photomicrograph of die fabricated test circuit and the measured results in comparison with simulation. The accuracy is limited by the smallest coupling capacitance representing die location. The problem was solved by introducing a two-step computation scheme [20]. A solution to the problem of nonlinearity in the source follower characteristics is also described in Rcf. [20]. 1.6 Right-Brain Computing Model and Association Processor 1.6.1 Right Brain Computing Model
Recalling memory in the brain Fig. 1.11 "Seeing*' is not mere objects imaging onto the retina but recalling of past memory triggered by the stimuli on the retina.
12
T. Shibata
"Seeing and recognizing objects" is a very intelligent function of our brains. Then, what does "seeing" mean? "Seeing" is not mere optical imaging of objects onto the retina screen but that memorized images in the brain are recalled with their full richness of details triggered by the stimuli produced on the retina (Fig. 1.11). Reconstruction of a three dimensional structure from a two dimensional image on the retina is known as an ill-posed problem because unique solution can not be derived. Does the brain manipulate sophisticated mathematical equations to draw conclusions? Probably this is not the case. We recognize the spherical form of an illustrated soccer ball based on our experience in the past. Recalling past memory in immediate response to the current sensory inputs is the very bases of recognition. Based on this postulate, or so to speak a psychological brain model, we are tackling the subject of building "intelligent" electronic systems on silicon [21]. Our hardware recognition model is schematically illustrated in Fig. 1.12 [22]. An image captured on a two-dimensional pixel array is represented by a characteristic vector composed of a relatively small number of scalar quantities. Then the association processor performs the parallel search for the most similar vector in the vast memory where the past experience is stored in die form of
• III 1•1•111 I I I
T Vast Memory!
•
iiiiiiiiiiiillllTi
Characteristic Vector
^ > I I
1 1 1
!
Association • Processor CnJ^ Sensor with vMOS Processor
1i i i r
o
II 1 1 1 Winner-Take-All
Maximum-Likelyhood Event
Fig. 1.12 Hardware recognition system based on the psychological brain model in Fig. 1.11.
Neuron MOS Transistor: The Concept and Its Application
13
template vectors. The association is conducted by calculating the distances between the input code vector and the stored template vectors and searching for the minimum distance vector by a winner-take-all (WTA) circuitry [23]. In building such systems, the analog/digital merged computation scheme using vMOS circuits is employed as a guiding principle.
1.6.2
Non-Volatile Vast Memory Technology
In conducting the analog/digital-merged computation, storage of analog or multivalued data is essential. In our recognition system, the mass storage of knowledge in the form of analog template vectors is particularly important. For this reason, a high-precision analog EEPROM technology [24] has been developed. The chip does not require time-consuming write/verify cycles [25], thus being compatible with real time knowledge capture. The memory cell structure is shown in Fig. 1.13(a). It is a regular floatinggate EEPROM cell, but the tunneling electrode was made floating and the programming voltage is applied through a capacitor, while the main control electrode coupled to the floating gate being grounded. By making the control-
Programming Voltage
VD0
Input
(b) Fig. 1.13 Analog EEPROM cell with real-time writing control.
14 T. Shibata
.,10 h V 5
v~
IS2
\i •1
§5 O
0 500 Time [usee]
Fig. 1.14 Measured operation of the analog EEPROM cell (in Fig. 1.13) during data writing. electrode capacitance large enough as compared to the tunnel-oxide capacitance, die floating-gate potential becomes only dependent upon the net charge on die floating gate. The memory content (the floating gate potential) during data writing is real-time monitored through the source follower action of the memory transistor. When the memory cell content reaches the target value, the vMOS comparator turns the control transistor on and terminates data writing. The comparator is composed of multiple inverters (Fig. 1.13(b)). The target value is memorized in the vMOS inverter during auto zeroing. Positive feedback to one of thevMOS input terminal stabilizes the operation. Measured waveforms of the memory cell during writing is demonstrated in Fig. 1.14. As a high programming voltage is applied to the tunneling electrode, the memory value increases due to electron extraction from the floating gate. When the memory content arrives at the target value, the comparator turns on and terminates data writing. Improvements were made in the cell structure shown in Fig. 1.13 (a) to enhance the accuracy of data read/write. The source-follower readout was replaced by an op-amp voltage-follower circuitry in which the memory transistor is incorporated as one of the pair transistors in the differential pair. A tight cell layout resulted in a higher cell density. Furthermore merging analog EEPROM transistors into the matching cell circuitry (absolute value circuit) is
Neuron MOS Transistor: The Concept and Its Application
15
under study. 1.6.3 v MOS Association Processor The architecture of the vMOS association processor is shown in Fig. 1.15 where X is an input vector and A-Z template vectors down loaded from the vast memory. At each matching cell, the absolute value of difference IX ; - Z ;l is calculated and transferred to the floating gate of a vMOS source follower and accumulated. Therefore the output of thevMOS source follower yields the Manhattan distance, the dissimilarity measure between the input vector and the template vector. The WTA is composed of vMOS inverters having two-equally weighted inputs. At time t = 0, all vMOS inverters are in on states. This is because VDD is fed to one of the inputs and a non-zero distance value to the other, thus biasing the inverter above the threshold of VDD/2. When the common voltage is ramped down, thevMOS inverter receiving the smallest distance value turns off firstly. At this moment, the feedback loop in each inverter is closed and the state of the inverter is frozen. The location of the
A Xo-
x=
XiI I
-
Xn-
B
4 • S I SI >SI
z 4
Matching Cell IXi-Zil
si vMOS Source Follower
•mi
21X1-2 I
• s i S4
ULkB ___ jE a of^>
o?T>
WTA Fig. 1.15 Architecture of vMOS association processor. Manhattan distances are calculated by matching cell array and the minimum distance vector is searched by a winner-take-all circuit.
16 T. Shibata
©
A.1
\
Vi
Al
/
i ii
1TI
Reset Operation |
©
t
V2-Vi>.
\
© ,Vi-V2
II yi
V2-V1
^o(p v
^1
YZ^I
|Vl-V2l
Exchange of Input Voltage
Source Follower Activation
Fig. 1.16 Operation principle of absolute value circuit. smallest distance vector is identified by a flag appearing at the off-state inverter. Substantial computation is conducted by analog processing which is immediately followed by binary decision. This analog/digital-merged decision making operation is an essential feature of the vMOS circuitry. The operation of the absolute value circuit [26] is explained in Fig. 1.16. The circuit is composed of two floating gate NMOS' connected at their source terminals. While V, and V2 are fed to the input terminals, the floating gates are firstly grounded (®) and then disconnected from the ground to make them electrically floating ((2)). Then me input voltage is exchanged as shown in ® , resulting in the floating gate voltages of V2— V\ on the left and V! — V2 on the right. Here die floating-gate-to-ground capacitance is assumed to be negligibly small as compared to the input-gate-to-floating-gate capacitance for simplicity of explanation. When the source follower operation is activated as shown in (4), the output follows die larger of V2— V, or Vi — V2, namely IV! — V21 (Here it
Neuron MOS Transistor: The Concept and Its Application Input Vector _ v
Maximum Likelihood Pattern
> s
»""**: "*r*n-i.
5
(b)
(a)
10
Time In sec]
Fig. 1.17 (a) Photomicrograph of a test circuit of vMOS association processor, (b) Measurement results of the test circuit. is also assumed that the NMOS threshold ^ 0 . ) . Fig. 1.17 demonstrates the photomicrograph of a test circuit and the measurement results. An input vector of uiree components were compared with eight template vectors A~H. Although there is no template pattern exactly matching the input, the circuit automatically recalls the pattern C as the most similar to the input. Analog EEPROM VroVssVlN '•*»#•
18] iVM
*—tfCi
fS?
u H Input Vector
13Trs (a)
I
' — 1 _ Cl
raiOisi SH(-
Input Vector
5Trs (b)
Fig. 1.18 Two types of absolute value circuit for matching cell: (a) data are down loaded from EEPROM; (b) data are embedded in vMOS' in the cell.
18
T. Shibata
The matching cell (the absolute value circuit of Fig. 1.16) consumes a large chip area due to the crossbar switches to exchange V, and V2 and the interconnects for downloading template data from the analog EEPROM. This is illustrated in Fig. 1.18(a). In Fig. 1.18(b), a new ROM-version cell is presented [27] in which the template dada are merged into the matching cell using the concept of vMOS multivalued ROM technology [28]. The analog memory value is represented by the ratio of two input terminal capacitances of a vMOS, where die memory content is recalled by giving OV and VDD to respective gates. If the template data are established by off-line computation, this new version cell yields a higher integration density of matching cells, resulting in further enhanced association capability of the chip. 1.7 Applications of Association Processor Architecture 1.7.1 Vector Quantization Processorfor Motion Picture Compression As a straightforward application of the association processor architecture, the vector quantization (VQ) chips have been developed for motion picture compression and about three orders of magnitude faster performance has been demonstrated as compared to typical CISC processors. The VQ chips were implemented in conventional CMOS digital circuitry employing a fully parallel SIMD architecture [29, 30] as well as in the vMOS circuitry [31], resulting in the eight times higher integration density in the vMOS implementation. This is briefly described in the following. The vector quantization (VQ) [32] algorithm employed in the system is explained in Fig. 1.19. A fragment taken from the original picture (4X4 pixels for instance) is an abstract pattern of gray patches, which can be approximated by one of the template patterns stored in the code book. Thus the pixel data are compressed to the code number of the template. Although the algorithm is straightforward, the template matching is an extremely expensive computation.
Neuron MOS Transistor: The Concept and Its Application
19
I Fig. 1.19 Vector quantization ;(VQ) algorithm for image i compression. RECONSTRUCTED
However, this is the task that the association processor can carry out most efficiently. 1.7.1(a) Digital VQ processor In order to prove the VQ algorithm is effective for motion picture compression, we first implemented a VQ processor in a pure digital CMOS technology. The most important concern of the system is the real-time encoding of motion INPUT VECTOR
~ffi
ffjj
(16 elements, 8 bit /element)
SLAVE |
GLOBAL WINNER CODE Fig. 1.20 Organization of digital VQ system composed of eight VQ processor chips. The shortest distance vector (winner) is searched in three steps of competition.
20
T. Shibata
pictures. In order to encode a 640X480 full color picture in a 4:1:1 format within 33 msec, a single VQ operation must be completed within 1.1 -pec. Our sfrategy toward this end is as follows. Firstly a folly parallel SIMD architecture is employed. Secondly a single VQ operation is conducted in two pipeline stages, each pipeline segment consisting of 19 cycles. As a result, a single VQ operation is finished in every 1.1 psec at a. clock frequency of 17 MHz. Thirdly the chip is extendible to 8-chip master-slave configuration, enabling us to perform a fully parallel search for maximum 2048 template vectors in 1.1 usee. Fig. 1.20 shows the block diagram of the VQ chip module, which is composed of eight VQ chips, namely one master chip and seven slave chips. Each VQ chip stores 256 template vectors in the embedded SRAM. The input vector is given to all the chips at the same time and stored hi the input p i p o buffers. The template vector having the minimum distance to the input vector is searched in three stages of competition by using digital winner-take-all (WTA) circuits. The first stage is performed in each 64-vector matching block where the distances between the input vector and 64 template vectors are calculated and the winner (the shortest distance vector) is selected in each block. The second stage is conducted on each chip and the chip winner is selected by the 2nd 8-Vectors
7.98 mm
Fig. 1.21 Photomicrograph of digital VQ processor chip fabricated in 0.6pm single-polysilicon triple-metal CMOS technology.
Neuron MOS Transistor: The Concept and Its Application
21
WTA. The distance of each chip winner is sent to the master chip, where the final competition is carried out to find out the global winner. Fig. 1.21 shows a photomicrograph of the chip fabricated in a 0.6-|im singlepoly triple-metal CMOS technology. The search time for 2K template vectors in the eight-chip master-slave configuration is 1.1 usee at 17 MHz of the clock frequency, and the power dissipation of a chip is 0.29 W under 3.3 V power supply. A single VQ operation for 2K template vectors on typical CISC processors requires roughly 1.2 M operations. This number was derived from the estimation: (38 operations/element) X (16 elements/vector) X (2048 vectors/VQ) = 1.2M operations/VQ. The present VQ system in the eight-chip configuration can do this job in 1.1 \\sec, which is equivalent to a CISC processor performance of about 1000 GOPS (1.2M operations/ 1.1 jisec). 1.7.1(b) vMOS VQ Processor An analog vector quantization processor has been also developed using the neuron-MOS (vMOS) technology [31]. In order to achieve a high integrating density, the template-merged matching cell shown in Fig. 1.18(b) is employed in
Reset
Matching Degree Resistance Control
| " 1 Q " | " 1 5 " | "7" '3Floating' *~~^ I « *~ " 256-Vector Matching Block Gate
TWfe=fe X^^n
Neuron-MOS Comparator
yyy.
A
High Gain Amplifier,
Controller
3&F&S Latch Winner - Observer Winner Code =
"00000001"
Fig. 1.22 Self-convergent vMOS WTA circuit employed in vMOS VQ processor.
22
T. Shibata
the absolute value circuitry. A newarchitecture vMOS winner-take-all (WTA) circuit has been developed to resolve the trade-off relation between the search speed and the discrimination accuracy. In Fig. 1.22, the WTA architecture is illustrated. All 256 comparator outputs are fed to an OR gate and its output is fed back to the reference voltage terminal of each comparator, thus forming a multiple-loop ring oscillator. The loop gain is controlled by the Fig. 1.23. vMOS VQ processor chip variable resistance inserted in the fabricated in 1.5jim double-poly CMOS loop. At the start of WTA activation, technology. all the vMOS comparators turn on and the OR output starts an l-to-0 transition. This transition is fed back to all comparators and provide them with a descending reference voltage. If one of the comparators upsets, the OR gate upsets also and starts a O-to-1 transition. Detecting this transition, the controller increases die value of the variable resistance. In this manner the feed back gain is step-by-step reduced and the winner search accuracy is gradually increased from the coarse search with a low scan rate to the fine search with a high scan rate. The fixed-value resistances were made by MOS transistors and the value was altered by changing the resistor connections. In this manner, die new WTA performs multi-resolution winner search in an automatic control. The circuit was designed to achieve the discrimination accuracy of 5mV after five scan steps. A photomicrograph of the analog VQ processor chip is shown in Fig. 1.23. The chip was built in a 1.5-fim double-polysilicon CMOS technology and has the chip size of 7.2mm X 7.2mm. A single chip contains 256 16-element template vectors. This is equivalent to one eighth of the chip size of the digital CMOS implementation (built in a 0.6-jim CMOS technology) if it is assumed that the chip size scales with die minimum feature size of the technology.
Neuron MOS Transistor: The Concept and Its Application 23 1.7.2. Fully Parallel Motion-Vector Detection Circuitry [33, 22] The basic architecture of the vector matching circuitry in the vMOS association processor (Fig. 1.15) has been applied to the motion vector detection, or motion compensation, the most time consuming processing in the MPEG-2 coding. The motion of an object in two successive frames is obtained based on the image data projected onto x- and y- axis. The circuit configuration is shown in Fig. 1.24. The x-projection data at time t are intentionally shifted to ±4 pixels and matched with the data at time t + At. The absolute value of difference is calculated and summed up for each shift and die best much is searched by the WTA. The x-component of the motion vector is identified by showing a flag at one of the WTA outputs. In this manner, the computationally expensive motion compensation can be conducted in a very short time on a very simple hardware. The HSPICE simulation results are shown in Fig. 1.25. In Fig. 1.26, a photomicrograph of the test circuit designed for ±2 pixel shifts is demonstrated, and the measured data are presented in Fig. 1.27. In this manner, the basic
Fig. 1.24 vMOS motion-vector detection circuit composed of matching cell array and WTA.
24 T. Shibata
w
„
r+-n
ft " 0
_
•
TIMEt,
x-WinnerCellt+31
y ._i-;:''_J.OS^&
4
LATCH
^ — ~ " "
-
2
0
I:
0
. . . .SNinDBT . . . . CBlU+Si
i
Losers/ •
/
6 •3 *
5 2
"
n
^*~ Winner Cell(+3)
,
x—Losers
TIME [ sec ]
Fig. 1.25. HSPICE simulation results for the circuit of Fig. 1.24 with test input data shown in the figure. The circuit was simulated assuming 0.5um technology.
operation of the circuit is experimentally verified (The chip was built using a 3Hm Tohoku University lab. processing).
Fig. 1.26 A photomicrograph of test circuit designed for :2 pixel shifts (fabricated by CMOS process with 3-um layout rules).
Neuron MOS Transistor: The Concept and Its Application 25
=*=F CONTROL
~F=+
OUTPUT +1 PIXEL SHIFJ^j RAMP
-
•••
• — - ^
K
2(isec/div Fig. 1.27 Measured wave forms of the circuit in Fig. 1.18, showing only the output of +1 pixel shift is falling. The data were monitored by direct probing to negativelogic outputs
1.7.3 CDMA Matched Filter [34] The self correlation matching technique developed for the motion vector detection has been extended to build a matched filter, one of the key components in the next generation WB-CDMA wireless communication systems. In this application the templates are binary vectors representing the short PN (pseudorandom noise) codes with varying phase shifts. The chip architecture is shown in Fig. 1.28. An input signal train captured by sample and hold circuits is simultaneously matched with a group of templates having all possible shifts in the phase of an identical PN code. The maximum correlation is detected by fully parallel matching using the binarysearch vMOS winner-take-all circuit. Such a parallel architecture enables us to perform very fast peak detection as well as the detection of second and third correlation peaks arising from multi-path delays. Matching cell used in the matched filter is given in Fig. 1.29. Since a template vector is a certain length of a PN code composed of ± 1 , it is easily
26 T. Shibata Input DATA
I
I
I
WTA Controller
Sample/Hold
1
Aiit
,r
+
lin
:e u
L^J. v MOS Source Follower
W
t
Arniy
< H
Degree of Matching in Binary Code
O C
Winner Code (Phase)
Fig. 1.28 Block diagram of vMOS matched filter. implemented as a pattern of each switching state either to VREF or to Vj in the reset and evaluate cycles. A photomicrograph of the test chip fabricated in a 0.6|jm double-poly triple-metal CMOS technology is shown in Fig. 1.30 and the fundamental operation of the system has been experimentally demonstrated [34].
EVALUATE
RESET
PN Code
dating Gate
V V
R3
= V V
RBF
+
-£(V i -V M ! F )-(PN) l
^
Fig. 1.29 Binary matching cell used in vMOS CDMA matchedfilter.PN code of ±1 is determined by the switching pattern in the reset and evaluate cycles
Neuron MOS Tmnsistor: The Concept and Its Application
27
Fig. 130 Photomicrograph of a test chip of vMOS matched filter fabricated in a 0.6fim double-polysilicon triple-metal CMOS technology.
1.8
Conclusions
It has been discussed that the functionality enhancement in an elementary device has a number of impacts on circuits and systems. The concept of neuron MOS transistor and the analog/digital merged hardware computation scheme implemented by vMOS circuits have shown a number of interesting features. These include a remarkable simplification in the circuit configuration as weE as the totroduction of flexibility in toe functionality when applied to binary logic circuits. The scheme has been also successfully applied to motion detection as well as to association processor implementation. Various other interesting applications have been exploited such as those to fuzzy processors [35], a finger print identification chip [14], a differential-of-gaussian filtering chip [36] and so forth. Aiming at building real-time recognition systems, a psychological brain model has been proposed and the vMOS association processor architecture has been developed as its hardware implementation model. In order to apply the association processor architecture to real recognition problems, how to represent the input image by a characteristic vector is of primary importance. Namely, the
28
T. Shibata
dimensionality reduction in the input image data while retaining its essential features is most essential. A hardware friendly vector representation algorithm has been developed and its versatile characteristics have been proven by simulation in applications to handwritten character and hand-drawn pattern recognition and medical radiograph analysis [37]. The study on hardware implementation of the characteristic vector extraction algorithm as well as the total system integration on silicon is now in progress.
Acknowledgment The major part of the work presented in this article was done in collaboration with Prof. T. Ohmi at Tohoku University when the author was at Tohoku University. The contributions of Prof. K. Kotani, Ning Mei Yu, Y. Yamashita, M. Konda, and T. Nakai of Tohoku University and A. Nakada formerly at The University of Tokyo are acknowledged and the author would like to express his sincere thanks to all of these people. This work was partially supported by the Ministry of Education, Science, Sports, and Culture under Grant-in-Aid for Scientific Research on Priority Areas, "Ultimate Integration of Intelligence on Silicon Electronic Systems" (1995-98), and also by Semiconductor Technology Academic Research Center (STARC) under the research project, "Right-Brain Computing Integrated Circuits and Their Application to Real-Time Image Processing" (1997-1998). Some of the chips presented here were fabricated in the chip fabrication program of VLSI Design and Education Center (VDEC), the University of Tokyo in collaboration with Nippon Motorola LTD., Dai Nippon Printing Corporation, and KYOCERA Corporation and also in collaboration with Rohm Corporation and Toppan Printing Corporation
Neuron MOS Transistor: The Concept and Its Application
29
References [1] W. S. McCulloch and W. Pitts, "A logical calculus of the ideas immanent in nervous activity," Bull. Math. Biophys., vol. 5, p. 115 (1943). [2] G. Moor, "Progress in digital integrated electronics," IEDM Tech. Dig., 1975, pp. 1113. [3] J. D. Meindl, "Low power microelectronics: Retrospect and prospect," Proc. IEEE, Vol. 83, No. 4, pp. 619-635 (1995). [4] T. Shibata and T. Ohmi, "A functional MOS transistor featuring gate-level weighted sum and threshold operations," IEEE Trans. Electron Devices, Vol. 39, No. 6, pp. 1444-1455 (1992). [5] T. Shibata and T. Ohmi, "Neuron MOS binary-logic integrated circuits: Part I: "Design fundamentals and soft-hardware-logic circuit implementation," IEEE Trans. Electron Devices, Vol. 40, No. 3, pp. 570-576 (1993). [6] T. Shibata and T. Ohmi, "Neuron MOS binary-logic integrated circuits: Part n, Simplifying techniques of circuit configuration and their practical applications," IEEE Trans. Electron Devices, Vol. 40, No. 5, pp. 974-979 (1993). [7] T. Shibata, K. Kotani, and T. Ohmi, "Real-time reconfigurable logic circuits using neuron MOS transistors," in ISSCC Dig. Technical papers, Feb. 1997, FA 15.3, pp. 238-239(1993). [8] W. Weber, S. J. Prange, R. Thewes, E. Wohlrab, and Andreas Luck, "On the application of the neuron MOS transistor principle for modern VLSI design," IEEE Trans. Electron Devices, Vol. 43, No. 10, pp. 1700-1708 (1996). [9] K. Ike, K. Hirose, and H. Yasuura, "A module generator of 2-level neuron MOS circuits," in the Proceedings of the 4th International Conference on Soft computing, Methodologies for the conception, design, and Application of Intelligent Systems (World Scientific, Singapore, 1996) pp. 109-112. [10] T. Shibata, H. Kosaka, H. Ishii, and T. Ohmi, "A neuron MOS neural network using
30
T. Shibata self-learning-compatible synapse circuits," IEEE J. Solid-State Circuits, Vol. 30, No. 8, pp. 913-922(1995).
[11] H. Kosaka, T. Shibata, H. Ishii, and T. Ohmi, "An excellent weight-updating-linearity EEPROM synapse memory cell for self-learning neuron-MOS neural networks," IEEE Trans. Electron Devices, Vol. 42, No. 1, pp. 135-143 (1995). [12] Jun-ichi Nakamura and E. R. Fossum, "Image sensor with image smoothing capability using a Neuron MOSFET," in Charge-Coupled device and solid State optical sensors IV, Proc. SPIE Vol. 2172, pp. 30-37 (1994). [13] M.Ikebe, M. Akazawa, and Y. Amemiya, "vMOS Cellular-Automaton devices for intelligent Image Sensors," Proceedings of the 5 th International Conference on Soft Computing and Information/Intelligent Systems, 16-20 Octover, 1998, Iizuka, Fukuoka, "Methodologies for the Conception, Design and Applications of Soft Computing," Vol. 1 (T. Yamakawa and G. Matsumoto, Eds.) pp. 113-117 [14] S. Jung, R. Thewes, T. Scheiter, K. F. Groser, and W. Weber, "A Low-Power and High-Performance MOS Fingerprint Sensing and Encoding Architecture," IEEE Journal of Solid State Circuits, Vol. 34, No. 7, pp. 978-984 (1999). [15] H. R. Mehrvarz and C. Y. Kwok, "A large-input-dynamic-range multi-input floatinggate MOS four-quadrant analog multiplier," IEEE Journal of Solid-State Circuits, Vol. 31, No. 8, pp. 1123-1131, August 1996. [16] K. Kotani, T. Shibata, M. Imai, T. Ohmi, "Clocked-neuron-MOS logic circuits employing auto-threshold-adjustment," in Digest of Technical papers, 1995 IEEE International Solid-State Circuits conference (ISSCC), San Francisco, FP 19.5, pp. 320-321 (1995). [17] K. Kotani, T. Shibata, and T. Ohmi, "DC-Current-Free Low-Power A/D Converter Circuitry Using Dynamic Latch Comparators with Divided-Capacitance Voltage Reference," 1996 IEEE International Symposium on Circuit and Systems (ISCAS 96), Vol. 4, Atlanta, pp. 205-208, May (1996). [18] K. Kotani, T. Shibata, and T. Ohmi, "CMOS Charge-Transfer Preamplifier for OffsetFluctuation Cancellation in Low-Power, High-Accuracy Comparators," Digest of Technical papers, 1997 VLSI Circuit Symposium, Kyoto, June, pp. 21-22 (1997). [19] Ho-Yup Kwon, K. Kotani, T. Shibata, and T. Ohmi, "Low Power neuron MOS Technology for High-Fuctionality Logic Gate Synthesis, " IEICE Trans Electronics, Vol. E.80-C, No. 7, pp. 924-930 (July, 1997). [20] Ning Mei Yu, Tadashi Shibata, and Tadahiro Ohmi, "A Real-Time Center-Of-Mass Tracker Circuit Implemented by Neuron MOS Technology," IEEE Transactions on
Neuron MOS Transistor: The Concept and Its Application
31
Circuit and Systems II, vol. 45, No.4, pp.495-503 (1998). [21] T. Shibata and T. Ohmi, "Neural Microelectronics," Technical Digest, International Electron Devices Meeting (IEDM) 1997, Washington D. C , pp. 337-342. [22] T. Shibata, T. Nakai, N. M. Yu, Y. Yamashita, M. Konda, and T. Ohmi, "Advances in neuron-MOS applications," in ISSCC Dig. Technical Papers, Feb. 1996, SA 18.4, pp.304-305. [23] T. Yamashita, T. Shibata and T. Ohmi, "Neuron MOS winner-take-all circuit and its application to associative memory," in ISSCC Dig. Technical Papers, Feb. 1993, FA 15.2, pp. 236-237. [24] Y. Yamashita, T. Shibata, and T. Ohmi, "Write/Verify Free Analog Non-Volatile Memory Using a Neuron-MOS Comparator," 1996 IEEE International Symposium on Circuit and Systems (ISCAS 96), Vol. 4, Atlanta, pp. 229-232, May (1996). [25] J. Hemink, T. Tanaka, T. Endo, S. Aritome, and R. Shirota, "Fast and accurate programming method for multi-level NAND EEPROMs," in 1995 Sym. VLSI Technology, Kyoto, Dig. Technical papers, pp. 129-130. [26] M. Konda, T. Shibata, and T. Ohmi, "Neuron-MOS Correlator based on Manhattan distance computation for event recognition hardware," 1996 IEEE International Symposium on Circuit and Systems (ISCAS 96), Vol. 4, Atlanta, pp. 217-220, May (1996). [27] M. Konda, T. Shibata, and T. Ohmi, "A Compact Memory-Merged Vector-Matching Circuitry for Neuron-MOS Associative Processor," IEICE Transactions on Electronics, Vol. E82-C, No. 9, pp. 1715-1721 (1999). [28] A. Rita, T. Yamashita, T. Shibata, and T Ohmi, "Neuron-MOS multiple-valued memory technology for intelligent data processing," in ISSCC Dig. Technical papers, Feb. 1994, FA 16.3, pp. 270-271 (1994). [29] T. Shibata, A. Nakada, M. Konda, T. Morimoto, T. Ohmi, H. Akutsu, A. Kawamura, and K. Marumoto, "A fully-parallel vector quantization processor for real-time motion picture compression," in ISSCC Dig. Tech. Papers, Feb. 1997, pp. 236-237. [30] A. Nakada, T. Shibata, M. Konda, T. Morimoto, and T. Ohmi, "A fully-parallel vector quantization processor for real-time motion picture compression," IEEE Journal of Solid-State Circuits, Vol. 34, No. 6, pp. 822-830, June 1999. [31] A. Nakada, M. Konda, T. Morimoto, T. Yonezawa, T. Shibata and T. Ohmi, "FullyParallel VLSI Implementation of Vector Quantization Processor uisng Neuron-MOS Technology," Vol. E82-C, No. 9, pp. 1730-1737 (1999). [32] A. Gersho and R. M. Gray, "Vector quantization and signal compression," Kluwer
32
T. Shibata Academic Publishers, Boston, 1992.
[33] T. Nakai, T. Shibata, and T. Ohmi, "Neuron-MOS Quasi-Two-Dimensional Image Processor for Real-Time Motion Vector Detection," in the Proceedings of the 4th International Conference on Soft computing, Methodologies for the conception, design, and Application of Intelligent Systems (World Scientific, Singapore, 1996) pp. 833-836. [34] A. Okada and T. Shibata, "A Neuron-MOS Parallel Associator for High-Speed CDMA Matched Filter," The 1999 IEEE International Symposium on Circuits and Systems (ISCAS '99), Vol. 2, Orlando, Florida, May. 30 - June 2, 1999, pp. 11-392 395. [35] Ning Mei Yu, Tadashi Shibata, and Tadahiro Ohmi, "An Analog Fuzzy Processor Using Neuron-MOS Center-of-Mass Detector," Proceedings of the 6th International Conference on Microelectronics for Neural networks, Evolutionary & Fuzzy Systems (MicroNeuro'97), 24-26 September, 1997, Dresden, pp. 121-128. [36] T. Sunayama, M. Ikebe, and Y. Amemiya, "A vMOS Cellular-Automaton Device for Differential-of-Gaussian Filtering," Extended Abstract of the 1999 International Conference on Solid State Device and Materials, Tokyo, 1999, pp. 110-111. [37] T. Shibata, M. Yagi, and M. Adachi, "Soft-Computing Integrated Circuits for Intelligent Information Processing," Proceedings of The Second International Conference on Information Fusion, Vol.1, pp.648-656, Sunnyvale, California, July 68, 1999.
Chapter 2 Adaptive Learning Neuron Integrated Circuits Using Ferroelectric-Gate FETs Sung-Min Yoon,
Eisuke Tokumitsu,
and Hiroshi Ishiwara
Tokyo Institute of Technology
Abstract An adaptive-leaning neuron circuit composed of MFSFET and complementary unijunction transistor (CUJT) oscillation circuit was fabricated on an SOI (siliconon-insulator) structure as the first step to the next-generation neural network. SrBi2Ta20g (SBT) was selected as a ferroelectric gate material and patterned by a newly developed selective etchant, NH4F:HC1. It was demonstrated that the fabricated MFSFET showed good memory operations and gradual learning effect in which the drain current was changed gradually by applying a number of input pulses with a sufficiently short duration time. It was also demonstrated that the output pulse frequency of neuron circuit increased gradually as the number of input pulses was increased. Finally, the problem of small output pulse height of the neuron circuit was solved by replacing C U J T oscillation circuit with the CMOS Schmitt-trigger circuit. Keywords : adaptive-learning, neuron circuit, MFSFET (metal-ferroelectricsemiconductor field effect transistor), synapse array, P F M (pulse frequency modulation), SOI (silicon-on-insulator), SrBi2Ta20g, selective-etchant, ferroelectric film, CMOS Schmitt trigger, complementary unijunction transistor (CUJT)
2.1
Introduction
Artificial neural networks, which execute a distributed parallel information processing and an adaptive-learning function, have attracted much attention for the future highly-developed information-oriented society. In a human brain, a huge quantity of information is processed in parallel, and stored as one's past experiences. In this system, neurons accept many weighted input signals and generate output pulses when the total value of 33
34
S.-M. Yoon, E. Tokumitsu
& H. Ishiwara
input signals exceeds a threshold value. The weighting operation for input signals is conducted by synapses which are attached to the neurons. Thus, synapses and neurons can be realized using memory devices and processors in an artificial neural network. However, the hardware implementation of a large-scale network is rather difficult, since the number of synaptic connections becomes huge as the number of neurons increases. One possible solution to this hardware problem is to use electrically-rewritable, nonvolatile analog memories, by which an electrically modifiable synapse array can be implemented in a small size. Actually, floating-gate MOS devices are used for this purpose [lj-[4j. In these devices, the data are stored as an amount of electrical charge injected through a tunnel oxide into the floating gate [5]. However, precise control of the quantity of injected carriers is rather difficult, unless the well-designed control circuit is used [3]. We have proposed a new concept of synaptic connection which is composed of an array of MFSFETs (metal-ferroelectric-semiconductor field effect transistors) and applicable to an adaptive- learning neural network [6]. In an MFSFET, the gate dielectric film of an MOSFET is replaced with a ferroelectric film, and it is used as an analog memory device for storing the past experiences through partial polarization of the ferroelectric film. The MFSFET array has the following merits superior to the floating-gate MOS device. First, the control circuit for modifying the synaptic weight values is much simpler in the MFSFET neuron circuit, since the polarization reversal phenomenon in a ferroelectric film is not so nonlinear as the electron tunneling phenomenon to a floating-gate, which reduces the synaptic connection area greatly. Secondly, the endurance for "rewrite" cycles in a ferroelectric film is more than 10 12 , which is much higher than the value (about 106) in a floating-gate FET. From these reasons, an MFSFET array is very promising for hardware implementation of neural networks. However, it is difficult, at present, to obtain good interface between a ferroelectric film and Si substrate in fabrication of MFSFETs, and it is further difficult to integrate MFSFETs with Si circuitry using conventional LSI technology without degradation of device performances. In this paper, we discuss some problems and solutions in fabricating PFM (pulse frequency modulation)-type neuron circuits integrated with MFSFETs and demonstrate the adaptive-learning function of the neuron circuit. In these circuits, CUJT (complementary unijunction transistor) and CMOS Schmitt-trigger circuit are used as the switching components for oscillation circuits.
Adaptive Learning Neuron Integrated Circuits . . .
2.2
35
Operation Principles of Adaptive-Learning Neuron Circuits
2.2.1 In our neuron circuit, a key device for realizing the adaptive-learning function is an MFSFET. Recently, the MFSFET has attracted considerable attentions as a promising device for nonvolatile memory applications, because they have crucial advantages compared with ferroelectric random access memories (FeRAMs) using ferroelectric capacitors [7j-[lOJ. Since the MFSFETs exploit the ferroelectric field effect, which is the modulation of conductivity by the electrostatic charges induced by ferroelectric polarization, the stored data can be read out nondestructively. Moreover, since the MFSFETs obey the scaling rule for the device miniaturization unlike DRAMs and capacitor-type FeRAMs, they are very desirable configuration for a single-transistor-cell-type high density nonvolatile memory. Actually, we have proposed a prototype of the single-transistor-cell-type digital memory, in which MFSFETs are arranged in a matrix form on an SOI (silicon-on-insulator) structure [ll]. However, fabrication of MFSFETs is very difficult. When a ferroelectric film is deposited on a Si substrate, generation of interfacial traps and interdiffusion of constituent elements occurs easily and the electrical properties as an FET become very poor. That is the main reason why no commercial nonvolatile memories loading MFSFETs have yet appeared, although the original idea on MFSFET is dated back to 1950's [12] and several prototype devices have been fabricated so far [13]-[15]. In order to realize the MFSFETs, the various ferroelectric materials and structures have been researched. In this sub-section, we will review the MFS-related structures briefly. Although P b Z r r T i ! _ x 0 3 (PZT) is a typical ferroelectric material with a high remnant polarization (Pr), it is well known that the PZT/Si structure causes terrible interdiffusion of Pb and Si atoms, even if the annealing temperature is as low as 500°C. Thus, various buffer materials (SrTi0 3 [16], Ce0 2 [17], Y 2 0 3 [18], YSZ [19] and MgO [20] etc.) to prevent the interdiffusion are being investigated. One of the most successful fabrications using PZT related material as a ferroelectric gate is Ir/Ir02/PZT/Ir/Ir02/Poly-Si/Si02/Si structure, in which the memory window (the shift of threshold voltage) of 3.3 V was achieved in JD - VG (drain current vs. gate voltage) characteristics for a bias sweep
36
S.-M. Yoon, E. Tokumitsu
& H. Ishiwara
of ±15 V [21]. However, in these MFIS or MFMIS (Hnsulator) structures, high operation voltage is required to provide the sufficient voltage to the PZT gate film since PZT has a relatively high dielectric constant. SrBi 2 Ta20g (SBT) is the other important material in nonvolatile memory applications because of its reasonably large remnant polarization and superior fatigue-free properties [22]. Since a sol-gel derived SBT film with poly crystalline structure can be deposited directly on Si without significant degradation of the interface, the SBT/Si structure can be used for MFSFET applications. We have confirmed the memory operations of MFSFET using this structure [23]. However, the memory window width is much narrower than the expected value, which is ascribed to the existence of transition layer with low dielectric constant such as SiC>2 formed at the interface. Although Ce02 and Y2O3 are also selected as buffer layers for forming an MFIS-FET [24]-[25], they are not very effective for improving this problem. Another Bi-layered ferroelectric material is Bi4Ti30i2 (BiTO). This material was often used in the early studies on MFSFET [13] and MFISFET [14]. Recently, excellent interface properties and data retention characteristics were obtained in MFIS diodes, in which Bi2SiOs (BSO) was used as an interfacial buffer layer and both BSO an BiTO were epitaxially grown on Si (100) substrate [26]. Other interesting materials are Sr2(Ta,Nb)207 and BaMgF 4 . The Pt/Sr 2 (Ta,Nb) 2 0 7 /Pt/Ir02/Poly-Si/Si02/Si structure showed excellent FET characteristics, in which the memory window of 3.6 V was obtained for a bias sweep of ± 5 V [27]. The fluoride ferroelectric materials such as BaMgF4 (BMF) has low dielectric constants and they can be deposited directly on Si substrates without formation of unintended transition layers, such as SiO x at the interface [28]-[29]. Furthermore, the interface state density of MFS diodes formed in a simple process seems to be relatively small [30]. However, the ferroelectricity of BMF is easily degraded through the fabrication process, especially by both dry and wet etching processes, and the data retention time of MFSFET using BMF film is still very short. In these MFS-related devices, one of the most important properties for realizing nonvolatile memories with nondestructive "read-out" operation is the data retention characteristics. Ideally speaking, the data stored in an MFSFET through "write" operation must be retained for years. However, in actual cases, the remnant polarization of ferroelectric film is prone to be reduced with time due to the depolarization field and leakage current
Adaptive Learning Neuron Integrated Circuits ...
37
through the ferroelectric film. Therefore, in order to improve the device performance of MFSFETs, it is very important to establish a better ferroelectric/semiconductor interface and to develop a more robust device structure for the retention failure. Recently, it was found that FETs with the gate structure of Pt/SBT/Pt/SrTa20e/SiON/Si showed good data retention characteristics when area ratio of the top electrode and the floating gate electrode was optimized [31].
2.2.2 Figure 2.1 shows a basic neuron circuit proposed as an elementary component of the pulse frequency modulation (PFM) type adaptive-learning neural networks. The term "adaptive-learning" means such a function that the electrical properties of a device are changed partially or totally by applying a certain number of usual signals to the device. In this circuit, MFSFETs correspond to the synapses and other devices (C, R and CUJT) form a neuron part which generates output pulses when the total input charge exceeds a threshold value. In order to realize the adaptive-learning function, the polarization state of the ferroelectric gate in MFSFET is partially reversed by applying input pulses to the gate terminal, and thus the channel resistance of MFSFET is gradually changed according to the polarization state. In other words, the synaptic values stored in MFSFETs are gradually changed by applying an adequate number of input signals. For this reason, the duration of input pulses must be sufficiently shorter than the switching time for polarization reversal of the ferroelectric film. This is a main reason why the PFM system is used in the proposed neuron circuit. In the neuron part, CUJT is used as a switching component to discharge the capacitor C, which corresponds to the threshold processing in a neuron. Since the output pulse interval of the circuit is proportional to the C times channel channel resistance of MFSFET, the output pulse frequency can be gradually changed as the number of input pulses is increased. This operation is similar to the information processing in a human brain, in which current pulses generated in neurons propagate through nerve membranes and axons.
38
S.-M. Yoon, E. Tokumitsu
& H. Ishiwara Vcc
CUJT
°Htt^H£H£ T Fig. 2.1
A Basic neuron circuit for an adaptive-learning neural network.
2.2.3
In neural networks, each neuron has many synapses and they are connected to the neurons in the previous layer. Figure 2.2 shows the schematic diag r a m of a two-layered neural network, in which the o u t p u t s of m neurons are fully connected to the n neurons in the next layer. In this neural network, mxn synapses are required, which can be realized by parallel connection of the M F S F E T s , as shown in Fig. 2.1. In this structure, each M F S F E T is differently polarized and accepts pulse signals from different neurons. Therefore, the total drain current summed u p for all M F S F E T s determines the o u t p u t behaviors of the neuron circuit. T h e "weighted-sum" operation of synaptic values in a neuron is performed in this way. T h e prototype layout of the synapse array fabricated on an SOI structure is shown in Fig. 2.3, where Si stripes with a lateral npn structure are placed on an insulating layer and then covered with a ferroelectric film, and common metal stripes for gate electrodes are placed on the film perpendicular to the Si stripes. Since there is no via-holes across the ferroelectric film in this structure, the packing density of synapses is expected to be very high. Furthermore, the synapse array fabricated on an SOI structure can be electrically isolated completely from one another, which enables us to give different weight values to the individual synapses with ease. We have demonstrated the
Adaptive Learning Neuron Integrated Circuits . . .
39
m-neurons
Fig. 2.2
Schematic diagram of a two-layered neural network.
"weighted-sum" operations of electrically modifiable synapse array using MFSFETs with 3x3 array structure [32]. 2.3
Neuron Integrated Circuits Composed of MFSFETs and CUJT Oscillation Circuits
We selected the SBT/Si structure of various MFS-related structures discussed above for fabricating synapse device using MFSFET. Although this structure is not perfectly promising in its interface properties, the FET Ferroelectric film
Fig. 2.3
Gate electrode
Synapse array using MFSFET matrix fabricated on an SOI structure.
40
S.-M. Yoon, E. Tokumitsu & H. Ishiwara
behaviors are sufficiently good for synapse device applications [22]. Furthermore, the simplicity of this structure is expected to increase the yield of circuit after the full fabrication processes. It was found that parasitic ferroelectric effects of the unnecessary SBT film which was deposited on the whole area of substrate prevented the normal oscillation operation of the circuit, although the individual devices in the circuit operated normally. The parasitic capacitors seem to be formed particularly in the areas of metal interconnections and the electrode pads and have a significant effect on the normal operation scheme of the neuron circuit. In order to solve this problem, the unnecessary SBT film must be selectively etched. Therefore, we newly developed a selective etchant for an SBT film for integrating MFSFETs with other components of the circuit. The fabrication procedure and the operations of the circuit will be explained in detail. 2.3.1 All devices of the neuron circuit were designed by a 5 fim design rule and fabricated on an SOI structure with a 3-/<m-thick p-type Si layer. The channel length and width of MFSFET are 5 fim and 50 ^m, respectively. The device structure and electrical characteristics of CUJT used in a neuron circuit were described in our previous paper [33]. The capacitors were designed to be 3 pF, 10 pF and 30 pF and fabricated with the structure of Al/Si0 2 /n+-Si. RL was designed to be 60 fi-80 Q.. The fabrication procedures are as follows. First, the device region was separated into islands of rectangular shapes using plasma etching system. The reaction gases and their ratio were CF4:02 and 45:5, respectively. Then, the ion implantation processes for forming the active regions of device and contact regions were performed based on the optimum conditions examined in Ref. [33]. After the Si islands were oxidized by dry oxidation for passivation, gate windows for deposition of SBT films were formed by wet chemical etching. The thickness of passivating SiC>2 layer was 50 nm. SBT films were deposited using liquid source misted chemical deposition (LSMCD), in which the same type of sol-gel precusors was used as that in spin-coating method. A better coverage at surface steps and a good thickness uniformity is expected to be obtained by the LSMCD method [34]. Figure 2.4 shows the schematic diagram of LSMCD apparatus used in this study. The deposited SBT films were dried at 150°C for 5 min and prefired at 500° C for 20 min to remove residual organics. The deposition process
Adaptive Learning Neuron Integrated
Circuits
41
Carrier gas N2
Exhaustion •^m —I
Fig. 2.4
Ultrasonic nebulizer *"
Schematic diagram of LSMCD apparatus used in this study.
by LSMCD was repeated until the desired film thickness was obtained, and they were annealed for crystallization at 750°C for 30 min in an O2 atmosphere using a rapid thermal anneal (RTA) system. The final thickness of SBT gate film was about 150 nm. Then, a Pt film was deposited by e-gun evaporation method for forming the gate electrode and it was patterned by lift-off process. In order to obtain good ferroelectricity of SBT, it is generally desirable to use the Pt electrode. We can also expect that the Ptgate electrode formed right after the deposition of SBT acts as a protection layer for SBT gate during the subsequent fabrication processes. As mentioned above, it is essential for obtaining the normal oscillation operation of the circuit that the unnecessary SBT films be removed. To remove the unnecessary SBT film, various etching methods, which give a sufficient etching selectivity between SBT and underlying SiC-2 films, were attempted. In a reactive ion etching (RIE) process, it is very difficult to etch the SBT film only because of the similar etching rates of SBT and SiC-2 either in a gas mixture of Ar:Cl 2 or in CF 4 -based gas mixtures. On the other hand, it was found in the wet chemical etching using HF:HC1 solution that the SBT was very quickly etched off compared with SiC-2, in which the etching rate for SBT was about 10 times faster than that for SiC-2However, the absolute value of etching rate for SiC>2, about 70 nm/min, was too high even if the HF concentration was decreased to 2.5 %, hence the remaining SiC-2 was seriously damaged.
42
S.-M. Yoon, E. Tokumitsu ,-.250
& H. Ishiwara
Room temperature for 60s
^200
(«
J 150 !S BJ3
100
C
"Js 50 W
0 0
0.2 0.4 0.6 0.8 NH4F:HC1 (NH4F Concentration, M/1)
20 30 40 Etching Time (s)
50
(b) Fig. 2.5 Etching characteristics of NH4F:HCl;(a)Dependence of etching rates on the concentration of NH4F. (b)Dependence on the crystallization temperature of SBT.
After many trials, we developed a new wet selective etchant for SBT film, NBi4F:HCl solution. Figure 2.5(a) shows the comparison of the etching rates of S B T and SiC-2 in NH4F:HC1 solution as a function of the concentration of NH4F, which shows a good etching selectivity between SBT and Si02- When the concentration of NH4F is 0.7 M/1, the etching selectivity of about 14:1 was obtained. It was also found t h a t the etching rate of S B T film in this etchant was dependent on the crystallization t e m p e r a t u r e of S B T , as shown in Fig. 2.5(b). Using this etchant, the SBT film deposited on the entire area of the circuit was removed thoroughly, leaving only the gate area of M F S F E T . Contact holes were easily formed by wet etching using B H F solution,
Adaptive Learning Neuron Integrated Circuits ...
MFSFET
Fig. 2.6
43
CUJT
A photograph of integrated MFSFET neuron circuit.
since the unnecessary SBT film did not exist any more. Finally, Al interconnection and electrode pads were formed by lift-off process, which was essential for successful Al patterning, since the wet chemical etching of Al in a hot H3PO4 solution degraded the SBT film severely. 10 sheets of photomask were used in fabrication of the neuron circuit. A photograph of the integrated neuron circuit is shown in Fig. 2.6. 2.3.2
Figure 2.7 shows the drain current (ID) - gate voltage ( VG) characteristics of the fabricated MFSFET. A counterclockwise hysteresis was obtained as indicated by arrows, and the memory window was about 0.57 V for a VG sweep from 0 V to 6 V. To confirm that this shift of threshold voltage is attributed to the ferroelectric nature of SBT film, the dependence of the memory window width on the sweep rate of gate voltage was measured. In Fig. 2.8(a) the ID - VG characteristic for the fastest sweep rate case (6xl0 4 V/s) was compared with that for the normal case shown in Fig. 2.7 (0.5 V/s), in which the sweep rate of gate voltage for the fastest case is about 105 times faster than that of normal case. In this measurement, the gate voltage was applied using 5 kHz triangular wave of 0 to 6 V in a virtually grounded circuit shown in Fig. 2.8(b). Although it is the case that the existence of mobile ions in the gate film may cause the hysteretic behavior
44
S.-M.
Yoon, E. Tokumitsu
& H. Ishiwara
1.6 Memory Window 0J7 V
0.8 0.4 0
0
1
2
3
4
Gate Voltage (V) Fig. 2.7 1.6
~i
ID - Va characteristics of the fabricated M F S F E T
'
r
VG sweep rate SOOmVIs ° 6xl
5. 1.2
vn=iv MFSFET
£ 0.8
^ © Pulse
U
2 0.4 Q
WkO
[Generator]
Of Amp
(a)
1
2
3 4 Gate Voltage (V)
5
(b)
Fig. 2.8 (a)Comparison of //) - IQ characteristic for slow sweep rate of the gate voltage with that for a gate voltage sweep rate of 6 x l 0 4 V/s. (b)A virtually grounded circuit for the measurement. In this circuit, the gate voltage was applied using 5 kHz triangular wave and the drain current was measured.
in the drain current, its evidence cannot be found in the experiment with the fast sweep rate of gate voltage. The measured value of memory window was practically independent of the sweep rate of gate voltage, as shown in Fig. 2.9. From these results, it is concluded that the MFSFET was successfully fabricated under the new wet etching condition, and that the obtained memory window is due to the ferroelectricity of SBT film, obviously. The memory effect was also shown from drain current (ID) - drain voltage (VD) characteristics for the same FET, as shown in Fig. 2.10. First,
Adaptive Learning Neuron Integrated Circuits ...
i
i
i
normal condition 0.6
'
• '
1
•
•
"
• •
1
1
45
•
•
0.4
-
0.2
<
io-'
Fig. 2.9
i
1
102 103 10 4 10 Gate Voltage Sweep Rate (V/s)
105
Dependence of memory window width on the sweep rate of gate voltage.
the "write" pulse signals with -6 V, +4 V or +6 V were applied to the gate terminal for 2-3 s. Then, the "read-out" voltage of 1.4 V was applied and the drain current was measured. As can be seen in the figure, the drain current changed from "off to "on" state by changing the "write" voltage from -6 V to +6 V. On the other hand, the saturated drain current was reduced when the "write" voltage was +4 V. It indicates that the ferroelectric polarization of SBT gate film is partially reversed when a relatively small voltage is applied as the "write" signal.
'
1
'
1
'
1
'
1
+6V Write
<m*
I,
i!..-' 1ft-8
' 0N
•v"
: -
'
+4V Write "Read-out" gate voltage 1.4V \
f
^6V Write i
1
.
i
OFF \ .
i
.
i
.
2 3 4 Drain Voltage (V)
Fig. 2.10 "Write" and "read-out" operation in Ip - VQ characteristics of the fabricated MFSFET.
46
S.-M. Yoon, E. Tokumitsu & H. Ishiwara
Next, the adaptive learning effect due to the gradual polarization reversal was examined by applying the input pulses to the gate terminal. The height and duration of applied input pulses were 6 V, 20 ns, respectively. Figure 2.11 shows that the variation of "read-out" drain current (ID) with the number of input pulses. As can be seen from the figure, the drain current gradually increases as the number of applied pulses increases. The "read-out" gate voltage ( VG) was equally adjusted at 1.4 V. The behavior of drain current in this figure can be interpreted as such a phenomenon that the ferroelectric polarization of SBT gate is gradually reversed by application of input pulses and that the channel resistance of MFSFET is changed due to the polarization state. This analog-like change of "readout" drain current is essential in realizing the adaptive-learning function in the proposed neuron circuit. i
1 - 1
.
i
= = = ** t io-«
;
i
.
—
i
.
30 times 20 times 15 times : 10 times 5 times .
"—
two pulses
'
a U e '3
one pulse
Read-imtgate voltage 1.4V
u
a
.
io7
Initial i
i
1
1
i
1
.
1
i
: '
2 3 4 Drain Voltage (V)
Fig. 2.11 Gradual learning effect in the MFSFET. ID in "read-out" operation changes by applying input pulse signals to the gate.
2.3.3
First, the normal operation of the neuron circuit was examined by measuring the oscillation frequency as a function of DC input voltage applied to the gate terminal of MFSFET. As shown in Fig. 2.12, it was found that the output pulse frequency changed with a hysteretic characteristics, which reflects the ferroelectric memory operation of MFSFET as discussed
Adaptive Learning Neuron Integrated Circuits . . .
47
in Fig. 2.7. Then, the oscillation characteristics for pulse input signals were measured, which corresponds to the operation for determining the weight value in each synapse in the synaptic connection. Figure 2.13 shows the variation of the output pulse frequency after sufficiently long single pulses with different heights were applied to the gate terminal. During this measurement, the gate voltage of MFSFET was kept at 1.4 V. As discussed in Fig. 2.10, since the "write" signals with different heights induce different values of "read-out" drain current in MFSFET, the output pulse frequency of neuron circuit changes in this figure.
T
'
1
'
1
'
1
K3
v{ 1.5 2 Gate Voltage (V)
6 2
Read-out gate voltage 1.4V 3
4 5 6 Pulse Amplitude (V)
7
Fig. 2.12 Oscillation characteristics of the Fig. 2.13 Variation of output pulse freneuron circuit as a function of DC input quency with input pulse height. voltage applied to the gate terminal.
In order to realize the adaptive-learning function in the PFM system, it is necessary to change the number of input pulses, each of which has the same width and the same height (20 ns, 6 V). Typical output pulse waveforms are shown in Fig. 2.14, which correspond to the waveform after the application of a single pulse and that after sixty pulses. During this measurement, a constant DC voltage of 1.65 V was applied to the gate terminal. Under this condition, the neuron circuit did not show the oscillatory behavior before the first pulse was applied to the gate. Figure 2.15 shows that the variation of output pulse frequency of neuron circuit with the number of input pulses, which clearly demonstrates the adaptive- learning function of the neuron circuit. It is concluded from these results that the output characteristics of neuron circuit are changed by a kind of experiences imposed in the past.
48
S.-M.
Input
Yoon, E. Tokumitau & H. Ishiwara 6V
Input 1 Pulse
rui-n*
60 Pulses
1.65 V
Output
Output ttfler sixty pulses, f~USMHz
'\fWVWW
> >o ,
- 2 - 1 0 Time (us)
Fig. 2.14
1
-1
0 Time (\is)
1
Output pulse waveforms of the integrated MFSFET neuron circuit.
Total Pulse Width (ns) 0
200
0
10
400
600
800
1000 1200
9
I Fig. 2.15
20 30 40 Number of Pulse
50
60
Variation of output pulse frequency with the number of input signals.
2.3.4
Although we successfully obtained the adaptive-learning function of an MFSFET neuron circuit fabricated on a single SOI wafer, the small output pulse height of CUJT oscillation circuit is still an undesirable problem for the practical applications. In other words, the height of output pulses must be high enough to reverse the ferroelectric polarization of MFSFET, since
Adaptive Learning Neuron Integrated Circuits ...
49
the output pulse of a neuron is used as an input signal to the next-layer neuron. However, the pulse height obtained in the CUJT oscillation circuit was as small as 0.1 V, as shown in Fig. 2.14. In practical networks, this value is too small to be used as input signals to the next-layer neuron and an amplifier of the signals is necessary to be connected to the output of each neuron circuit. To integrate an amplifier circuit is not a desirable solution to this problem, since it requires additional chip area and makes fabrication process more complex. Thus, we introduce a new circuit configuration in order to clear the pulse height problem effectively, as shown in next section. 2.4
Neuron Circuit Using CMOS Schmitt-Trigger Oscillator
In order to solve the small pulse height problem in the output characteristics, we introduce a new oscillation circuit using CMOS Schmitt-trigger. In this section, the basic operation of the neuron circuit using CMOS Schmitttrigger oscillator is first discussed and then the fundamental oscillation characteristics of the circuit are examined. Finally, a new neuron circuit is fabricated by integrating an MFSFET with a CMOS Schmitt-trigger oscillator and the adaptive-learning function of the circuit is demonstrated. 2.4.1 The basic circuit diagram of a neuron circuit with CMOS Schmitt-trigger oscillator is shown in Fig. 2.16, in which the oscillation operation using MFSFET as a synapse device is basically identical to that of neuron circuit using CUJT oscillation circuit. The CMOS Schmitt-trigger circuit enclosed by the dotted line in Fig. 2.16, has the hysteretic behaviors in the voltage transfer characteristics between input and output, hence, charging and discharging of a capacitance C can be performed through the p-ch FET connected in parallel. The threshold voltages for increasing and decreasing input signals can be changed by varying the ratio of two feedback resistors, R1/R2. 2.4.2
In order to examine the fundamental oscillation characteristics, a simple oscillation circuit using a CMOS Schmitt-trigger configuration with fixed
50
S.-M.
Yoon, E. Tokumitsu
& H. Ishiwara
value resistors and a conventional MOSFET was first fabricated. The circuit was designed using a 5-/im rule. The channel width to channel length ratios (W/L) for n-ch FET and p-ch FET in the CMOS Schmitt-trigger were designed to be 10 and 20, respectively, so that the driving current be sufficiently large for the high-speed operation of this circuit. Rl, R2 and C were designed to be 500 kfl, 1 Mfi and 10 pF, respectively. Full fabrication process was conducted using the conventional CMOS technology on an SOI structure. A photograph of the fabricated circuit is shown in Fig. 2.17. 2.4.3
Figure 2.18 shows output pulse waveforms of the fabricated circuit, which are results for the fixed value resistors of 50 kQ, 100 kfi and 500 kfi. The power supply voltage ( VDD) was 5 V. Clear rectangular pulse forms with 5 V in height were obtained, and the output pulse frequency was modulated by the value of RE • Similar oscillation operations with output pulse height of 5 Vwere obtained in the circuit using an MOSFET, as shown in Fig. 2.19. Figure 2.20 shows the dependence of output pulse frequency on the gate voltage applied to MOSFET. The figure shows that the output
Charging & -r-L Discharging ^ _]_ I C
CMOS Schmitt-Trigger Output puke R2
\ MFSFETs
R1 rHM
!
0^in
Transfer Characteristics of • < * CMOS Schmitt-Trigger Circuit
v. v. v„
Fig. 2.16 Basic operation of the neuron circuit composed of MFSFETs and CMOS Schmitt-trigger oscillator.
Adaptive Learning Neuron Integrated Circuits . . .
51
_^5
~ t > 0 0
••"
-y??*^
S3
Di)
Oulpull {
iffll *'
1
>
• o; vUI$F:£339l
Ir-
5
ap
m
l200//m
, .
(a)
..- •{: „pi &stmkQi
vu '
f
'i<
100 k OhSO k Oi
> *o o
(MOS)
Fig. 2.17 A photograph of CMOS Schmitttrigger oscillation circuit using fixed value resistors and MOSFETs as RE-
0
i 10 20 30 40 50 60 70 80 90 100 (c) Time ((is)
Fig. 2.18 Output pulse waveforms of CMOS Schmitt-trigger oscillation circuit using fixed value resistors; (a)50 left, (b)100 kQ, and (c)500 k n .
pulse frequency is well modulated by the variation of S-D channel resistance of MOSFET. However, the hysteretic behavior shown in Fig. 2.12 was not obtained , since an MFSFET was not used in this circuit. It is concluded from these results that the output pulse signals generated from CMOS Schmitt-trigger circuit can be used as input signals for neurons in the next-layer without connecting an additional amplifier. 2.4.4
In the previous subsection, it was demonstrated that the small output pulse height problem of the neuron circuit using CUJT oscillator could be solved by replacing it with the CMOS Schmitt-trigger oscillator. Thus, in this subsection, the CMOS Schmitt-trigger oscillator is integrated with an MFSFET and the adaptive-learning characteristics with an improved pulse height property in the neuron circuit is demonstrated. In fabrication of the modified neuron circuit, the oscillation part was
52
S.-M.
Yoon, E, Tokumitsu
& H. Ishiwara
/gs=i.or
1
] Vgs=2M
103
— - t — —
::::::::?::::::: pr ,1
f
Vt>s=3.0Y
102
I±l F - - } — • EEfrE
_JCL__
^-x^eimfiF^
,
:
~Z
4
23kHz
Vgs=4.0Y 0
10
20 30 Time (jis)
40
50
Fig. 2.19 Output pulse waveforms of CMOS Schmitt-trigger oscillator using MOSFET as RB-
O 10
i i
1
2 3 4 Gate Voltage (V)
Fig. 2.20 Variation of output pulse frequency with the value of applied gate voltage.
fabricated using the same process as that of the CMOS Schmitt-trigger oscillator shown in 2.4.2, and then the fabrication process of MFSFET was incorporated, which was almost the same as that used in fabrication of the CUJT neuron circuit discussed in section 2.3. A photograph of the fabricated circuit is shown in Fig. 2.21, in which a capacitance value is fixed at 10 pF. All the fabricated devices including an MFSFET and MOSFETs composing the inverter circuits were confirmed to operate normally by optimizing the fabrication conditions, especially the etching condition of SBT. To examine the improved behavior of the modified neuron circuit, similar measurements were performed. The power supply voltage ( VDD) w a s 5 V. Figure 2.22 shows the output pulse waveforms after a single pulse (20 ns, 6 V) and 60 pulses were applied to the gate terminal of MFSFET. As can be seen in this figure, the adaptive-learning characteristics were similarly obtained in this modified neuron circuit. Furthermore, as expected, the height of output pulses was almost the same as VDD • The output pulse frequency was also successfully modulated with the number of input pulses, as shown in Fig. 2.23. It is concluded from these results that the imple-
Adaptive Learning Neuron Integrated Circuits ...
200 fim
| I Output
53
. •
Fig. 2.21 A photograph of the modified neuron circuit. mentation of neuron circuit using C M O S Schmitt-trigger as an oscillation component is a very desirable solution to the small pulse height problem of the neuron circuit using a C U J T oscillator.
2.5
Conclusions
A novel P F M - t y p e adaptive-learning neuron circuit using an M F S F E T as a synapse device was successfully fabricated on an SOI structure after optimizing the fabrication process. Main results obtained are summarized as follows. (1) M F S F E T s , which act as analog memories storing the synaptic weight in t h e neuron circuit, were fabricated using the S B T / S i structure, and good nonvolatile memory operations were demonstrated. (2) Gradual change of the drain current of M F S F E T due to the partial polarization reversal of S B T gate film was demonstrated by applying a number of input pulses with a sufficiently short duration of 20 ns. (3) In the integrated neuron circuit using an M F S F E T and a C U J T oscillation circuit, the o u t p u t pulse frequency was gradually changed as the number of input pulses applied to M F S F E T was increased. (4) T h e problem of small o u t p u t pulse height of the C U J T neuron circuit was solved by replacing the C U J T oscillation circuit with the C M O S
S.-M.
Input
Yoon, E. Tokumitsu
s
6V
1.85 V
Input
Pulse
20 ns
Output
after one pulse
'OUT
J
2
n
- . ! -20
-10
0 10 Time (us)
nil-JL"
60 Pukes
Output
j=64.7kHz
(i
[
> *
Fig. 2.22 circuit.
& H. Ishiwara
•
after sixty pulses > 4 -
"|
f=l54.2kHz
I1
/
/*1
OUT
54
2 •
0
20
•
_1,., -20
-10
,/
,H
j
.r,
0 10 Time ((is)
.,/ . 20
Output pulse waveforms in adaptive-learning function of the modified neuron
at constant gate volatge of 1.85V
J4
Adaptive-Learning Pulse Frequency Modulation 64.7 -154.2 kHz
0
Fig. 2.23
10
20 30 40 Number of Pulses
50
60
Gradual change of output pulse frequency in the modified neuron circuit.
Schmitt-trigger circuit. However, there are still problems which must be improved for the future perspectives. First, it has been found that the retention time of stored synaptic weight is not sufficiently long for using the practical systems. In retention measurement of the circuit, the output pulse frequency obtained by application of input signals decreased to about one-half of the initial value after 500 s, which is much shorter than the memory retention time measured in individual MFSFETs (typically 5000 s). Secondly, the modu-
Adaptive Learning Neuron Integrated Circuits . . .
55
lation range of output pulse frequency is still too narrow. In order to carry out the learning operation in a neural network using the back-propagation method, it is generally said that the minimum resolution of the weight value in a synaptic connection is not less than 10 bit. In other words, the output frequency must be modulated in the range of 3-orders-of-magnitude. Therefore, further researches on the improvement of the overall performance of oscillation circuit as well as the device characteristics of MFSFET must be continued. Actually, since it is expected that the retention time of an MFSFET is improved by optimizing the device structure, the next version of the neuron circuit with a better retention property is under fabrication. Finally, we conclude that this novel neuron circuit with a adaptivelearning function is very promising for the large-scale neural networks in the next generation,particullary when the above mentioned problems are well solved.
56 S.-M. Yoon, E. Tokumitsu & H. Ishiwara
References [l] O. Fujita and Y. Amemiya, "A floating-gate analog memory device for neural networks," IEEE Trans. Electron Devices, 40, pp.2029-2035, 1993 [2] K. Nakajima, S. Sato, T. Kitaura, J. Murota, and Y. Sawada, "Hardware implementation of new analog memory for neural networks," it IEICE Trans. Electron., E78-C, pp.101-105, 1995. [3] T. Shibata, H. Kosaka, H. Ishii, and T. Ohmi, "A neuron-MOS neural network using self-learning-compatible synapse circuits," IEEE J. Solid-State Circuits, 30, pp.913-922, 1995. [4] C. Diorio, P. Hasler, B. A. Minch, and C. A. Mead, "A single-transistor silicon synapse," IEEE Trans. Electron Devices, 43, pp.1972-1980, 1996. [5] S. M. Sze, Physics of Semiconductor Devices, Wiley, New York, 1981. [6] H. Ishiwara, "Proposal of Adaptive-Learning Neuron Circuit with Ferroelectric Analog-Memory Weights," Jpn. J. Appl. Phys., 32, pp.442-446, 1993. [7] T. Fukushima, A. Kawahara, T. Nanba, M. Matsumoto, T. Nishimoto, N. Ikeda, Y. Judai, T. Sumi, K. Arita, and T. Otsuki, "A Microcontroller Embedded with 4Kbit Ferroelectric Non-Volatile Memory," 1996 Symp. on VLSI Circuits Tech. Dig., pp.46-47, 1996. [8] D. J. Jung, N. S. Kang, S. Y. Lee, B. J. Koo, J. W. Lee, J. H. Park, Y. S. Chun, M. H. Lee, B. G. Jeon, S. I. Lee, T. E. Shim, and C. G. Hwang, "A 1T/1C Ferroelectric RAM using a Double-level Metal Process for Highly Scalable Nonvolatile Memory," 1997 Symp. on VLSI Technol. Tech. Dig., pp.139-140, 1997. [9] K. Amanuma, T. Tatsumi, Y. Maejima, S. Takahashi, H. Hada, H. Okizaki, and T. Kunio, "Capacitor-on-Metal/Via-stacked-Plug (CMVP) Memory Cell for 0.25 nm CMOS Embedded FeRAM," IEDM Tech. Dig., pp.363-366, 1998 [10] S. Tanaka, R. Ogiwara, Y. Itoh, T. Miyakawa, Y. Takeuchi, S. Doumae, H. Takenaka, and H. Kamata, "FRAM Cell Design with High Immunity to Fatigue and Imprint for 0.5 /jm 3 V 1T/1C 1M bit FRAM," IEDM Tech. Dig., pp. 359-362, 1998.
Adaptive Learning Neuron Integrated Circuits ...
57
[11] H, Ishiwara, T. Shimamura, and E. Tokumitsu, "Proposal of a singletransistor-cell-type ferroelectric memory using an SOI structure and experimental study on the interference problem in the write operation," Jpn. J. Appl. Phys., 36, pp.1655-1658, 1997. [12] W. L. Brown, US Patent 2791759, and I. M. Ross, US Patent 2791760, 1957. [13] S. Y. Wu, "A New Ferroelectric Memory Device, Metal-FerroelectricSemiconductor Transistor," IEEE Trans. Electron Devices, 21, pp.499-505, 1974. [14] K. Sugibachi, Y. Kurogi, and N. Endo, "Ferroelectric field-effect memory device using Bi 4 Ti 3 Oi 2 film," J. Appl. Phys., 46, pp.2877-2881, 1975. [15] Y. Higuma, Y. Matsui, M. Okuyama, T. Nakagawa, and Y. Hamakawa, "MFSFET-A new type of nonvolatile memory switching using PLZT film," Proc. 19th Conf. Solid State Devices, Tokyo, 1997, Jpn.J.Appl. Phys., Suppl.17-1, pp.209-214, 1977. [16] E. Tokumitsu, R. Nakamura, and H. Ishiwara, "Nonvolatile memory operations of metal-ferroelectric- insulator-semiconductor (MFIS) FETs using PLZT/STO/Si(100) structures," IEEE Electron Device Lett., 18, pp.160162, 1997. [17] B. E. Park, I. Sakai, E. Tokumitsu, and H. Ishiwara, "Hysteresis characteristics of vacuum-evaporated ferroelectric PbZro.4Tio.6O3 films on Si (111) substrtaes using C e 0 2 buffer layers," Appl. Surf. Sci., 117/118, pp.423-428, 1997. [18] B. E. Park, E. Tokumitsu, and H. Ishiwara, "Fabrication of PbZr^Ti 1-^03 Films on Si Structures Using Y 2 0 3 Buffer Layers," Jpn. J. Appl. Phys., 37, pp.5145-5148, 1998. [19] S. Horita, S. Horii, and S. Uemoto, "Material Properties of Heteroepitaxial Ir and P b ( Z r i T i i _ x ) 0 3 Films on (100)(ZrO2)i-x(Y 2 O 3 ) I /(100)Si Structure Prepared by Sputtering," Jpn. J. Appl. Phys., 37, pp.5141-5144, 1998. [20] J. Senzaki, K. Kurihara, N. Nomura, O. Mitsunaga, Y. Iwasaki, and T. Ueno, "Characterization of Pb(Zr,Ti)0 3 Thin Films on Si Substrates Using MgO Intermediate Layer for Metal/Ferroelectric/Insulator/Semiconductor Field Effect Transistor Devices," Jpn. J. Appl. Phys., 37, pp.5150-5153, 1998. [21] T. Nakamura, Y. Nakao, A. Kamisawa, and H. Takasu, "Ferroelectric memory F E T with I r / I r 0 2 Electrodes," Integrated Ferroelectrics, 9, pp.179-187, 1995. [22] C. A. Paz de Arajuo, J. D. Cuchuaro, L. D. McMillan, M. C. Scott, and J. F. Scott, "Fatigue-free ferroelectric capacitors with platanum electrodes," Nature, 374, pp.627-629, 1995. [23] E. Tokumitsu, G. Fujii, and H. Ishiwara, "Electrical properties of MFS-FETs
58 S.-M. Yoon, E. Tokumitsu & H. Ishiwara using SrBi2Ta2 0 9 films directly grown on Si substrates by sol-gel method," Proc. Mat. Res. Soc. Symp., 493, pp.459-464, 1998. [24] T. Hirai, Y. Fujisaki, K. Nagashima, H. Koike, and Y. Tarui, "Preparation of SrBi2Ta2 0g Films at Low Temperatures and Fabrication of a Metal/Ferroelectric/Insulator/Semiconductor Field Effect Transistor Using Al/SrBi 2 Ta 2 O 9 /CeO 2 /Si(100) Structures", Jpn. J. Appl. Phys., 36, pp.59085911, 1997. [25] H. N. Lee, M. H. Lim, Y. T. Kim, T. S. Kalkur, and S. H. Choh, "Characteristics of Metal/Ferroelectric/Semiconductor Field Effect Transistors Using a Pt/SrBi2Ta20 9 /Y 2 03/Si Structure," Jpn. J. Appl. Phys., 37, pp.1107-1109, 1998. [26] T. Kijima, and H. Matsunaga, "Preparation of Bi.jTi30i 2 Thin Film on Si (100) Substrate Using Bi 2 Si0 5 Buffer Layer and Its Electric Characterization," Jpn. J. Appl. Phys., 37, pp.5171-5173, 1998. [27] Y. Fujimori, N. Izumi, T. Nakamura, and A. Kamisawa, "Application of Sr 2 Nb 2 07 Family Ferroelectric Films for Ferroelectric Memory Field Effect Transistor, " Jpn. J. Appl. Phys., 37, pp.5207-5210, 1998. [28] D. R. Lampe, D. A. Adams, M. Austin, M. Polinski, J. Dzimianski, and S. Sinharoy, "Process Integration of the Ferroelectric Memory F E T for NDRO FeRAMs," Ferroelectrics, 133, pp.61-72, 1992. [29] K. Aizawa, T. Okamoto, E. Tokumitsu, and H, Ishiwara, "Fabrication and Characterization of Metal-Ferroelectric-Semiconductor Field Effect Transistors Using Epitaxial BaMgF.* Films on Si (111) Substrate," Integrated Ferroelectrics, 15, pp.245-252, 1997. [30] K. Aizawa, T. Ichiki, T. Okamoto, E. Tokumitsu and H. Ishiwara, "Ferroelectric Properties of BaMgF 4 Films Grown on Si(100),(lll), and Pt(lll)/SiO 2 /Si(100) Structures," Jpn. J. Appl. Phys., 35, pp.1525-1530, 1996. [31] E. Tokumitsu, G. Fujii and H. Ishiwara, "Nonvolatile ferroelectric-gate fieldeffect transistors using SrBi 2 Ta 2 0 9 /Pt/SrTa2 06/SiON/Si Structure ", Appl. Phys. Lett., pp.575-577, 1999. [32] S. M. Yoon, E. Tokumitsu, and H. Ishiwara, "An Electrically Modifiable Synapse Array Composed of Metal-Ferroelectric-Semiconductor (MFS) FETs Using SrBi 2 Ta 2 09 Thin Films, " IEEE Electron Device Lett., 20, pp.229-231, 1999. [33] S. M. Yoon, Y. Kurita, E. Tokumitsu and H. Ishiwara, "Electrical Characteristics of Neuron Oscillation Circuits Composed of MOSFETs and Complementary Unijunction Transistors," Jpn. J. Appl. Phys., 37, pp.1110-1115, 1998.
Adaptive Learning Neuron Integrated Circuits ...
59
[34] M. Huffman, "Liquid source misted chemical deposition (LSMCD) - A critical review," Integrated Ferroelectries, 10, pp.39-53, 1995.
Chapter 3 An Analog-digital Merged Circuit Architecture Using P W M Techniques for Bio-inspired Nonlinear Dynamical Systems Takashi Morie,
Makoto Nagata, Hiroshima
and Atsushi Iwata
University
Abstract This chapter presents an analog-digital merged neural circuit architecture using pulse width modulation (PWM) signals. In particular, circuits implementing bipolar-weighted summation and arbitrary nonlinear transformation are described. The weighted summation circuit attains 8-bit precision in SPICE simulation by compensating parasitic capacitance effects. Measurement results of a prototype chip fabricated using a 0.6 fim CMOS process demonstrate that the overall precision is 5 bits. A neural network has been constructed using the prototype chips, and the experimental results for realizing the XOR function have successfully verified the basic neural operation. T h e arbitrary nonlinear nonmonotone transformation is achieved in conversion from analog voltage to P W M signals using plural comparators. Using this technique, we have fabricated a CMOS chaos chip, and have succeeded chaotic signal generations which exhibit bifurcation behaviors closely similar to those predicted by numerical simulation. Keywords : analog-digital merged architecture, VLSI implementation, nonlinear dynamical system, pulse width modulation, P W M , switched-current source, bipolar-weighted summation, nonlinear transformation, neural networks, chaos, bifurcation
3.1
Introduction
Neural networks, an information processing paradigm inspired by biological nervous systems, have been recognized as a useful approach in such applications as vision, acoustics, robotics or control systems. However, 61
62
T. Morie, M. Nagata & A. Iwata
the conventional neural network models used in many applications have only simple dynamics. The backpropagation networks, which are the most famous model, have a layer-type feed-forward structure and have no dynamics. The Boltzmann machines and Hopfield networks, which are also well-known models, only have symmetrical connections, thus their dynamics always leads to fixed-point steady states, and chaotic behavior or oscillation is never observed. However, recent many studies in brain physiology and artificial neural network theories have revealed that nonlinear analog dynamics plays an important role in intelligent information processing. Chaotic neural networks [l; 2; 3], associative memory with nonmonotone dynamics [4; 5; 6], and nonlinear oscillator networks [7] are the typical models. In order to use such models in real-time real-world applications, massively parallel nonlinear dynamical systems have to be constructed. Thus, their VLSI implementation is essential. There have been many reports of VLSI implementation of neural networks. However, the conventional VLSI implementation approaches can hardly realize such nonlinear dynamical systems. The aim of this chapter is to propose a VLSI circuit architecture and some related circuit techniques for constructing massively parallel nonlinear dynamical systems. We have developed analog-digital merged architecture using pulse width modulation (PWM) approaches, which are different from conventional ones. We evaluate the performance of our architecture and circuits by using circuit (SPICE) simulation and measurement results of fabricated prototype VLSI chips. This chapter is organized as follows. In Sec. 3.2, the features of the VLSI implementation approaches of neural networks are compared, and the advantages of our new approach using PWM signals are clarified. Next, our basic circuit architecture is proposed. In Sec. 3.3, a neural circuit based on the PWM approach is described [8]. A new bipolar weighted summation method using PWM signals is proposed, and circuit techniques that achieves high calculation precision are introduced. The performance of the circuit is evaluated using SPICE simulation and measurement results of a prototype chip. In Sec. 3.4, a new circuit technique for arbitrary nonlinear transformation by using PWM signals is proposed [9]. A chip that can generate arbitrary one-dimensional chaos is presented, and its measurement results are shown. Finally, we give the conclusion in Sec. 3.5.
An Analog-Digital
3.2
3.2.1
Merged Circuit Architecture . . .
63
A N e w VLSI Implementation Approach Using P W M Signals Comparison
between Various Implementation
Approaches
VLSI implementation of neural networks are mainly classified into digital, analog and pulse modulation approaches. The digital approach has high controllability and expandability. The digital systems are stable and robust against various disturbances arising in real VLSI systems. Recently, some practical high-performance digital neural VLSI chips have been developed [10; 11; 12], which can implement large-scale neural networks. The digital approach, however, cannot implement analog dynamics essentially although they can obtain high calculation precision in exchange for the large circuit area. Because the circuit components occupy the large area, massively parallel operation is difficult. Instead, time-sharing operation is performed. This feature is suitable for implementing simple feedforward networks such as backpropagation networks, but is not suitable for realizing massively parallel analog dynamical neural systems. On the other hand, the analog approach is obviously suitable for implementing analog dynamics. It is very powerful and effective for implementing recurrent networks that have analog dynamics. In addition, the circuit size can be reduced drastically compared with the digital approach. Thus, there are many attempts to develop analog neural VLSI chips [13; 14; 15]. However, the calculation precision is limited by various non-idealities in circuit components, noise and crosstalk [16; 17]. Moreover, it is not easy to perform arbitrary nonlinear, nonmonotone transformation. Therefore, analog VLSI chips dedicated to specific dynamics are designed. The third approach, pulse modulation approach, is considered as one for achieving time-domain analog information processing using pulse signals. It includes some information representation methods, such as pulse density (pulse frequency) modulation (PDM), pulse width modulation (PWM), and pulse phase modulation (PPM). The pulse modulation approach has almost the same advantages as the digital approach. The PDM approach has often been used because of the similarity to the behavior of biological neurons. It can approximately perform continuoustime continuous-state dynamics. A digital system using the PDM approach has been developed for large-scale neural network implementation [18].
64
T. Morie, M. Nagata & A. Iwata
However, the PDM approach has a drawback of large power consumption because of the large transition rate. This paper focuses on the PWM approach. A PWM signal represents the information by its pulse width. The PWM approach is used in an analog-digital merged circuit architecture, where signals have digital values in the voltage domain and analog values in the time domain [19; 20]. The PWM circuits mainly consist of digital circuit components, thus they match the scaling trend in the Si CMOS technology and low voltage operation. They operate with lower power consumption than in the traditional digital or PDM circuits because one data is represented by only one state transition in the PWM approach. This is an important superior point to the PDM approach in VLSI systems. Thus, the PWM approach seems suitable for constructing bio-inspired VLSI systems. The PWM approach implements continuous-state discrete-time dynamics. Obviously, discretetime dynamics is not used in the biological systems, but it has been thoroughly examined, and equivalent functions can be achieved in most cases. The PPM approach also has the same features as the PWM approach. However, it requires a reference (clock) signal defining the start time for measuring the phase, whereas a PWM signal includes all of the information in itself. Therefore, PWM signals can be transmitted efficiently, whereas PPM methods may effectively be used in local circuits.
3.2.2
Basic Architecture
Using PWM
Signals
A basic neural architecture using PWM signals is shown in Fig. 3.1 [8]. The operation is as follows: (1) PWM signals are transmitted from other neurons. (2) Weighted summations are performed by converting the PWM signals into charges stored in a capacitor using switched-current sources (SCSs). (3) The voltage between the nodes of the capacitor, Vout, is compared with the reference signal, and is transformed into a PWM signal. In neural circuits, bipolar (positive and negative) weights are required corresponding to excitatory and inhibitory synapses. Therefore, this PWM neural architecture must be expanded in order to perform bipolar weighted summation. This is described in Sec. 3.3 [8].
An Analog-Digital
Merged Circuit Architecture
...
65
PWM signals
Comparator Switched Current Source (SCS)
current / • u.\
JL V
(weight)
v
y Vout= ^ I ; T :i/C ;
' "
nonlinear transformation
V
~ Vref=/(t) Vout
Vout=/(Tout) Tout=/"1(Vout)
PWM output Tout Fig. 3.1 Basic neural architecture using P W M signals.
Nonlinear transformation is another key point in the neural circuits. The PWM output signal is made by comparing Vout with a ramped reference signal. A nonlinear transformation can be performed in this comparing process by supplying a nonlinear reference waveform. If the reference signal voltage Vref nonlinearly varies in the time domain, i.e. Vref = f(t), where / is a nonlinear function, the pulse width of the output signal, Tout, is given by Tout = f~1(Vout), where / - 1 is the inverse function of / . In this method, although / is limited to a monotone function, a sigmoidal function, which is often used in many neural network models, is easily generated. However, arbitrary nonlinear nonmonotone transformation is required for constructing general nonlinear dynamical systems. A new approach for arbitrary nonlinear transformation is described in Sec. 3.4 [9].
66
T. Morie,
M. Nagata
& A.
Iwata
PWM input
S2
u:
S3
H
S4
T ,Vref
PWM output
r^tf*^ 6—*—0^> OVt I C2 CI _i_ ,, comi comparator Vout SOWS
F i g . 3.2
P W M n e u r o n circuit w i t h four s y n a p s e s .
Since the PWM-voltage-PWM transformations are analog operations, much attention should be paid in designing the corresponding circuits. However, establishing the design criteria is easier than in the pure analog approach because analog parts in PWM circuits are localized. 3.3
A Neural Circuit Using P W M Signals
A neuron circuit based on the above PWM method is shown in Fig. 3.2, where a neuron with four synapses is assumed. PWM input pulses are fed into the synapse circuits Si,i = 1,•••,4, weighted by the preset synaptic weights, and summed in bipolar summation circuit SUM. The summation result is obtained as voltage Vout. The reason why this summation circuit configuration is chosen is described below. Here, we assumed digital memory (4 bits and a sign-bit) as a weight memory because it can easily be fabricated using the ordinary VLSI fabrication technology. A synapse circuit configuration is shown in Fig. 3.3. However, analog memory is desirable in practical neural chips. One of the authors has developed a practical analog memory device and applied it to an analog neural VLSI chip [15]. Thus, PWM neural chips with analog synaptic memory can be realized. 3.3.1
Bipolar
Weighting
Methods
PWM signals turn on the current sources and the capacitor is charged up. The synaptic weights are expressed as the current values of the currentsources. The bipolar weighting is achieved by the following two methods as
An Analog-Digital
Merged Circuit Architecture
...
weight sign bit load j w e i g h t - JJ
^
ilCQN CLK-1
PWM
P>>
signal
Vbias
\s°
|[*
* j | °NJ
Isum(+). Isum(-) Fig. 3.3
Synapse circuit with 4-bit digital memory.
illustrated in Fig. 3.4: (A) charging or discharging the single capacitor, and (B) preparing identical two capacitors, charging the corresponding capacitors with absolute values of positive and negative inputs, and subtracting the charges of the negative part from those of the positive part. Method A is simple and it is based on the same idea as Kirchhoff's current law that has been used in analog neural circuits [14]. PWM neural chips that have already been reported [21; 22] use this method without any evaluation about its effectiveness in the PWM approach. However, in this method, it is difficult to attain symmetric charging and discharging operation because PMOS and NMOS FETs are used as current sources, respectively. In addition, the linear summation voltage range is smaller than in method B because of non-ideal characteristics of both FETs. This small summation range is not so serious in analog neural circuits using voltage or current domain. The reason is that threshold operation near zero value is most important while large input values are less important because of saturation characteristics of sigmoidal transfer functions. However, this is not the case in PWM neural circuits. As shown in Fig. 3.5, when unbalanced
68
T. Morie, M. Nagata & A. Iwata (B) positive
negative
LJ
LJ
positive
is negative J Fig. 3.4
, ''"J" -'-c
- J -c
}
~J~
~J~
P W M methods for bipolar-weighted summation. •
«-•
T+ T~=nT*|
,+ r l + 2
I
l
• • •
•
time
,. _
-
h
Vout +
(n-l)T .
T
1 * ••time
without saturation with saturation time
Fig. 3.5
Saturation effect in method A when unbalanced inputs are applied.
inputs are applied, a saturation is caused in bipolar weighted summation operation by the small summation range, which leads to an error. In method B, it is rather easy to obtain high calculation precision because the identical MOSFETs and capacitors are used for positive and negative summations and the relative precision between the identical devices on a chip is very high. There exists the upper limit in summation
An Analog-Digital
Merged Circuit Architecture
...
69
(a) Serial connection Absolute-value summation mode Isum(+)
Isum(-)
Isum(-)
Isum(+)
vc vci
_x
•ftAJc H 4
(b) Parallel connection Absolute-value summation mode
Switching Subtraction mode 'Vout=Vcl-Vc2+Vr
^hv-j-VHKv Vr
Fig. 3.6
Vout
Vr
+ C2
1
CI Vr
Two ways for subtraction operation.
of absolute values as in method A, but the linear range is larger than in method A. As a result, we adopt method B to obtain high accuracy. 3.3.2
Subtraction
Operation
In method B, there are two ways for subtracting charges of the negative part from those of the positive part as shown in Fig. 3.6: (a) serial con-
70
T. Morie, M. Nagata & A. Iwata
absolute-value summation mode negative weightsChar
f
Qn
AQn
positive weights Cha 8eQP
Qn-AQ„
;
zn ~T
\
~n
Qp-AQp
Vr-
Vout/AQp
parasitic capacitor subtraction mode
i
+ - T
Vr-w^-w \ AQn discarded
Fig. 3.7
| Vout
I
+
ifw-w / AQp summed
T Vr
Parasitic capacitance effect on summation operation.
nection and (b) parallel connection. If two identical capacitors C are used, the maximum voltage integrable in a capacitor for the parallel connection is twice as large as that for the serial connection. Therefore, the calculation precision in the parallel connection is higher than that in the serial connection. However, the operation speed in the parallel connection is slower than that in the serial connection because charge redistribution occurs in the former case. When C = 10 pF, the results of SPICE simulation using 0.6 fim CMOS parameters at 5 V supply voltage showed that the operation speed is 20-30 ns for the serial connection and 50-60 ns for the parallel connection. Because we consider the calculation precision more important than the operation speed, we adopt the parallel connection configuration. 3.3.3
Summation
Accuracy
Even when the method B is used, a calculation error arises due to parasitic capacitance. As shown in Fig. 3.7, charge AQn is stored in the parasitic source/drain junction capacitance in the absolute-value summation mode, while it is discarded in the subtraction mode. Because of this effect, the
An Analog-Digital
2"
c
:n
20-1X0
20-ISO In..]
|^m5
Vl.
C2
f
Merged Circuit Architecture
2(1-ISO
2d-ISO
-I I
IJ
i
2n •I
•I
PW = PWM(+) - 20ns(-) 0.6
3.0 , 1
0.4
2.8
Vout[
2n |n<.|
VtiiH
I'W = PWM(-) - 20ns(+)
2.6
rf
m
2-4 .
%'
"
**.
2.0
0.2
&
•
<
.
.
.
-0.2
a
•
m
-0.4
w
(
-160
"
<*?-*- -*-
2.2
1.8
71
CI " T
Viiul
>
...
-120
-80 -40 0 40 80 pulse width difference PW[ns]
120
160
-0.6
• without compensation A with compensation Fig. 3.8 Weighted summation accuracy with and without parasitic capacitance compensation (SPICE results).
SPICE simulation results showed that the weighted summation error is ±0.6% (6 bits) as shown in Fig. 3.8. To improve the calculation precision, we add compensation capacitor C c to the positive part as shown in Fig. 3.9. We assume that the compensation capacitance is equal to the parasitic capacitance AC, and the storage capacitance in the positive part is C — AC, while that in the negative part is C. In the absolute-value summation mode, charge Qp is stored separately in the three capacitances C — AC, AC, and C c (= AC), while Qn is stored in C and AC. Then, in the subtraction mode, charges in C c and those in AC in the negative part are discarded. As a result, almost the same ratio on charges Qp and Qn is used for subtraction. The value of the compensation capacitance was determined to be 0.4 pF by simulation when assuming C = 10 pF. By using this improved circuit,
72
T. Morie, M. Nagata & A. Iwata Qp
Qn
Cc=AC
Vr
0 ^ r f i * ^ JL r
n
AC.
AC
Fig. 3.9
T Vout
:201um
Vr
-AC
M' S>n.ipse X 1 X
s*-f
I
;•>_<
*-1 i J tr*j*
Comparator Source Follower Fig. 3.10
.C-AC.
AC.
Parasitic capacitance compensation.
•*•
.•t-t
^4
1
feL
X :
_
I
Switches
I
Capacitors
Micro-photograph of the prototype chip.
the precision is improved, and 8 bit precision was obtained in SPICE simulation as shown in Fig. 3.8. 3.3.4
A Prototype sults
Neural
Chip and Its Measurement
Re-
We fabricated a prototype VLSI chip using a 0.6 /*m double-poly doublemetal CMOS fabrication process. The chip includes four weights (synapses) and one summation and nonlinear transformation block (neuron). For the nonlinear transformation, we used a high performance differential-type latch comparator [8]. The chip photograph is shown in Fig. 3.10. Figure 3.11 shows typical waveforms indicating neural processing in a prototype chip. The power supply voltage is 5 V. In the absolute-value
An Analog-Digital
Merged Circuit Architecture
...
73
absolute-value subtraction
capacitor switching PWM input
reference waveform (Vrcf)
PWM output
Fig. 3.11
Typical waveforms indicating neural processing in a prototype chip.
summation mode, currents by PWM input pulses are integrated at the capacitors, the subtraction mode follows, and in the comparing mode, the weighted summation result is compared with the reference voltage, and a PWM pulse is generated. We measured the PWM input-output relationship with a sigmoidal reference waveform. The four synaptic weights were set at ( 0 , + 8 , + 7 , - 1 5 ) . Under these conditions, power consumption of the chip was 12 mW. The sigmoidal transformation result and the overall precision are shown Fig. 3.12. The precision is determined not only by the weighted summation circuit, but also by the comparator, which includes a source-follower buffer, for converting the output voltage to the PWM signal. We obtained overall precision of 5 bits because the calculation error is less than ±2%. Although the calculation precision required in neural networks depends on the applications, many studies have demonstrated that networks with a feedforward calculation precision of 5 bits can manage to learn by using the backpropagation learning method [23; 24; 25; 26]. Thus, this prototype chip can be used for various applications. However, obviously, it is important to improve the calculation precision. The origins of the calculation error are as follows; (1) The error arising at a summation result of zero in Fig. 3.12 is caused by a mismatch of MOSFETs
74
T. Morie, M. Nagata & A. Iwata
1200 1000 •S- 800 3 a, o
600
a,
400 200
0 -20
-15
-10
-5
0
5
10
15
20
E P W M input (us) x Weight Fig. 3.12
Measurement results of neural processing accuracy.
in SCSs. This can be reduced by improving the circuit configuration and layout, and it is also corrected in neural network learning operation. (2) The error symmetric with respect to the vertical axis (a V-shape error) is probably attributed to the fact that the value of the compensation capacitance described in Sec. 3.3.3 was not optimal. (3) The large variation in the negative output region is due to nonlinearity in the source-follower buffer circuit. If these errors are reduced by optimizing the circuit design, the total error will be reduced down to 0.5%, which is attained when the weighted summation result is more than 10. This means that nearly 8 bit precision will be obtained. 3.3.5
Network
Operation
Test
We constructed a three-layer feedforward network using the prototype chips to evaluate the performance of PWM neural circuits. The network configuration is 2-2-1 (input-hidden-output neurons) with a bias neuron for implementing the exclusive-OR (XOR) function. The synaptic weights were calculated by numerical simulation and loaded into the weight memories. The experimental results are shown in Fig. 3-13. The outputs show the correct results. Although it takes several microseconds to obtain the output
An Analog-Digital
Input (1,0) Fig. 3.13
Merged Circuit Architecture . . .
75
Input (1,1)
Experimental results of neural network operation implementing XOR function.
in this experiment, this interval can be reduced to less than 1 fis because the charge integration achieves in less than 0.1 /xs and the comparator can operate more than 50 MHz. However, the width of PWM pulses depends on the required calculation precision. 3.4
Arbitrary Nonlinear Transformation Using P W M Signals
Various nonlinear transformations, especially nonmonotone transformations, are required for advanced neural network models [l; 2; 3; 4; 5; 6; 7]. If a nonmonotone transformation is achieved, chaotic signals are easily generated by feeding back the output to the input of the nonmonotone function generator. There have been several reports about nonmonotone function/chaotic signal generator circuits using ordinary analog circuits in voltage or current
76
T. Morie, M. Nagata & A. Iwata
domains. A voltage-mode chaos circuit [6] consists of op-amps, diodes and resistors. A current-mode chaos circuit [27] is designed using bipolar transistors as well as MOSFETs. The use of these circuit components is not suitable for CMOS VLSI chips. The shapes of transfer functions generated by these analog circuits are strongly restricted by the characteristics of circuits or component devices. It is difficult to change function shapes arbitrarily. Accurate control of transfer functions is also difficult. 3.4.1
Basic
Idea
In this section, we describe an arbitrary nonlinear transformation method by comparing the input voltage with a nonlinear reference waveform in voltage-to-pulse conversion. In Sec. 3.2.2, such a method was applied for achieving arbitrary monotone transformation. It can be expanded for nonmonotone transfer function generation by using plural comparators [28]. Figures 3.14 and 3.15 show examples of making second-order and thirdorder nonlinear functions, respectively. The reason why this approach is useful is that it is fairly easy to generate arbitrary nonlinear waveforms as a function of time by D/A conversion of digitally-stored waveform data. On the other hand, it is very difficult to generate arbitrary nonlinear transfer functions by using ordinary analog circuits in voltage or current domains. Furthermore, plural transformation circuits can use a common nonlinear waveform generator, provided synchronous operation is assumed. Thus, the proposed circuits are much suitable for VLSI implementation of large-scale nonlinear dynamical systems. 3.4.2
Circuit
Design
We use a clocked CMOS comparator as shown in Fig. 3.16. This comparator has a simpler configuration, consumes less power, and is more suitable for low-voltage operation than comparators using a differential-pair, Clocked operation is suitable for making PWM signals because it is a discrete-time operation. This comparator consists of CMOS inverters U\ to t/3, a capacitor C\, and switches SW\ to SW3 controlled by clocks 0i and
\ period. This operation compensates for
An Analog-Digital
Merged Circuit Architecture . . .
77
(a) A B 9 9
3>n =QI2S t^
Tout (c)
Vin o—
time Fig. 3.14
Second-order nonlinear transformation circuit using PWM signals.
(a)
Vino Fig. 3.15
Third-order nonlinear transformation circuit.
the fluctuation of the threshold voltage in inverter U\. In the fa period, a monotonically ramped reference voltage VA is supplied to the capacitor node N±. When VA reaches Vin, the voltage of the other capacitor node (the input node of inverter t/i), AT2, reaches Vth and the inverters are inverted. Thus, an output pulse is generated. In this comparator operation, capacitor C\ and switch SW\ operate as a sample and hold (S/H) circuit. Thanks to the S/H mechanism of the clocked CMOS comparator, iterated operation can be achieved using a simple circuit configuration. As a simple example of nonlinear dynamical systems, Fig. 3.17 shows an arbi-
78
T. Morie, M. Nagata & A. Iwata
SW3
SW 2
Cl
Tout Nl
N2
u2
u3
J~L
bii
vA
j—i
Tout Fig. 3.16
Clocked CMOS comparator circuit including S/H mechanism.
trary chaos generator with second-order nonlinearity. In this circuit, the scheme shown in Fig. 3.14 is used for generating second-order nonlinearity. By the Si signal, the terminal voltage of capacitor Ci, Vout, is held as the state value x(t), and at the same time, it is transferred to the nodes of capacitor Ci and Cz in the clocked comparators through a buffer. Then, Vout is reset by the 52 signal. Next, by the 53 signal, the PWM output, Tout-, drives the SCS, and voltage Vout is updated. Thus, this circuit implement the following dynamics:
x(t + l)
=
f(x(t)),
(1)
-*• out
(t + 1) =
f(Tout{t)),
where / is a second-order nonlinear function defined by the voltage waveforms supplied as VA and VB- It is noted that both voltage and PWM signals following the given dynamics can be obtained simultaneously. This is a unique feature of this circuit.
An Analog-Digital
Merged Circuit Architecture
...
79
Si
Fig. 3.17 Arbitrary chaos generator circuit with second-order nonlinearity designed using clocked CMOS comparators.
3.4.3
A CMOS
Chip Generating
Arbitrary
Chaos
We fabricated an arbitrary chaos generator chip using a 0.4/im CMOS process based on the circuit shown in Fig. 3.17. The capacitances are Ci = 5 pF, Cg = Cz = 1 pF. As the buffer, a voltage follower of an on-chip op-amp or a CMOS source follower was used. The former gives high accuracy but occupies a large chip area. The latter makes whole circuit compact but the input-output characteristic is not completely linear. However, this nonlinearity can be compensated for by modifying the reference waveforms. The following are the results obtained using the voltage follower. A microphotograph of the arbitrary chaos generator circuit without the buffer part is shown in Fig. 3.18. Typical chaotic behavior can be observed using a tent map:
x(t + 1)
-{
2ax(t) 2a(l - x(t))
(0 < x(t) < 0.5) (0.5 < x(t) < 1),
(2)
80
T. Mmie, M. Nagaki & A. Iwmta
100pm Fig. 3.18 Micro-photograph of the arbitrary chaos generator circuit without the buffer part.
or using a logistic map: x(t + 1) = 4ax(t)(l - x(t))
(0 < x(t) < 1),
(3)
where a is a parameter ranging from 0 to 1. These dynamics can be implemented by the chaos generator chip with the corresponding reference waveforms. In the following measurement, the reference voltage waveforms VA and VB were given by the external arbitrary waveform generators. Chaotic behavior of Vout in the logistic map was observed as shown in Fig. 3.19, where the clock period was 4 /isec. The disturbances observed at the plateau regions are attributed to the Si control signal. Figure 3.20 shows return maps in a tent map and a logistic map obtained from the observed waveforms, where o r a l . The sampling timing is just before the disturbance by the Si clock. These results demonstrate that the calculation precision is 6 bits. Figure 3.21 shows bifurcation diagrams observed on the oscilloscope
An Analog-Digital
Merged Circuit Architecture
...
81
0.2
0
50
100 time (us)
150
1
ff .
I -
0.8
> °-6 |
200
/
0.4 0.2
"1 f"""Kl / 1/ v M . . .V.
0
/
n
1/ . u
10 time (us)
15
20
Fig. 3.19 Chaotic behavior observed in the logistic map using the CMOS chaos generator chip. The starting time (t=0) is arbitrary.
screen. In order to observe them on the screen, Hios was ramped linearly with around 20 Hz. The oscilloscope is set at the X-Y&nd point plot mode. The observed bifurcation diagrams are similar to those obtained by the numerical simulation. The pretty large noise is mainly due to the sampling timing for display on the screen.
82
T. Morie, M. Nagata & A. Iwata
1 0.8
f „, 0.4 0.2
0 0
0.2
0.4
0.6 x(t)
0.8
1
0
0.2
0.4
0.6 x(t)
0.8
1
1 0.8 |
0.6 0.4 0.2 0
Fig. 3.20 map.
3.5
Return maps obtained from observed waveforms, (a) tent map, (b) logistic
Conclusion
An analog-digital merged circuit architecture using the PWM approach was presented. In our architecture, conversion from PWM pulses to analog voltage or from analog to PWM is effectively used for weighted summation or arbitrary nonlinear transformation. A proposed bipolar-weighted summation circuit has attained 8-bit precision in SPICE simulation by compensating parasitic capacitance effects. We fabricated a prototype neural chip including one neuron and four synapses
An Analog-Digital
2.368
0.95
0.55-1
S X
0.35
|
(b)
«#****•*''*%}"
0.15
IK i ^
-0.05 2.363
83
2.320
2.344 Vbias (V)
0.75 >
Merged Circuit Architecture ...
2.339 Vbias (V)
2.315
Fig. 3.21 Bifurcation diagrams observed on the oscilloscope screen, (a) tent map, (b) logistic map.
using a 0.6 /zm CMOS process. The measurement results demonstrated that the overall precision in the weighted summation and the sigmoidal transformation is 5 bits. Although the precision achieved in the prototype chip is lower than expected by SPICE simulation, analyzed results indicates that the precision can be improved up to around 8 bits by optimizing the circuit design. We have also attained arbitrary nonlinear analog transformation using PWM signals. This cannot be realized in the ordinary analog approach, nor the digital approach. We fabricated a CMOS chaos generator chip using a 0.4 /an CMOS process. This chip exhibited chaotic behaviors as predicted
84
T. Morie, M. Nagata & A. Iwata
by the numerical simulation. In the future, practical VLSI systems implementing arbitrary analog nonlinear dynamics will be constructed using this architecture. It will provide new hardware for various bio-inspired models such as advanced associative memory, chaotic neural networks, and nonlinear oscillator networks. Moreover, we can also provide another new hardware that can dynamically change its analog dynamics. We expect that such hardware leads proposal of innovative information processing models. Acknowledgments The authors would like to thank Jun Funakoshi and Souta Sakabayashi for their contributions to this work. This work was supported by the Ministry of Education, Science, Sports, and Culture under Grant-in-Aid for Scientific Research on Priority Areas, "Ultimate Integration of Intelligence on Silicon Electronic Systems" (Head Investigator: Tadahiro Ohmi, Tohoku University). This work was supported in part by The Mazda Foundation's Research Grant.
An Analog-Digital Merged Circuit Architecture . . .
85
References [1] K. Aihara, T. Takabe, and M. Toyoda, "Chaotic Neural Networks," Phys. Lett. A, 144, pp.333-340, 1990. [2] H. Nozawa, "A Neural Network Model as a Globally Coupled Map and Applications Based on Chaos," Chaos, 2, pp.377-386, 1992. [3] S. Ishii, K. Pukumizo, and S. Watanabe, "A Network of Chaotic Elements for Information Processing," Neural Networks, 9, pp.25-40, 1996. [4] M. Morita, "Associative Memory with Nonmonotone Dynamics," Neural Networks, 6, pp.115-126, 1993. [5] H. Kakeya and T. Kindo, "Hierarchical Concept Formation in Associative Memory Composed of Neuro-window Elements," Neural Networks, 9, pp. 1095-1098, 1996. [6] T. Miki, M. Shimono, and T. Yamakawa, "A Chaos Hardware Unit Employing the Peak Point Modulation," Proc. Int. Symp. Nonlinear Theory and its Applications, pp. 25-30 , 1995. [7] D. L. Wang and David Terman, "Image Segmentation Based on Oscillatory Correlation," Neural Computation, 9, pp.805-836, 1997. [8] T. Morie, J. Funakoshi, M. Nagata, and A. Iwata, "An Analog-Digital Merged Neural Circuit Using Pulse Width Modulation Technique," IEICE Trans. Fundamentals., E82-A, pp.356-363, 1999. [9] T. Morie, S. Sakabayashi, M. Nagata, and A. Iwata, "Nonlinear Dynamical Systems Utilizing Pulse Modulation Signals and a CMOS Chip Generating Arbitrary Chaos," Proc. 7th Int. Conf. on Microelectronics for Neural, Fuzzy and Bio-inspired Systems (MicroNeuro '99), pp. 254-260, Granada, 1999. [10] C. Park, K. Buckmann, J. Diamond, U. Santoni, S. The, M. Holler, M. Glier, C. Scofield, and L. Nunez, "A Radial Basis Function Neural Network with On-chip Learning," Proc. Int. Joint Conf. on Neural Networks, pp. 30353038, 1993. [11] Y. Kondo, Y. Koshiba, Y. Arima, M. Murasaki, T. Yamada, H. Amishiro, H. Shinohara, and H. Mori, "A 1.2GFLOPS Neural Network Chip Exhibiting
86
T. Morie, M. Nagata & A. Iwata Fast Convergence," IEEE Int. Solid-State Circuits Conf. Dig., pp. 218-219, 1994.
[12] O. Saito, K. Aihara, O. Fujita, and K. Uchimura, "A 1M Synapse SelfLearning Digital Neural Network Chip," IEEE Int. Solid-State Circuits Conf. Dig., pp. 94-95, 1998. [13] C. R. Schneider and H. C. Card, "Analog CMOS Deterministic Boltzmann Circuits," IEEE J. Solid-State Circuits, 28, pp.907-914, 1993. [14] T. Morie and Y. Amemiya, "An All-analog Expandable Neural Network LSI with On-chip Backpropagation Learning," IEEE J. Solid-State Circuits, 29, pp.1086-1093, 1994. [15] T. Morie, O. Fujita, and K. Uchimura, "Self-Learning Analog Neural Network LSI with High-Resolution Non-Volatile Analog Memory and a Partially-Serial Weight-Update Architecture," IEICE Trans. Electron., E80C, pp.990-995, 1997. [16] R. C. Frye, E. A. Rietman, and C. C. Wong, "Back-Propagation Learning and Nonidealities in Analog Neural Network hardware," IEEE Trans. Neural Networks, 2, pp.110-117, 1991. [17] T. Morie, O. Fujita, and Y. Amemiya, "Analog VLSI Implementation of Adaptive Algorithms by an Extended Hebbian Synapse Circuit," IEICE Trans. Electron., E75-C, pp.303-311, 1992. [18] Y. Hirai and M. Yasunaga, "A PDM Digital Neural Network System with 1,000 Neurons Fully Interconnected via 1,000,000 6-bit Synapses," Proc. ICONIP, pp. 1251-1256, 1996. [19] A. Iwata and M. Nagata, "A Concept of Analog-Digital Merged Circuit Architecture for Future VLSI's," IEICE Trans. Fundamentals., E79-A, pp. 145-157, 1996. [20] M. Nagata, J. Funakoshi, and A. Iwata, "A PWM Signal Processing Core Circuit Based on a Switched Current Integration Technique," IEEE J. SolidState Circuits, 33, pp.53-60, 1998. [21] E. I. El-Masry, H. K. Yang, and M. A. Yakout, "Implementations of Artificial Neural Networks Using Current-Mode Pulse Width Modulation Technique," IEEE Trans. Neural Networks, 8, pp.532-548, 1997. [22] J. C. Bor and C. Y. Wu, "Realization of the CMOS Pulsewidth-Modulation (PWM) Neural Network with On-Chip Learning," IEEE Trans. Circuits & Syst. II, 45, pp.96-107, 1998. [23] P. W. Hollis, J. S. Harper, and J. J. Paulos, "The Effect of Precision Constraints in a Backpropagation Learning Network," Neural Computation, 2, pp.363-373, 1990. [24] D. D. Caviglia, M. Valle, and G. M. Bisio, "Effects of Weight Discretization on the Back Propagation Learning Method: Algorithm Design and Hardware
An Analog-Digital Merged Circuit Architecture . . .
87
Realization," Proc. Int. Joint Conf. on Neural Networks, pp. 11-631-637, 1990. [25] J. L. Holt and J. Hwang, "Finite Precision Error Analysis of Neural Network Electronic hardware Implementations," Proc. Int. Joint Conf. on Neural Networks, pp. 1-519-525, Seattle, 1991. [26] B. W. Lee and S. W. Kim, "Required Dynamic Range and Accuracy of Electronic Synapses for Character Recognition Applications," IEEE Proc. of Int. Symp. Circuits and Systems, pp. 1545-1548, San Diego, 1992. [27] K. Eguchi and T. Inoue, "A Current-Mode Analog Chaos Circuit Realizing a Henon Map," IEICE Trans. Electron., E 8 0 - C , pp.1063-1066, 1997. [28] T. Morie, S. Sakabayashi, M. Nagata, and A. Iwata, "Nonlinear Function Generators and Chaotic Signal Generators Using a Pulse-Width Modulation Method," Electron. Lett, 3 3 , pp.1351-1352, 1997.
Chapter 4 Application-Driven Design of Bio-Inspired Low-Power Vision Circuits & Systems Andreas Konig,
Jan Skribanowitz,
Jens Doge,
Michael Eberhardt,
and Thomas Knobloch
Dresden University of Technology
Abstract Natural vision systems are yet unrivaled with regard to parameters such as performance, size, and power consumption in comparison to todays technical and predominantly digital implementations. This especially holds for complex vision tasks, e.g., in image sequence analysis. Application-specific constraints, imposed by many real time vision tasks, can be met by an opportunistic design of bio-inspired circuits and systems employing analog and mixed-signal design techniques. Consequently, a plethora of vision chips exploiting basic principles have been designed but only few can actually serve in real applications. Today, the modeling of complete application systems requires a hybrid approach and an appropriate design methodology to assure the viability of the resulting integrated system. This paper reports on a research activity that tackles the development of a corresponding design methodology. Several application projects, e.g., OCR, automotive image processing, eye tracking, and visual inspection, will be introduced, t h a t were subject to this design methodology and gave feedback to advance the methodology for systematic design of integrated cognitive systems. Keywords : bio-inspired VLSI systems, systematic low-power mixed-signal design, design methodology, CMOS image sensors, vision chips, automotive applications, overtake monitoring, 3D-displays, eye-trackers, image coding, OCR
4.1
Introduction
Numerous machine vision problems, e.g., complex surveillance tasks, automotive applications, or automated visual inspection and visual process 89
90
A. Konig, J. Skribanowitz,
M. Eberhardt, J. Doge & T.
Knobloch
control, impose high d e m a n d s on viable solutions in terms of size, speed, performance, and power consumption. Furthermore, cost and t u r n a r o u n d time are critical factors. Todays predominantly digital systems cannot always provide an adequate problem solution by available state-of-the-art hardware with regard to all constraints specified above. In contrast, biological systems frequently excel m a n - m a d e structures in respect of these requirements. Therefore, the systematic, technological exploitation of salient features from the wealth of biological and physiological evidence by bioinspired algorithms and circuit implementations is of relevance for advanced microelectronic application solutions. This especially holds for issues of power dissipation and related high input currents and heat dissipation. T h e SIA r o a d m a p [30] points out t h a t with ongoing feature size reduction and technological advance power consumption increases or, at best, stagnates. In addition t o benefits in power consumption, due t o fault tolerance by graceful degradation as well as a d a p t a t i o n and learning capability met in the biological evidence, bio-inspired systems mimicking these properties can alleviate problems met in design, yield, and test of todays complex integrated circuits and systems. However, the lessons learned from neural network hardware design versus the development and applicability of general purpose (GP) hardware, e.g., from the surging communication market and respective low-voltage and low-power implementations, have to be taken in account to achieve technically and economically sound and viable dedicated system solutions with regard to competing G P solutions. Especially systems t h a t employ complex spatio-temporal processing principles observed in biological systems ([10], [9]) are still not in reach for G P digital hardware under the constraints of size, power-consumption, and costs. Complex problems t h a t benefit from such principles and their respective implementation are thus most attractive candidates for related dedicated implementation efforts. In conjunction with an opportunistic design style [36] biological principles and bio-inspired circuits employing analog and mixed-signal design techniques in a potentially massively parallel architecture allow to deal with computational burdensome tasks in a very efficient way. Particularly, the combination of image acquisition and early vision processing is attractive for system solutions. Salient, yet complex phenomena such as recurrent feature maps, selective attention, temporal binding of features as well as habituation and a d a p t a t i o n processes can thus be efficiently implemented and exploited in technical applications. Generic system solutions as well as complete bio-inspired systems in the
Application-Driven
Design of Bio-Inspired
...
91
described domain are still out of the question as, with todays technology, the implementation of the required complexity under the given constraints is not yet within reach. Nevertheless, hybrid systems in CMOS technology with advanced bio-inspired, spatio-temporal processing and dynamics in analog technology, including CMOS compatible sensors, can be saliently combined with dedicated digital processing for competitive, dedicated, integrated system solutions. Remarkable examples are for instance reported in [23], [15], and [2] on implementations of an "artificial retina" chip and related systems for 3D human motion recognition as well as general image preprocessing, or CSEM's motion detector chip for pointing devices [l], that is used in Logitech's Marble trackballs [25] as part of a commercial product. To widen the scope for additional application domains, to exploit the described potential, and to meet constraints of turnaround time, development and design costs as well as overall system validity and performance, an efficient design methodology is required. In our work, we introduce such a methodology and enhance the standard design flow by a level for fast behavioral modeling of the vision task. Our objective is to alleviate and advance the design of hybrid application-specific vision systems incorporating bio-inspired algorithms in an opportunistic, low-power design style, so that todays industrial application needs, constraints, and requirements are met. Special emphasis is put on the realization of high-performance, yet extremely power-conserving circuits and systems with optimum exploitation of todays microelectronics potential. In the following section, our design methodology will be presented. Then, examples of application specific integrated vision based recognition systems, their modeling, and their implementation are described. Concluding, we will assess the current state and future aims of our design methodology.
4.2
Methodology for application-specific design of vision circuits and systems
The introduction pointed out the need for an efficient top-down design methodology tailored to the design flow of vision and recognition systems. Typically, the design of such a general intelligent system starts with a coarse specification of the problem and the aspired solution. Based on available examples and/or knowledge a first-cut reference system must be designed
92 A. Konig, J. Skribanowitz, M. Eb&rhardt, J. Doge & T. Knobloch
to assure the viability of the solution by simulations. After system optimization and test of robustness, the VLSI design effort can be started using the simulation system as a reference and its results as benchmarks. It is evident that the flexibility of the design platform as well as the existence and availability of suitable system performance measures, e.g., in terms of discriminance and recognition ability, are crucial for success and rapid advance of the design effort. Employing the simulation system as the QuickCog
Fig. 4.1
Enhanced Y diagram dedicated to mixed-signal cognitive systems design.
baseline together with a behavioral description, the design process now advances by repeated partitioning and elaboration of building blocks of lower complexity. In the process, the design description is both detailed and advanced from behavioral to functional and, finally, to geometrical representation. Design decisions and compromises take place, e.g., the choice of a certain fixed-point computational accuracy as well as the selection of a specific circuit technology. These design options, though they might bring benefits concerning area or power consumption, can be extremely detrimental to overall system performance and, thus, can put the viability of
Application-Driven Design of Bio-Inspired ...
93
the overall design into question. Following the basic idea and methodology applied in the systematic design of neural network hardware, neurochips, and neurocomputers [17] and extending this experience to system level, a methodology for the systematic and optimized design of integrated cognitive systems can be introduced here. Figure 4.1 visualizes the approach, enhancing the well-known Y-diagram of Gajski and Kuhn [7] to the issues implied by mixed-signal implementation of cognitive systems. T h e top-down design process described above, from concept level to algorithmic representations in b o t h C / C + + and hardware description languages such as Verilog/Verilog-A and V H D L / V H D L - A and the conversion into structural and, finally, geometrical representations, is complemented by introducing feedback and assessment p a t h s from the various levels of description and representation t o t h e reference system. T h u s , in principle, chosen design options can systematically and rapidly be validated, and the viability of the chip design can thus be assured, while minimizing design time, effort, and related costs. It is obvious from the discussion, t h a t the properties of the tool for reference system modeling and design state assessment are crucial for the overall success of the proposed methodology. For this aim, the QuickCog system has been devised in a concurrent research project (cf., e.g., [2l], [20], [19]). T h e general architecture of the adaptive QuickCog system is given in Fig. 4.2. It meets the needs of rapid reference system modeling by providing the following key features: ( 1 ) Visual programing of block diagrams for system modeling. ( 2 ) Sample set oriented processing. ( 3 ) Large collection of significant and proven m e t h o d s for image processing, pattern recognition, and artificial neural networks. Currently, bioinspired information processing m e t h o d s are included in the system. ( 4 ) Convenient and intuitive graphical user interface (GUI), t h a t supports d a t a acquisition (e.g., images or image sequences), sample set creation for learning from examples, region of interest (ROI) definition and object partitioning as well as preclassification. ( 5 ) Feature space visualization based on multivariate d a t a projection and interactive visualization techniques. This visualization gives insight into the current problem characteristics, e.g., feature discriminance, separability, class overlap, or the number of modes per class. Gradual degradation in system performance due to chosen design options
94 A. Konig, J. SkHbanowitz, M. Bberhardt, J. Doge & T. Knohloch
Pig. 4.2
QuickCog adaptive system architecture.
can be detected in the feature space visualization. (6) Assessment functions related to feature space visualization. These measures can also serve to detect and assess degradation in system performance due to chosen design options. (7) Automatic feature selection, method selection, and method parameter optimization is included. (8) Comprehensive classifier toolbox from simple centroid to powerful nonparametric classifiers, comprising statistical approaches as well as neural networks. Thus, QuickCog provides a platform for fast and efficient modeling of integrated cognitive systems from reference system modeling to implementation
Application-Driven Design of Bio-Inspired . . .
95
evaluation by using QuickCog's unique and powerful modules for the assessment of a system's efficacy. The adaptive features of the architecture, that considerably facilitate and accelerate the reference system modeling and reduce overall turnaround time, can also be well exploited just for software system design. Therefore, QuickCog serves also as a commercial tool for general visual inspection tasks. One example is given in Fig. 4.3. In
Fig. 4.3
QuickCog applied in an electronics manufacturing task.
the following, the methodology will be elucidated by a very simple design example. For the well known Irisdata, a classifier was designed in CMOS technology and subthreshold mode. The classifier implements the recall structure required for Learning-Vector-Quantization (LVQ) [22] or NearestNeighbor-techniques (kNN) [6] (cf. Fig. 4.4). The training reference system for this simple example is given in Fig. 4.5. The training system is complemented by tools for feature space visualization and assessment [18]. In an actual application, the feature input of the classifier would stem from an image processing and feature extracting hierarchy. Figure 4.6 shows the reference test system and the modified test system, which incorporates the hardware model of the classifier. In the block Stimuli In/Out, stimuli are converted and handed down the design hierarchy for simulations, and the achieved simulation results are fed back to the system for ongoing processing as well as result analysis and assessment. Figure 4.7 shows the
96
A. Konig, J. Skribanowitz,
M. Eberhardt, J. Doge & T. Knobloch
E[3:0]
T,[3:0]
T„[3:0] •A[n:1]
Fig. 4.4
LVQ and kNN classifier recall architecture.
••- i . JI.J-J
i11- *"•" g
'•li f
.
]
""tMoT
T _J
Fig. 4.5
!- * .
r:jr JJ...-U 'i»
t
»
; : . . . i ' " .._. .zr
«t
•
1 *" "" ^ . . . J —~-•
•
1
LVQ and kNN classifier training in QuickCog.
comparison of the reference and the hardware test system, employing confusion matrices for classifier recall result analysis. Evidently, the classifier's performance deteriorated from 93% to 73% due to design and circuit imperfections. For directed optimization, which is a substantial part of our design methodology, the statistical performance on its own does not provide sufficient information. In addition to the overall classification rate and the confusion rates between individual classes, the location of misclassified patterns in feature space as well as the respective feature values are of significant interest. For this purpose, our methodology uses multivariate data
Application-Driven
Design of Bio-Inspired . . .
97
SHNKO jiij
« .T" TftptR
'
*
1 {
1
* "J ' 1
*J j
J>
«
(j
Wat&w,! w
\ i
* .a 1
" * >; 1 4«(wt
*f|
. *. ..J; i
. *..ii..i
r-
|
T.*..-"J
X* M
r* # _a^u
*
J
M* i
^, _,.
|
' • ni Fig. 4.6
LVQ and kNN classifier recall by reference and hardware model.
(a) Fig. 4.7
(b)
Confusion matrices of classification results, (a) Reference, (b) Hardware model.
projection and advanced interactive feature space visualization offered by the QuickCog system [18]. These can be employed to understand the problem and optimize the circuit. Figure 4.8 shows the feature space with the imposed class labels of the reference and the hardware model, respectively. At each projection point, the feature values are plotted in a radial representation. From the feature space visualization^ it becomes obvious, that mis-
98
A. Kbnig, J. Skribanowitz,
M. Eberhardt, J. Doge & T.
Knobloch
• . . * • ;
J"*
Sx
^
A
*t%'%>*"*s
^ •"'
™im*l™ "Jl1
/ »;-*V
\
/ (a)
Fig. 4.8
(b)
Feature space with classification results, (a) Reference, (b) Hardware model.
classifications strongly correlate with very large feature values. Thus, the current circuit and its dimensioning still suffers from a saturation problem for large feature and metric values. This system oriented analysis sustains I-^V
">•>
,TM=3 M n h .
[X> "nil
(a) Fig. 4.9
Classifier circuits, (a) Subtraction, (b) Absolute value computation.
the consistency and information processing properties of the design. It can be employed on the behavioral level, e.g., Verilog or VHDL, on the functional level as given in Fig. 4.9 for two standard subcircuits of the classifier as well as on the geometrical level, e.g., simulations based on the extracted layout of the regarded circuits (cf. Fig. 4.10). The simple classifier example gives an idea of the design methodology and the assessment functions
Application-Driven
Design of Bio-Inspired
...
99
SMmi>
(a) Fig. 4.10
(b)
Classifier circuit layout, (a) Subtraction, (b) Absolute value computation.
provided in QuickCog. More visual and numeric assessment functions, e.g., for image comparison and assessment, feature space assessment as well as classifier performance assessment based on estimated aposteriori values are available. These allow to detect failures as well as to disclose gradual degradations caused by design decisions and provide a means to systematically correct and optimize the design. In ongoing extensions of our methodology, we begin to exploit adaptation and learning mechanisms for compensation of circuit imperfections. Similar to the standard approach used for training of analog neural network chips, e.g., the Intel ETANN chip, which is denoted as hardware-in-the-loop-learning, models of the imperfect circuits will be generated and incorporated in the system configuration and learning phase. Extending QuickCog to a true self-learning system, imperfections of innovative devices, circuits, and resulting structures can be overcome by: (1) Compensation employing learning of the following stages. (2) Learning in the same stage with a model of the imperfect circuit, e.g., optimizing design parameters, coefficients, or degree of parallelism. (3) Learning of all stages for optimization and compensation. So far, our methodology provides a way of systematic, consistent system design. The issue of design automation with focus on synthesis techniques has not yet been tackled in our project work. However, many research activities on analog and mixed-signal design synthesis can be observed that can be exploited and integrated with our work in the future. In the next section, several chip and system design examples will be presented, that profited from the idea of the described design methodology. These implementations were not only subject to the design methodology but actively contributed to it by feeding back experience, algorithms, methods, and simulation tech-
100
A. Konig, J. Skribanowitz,
M. Eberhardt, J. Doge & T. Knobloch
niques. The salient essence of the realized project work was extracted and integrated into the QuickCog system to alleviate and advance the design of future application-specific integrated cognitive systems.
4.3 4.3.1
Design examples of integrated low-power vision systems OCR chip for consumption
meter
read-out
In cooperation with an industrial partner, we developed a reference system and an embedded image sensor for automated visual consumption meter read-out [31J. Such a device is commercially interesting for utility companies because a large number of mechanical meters have to be read out manually today. Our approach consists of a smart sensor system snapattached to conventional mechanical meters. The OCR chip features bioinspired algorithms in the digital part and was first choice for the further development of our design methodology. Processing methods as well as design parameters, e.g., sensor size and resolution, have systematically been determined. Starting with a clear system specification, an algorithm has been derived, validated, and subsequently optimized. Because errors can be detected at an early stage of the design hierarchy, time-consuming and expensive redesigns are avoided. Due to the moderate processing speed requirements it was possible to optimize the OCR algorithm for low complexity and high discriminance. Exploiting the salient properties of QuickCog, an appropriate system configuration to cope with the problem of partly occluded characters, caused by gradual digit transition in the meters, could be rapidly determined. A template matching approach as given in Fig. 4.11 was successfully employed. For the software prototype, initially, the Euclidean distance has been used as a similarity measure. However, following our design methodology, systematic simulations showed that the Manhattan distance measure could be used equally well, which is much more convenient for VLSI implementation. Our system is capable to provide both a classification of the dominantly visible digit, as well as the exact meter wheel position. The achieved performance was only one error in 70,000 digit images. A prototype for the recognition system has been developed comprising a PC and a CCD camera connected to a commercial frame grabber. The camera can be attached to consumption meters by the means of an adapter containing also the LED illumination. This setup has successfully been
Application-Driven
W Fig. 4.11
Design of Bio-Inspired
...
101
(b)
Recognition system, (a) Template matching, (b) Neuro-inspired approach.
tested by our industrial partner. One drawback of the system is the large memory required for storing the templates, which have to be defined for each new meter and respective font type. Therefore, a multistage neuroinspired architecture (Fig. 4.11) was proposed. This hierarchical, strokebased recognition approach is described in detail in [8]. The architecture of our OCR sensor chip is depicted in Fig. 4.12 [8]. Behavioral simulations of the recognition system showed that a resolution of six bits is sufficient for the system requirements. In order to enhance the field of application, a local storage has been integrated into each pixel cell, allowing a random access read-out. Each pixel cell contains an active, integrating core cell employing a diffusion-substrate diode as a photosensitive element. A SC amplifier allows both simple read-out and correlated double sampling (CDS). Due to the moderate speed requirements a twostep flash architecture with a resolution of three bits has been chosen for the A/D converter. The on-chip finite state machine generates all internal clock signals and is controlled by five input signals. Figure 4.13 shows the pixel cell layout, featuring a size of (31.6/im) 2 . The photosensitive area amounts to (15.0/xm)2. By predominantly using minimum size structures, a fill factor of 22.5% could be achieved. The source follower and reset transistors have been designed with twice the minimum structure size in order
102 A. Konig, J. Skribanowitz, M. Eberhardt, J. Doge & T. Knobloch
BA CDS CLK ResetN-
Control-Unit
4 li
Digital Output
Afttjiftfiftrj
*-?£•
Pixel Array
fttxel Reset Buffer
Array 3x2?
Analog Output x-Address Decoder
Kc ierroce Currents
JL
IT
Current
Address-Latch k
vren
[7] digital
Fig. 4.12
' A d d r e s s Decoder
vrefh
|_] mixed
Address
Q analog
Architecture of the simple CMOS OCR sensor.
to achieve a better matching and to reduce fixed pattern noise. The chip
'^^WffflRiP^*
(a) Fig. 4.13
(b)
OCR sensor chip, (a) Layout of pixel cell, (b) Chip photograph.
has been fabricated and successfully tested. Figure 4.13 depicts a chip photograph. The sensor features a frame rate of up to 20 images per second. A QuickCog based system for the read-out task with a captured sample im-
Application-Driven
Design of Bio-Inspired
. . . 103
age and matching templates is shown in Fig. 4.14. As our OCR algorithm has to be able to deal with poor illumination conditions, the local adaptation concept, discussed in Section 4.3.4, has been used as an improvement. Summarizing, a QuickCog based reference system has been developed and
••'"
'•"'"•"
Fig. 4.14
" ' " ' ' '
••••••••"••"."•
.•.•••=.•"•••••'.••'•••••••
- . ; •
•'•'•••'••'••
,';";'3g:ja^':•"•.
,.'"•'""
\xs&--
'
•}
QuickCog OCR system employing CMOS sensor images.
assessed. Crucial VLSI design parameters, e.g., the sensor's spatial and the pixel value resolution as well as the metricforthe recognition process have been determined by systematic simulations and used in a mixed-signal design effort. The image sensor's operation could be verified and acquisited meter images were correctly classified by reference system, thus proving the viability of the design approach as well as the validity of the design it-
104
A. Konig, J. Skribanowitz,
M. Eberhardt, J. Doge & T. Knobloch
self. The neuro-inspired OCR algorithm itself was modeled in Verilog and simulated [8], but due to funding limitations was not manufactured so far.
4.3.2
Overtake monitor and
eye-tracker
As outlined in the introduction vision problems requiring complex tasks such as spatio-temporal image sequence processing impose high demands on a viable solution and are thus first choice candidates for a dedicated VLSI implementation. For instance, in the automotive area, autonomous vehicle guidance, driver assistance (collision avoidance, control of the distance to vehicles in front, detection of drowsy drivers, overtake monitoring) as well as the surveillance of the car interior are such challenging tasks with also significant economical background. For our design activities, we regarded overtake monitoring as especially suited. Passing vehicles are to be monitored by the system, and the driver should be warned not to change lanes in dangerous situations. After a comprehensive study of motion detection and tracking methods (such as optical flow, [12] and [29]), we focused on feature based schemes (using corners and edges) with bio-inspired preprocessing and developed a prototype system simulator for Overtake Monitoring. Inspired by the ASSET-2 system introduced by Smith [33], we developed a smart, hardware-oriented algorithm, that tracks and clusters these features as well as the resulting objects. The system simulator carries out a risk assessment for changing lanes and computes the driver warning in dangerous situations. Figure 4.15 (a) shows the implemented processing steps, while Fig. 4.15 (b) gives a demonstration of the OTM simulator capabilites, based on a two-lane highway scene. Spatio-temporal smoothing as preprocessing, as used by Nagel in his work on optical flow [29], was found salient for stabilization of reliable corner detection. Processing steps that are shown gray-colored in Fig. 4.15 (a) are suitable for an implementation in analog hardware. Furthermore, the corner detection stage can also be integrated on the CMOS image sensor, reducing the required complexity and data throughput of the digital part. Detection and tracking of faces in image sequences, e.g., user eye pairs for the control of three-dimensional displays and graphical user interfaces, is an excellent application in the field of multimedia and surveillance. In particular, autostereoscopic 3D-displays are subject of intensive research and development due to their significant market potential. For a correct stereoscopic impression the exact pupil positions of the oberservers' eyes
Application-Driven
f
Design of Bio-Inspired
...
105
Risk AssESSfrient and Warning of Driver
(a) Fig. 4.15
(b) Overtake monitor, (a) Flow diagram, (b) Simulation results.
must be known in order to control the 3D-display hardware and software. In contrast to [5] we chose a monofocal approach, which makes higher demands on the vision system in terms of processing power. The system integration into -one single device together with image plane processing as well as advanced techniques such as local adaptation (cf. section 4.3.4) have the potential to compensate for it, resulting in a low-cost, compact system. The prototype system has been developed using the same design approach as for the overtake monitor [32]. The implemented processing steps of our hardware-friendly algorithm are shown in Fig. 4.16 (a). Again, the gray-colored boxes emphasize the processing steps that are most attractive candidates for an analog, bio-inspired, mixed-signal VLSI implementation. As in the overtake monitor the image sequence is first spatio-temporally smoothed. Next, contour maps are derived using a Difference-of-Gaussian (DoG) filter and zero-crossing evaluation. Approximations to DoG filters can conveniently be implemented in analog hardware. Their successful application in the modeling of the receptive field of the human vision system [37] made them even more appealing. The identified contours are represented by polylines and polygenes for an easier and faster processing. Eye
106
A. Konig, J. Skribanowitz,
M. Eberhardt, J. Doge & T.
Knobloch
region candidates are detected and tracked by a rule-based approach. The algorithm employs pairwise matching of eye region candidates and filterbased extraction of pupils within the eye shape regions to determine and output- the pupil coordinates of valid eye pairs. Figure 4.16 (b) demonstrates our current eye-tracker system simulator. Figure 4.17 shows the modular QuickCog implementation of the eye-tracker system, which will serve for step by step hardware modeling according to our design methodology. The salient image preprocessing features common for both applica-
c
* « * » § * A??#s?$s« • J
Smoothed Image Sequences
fo%s^^iy#^; Contour Maps
4
.
Coiftour Representation by Polylines List of Polylines & Polygenes Detection of Prospective Eye Regions i List of 'EyeSnapes'
D
Localization of Pupil Candidates List of 'EyeShapes" w/ Pupte
i
Tracking and Verification of Prospective Eys Regions & Pupi! Candidates List of Tracked 'EyeShapes' w/ PupHs
4 j
Extraction of Valid Pupi! Pairs
j
Coordinates of Pupils
i
Output of Pupi! Coordinates
(a) Fig. 4.16
Eye tracker, (a) Flow diagram, (b) Simulation results.
tions presented above comprise spatio-temporal image smoothing as well as edge detection by a Laplacian operator. These computationally expensive tasks have been implemented in our vision chip. Unlike other vision chip designs also integrating spatial smoothing and DoG filtering with subsequent edge detection or image segmentation, respectively (cf., e.g., [16] and [38]), we regarded spatio-temporal smoothing as imperative for our applications.
Application-Driven
Fig. 4.17
Design of Bio-Inspired
...
107
Eye tracker reference model in QuickCog.
It has been implemented in the analog domain, the smoothing strength is programmable. The digital processor not only generates the necessary control signals, but also governs the different modes of operation. Depending on various register entries, the vision chip either outputs a smoothed or unsmoothed image area, respectively, or performs a zero-crossing search using the difference of two differently smoothed images. In the latter case, both the zero-crossing strength and the coordinate can be output. Finally, all output data (except the edge coordinates) can optionally be thresholded and binarized. For all modes of operation an arbitrary layer of the Gaussian pyramid is selectable for read-out. The block diagram of our vision chip is shown in Fig. 4.18.
108
A. Konig, J. Skribanowitz,
M. Eberhardt, J. Doge & T.
Knobloch
CMOS Sensor
&
Preprocessino Array (m x n Pixel)
*-
->
fi'i - i
]5zf
Column Selector
—J
Digrtal Control of Sensor Array
L
Column Decoder
iT^
ADC C c n n l Sigruta
filtered data
Fig. 4.18
\ Subtraction/QOG filtered data
Block diagram of the vision chip.
After modeling our vision system in the behavioral domain, we derived functional descriptions. While the schematic of the analog part has been derived manually, the digital schematics have been automatically synthesized. In accordance with our design methodology we used the same stimuli for validation both in the behavioral and the functional domain. Thus, the correct functioning could be assured at this intermediate design stage.
4.3.2.1
Pixel core cell with photosensitive
element
We used a high dynamic range pixel core cell introduced by Klinke, Brockherde, Hosticka, and Zimmer [3]. This two-transistor cell, depicted in Fig. 4.19 (a), employs a parasitic well-substrate diode as a photosensitive element. Due to the logarithmic change of the floating well potential with respect to illumination variations, a high dynamic range can be achieved. Furthermore, this nonlinear behavior can be programed by varying the reference current of the pixel cell (since the same reference current is used for all pixel core cells global adaptation is achieved; cf. Section 4.3.4).
Application-Driven Design of Bio-Inspired ... 4.3.2.2
109
Pixel cell and sensor matrix
As shown in Fig. 4.19 (b), each pixel cell consists of the core cell described above as well as circuitry for temporal smoothing, spatial smoothing, and storing the pixel values of two differently smoothed images. The temporal smoothing has been accomplished by a recursive weighted summation, the spatial smoothing by employing a multirate switched-capacitor (SC) network as introduced by Umminger and Sodini [34]. Since we have integrated only one SC network, a lower circuit complexity as well as a lower device count could be achieved. The network's time-discrete behavior enables a spatial smoothing of up to
(a)
(»>)
Fig. 4.19 Vision chip, (a) Pixel core cell and its characteristic, (b) Block diagram of the pixel cell.
fully differential signal approach has been used. The pixel cell's amplified analog output signal is fed into an eight bit successive approximation ADC. Since we used a converter from a standard cell library for convenience reasons, the maximum conversion rate is unfortunately limited to 111 kHz. A speed-up of more than one order of magnitude is easily accomplished by an improved A/D converter design. The following digital processor not only handles the ADC's output as described above, but also controls the sensor
110
A. Konig, J. Skribanowitz,
M. Eberhardt, J. Doge & T.
Knobloch
matrix, the pixel read-out SC amplifier as well as the ADC itself. It has also been synthesized using standard cells and comprises 1181 gates (12,251 transistors). We used Austria Mikro Systeme International AG 0.8/zm CYE
(al
Fig. 4.20
lb)
Matrix sensor, (a) Layout of the pixel cell, (b) 32 X 32 sensor matrix.
CMOS technology, featuring double poly, double metal. Again, the layout generation has been carried out independantly for the digital and analog part. Due to the large number of pixel cells their area is critical for the layout. Therefore, the pixel cells have been generated using minimum design rules. In the first layout design for our vision chip the pixel cell features a side length of 149.5^m and a fill factor of 1.7% (Fig. 4.20). Due to funding constraints, this pixel size only allows the manufacturing of a rather small test circuit. Figure 4.20 shows a matrix of 32 x 32 pixels, already covering an area of Pa23mm2. The resulting manufacturing costs of such a test matrix would be inadequate with respect to the chip's actual usefullness. So, we adapted the concept for a linear sensor, which is described in the following. 4.3.3
Linear
sensor
for visual inspection
applications
In the field of visual inspection, many applications can profit from linear CMOS image sensors comparable to our design. Especially in-line inspection tasks require mechatronic solutions able to deal with harsh environments such as irregular and erratic illumination as well as uneven object
Application-Driven
Design of Bio-Inspired . . .
Ill
jHiia Buffer and Analog Memory #2
Buffer and Analog Memory #1
\ Source / Follower
Spatial Smoothing (SC network)
Buffer and Temporal Smoothing
Photo detector
(a)
(b)
Fig. 4.21 Linear sensor, (a) Layout of redesigned pixel cell, (b) Complete 224 X 3 linear sensor.
112 A. Kdnig, J. Skribanowitz, M. Eberhardt, J. Doge & T. Knobloch
surface. The pixel core cell chosen for our vision chip can be adapted to a high dynamic range of image light levels and, thus, is ideally suited for such applications. Dedicated linear image sensors can be utilized in textile production, wood inspection, and other high speed, continuous production processes. For example, in wire mills the width as well as the speed of the wire has to be measured. A feasible solution would consist of a smart vision system featuring edge detection capabilities comparable to our sensor. Additionally, the integration of multiple inspection units into one mechatronic system is feasible, resulting in a compact and low-priced solution for industrial in-line inspection tasks. Other potential applications are the blade control of harvesting machines, proximity sensors in the automotive area, or linear sensors for automatic alignment tasks. Benefiting from the previous design experience we thus implemented a linear sensor. In order to achieve a small pitch, we redesigned the pixel cell. The new layout is depicted in Fig. 4.21 (a). The cell now features a width of 33.2/zm, while the height amounts to 360.3/zm. The photosensitive area covers an area of approximately 630^m 2 . We discarded spatial smoothing across rows to save chip area. The complete sensor matrix comprises three rows of 224 pixels each. The matrix width is limited by the cavity size of the used DIL-48 package. Figure 4.21 (b) shows the bonding diagram of the complete chip which consumes s=sl5.26mm2 chip area. It has been manufactured using the EUROPRACTICE MPW services, and first sample tests showed that the chip is operational. Currently, it is integrated with DSP TI 320C50 to achieve an intelligent camera for demonstration and application.
4.3.4
Local
adaptation
The dynamic range of image sensors is a crucial figure for coping with realworld environments from darkness to full sun light, especially in machine vision and visual inspection tasks. From biological evidence, various mechanisms for adjusting a light receptor's sensitivity to varying intensities of light [4] can be exploited for engineering solutions. For instance, IMEC's Fuga chips [14] and the HDRC chip from IMS Stuttgart [13] obtain high dynamic range by pixel cells with static logarithmic characteristic. However, the local contrast resolution is very limited in this approach. Various other means of pixel characteristic adaptation by electronic photoplane shutters control or multiple-readout schemes have been invented to achieve
Application-Driven
Design of Bio-Inspired
...
113
local-adaptation with acceptable local contrast resolution. In our work5 we focused on local adaptation schemes that exploit the same operations applicable for image preprocessing in CMOS technology. For instance in [35] a programmable bias current is subtracted from the actual photo current in each pixel. The division of the pixel intensity by the local average is even
Fig. 4.22 Simulated effect of local adaptation, (a) Original image, (b) Local adaptation with isotropic smoothing, (c) Illumation step to be multiplied with original image, (d) Result of integrating sensor with short integration time, (e) Local adaptation with isotropic smoothing, (f) Local adaptation with anisotropic smoothing.
better suited for the reduction of the influence of brightness variations, which have a multiplicative effect [ll], [24]. The required computation can well be implemented in analog VLSI, c.f., e.g., [28] where multiplicative -flicker caused by artificial light has been removed. Actually, the local mean and the local contrast could be subject to A/D-conversion and readout, similar to mantissa and exponent in a high resolution floating-point
114
A. Kbnig, J. Skribanowitz,
M. Eberhardt, J. Doge & T. Knobloch
representation. However, for machine vision applications local detail and contrast carries the decisive information and thus is solely computed in our design effort. After reviewing various local adaptation schemes we conducted behavioral simulations in order to assess their ability to preserve local contrast details in the presence of high contrast global illumination changes. For this purpose, we implemented a simulator based on our QuickCog system. We extended our simulations on artificial data [32] to real-world data from the eye-tracking task. Figure 4.22 (a) shows the original image and (b) its processing by local adaptation with isotropic smoothing. Figure 4.22 (c) shows an illumination step function as encountered in real-world by masking of strong sun light, e.g. by bridges in the highway scene or window frame in the eye-tracker scene, which is multiplied with the original image. Figure 4.22 (d) shows the result of applying an image sensor with global adaptation. Figure 4.22 (e) shows the result of processing by local adaptation with isotropic smoothing. The smoothing and local average computation in orthogonal direction to the step edge and the ensuing division by this local average causes a loss of local contrast and thus image details. This becomes visible by a black stripe in the result image. Figure 4.22 (f) shows the result of processing by local adaptation with anisotropic smoothing. The result image only shows a thin stripe along the illumination step edge and the local contrast is preserved in both sides of the image. The simulations have been carried out on available CCD images with eight bit resolution. Our dedicated image sensor, presented in the following, will be able to extract local contrast from more than six orders of magnitude of illumination range, and return the local contrast information in eight bit resolution. So, much better quality than the presented simulation results can be expected from the sensor implementation. After validation of the concept by our simulations, a pixel cell was derived that implements the desired behavior. The anisotropic smoothing network could also serve for image preprocessing, e.g., smoothing and segmentation. The local average division has been realized by a one-quadrant, translinear multiplication/division circuit [32]. The pixel cell comprises a pseudolinear anisotropic diffusion network [26] employing the concept of weak springs or resistive fuses which are controlled by a globally adjustable threshold. The translinear loop behaves approximately according to 70ut = ^p"* • hef- Transistor mismatch, resulting in a multiplicative fixed pattern noise in the image, can be tackled by adjusting 7ref. In an
Application-Driven
Design of Bio-Inspired ...
115
Fig. 4.23 Local adaptive image sensor, (a) Layout of pixel cell, (b) Layout of image sensor, (c) Bonding diagram for CLCC68 package
auto-calibration step the reference current for each pixel cell can be determined and stored in a current memory, that has to be regularly refreshed due to leakage currents [32]. Floating-gate techniques could overcome this drawback. However, for machine vision tasks, especially with individual learning or adaptation capabilities, correction might not be mandatory. The pixel cell layout was designed using the same CMOS technology as
116
A. Konig, J. Skribanowitz,
M. Eberhardt, J. Doge & T.
Knobloch
for the other designs [32]. After some revisions, it was used as part of a novel local adaptive image sensor chip, given in Fig. 4.23 along with the current pixel cell layout and the bonding diagram. The cell dimensions are 66.9/xm x 66.5^m, about half of the real-estate is used for wiring, and the fill factor is 3.5%. The first image sensor implementation comprises 96 x 32 pixels and 22.2mm 2 chip area. A dynamic correction mechanism for fixed pattern noise compensation was included in addition to the readout and control circuitry. This sensor matrix with 130,000 transistors can serve for testing and measurement purposes as well as for simple machine vision applications, e.g., for the OCR task described as first design example, replacing the current software implementation of the local adaptation. Acquired images will also serve to advance the eye-tracker project as well as the automotive image processing activities.
4.4
Conclusions and Future Work
In this contribution we have presented and discussed a novel design methodology for the systematic and focused design of application-specific, bioinspired, integrated vision and recognition systems under the constraints of size, costs, power dissipation, performance, and turnaround time. We have outlined the role of bio-inspiration and the potential benefits of our design approach, reported on its state of implementation and have given four application examples. Our design efforts, which were all application-driven from real-world tasks with significant commercial interest and potential, were both subject to the developed design methodology and contributed themselves by enrichment of the method pool as well as experience to the concept and the state of implementation. Our system-oriented approach provides the unique opportunity of a vertical optimization in the overall design, exploiting redundancies and dependencies over the individual processing stages of the system. Our work possesses the potential to be immediately applicable to other application domains beyond vision, e.g., for electronic noses, acoustical systems, biometric systems, mechatronics tasks, or other CMOS compatible intelligent microsystems. So, the long term objective is to achieve a design methodology and a corresponding design flow implementation for integrated cognitive system design in general. In future work, the QuickCog system for reference modeling and design validation will be considerably extended to allow the fast modeling of
Application-Driven Design of Bio-Inspired ...
117
diverse applications and optimum exploitation of salient bio-inspired information processing techniques. This also comprises the creation of a cell library containing, e.g., various image sensor from simple passive or active pixel cells to local adaptive cells as well as image preprocessing cells, to support the faster modeling of a circuit or system after reference system manifestation. A similar effort will be made with regard to classifier systems and their implementation, migrating to the Austria Mikro Systeme International AG 0.6/zm CMOS technology. Floating gate technology, e.g., for deviation compensation techniques and for advanced preprocessing applications as well as resistive MOS and other interesting device concepts will be investigated with the objective to extent the concept of opportunistic design in our methodology down to the device level. As the main demonstrator of our design methodology, we will focus on the face and eye-tracking task, which is, in addition to 3D-display control, also applicable to semantic image coding, surveillance as well as to advanced user interface tasks. For a robust, monofocal, grayvalue-based eye-tracker solution, a mixed-signal, low-power single chip system, comprising image sensor, image processing and tracking capability, is aspired as a milestone in our work.
Acknowledgment We would like to thank Austria Mikro Systeme International AG, Austria, for providing Cadence "HIT-Kit v3.01", various design libraries as well as the CYE CMOS technology. Our work on the OCR system and chip has in part been funded by a research contract with Phytec MeBtechnik GmbH, Mainz. The advance of the design methodology and the work on optimized and consistent low-power recognition circuits and systems design is pursued within the GAME project (SPP 1076) under grant of Deutsche Forschungsgemeinschaft - DFG (German research foundation). The parts of our work related to image sensors have been funded by DFG's Graduiertenkolleg Sensorik (Graduate college for sensor technology). The responsibility for this contribution is with the authors. Further, complementing student contributions of Sascha Thoss, Andre Giinther, and Andre Krohnert to this work are gratefully acknowledged.
118 A. Konig, J. Skribanowitz, M. Eberhardt, J. Doge & T. Knobloch
References [1] Xavier Arreguit, Andre van Schaik, Francois V. Bauduin, Eric Raeber, "A CMOS motion detector system for pointing devices," Journal of Solid-state Circuits, 31, pp. 1916-1921, Dec. 1996. [2] Brian Carlson, "An "Artificial Retina" Ready for App Developers," Advanced Imaging, p. 72, Sept. 1998. [3] Werner Brockherde, Bedrich J. Hosticka, Roland Klinke, G. Zimmer, "Eine Photodetektor-Matrix mit Ausleseelektronik in Standard-CMOSTechnologie," Gesellschaft Mikroelektronik (Fachtagung): Mikroelektronik '91, VDE-Verlag, Berlin, pp. 175-180, 1991. [4] Vicki Bruce, Patrick R. Green, Mark A. Georgeson, "Visual Perception — Physiology, Psychology, and Ecology," 2nd ed., Psychology Press, 1996. [5] Department of Computer Science, Dresden University of Technology, "3D LCD-Display," http://ddd001.inf.tu-dresden.de/ -3ddisp/index.html, 1996. [6] Keinosuke Fukunaga "Introduction to Statistical Pattern Recognition", ACADEMIC PRESS, INC. Harcourt Brace Jovanovich, Publishers, 1990. [7] D. Gajski, R. H. Kuhn, "Guest editors' introduction: New VLSI tools," IEEE Computer, pp. 14-17, Dec. 1983 [8] Stefan Getzlaff, Jorg Schreiter, Andreas Konig, "Systematic design of an embedded neural system for automated visual consumption acquisition," 7th Int'l Conference on Microelectronics for Neural, Fuzzy & Bio-inspired Systems, Granada/Spain, April 1999. [9] Rainer Goebel, "Biologically inspired neurons and networks: The functional role of temporal coding," Proceedings of the 6th International Conference on Microelectronics for Neural Networks, Evolutionary Si Fuzzy Systems (MicroNeuro '97), Technische Universitat Dresden, pp. 65-74, 1997. [10] C. M. Gray, P. Konig, K. Engel, W. Singer, "Oscillatory responses in cat visual cortex exhibit inter-columnar synchronization which reflects global stimulus properties," Nature, 338, pp. 334-337, 1989. [11] Berthold K. P. Horn, "Robot Vision," The MIT Press 1990.
Application-Driven Design of Bio-Inspired ...
119
[12] Berthold K. P. Horn, B. G. Schunck, "Determining optical flow," Artificial Intelligence, 17, pp. 185-203, 1981. [13] Institut fiir Mikroelektronik, Stuttgart, "IMS home page," http://www.imschips.de/, 1999. [14] Interuniversity Microelectronics Center (IMEC), "IMEC image sensors," http://www.imec.be/fuga/, 1999. [15] Hiroshi Kage, Eiichi Funatsu, Kenichi Tanaka, Kazuo Kyuma, "Artificial retina chips for 3D human motion reconstruction," Proceedings of the 5th International Conference on Soft Computing and Information/Intelligent Systems, Iizuka/Japan, World Scientific Publishing Co. Pte. Ltd., 1, p. 76, Oct 1998. [16] H. Kobayashi, L. White, A. Abidi, "An active resistor network for Gaussian filtering of images," IEEE Journal of Solid-State Circuits, 26, pp. 738-748, May 1991. [17] Andreas Konig, "Neural Structures for Visual Surface Inspection of Objects in an Industrial Environment," PhD thesis, TU Darmstadt, http://www.iee.et.tu-dresden.de/-koeniga, 1995. [18] Andreas Konig, "Dimensionality Reduction Techniques for Interactive Visualisation, Exploratory Data Analysis, and Classification," Feature Analysis, Clustering, and Classification by Soft Computing, Edited Books from IIZUKA'98, FLSI Soft Computing Series, Nikhil R. Pal (Ed.), Vol. 3, pp. 1-37, 1999. [19] Andreas Konig, Michael Eberhardt, Robert Wenzel, "QuickCog SelfLearning Recognition Systems - Exploiting machine learning techniques for transparent and fast industrial recognition system design", Image Processing Europe, PennWell, Vol. 5, pp. 10-19, 1999. [20] Andreas Konig, Andreas Herenz, Klaus Wolter "Application of Neural Networks for Automated X-Ray Image Inspection in Electronics Manufacturing", Proceedings of the International Work-Conference on Biological and Artificial Neural Networks IWANN'99, Vol. 2, pp. 588-595, 1999. [21] Andreas Konig, Michael Eberhard, Robert Wenzel, "A transparent and flexible development environment for rapid design of cognitive systems," Proc. of the EUROMICRO '98 conference, Workshop Computational Intelligence, Vasteraas/Sweden. IEEE CS., pp. 655-662, 1998. [22] Teuvo Kohonen, "Self-Organization and Associative Memory," SpringerVerlag, 1989. [23] Kazuo Kyuma, Eiichi Funatsu, Yoshikazu Nitta, "Concept, design, performance, and applications of artificial retina chips," Proceedings of the 6th International Conference on Microelectronics for Neural Networks, Evolu-
120 A. Konig, J. Skribanowitz, M. Eberhardt, J. Doge & T. Knobloch tionary &: Fuzzy Systems (MicroNeuro '97), Technische Universitat Dresden, pp. 2-8, 1997. [24] Peter J. Lawrence, "Lecture notes CS488/688: Introduction to computer graphics," http://www.greenwich.ac.uk/-lp03/Lectures/Graphics/html, 1999. [25] Logitech Fremont/CA/U.S.A., http://www.logitech.com/, 1999.
"Logitech
home
page,"
[26] Carver Mead, "Analog VLSI and Neural Systems," Addison-Wesley VLSI System Series. Addison-Wesley Publishing Company, 1989. [27] Alireza Moini, "Vision Chips or Seeing Silicon, chapter Advantages and disadvantages of vision chips," Centre for High Performance Integrated Technologies and Systems (CHIPTEC), http://www.eleceng.adelaide.edu.au/Groups/GAAS/Bugeye/visionchips /vision_chips/advantages.html, 1997. [28] Alireza Moini, Andrew Blanksby, Abdesselam Bouzerdoum, Kamran Eshragian, Richard Beare, "Multiplicative noise cancellation (MNC) in analog VLSI vision sensors," ETD2000, Electronics technology direction for the year 2000, pp. 253-257, 1995. [29] Hans-Hellmut Nagel, (1987), "On the estimation of optical flow: Relations between different approaches and some new results," Artificial Intelligence, 33, pp. 299-324, 1987. [30] Semiconductor Industry Association, The National Technology Roadmap for Semiconductors, http://www.sematech.org/public/roadmap/doc, 1997. [31] Jorg Schreiter, Stefan Getzlaff, H. Fendrich, Andreas Konig, "Systemstudie fur ein integriertes Sensor/Prozessorsystem zur automatischen visuellen Ablesung von Verbrauchszahlem," Tagungsband Fachtagung Informationsund Mikrosystemtechnik, Magdeburg, pp. 109-116, 1998. [32] Jan Skribanowitz, Thomas Knobloch, J6rg Schreiter, Andreas Konig, "VLSI implementation of an application-specific vision chip for overtake monitoring, real time eye tracking, and visual inspection," 7th Int'l Conference on Microelectronics for Neural, Fuzzy & Bio-inspired Systems, Granada/Spain, 1999. [33] Stephen M. Smith, "ASSET-2: Real-time motion segmentation and shape tracking," Technical Report TR95SMS2, Defence Research Agency, UK, 1995. [34] Christopher B. Umminger, Charles G. Sodini, "Switched capacitor networks for focal plane image processing systems," IEEE Transactions Circuits and Systems for Video, 2, pp. 392-400, Dec. 1992.
Application-Driven Design of Bio-Inspired ...
121
[35] Oliver Vietze, Peter Seitz, "Active pixels for image sensing with programmable, high dynamic range," Proceedings of AT: Advanced Technologies Intelligent Vision, pp. 15-18, 1995. [36] Eric Vittoz, "Present and Future Applications of Bio-Inspired Systems," Proceedings of International Conference on Microelectronics for Neural, Fuzzy, and Bio-Inspired Systems MicroNeuro'99, pp. 2-11, 1999. [37] Brian A. Wandell, "Foundations of Vision," Sinauer Associates, Inc., Sunderland/MA, 1995. [38] Chang-Han Yi, Robert Schlabbach, Holger Kroth, Heinrich Klar, "A bioinspired multiplexed analog circuit for early vision edge detection and image segmentation," Proceedings of the 6th International Conference on Microelectronics for Neural Networks, Evolutionary &; Fuzzy Systems (MicroNeuro '97), Technische Universitat Dresden, pp. 149-153, 1997.
Chapter 5 Motion Detection with Bio-Inspired Analog MOS Circuits Hiroo Yonezu,
Tetsuya Asai, and Naoki Ohshima
Masahiro Ohtani,
Toyohashi University of Technology
Abstract We propose simple analog MOS circuits with a correlation model based on the insect motion detectors aiming at the realization of fundamental motion-sensing systems. The model makes the circuit structure quite simple, compared with conventional velocity sensing circuits. SPICE simulation results indicate that the proposed circuits compute local velocities of the moving light spot and have direction selectivity for the spot, which implies that a high-resolution motion-sensing chip can be realized by current analog VLSI technology. Keywords : motion detection, optical flow, analog integrated circuit, neural network
5.1
Introduction
Early visual processing elements in the biological and artificial visual systems facilitate subsequent higher-order visual processings. Neuromorphic vision chips, which have recently been developed and fabricated in the literature, certainly act as powerful visual preprocessors due to their analog, parallel and real-time operations[l; 2; 3]. Those operations spontaneously arise from mimicking the structure of biological early vision systems. An analog VLSI technology seems to be particularly useful for implementing such neuromorphic systems since a large number of unit circuits can be integrated on a small chip area, as in biological systems. 123
124
H. Yonezu, T. Asia, M. Ohtani & N.
Ohshima
Among early visual functions of biological systems, the ability to detect moving object is believed t o be an important and fundamental visual modality since the eyeball frequently makes a small amount of visual shifts in order to capture an interest object onto the retinal fovea (saccadic drift) [4]. Although the movement of the eyeball or visual objects induce optical flows on the retina, the real-time computation of the optical flows requires massive computational power if the computation is based on traditional algorithms in the research field of computer vision [5]. Recently, the optical flow chip has been tried to be realized by J. Kramer et al[6], however, the unit circuit occupies relatively a large area of the chip, which results in low spatial resolution. We have tried to simplify the fundamental analog integrated circuit for motion detection combining biological systems with a simplified algorithm of the optical flow. T h e results of S P I C E simulation of the one dimensional network showed t h a t it can detect the direction and speed of motion and that expand into the two dimensional network.
5.2
C o r r e l a t i o n N e u r a l N e t w o r k s for t h e M o t i o n D e t e c t i o n
One of a plausible biological motion extractor is the Fteichardt's correlation neural network[7]. Figure 5.1(a) shows the one dimensional correlation networks including three fundamental layers, t h a t are an input, delay and o u t p u t layers, respectively. We call a set of neurons in dashed-line square in Fig. 5.1(a) a cell. Each neuron in the input layer receives visual stimulus and produces three undelayed signals. One of these signals is connected with a neuron in the output layer, while the rests are connected with neurons in the delay layer. T h e neurons in the delay layer produce delayed signals with a time constant r. T h e delayed signals are given to neighboring neurons in the o u t p u t layer through excitatory (right-hand side) and inhibitory (left-hand side) presynaptic connections. T h e light spot moves toward the x direction with a velocity s. It passes the (i - l ) t h cell at time t and the ith cell at (t + dt). When r is equal to dt, the o u t p u t of the cell J o u t becomes the highest since the delayed signal from the (i — l ) t h cell coincides with the undelayed o u t p u t . W h e n r is shorter or longer t h a n dt, the o u t p u t 7Cut is low since the correlation of the undelayed o u t p u t to the delayed o u t p u t is small. T h u s , depending on the correlation, the absolute value of o u t p u t J o u t varies, as shown in Figs 5.1(b) and (c). Since the neurons in the o u t p u t layer receive inhibitory and
Motion Detection with Bio-Inspired Analog MOS Circuits
t+dt
Wi-1,t)
Input layer
ljn(i,t)
W-'.')
ljn(i-1.l-t)
125
X-dt
li„(i+1,t)
ir
WW)
/x - <*
Output layer time
Fig. 5.1 The correlation model, (a) network structure; (b) input signals; (c) outputs of the »th cell.
excitatory synaptic connections from the neighboring delay neurons, the sign of the output 7 out depends on the direction of moving objects.
5.3
Velocity Sensing Circuits and Networks for the Correlation Model
A velocity sensing circuit(VSC) for the ith cell in Fig. 5.1(a) is shown in Fig. 5.2. The input Imi produces the source current Ihut,i+i from the neighboring delay neurons are received as Vin+j and Vi n -,,, respectively. They divides the source current 7b,i into IM4i and 7M5,i- The output Iouti is obtained as the difference of divided currents 7M4,J — Iu5,i- Thus, the output 7out,i becomes approximately proportional to the value of Vi n+i , — Vin-^. The output of the delay neuron is generated at a capacitance C connected in parallel with a transistor M 2 ,,. It should be noticed that the output 7 out| i can be obtained only when the input 7in,- is applied and that the sign of the output is the same as that of (Vin+,i — Vin-,;). The one dimensional network is constructed by connecting VSCs, as
126
H. Yonezu, T. Asia, M. Ohtani & N.
Ohshima
OUt,l 'M4,i M5,iH|
out,i
out,i
out.i
Fig. 5.2
Velocity sensing circuit(VSC).
- • X 'in,i-1
•in,i
'in,i+1 'i+1
'1-1
'out,i-1
V H V VSC
vsc 3
D
out,i-1 out,'l
Fig. 5.3
M|
"out.i v in-,i in+j+1
out,i+1
vsc ^out,Pout,i+1
One dimensional VSC network.
shown in Fig. 5.3. Since the local velocities must be spatially and temporally continuous, the nodes of outputs are connected with pass transistors. The response of the ith VSC is shown in Fig. 5.4, where G = 0. When a light spot moves to the right-hand side, a positive output current 70ut,i is obtained, as shown in Fig. 5.4(a). A negative output current Iout,i is
Motion Detection with Bio-Inspired
Analog MOS Circuits
127
*d 'inj-l
: 'in,i-1
Pout,i-1 time TL'
time
:Td
-M J
-'out.i time jinj+1
time 'in,i+1
! Doi)t,i+1
J
out,i+1
0
time
'out.i
out,i
time •out, 0"
time
time
(a)
(b)
Fig. 5.4 Responses of the VSC network when the light spot moves to the right (a) and left-hand side (b).
obtained when a light spot moves to the left-hand side, as shown in Fig. 5.4(b).
5.4
Simulation Results
A primitive network was evaluated by S P I C E simulation. T h e input current /;„,,- was applied during the period when a light spot is located in the area of VSC with a length L. No input current was applied when the light spot is located in a spacing d between VSCs.
128
H. Yonezu, T. Asia, M. Ohtani & N. Ohshima
200
< 100 -
So
10
s [m/s] (a)
-s [m/s] (b) Fig. 5.5 SPICE simulation results of the one dimensional VSC network for positive (a) and negative (b) velocities.
The results of SPICE simulation are shown in Fig. 5.5 for the primitive network with L — 100/im, d = 10/zm and C = 50pF. The output current lout.i was approximately proportional to the logarithm of the velocity of light spot in the range between + and — maximum values So- This result indicates that the sign of the output current was changed when the direction of motion was changed, as expected. The absolute value of so means
Motion Detection with Bio-Inspired Analog MOS Circuits
Fig. 5.6
129
Micrograph of the fabricated VSC chip.
the highest measurable velocity. It can be increased by reducing C and increasing d. T h e chip micrograph is shown in Fig. 5.6, which was recently fabricated at Electron Device Research Center in our university. A photodiode generates an input current I\n>i. T h e capacitance C is formed with a pn junction. T h u s , the m a x i m u m speed SQ can be controlled by varying the bias voltage applied to the pn junction.
5.5
V S C N e t w o r k s a n d C o m p u t a t i o n a l A l g o r i t h m for O p tical Flows
In this section, we show t h a t the velocities produced by the proposed circuits are qualitatively equivalent to the local velocities obtained from computational algorithms of the optical flow. A local velocity u ( = (u, v) = (x, y)) in the two dimensional plane (x, y)
130 H. Yonezu, T. Asia, M. Ohtani & N. Ohshima can be obtained by minimizing the spatial integration of the following equation [5] E(x,y)=
(Vf
+ X(ul + ul + v2x + vl),
-u+^Y
(1)
where / , t, A, uXiy and vX:y represent the light intensity at a position (x, y), the time, the regularization parameter and the spatial derivations of the local velocities (u, v), respectively. For the one dimensional motion, the local velocity u can be represented by
_
x
u
l
.
1 +
^
1
- 2 u
i + M i = 0 t
(2)
where fx = df/dx, ft = df/dt and h the spatial constant, i the positions in the x direction, as long as 2A/ft 2 ^> fx. The second t e r m fxft in eq. (2) has a large positive or negative value around the edge of moving objects. T h u s , the t e r m fxft is qualitatively same to the o u t p u t of Reichardt cells. T h e sign of the t e r m depends on the direction of motion. The first t e r m in eq. (2) means a spatial smoothing. From Fig. 5.3, a following equation is obtained around the ith node., -G (VS-i + K + i - Wi) + /o„t,i = 0,
(3)
where V{ is the potential at the ith node and G the conductance of the pass transistor. Equation (3) has the same formula as eq. (2) of the optical flow. T h u s , the potential at the ith node V* represents the local velocity at the position i.
5.6
S u m m a r y a n d Discussion
It was clarified in motion detection t h a t the analog network based on a biological system works in a similar manner to a simplified algorithm of the optical flow. T h e proposed network is constructed by combining simple fundamental circuits called VSCs. T h e direction and speed of motion can be detected with the network. T h e proposed network is expandable for large scale integration. In a biological motion detection system as in monkey, the edge of target in an image is detected at the primary visual cortex V I through a retina. T h e n the orientation and speed are detected in the middle temporal area M T . Silicon retinae produce the edge of target in an image[8; 3]. The
Motion Detection with Bio-Inspired Analog MOS Circuits
131
orientation of the edge could be detected in a self-organized network as the orientation columns in VI [9]. The two dimensional motion detection can be done by arranging the one dimensional networks in Fig. 3 to x and y axes. Namely, the vector summation of the velocities u and v for x and y axes, respectively, enables one to measure two dimensional velocities. However, it is not a common sense to consider that biological systems require such individual velocities for two dimensional motion detection. Let us consider the scalar summation of the velocities u and v. The weighed velocity s is expressed as s = au + 0v,
(4)
where a and 0 represent the weight strength. When a = f3, s has the maximum and minimum value for the angles of the movement 7r/4 and 57r/4, respectively. This implies that the weighed velocity s has the direction selectivity according to the ratio of a to 0. For given a and /?, the angle of movement which produces maximum s can be represented by (Un~\0/a), \ t a n - \ 0 / a ) + n,
(a>0), (a < 0) .
K
'
The weighted velocity s does not represent the absolute local velocity, however, s is proportional to the velocity of the moving object when a and 0 are fixed. Thus, the weighted velocity represents a relative velocity of the moving object. It should be noticed that the direction of motion could be detected by the neurons receiving the local velocities for x and y axes with weighted synaptic connections, that are a(x,y) and 0(x,y). If the fields of a(x,y) and 0(x, y) are continuous, a network could form topological maps between the direction-selective columns and the direction of the movement, and respond selectively to the preferred direction of the movement. This implies that the orientation and direction columns in VI and MT could be formed according to the possible mechanism of the self-organization for a and 0 fields.
5.7
Acknowledgement
We would like to thank Mr. T. Miyashita for the arrangement of the manuscript for printing. This work was partially supported by a Grant-
132
H. Yonezu, T. Asia, M. Ohtani & N.
Ohshima
in-Aid for Scientific Research on Priority Areas from the Ministry of Education, Science, Sports and Culture of Japan.
Motion Detection with Bio-Inspired Analog MOS Circuits
133
References [1] B.J. Sheu and J. Choi, "Neural Information Processing and VLSI," Kluwer Academic Publishers, 1995. [2] R. Douglas, M. Mahowald, and C. Mead, "Neuromorphic Analogue VLSI," Annual Review of Neuroscience, Vol. 18, pp.255-281, 1995. [3] H. Ikeda, K. Tsuji, T. Asai, H. Yonezu and J-K. Shin, "A Novel Retina Chip with Simple Wiring for Edge Detection," IEEE Photon. Technol. Lett., Vol. 10, pp.261-263, 1998. [4] E.R. Kandel, J.H. Schwartz, and T.M. Jessell, "Principles of Neural Science," Prentice Hall International, 1991. [5] D. H. Ballard and C. M. Brown, "Computer Vision," Prentice-Hall, Inc., 1982. [6] J. Kramer, R. Sarpeshkar and C. Koch, "Pulse-based Analog VLSI Velocity Sensors," IEEE Trans. Circuits and Systems II, Vol. 44, pp.86-101, 1997. [7] W. Reichardt, "Principles of Sensory Communication," Wiley 1961. [8] C. Mead, "Analog VLSI and Neural Systems," Addison Wesley, 1989. [9] H. Yonezu, K. Tsuji, D. Sudo and J-K. Shin, "Self-organizing Network for Feature-map Formation: Analog Integrated Circuit Robust to Device and Circuit Mismatch," Computers Elect. Engng, to be published, 1998.
Chapter 6 i/MOS Cellular-Automaton Circuit for Picture Processing Masayuki Ikebe and Yoshihito Amemiya Faculty of Engineering, Hokkaido University
Abstract This chapter proposes a design of cell circuits for implementing cellular-automaton devices that perform morphological picture processing. To produce the morphological processing, we present the idea of using the silicon functional device, i/MOS F E T . We designed sample cell circuits for several morphological processing (noise cleaning, edge detection, thinning and shrinking in an image). A low dissipation of about 10 fiW per i/MOS F E T threshold logic circuits can be expected at 1 MHz operation; therefore, 10 5 or more cells that operate in parallel can be integrated into an LSI. Keywords : cellular automaton, fully parallel, Game of Life, morphological picture processing, noise cleaning, dilation, erosion, majority black, edge detection, thinning, shrinking, template matching, fMOS F E T , cell function, cell circuit, low-power dissipation, high threshold MOS F E T , dynamic logic circuit, analog circuit
6.1
Introduction
The cellular automaton is a parallel processing system that is suitable for high-speed picture processing. To implement the cellular automaton into LSIs, we must first develop a cell circuit that can produce required cell functions in compact construction. This chapter proposes such a cell circuit: namely, a z^MOS cellular-automaton circuit. The cellular automaton is a parallel, distributed data-processing system that consists of many identical processing elements (cells) regularly arrayed on a plane. Each cell changes its state in discrete time steps through interaction with its neighboring cells. The data that the cellular automaton 135
136
M. Ikebe & Y.
Amemiya
manipulates is a pattern of the cell states (i.e., a matrix, of which the elements represent states of the arrayed cells). The cellular automaton receives an input pattern and converts the pattern into various differing patterns with time steps; at an opportune moment, the converted pattern is retrieved as an output. With proper interaction rules, we will be able to obtain useful pattern transformations. The cellular automaton has potential applications especially to binary picture processing, because — if each cell and its state are regarded as a picture element and a black-white level of the picture element — the operation of the cellular automaton is just the same as morphological picture processing on binary (two-tone) pictures. The cellular automaton can be expected to provide high-speed morphological processing devices because its operation is inherently parallel. A difficulty in developing such cellular-automaton devices is that the device has to be implemented in one chip with fully parallel construction (one processing circuit for each cell). Because picture-processing applications require a large number of elements (e.g., 500 x 500 elements for television pictures), it is therefore required that a cell circuit be compact in construction and small in area. But it is difficult to construct a compact cell circuit with existing transistors because many devices are required to implement the required cell functions. To overcome this problem, we will present the idea that a compact cell circuit for picture processing applications can be constructed by use of a silicon functional MOS device known as the vMOS FET (the neuron MOS FET). In the following sections, first, we will review the great similarity between the cellular-automaton operation and the morphological picture processing (Section 6.2). We will then propose that a cell circuit for morphological picture processing can be constructed simply by using the i/MOS FETs (Section 6.3). We will present sample cell circuits and simulate their operation to show that the circuits can produce the cell functions required for morphological processing (Section 6.4). We propose processing algorithms adapted to the execution of image thinning and shrinking by means of the cellular automaton (Section 6.5). We suggest a method for constructing of a J / M O S FET based cellular circuit designed to operate according to the algorithms (Section 6.6). We simulate the circuit operation in order to demonstrate that such circuits can process data at high rates (Section 6.7). We will also discuss the power dissipation of the i/MOS FET cell circuits
v MOS Cellular-Automation
Circuit for Picture Processing
137
and present a method for designing low-power cell circuits (Section 6.8). Image processing LSI structures usually consist of an image pickup unit, a processing unit, and output unit. In this chapter, we discuss only the cellular-automaton circuit, which is the main component of the processing unit.
6.2
6.2.1
The Cellular Automaton for Morphological Picture Processing Pattern
transformation
using cellular
automata
vs *~* iHf *~~* ^ s — * Hi *~> Fig. 6.1 Cellular Automaton. An information processing system consisting of a large number of identical processing cells with local interactions.
We start the discussion with a general description of a cellular automaton. The cellular automaton is a type of information processing system having parallel and distributed architecture. Detailed explanations can be found in Refs. [l][2]. As shown in Fig. 6.1, the cellular automaton is configured as a matrix of unit cells that interact with each other. Each unit cell can assume a binary state (or a ternary or a higher state) that changes synchronously with all cells at each unitary time step. It is assumed that each subsequent state of a cell is determined only by its current state and by the states of its neighboring cells. The principles governing these changes are called interaction rules (a cell function) and, depending on the rules, various changes in the cell states take place. Although the configuration of the cellular automaton is simple and it does not need a control center, its
138
M. Ikebe & Y.
Amemiya
overall behavior is rather complicated.
Number of adjacent "1" cells 2,3
Current state 1
Subsequent state 1
0,1,4,5, 6,7, 8
0
3
1 0
0
0,1,2,4,5,6,7,8
(a) Cell
StepO
m IT Step 4
State "1"
i
Stepl
Step 5
LJ State "0'
Step 2
I
•• Step 6
Step 3
Step 7
(b) Fig. 6.2 Example of pattern transformation in a cellular automaton . The operation is illustrated: (a) cell function, (b) pattern transformation.
Game-of-Life
The cellular automaton operates as a transducer that produces an output information pattern in response to an input information pattern. As an example (Fig. 6.2), we assume a binary cell state (1 or 0) and consider the eight neighbors in determining the subsequent state of each cell, and follow the cell function illustrated in Fig. 6.2(a) (called the Game-of-Life rule). We start with an initial cell-state pattern illustrated in Fig. 6.2(b) (step 0), and with time steps, we will observe transition of the cell-state pattern as shown in the figure (steps 1 through 7). Another initial pattern will produce a different pattern change. There are many other cell functions and therefore various pattern transformations.
v MOS Cellular-Automation
6.2.2
Morphological
picture
Circuit for Picture Processing
139
processing
Morphological picture processing is a type of processing in which the special form or structure of objects within an image is modified. The morphological processing is often used as a preliminary to image analysis; it is used to condition and modify raw image signals in a way such that structural features of objects in an image will be enhanced or accentuated. For details, see Refs. [3][4][5]. The basic concept of the morphological processing for binary images is as follows. A small-sized mask (typically 3 x 3 pixels) is scanned over an image. If the binary pattern of the mask (called a template) matches the state of the pixels under the mask, then the center pixel of the 3 x 3 pixel window in the image will be subsequently set to some desired binary state. For a pattern mismatch, the center pixel will be set to the opposite state, or will be left as it is. After scanning over the entire image, all the pixels are converted simultaneously to their subsequent states. For the following discussions, we here give several known instances of morphological processing.
Hit - * - SS = white Miss-*-SS = black (a) Dilation
Hit - » - S S = black Miss-*- SS = white (b) Erosion
Hit - * - S S = black Miss-*- SS = white
Hit -*• SS = white Miss-*- SS = as it is
(c) Majority black
(d) Edge detection
Fig. 6.3 Example templates for morphological picture processing. "Hit" means the correspondence between the template and the pattern of the 3 x 3 pixel window, and "Miss" means the lack of correspondence. "SS" means the subsequent state of the center pixel.
140
M. Ikebe & Y.
Amemiya
(i) Dilation: Set the center pixel to white if all the pixels in a 3 x 3 window are white; otherwise, set the center pixel to black. The corresponding template is illustrated in Fig. 6.3(a). With dilation, an object grows uniformly by a single-pixel-width ring of exterior pixels. This is a basic operation for morphological processing and is frequently used together with erosion, described below. (ii) Erosion: Set the center pixel to black if all the pixels in a 3 x 3 window are black; otherwise, set the center pixel to white. The corresponding template is illustrated in Fig. 6.3(b). With dilation, an object shrinks by a single-pixel-width ring of interior pixels. (iii) Majority black: Set the center pixel to black if five or more pixels in a 3 x 3 window are black; otherwise, set the center pixel to white. There are 256 qualifying templates for this operation. Two instances of templates are shown in Fig. 6.3(c). The majority black is useful for filling small holes in objects and closing short gaps in strokes. (The Game-of-Life is somewhat similar to this operation but is more complex.) (iv) Edge detection: Set the center pixel to white if all the pixels in a 3 x 3 window are black; otherwise, leave the center pixel as it is. The corresponding template is shown in Fig. 6.3(d). The edge detection converts the interior pixels of an object to white but leaves the periphery pixels black.
Dilation
Erosion
Edge
detection
Fig. 6.4 An instance of morphological processing. T h e object in the picture is a letter G with noise. The noise is cleaned up by dilation and erosion; then the edge of the object is extracted by edge detection.
v MOS Cellular-Automation
Circuit for Picture Processing
141
By selecting appropriate operations, we can perform various processing on measurements of images. An example is illustrated in Fig. 6.4; a noisy image of an object (a letter G) is cleaned by a combination of dilation and erosion; then the edge of the object is extracted by edge detection. For many other applications, see Refs. [3-5]. Here, we discussed simple morphological processing. Complex operations as thinning and shrinking is described later (Sect. 6.5).
6.2.3
Utilization of cellular automata cessing devices
for morphological
pro-
If each cell of a cellular automaton is regarded as a pixel and its 1-0 state as a black-white level of the pixel, then the operation of the cellular automaton is the same as the morphological processing on binary pictures; the cell functions for cellular automata corresponds to the templates for morphological processing. The cellular automaton can be expected to provide high-speed morphological processing devices because of its inherently parallel operation. In the application to actual devices, the cellular-automaton morphological processor has to be integrated with an image sensor into a chip (one cell circuit beside each pixel-sensor element) in order to receive the input image data in parallel. This chip receives an image input, binaries the image signal, performs the morphological preprocessing on the binary image, and then outputs the processed data in time series for the subsequent image-analysis subsystems. To integrate a cellular automaton into a chip, a large number of cell circuits is required (e.g., 500 x 500 elements for ordinary television pictures and millions for high-definition pictures), so the cell circuit must be compact and small-sized in its construction. The essential operation of the cell circuit is to determine its subsequent state as a function of the current states of its own and its neighboring cell circuits. The cell function depends on what processing is required, and it is not always a simple, symmetric function that can be given by a brief Boolean representation ("symmetric" means that each adjacent cell makes an equal contribution toward determining the subsequent state of the center cell). The cell function may be a more complex function, such as a majority decision, a multithreshold, or a weighted-input function. To implement such functions in compact construction, we will propose the use of a variable-threshold logic device known
142
M. Ikebe & Y.
Amemiya
as the z/MOS FET. ("Cell function" is synonymous with "template (or a set of templates).")
6.3 6.3.1
Construction of Cell Circuits using i/MOS FETs The uMOS
FET and its
characteristic
A z/MOS FET is a variant of the floating-gate MOS FETs. There are nchannel and p-channel devices, and they are usually combined in series to form a z/CMOS inverter (Fig. 6.5(a)). For details of the z/MOS FETs, see Refs. [6][7][8][9]. When input voltages are applied to a z/CMOS inverter, each input Vi induces the potential of the floating gate in proportion to CiVi. Hence the total potential of the floating gate represents the weighted sum of inputs (Y,CiVi). The output voltage is "0" (Vout = 0) if the weighted sum of inputs is greater than the inverter threshold, and the output is " 1 " (Vout= Vdd) if the weighted sum is less than the threshold. To show the variable-threshold operation of a z/CMOS inverter, we simulated the transfer characteristics (i.e., the output voltage versus normalized input sum (Y,Vi)/Vdd curve) for the circuit in Fig. 6.5(b). (In the followings, simulated results were for 0.6-fim CMOS device parameters, where the gate width was set at 5 fim except 10 /im in the output buffer inverters.) The result is illustrated in Fig. 6.5(c). In this instance, we assumed nine input gates and used eight of them for data input and the other one for threshold control. The threshold characteristic of a z/CMOS inverter (solid curves) is not very steep, but if necessary, it can be reshaped by adding an inverter as illustrated by dashed curves. A single-threshold function is thus easily obtained with one z/CMOS inverter, and its threshold can be changed by a control input. By combining a number of z/CMOS inverters we can implement a variable-multithreshold logic operation.
6.3.2
Construction
of templates
using uCMOS
inverters
The basic operation of the morphological processing is to check the window pattern of an image against a given template to see whether they are consistent with each other. This operation can be implemented simply by using a z/CMOS inverter, as illustrated in Fig. 6.6. For example, we present a circuit construction for the majority black template. A ten-input z/CMOS inverter is used for this purpose; nine pixel signals are applied to
v MOS Cellular-Automation
VDD
Circuit for Picture Processing
143
VDD
Data Input
50fF 130fF
1 2
3 4 5 6 normalized input sum (c)
Fig. 6.5 i/MOS F E T and its transfer characteristics: (a) i/CMOS inverter, (b) simulated circuit, and (c) transfer characteristics (i.e., the output voltage versus normalized input sum (E Vi)/ Vdd) of a i/CMOS inverter of (b). For simplicity all input gates are set to the same input voltage. Simulated CMOS inverter of (b). For simplicity all input gates are set to the same input voltage. Simulated assuming 0.6-/zm CMOS device parameters, Cl = 50 fF for each signal input, and Cl = 130 fF for the control input.
the nine inputs, and a bias voltage is applied to the tenth input to control the threshold such that the template circuit will change its output from 0 to 1 if five or more pixel signals are 1. The majority black corresponds to a majority-decision cell function and requires a set of many templates, but can be implemented compactly by using the z/CMOS inverter. If necessary, we can give a differing weight to each pixel signal by changing the value of each input capacitance. We will refer, in Sect. 6.6, to the complex template circuits that can recognize positions of black and white pixels in the window. 6.3.3
Cell circuit for dilation-erosion
plus edge
detection
The functions dilation and erosion (Fig. 6.3(a) and (b)) are frequently used for morphological processing in combination with each other. They are single-threshold, symmetric functions. The operation of edge detection
144
M. Ikebe & Y.
Amemiya
VDD
C\\ Inputs 1 from the pixel j window 8
Cu
d ^
Output
C\\
Control bias voltage Fig. 6.6 Implementation of templates by using j/MOS circuits: a cCMOS circuits for the majority black template.
Threshold logic gate
Fig. 6.7 Cell circuit for dilation-erosion plus edge-detection operation. A sample set of parameter is: G\ = 50 fF, Ol = 400 fF, Cl = 150 fF, and 0.6-/*m CMOS device parameters.
(Fig. 6.3(c)) is also single-threshold function like the erosion. We considered merging an edge detection logic gate with the dilation and erosion logic gate. The resultant construction is illustrated in Fig. 6.7. It consists of a threshold logic gate and a memory, and the threshold
v MOS Cellular-Automation
Circuit for Picture Processing
145
logic gate consists of a single z^CMOS inverter. The circuit has two control signal inputs, CSi and CSi. If CSi = 1 (Vdd), the circuit operates as an edge-detection cell, and if CSi = 0, then it operates as a dilation-erosion cell. The dilation and erosion can be switched by CS2. For the inputcapacitance parameters, see the figure caption. The memory consists of a D-type latch (two inverters and two CMOS switches) and an output buffer (one inverter and a switch) that are driven by the clock CLK.
6.4
Properties of Cell Circuits using i/MOS F E T
We designed a sample cellular automaton by arranging the cell circuits of Fig. 6.7 (a dilation-erosion plus edge-detection cell) and operated it in simulation to perform noise cleaning and edge detecting. The result is illustrated in Fig. 6.8(a). To define a boundary condition, we enclosed the cell matrix with 64 bias cells, each of which was fixed at 0. Assuming the illustrated initial cell pattern, we simulated the morphological processing with clocks. As shown in the figure, the noisy picture (step 0) is cleaned up by dilation and erosion (steps 1 through 4); then the edge of the object is extracted successfully (step 5). Plotted in Fig. 6.8(b) are the waveforms of the clock CLK, the control signals CSi and CS2, and the cell-output transition for two cells marked by A and B (in steps 5 of Fig. 6.8(a)). This cellular-automaton circuit operated at clock frequencies up to 110 MHz.
6.5
6.5.1
Image Thinning and Shrinking by a Cellular Automaton Cellular-automaton
thinning
rule
Thinning is one of the techniques used to represent binary images by simple lines. This technique can be used for finding intersection points, end points, or branching points in characters or in line drawings. We here consider the thinning rule that can be carried out in the cellular. automaton. Because all cells in a cellular automaton operate in parallel, it is impossible to thin a given line image into a one-pixel-width line if the width of the image is an even number in units of pixels (i.e., if a line image is gradually erased evenly from both sides, it will eventually disappear).
146
M. Ikebe & Y.
Amemiya
Noise cleaning (V)4 CLK2
Edge detection >l< >l
rUlfUlTUVL
(V)4 CS\ 2 : CS2 0 (V)4 Output 2 0 0
r r
csi
i i i
r
CSi
r i i
Cell A U ^ Cell B •
i
100 200 300 400 500 600 (b)
i
700 (ns)
Fig. 6.8 Collective operation of the dilation-erosion plus edge-detection cell circuit of Fig. 6.7. A 15 x 15 cell array is simulated: (a) pattern transition of the cell array (the object in the picture is a letter A with noise); (b) voltage waveforms of clock CLK, control signals CSl and CS2, output voltages of the cells ,marked in step 5 of (a) by A and B. Simulated assuming a 10 MHz clock frequency and the same parameter as in Fig. 6.7.
Therefore, it is necessary to carry out thinning operations by successively changing the interaction rule. So we developed the modified rule with template rotation as described below. The developed thinning rule is illustrated in Fig. 6.9. Each of the three templates in the figure indicates the condition of 8-neighbor cell states under which the center pixel can be changed from black to white. By rotating
v MOS Cellular-Automation
Current state
Circuit for Picture Processing
State of the 8 neighbor cells
147
Subsequent state
Any states different from those shown below
• =o ma HHD • =i • • • I'D 1BD IHD
an
• = 0 or 1
Three templates are rotated 90 degrees at each time step Any state
Fig. 6.9
Thinning rule for cellular-automaton processing.
initial pattern Fig. 6.10
pattern after 12 steps Example of thinning-rule operation.
each template by 90 degrees for each thinning step, we can successfully thin any given image to a one-pixel-width line. An example is illustrated in Fig. 6.10. 6.5.2
Shrinking
rule
Shrinking is a process that shrinks a given image to obtain the one-pixel center point. The shrinking leaves an indication that an image was present in this location. It can be used for such applications as particle counting. To carry out shrinking on the cellular automaton, we use the above-
148
M. lke.be & Y. Avn.em.iya
mentioned thinning templates to obtain the one-pixel-width line for a given image and repeat a process of erasing the end points of the line. The shrinking rule based on this method is shown in Fig. 6.11. A simulated result for this operation is shown in Fig. 6.12. According to the shrinking rule, a given image can be reduced to a single point (images of an enclosure type are reduced to a circle). Because the same templates can be used for both thinning and shrinking, the circuits designed for thinning can also be used for the shrinking applications. Current state
State of the 8 neighbor cells
Subsequent state
Any states different from those shown below
USD • = 0 • =i • • • m = 0 or 1
pnn Three atemplates 90 degrees • a •are•rotated • • at each time step Any state Fig. 6.11
Shrinking rule for cellular-automaton processing.
pattern after 64 steps Fig. 6.12 Example of shrinking-rule operation: processing for the initial pattern shown in Fig. 6.10.
v MOS Cellular-Automation
6.5.3
Comparison
Circuit for Picture Processing
with other thinning
149
algorithms
Thinning algorithms in which templates are used can be separated into two general types. (1) Algorithms in which the final point conditions (pixels which may not be erased) are presented in a template form (see Refs, for example, Stefanelli and Rosenfeld [10] or Tamura [ll]). (2) Algorithms in which the erased points conditions (pixels which can be erased) are presented in a template form (see Refs, for example, Arcelli et al. [9]). With algorithms of the first type, it is possible to obtain various modifications by properly using the conditions (thinning, shrinking, and other types of image processing by four or by eight neighbor cells). In addition, by combining four directions into two, it is possible to increase the processing speed (one full rotation can be done in two steps). However, due to the fact that multiple conditions are necessary for the definition of the final point, multiple templates are needed for processing (for example, 9 template circuits are needed for each cell in the process of thinning according to the Tamura algorithm using 8 neighboring cells). In algorithms of the second type, it is possible to perform the processing with a smaller number of templates by combining the erasure conditions. (In thinning according to the Arcelli algorithm using eight neighbor cells, it is sufficient to have two templates per cell. However, it is important to pay special attention to the cell design in order to avoid the occurrence of noisy branches.) The algorithm used in this study is of the second type. It is similar to the Arcelli algorithm, but the processing method is different. In our algorithm, 3 template circuits are used for each cell and one full rotation processing is carried out in 4 steps, while in the Arcelli algorithm, there are 2 template circuits per cell but one full rotation processing operation is carried out in 8 steps. When special hardware is designed for such processing (one operator per cell), its area depends on the number of templates and its processing speed is determined by the number of steps. Since in a cellular automaton all pixels are processed in parallel, its overall processing speed can be high, even if the operation of individual cell circuits is slow. It is also important that the area of the cell circuit is small, thus making it possible to use
150
M. Ikebe & Y.
Amemiya
the cellular a u t o m a t o n in the m e t h o d employing point erasure conditions. Therefore, one can say t h a t this algorithm is suited t o t h e special circuits of the cellular a u t o m a t o n (thinning accuracy depends on t h e number of noisy branches and on how much the extracted lines coincide with the center lines of the original graphic features). It should be also noted t h a t this algorithm is not much different from existing algorithms.
6.6
H a r d w a r e E m b o d i m e n t o f t h e C e l l u l a r A u t o m a t o n for Thinning and Shrinking
6.6.1
The vMOS
FET-based
template
circuit
Since all input signals are added u p t o a total signal in the i/MOSFET, it is impossible t o differentiate positional relation of 0, 1 (white, black) in the 8-neighbor-cell template. Therefore, we decided t o put an inverter before each input terminal t h a t receives a pixel signal from a neighboring cell, thus making it possible to make a judgment about the template simply by adding t h e input signals from the 8-neighbor cells. A t e m p l a t e circuit for this purpose is shown in Fig. 6.13(a). T h e static characteristics of this template circuit were simulated and are shown in Fig. 6.14(b) (a set of 0.6-yum C M O S parameters was assumed). T h e desired discrimination characteristics are achieved.
• =o
runs
• =i
EBB
b5
VDD
input l-» H4> 2-» 3-»- - > — I H 5-*7 - » - -t>—I
•x (a)
T""~^^ i i S y
oltag
d output
>
outpu
• = 0orl
cell number
i
i
i
i
i
i \ i \ I \ i
1 2 3 4 5 6 7 number of neighboring cells that match to the template (b)
Fig. 6.13 Template circuit: (a) example of the template circuit structure; (b) static characteristics of template circuit.
v MOS Cellular-Automation
6.6.2
Circuit for Picture Processing
A structure of the thinning/shrinking ton circuit
cellular
151
automa-
A cellular automaton circuit for thinning and shrinking can be obtained by combining above template circuits. The circuit construction for a cell is shown in Fig. 6.14(a) . The cell circuit consists of three template circuits, each of which uses a i/MOSFET. Each signal from neighboring cells is applied to the template circuits through a multiplexer circuit. Rotation of the templates can be carried out by using the multiplexer circuits. The cell state is stored in the latch (the voltage on node P corresponds to the cell state, and the same voltage is sent to the neighboring cells). All template circuits operate in parallel and discriminate the states of the eight neighboring cells. In the figure, Co and C\ are input coupling capacitances at the template circuits and, C2, C3, and d are the bias coupling capacitances. The outputs of the template circuits, together with the feedback signal from the cell circuit output, are applied to an AND-logic circuit. Depending on the change in the cell state due to this feedback signal (0 or 1), the interaction rule is changed correspondingly. Output Q of the AND-logic circuit corresponds to the state that will be taken by the cell during the next processing step. Changes in the cell state also take place in response to the clock signal applied to MOS switches in the latch. Switching between thinning mode and shrinking mode is performed by changing the template circuit operation by varying the control voltage from 0V to VDD. In this cell circuit, the area occupied by the template circuit determines the entire area of the cell. For comparison, we designed a CMOS circuit that has the same function as that of the developed i/MOS template circuit above. It was confirmed that the layout area for the z/MOS template circuits is about 40 % as compared to the CMOS circuit. 6.7
Simulation of Operational Characteristics of a i/MOS Cellular Automaton
We built a cellular automaton by assembling multiple thinning/shrinking cell circuits described in Sect. 6.6 and tested their performance by simulation. The design of the circuitry was based on the assumption that its parameters are the same as in a 0.6-fim CMOS. In the process of simulation, the cell circuits were arrayed and connected
152
M. Ike.be. & Y.
Amemiya contorol signal
output (to neigbors cells) • 1
• 2 • 3 •4 • 5 • 6 • 7
output inverter Ci = 1.2 Co Ci = 6.5 Co a = 5.0 Co C4 = 9.0 CO Cs = 2.0 Co
control voltage
mcum SDH] SHU] cell number
(a)
X If
•L
shirinking
pnc
(b) Fig. 6.14 Cellular-automaton circuit for thinning and shrinking: (a) circuit structure; (b) switching between thinning and shrinking modes.
in a matrix configuration. After applying certain initial values to all cells, we observed the changes in the states of the cells. The states of the individual cells were determined by the output voltage (the state is 0 when the voltage is zero, and it is 1 when the voltage is equal to it VDD. Since the
v MOS Cellular-Automation
Circuit for Picture Processing
153
simulation is time-consuming, there is a limit to the number of cells that can be analyzed. In this study we performed a simulation of a 6 x 6 cell array. As shown in Fig. 6.15, the array is framed by 28 bias cells (having a fixed state) located along the perimeter of the array. In order to check the thinning/shrinking characteristics, we observed the changes in the states of all cells that took place during each time step. Initial values were applied to each cell (in this case, cells were initialized as shown in step 0 of Fig. 6.15, and the peripheral bias ceUs were assigned the fixed 0 state). The initial pattern was processed first according to the thinning rule and, upon completing this process, the rule was switched to the shrinking mode by changing the control voltage. The changes in cell states caused by these actions are illustrated in Fig. 6.15 (steps 1 through 18). The initial pattern shown in step 0 is thinned step by step. The thinning process is completed by step 5 and no thinning takes place after that. On completion of step 8, the rule is switched to the shrinking mode, thus starting the contraction of the pattern. Once the pattern is contracted to a single point, no further change takes place. In this manner, switching of the functions and modification of the graphic patterns are carried out. •
state: 1
•
state: 0
1
I
i
LP
m
• H n step3,4
• ••• stepo
stepi
step 2
steps - 8
step 10
step 12,13
step 15
step 16
step 17
step 14
step is -
Fig. 6.15 Cell matrix and its state transition: Simulation for a 6 x 6 cell matrix. Peripheral 28 cells are bias cells and have a fixed state of 0.
154
M. Ikebe & Y.
Amemiya
..'....i....i...ij....u...i.. r
E EfKEEE Gazi?::j::i::I!t£::::::z
500
1000 (a)
2000 (nsec)
1500
£H
:>5 Z ! 3 7 D q n r x i D P Z O r X l C p P f
°D a — O o
>0
•••Q-«*-|-4-
x: 50
-~.e
100 (b)
!
„,„...,„..
3
-'•'••
150
200 (nsec)
! .„,.,,
s o lttl.tl.U.mJ.U.U.ti.l|tl.lJ:ll.tit|.liU.tllJi 100ns (for (a)) 10ns (for(b)) State changes taking place in cellsHHD shown in Fig 1.15. above (steps 3,4). Fig. 6.16 Simulation of cell-circuit operation: (a), (b) waveforms of cell outputs (at 10 MHz and 100 MHz); (c) waveform of clack voltage.
Next, we examined the operation waveforms associated with the cell states by observing appropriate cells in the 6 x 6 array shown in Fig. 6.16 (the locations of the cells are indicated in the diagram for steps 3 and 4 in Fig. 6.15). The results of operation at the clock frequency of 10 MHz are shown in Fig. 6.16(a). Fig. 6.16(b) represents the results of operation at a higher clock frequency (100 MHz) with the same array. A comparison of Figs. 6.16(a) and (b) shows that the operation of each cell is basically the same but the waveforms obtained at the higher clock frequency contain a time delay. This can be explained by the charge discharge time of the coupling capacitance at the i/MOS-FET input (the switching time of the transistor itself is less than 0.2 ns). The considerable triggering noise generated at the time of mode switching has the same cause. The value of the input coupling capacitances is an important, factor in the design of the cell circuitry. With an increase in the capacitance value, the speed of operation is decreased. On the other hand, if this value is too low, the floating produced by the potential divider action of the array of input capacitances may disappear due to errors caused by the capacitance of the z/MOS-FET gate
v MOS Cellular-Automation Circuit for Picture Processing
155
(the capacitance between the floating gate and the transistor), and there is therefore a risk of error in operation as a result of fabrication discrepancies. These factors impose restrictions on the operating frequencies of the cellular automaton. The upper frequency limit of the cellular automaton designed for this study is 150 MHz (the gate of the output inverter MOS FET of the circuit shown in Fig. 10 together with n and p channels was made 20 /zm wide). Because the cellular automaton performs the processing of all pixels in parallel, its processing speed is far higher than that of conventional raster scanning; e.g., the cellular automaton can provide a fast image processing of 1 M-frame/s at a low clock frequency of 1 MHz.
6.8
6.8.1
Constructing a i/MOS Circuit that has Low Power Dissipation Types of power dissipation cuits
in CMOS and uMOS
cir-
We have implemented the cellular-automaton compactly onto an LSI chip by using the z/MOS FET as a variable-threshold element for the unit cell circuit. In the implementation, it is necessary to reduce the power dissipation in the cell circuit. There are three types of power dissipation in CMOS circuits: load charging dissipation caused by the charging and discharging of load capacitances; short current dissipation caused by transient short current that passes through the n-type and p-type MOS FETs; and the leakage dissipation due to subthreshold leak current in the MOS FETs. In ordinary CMOS circuits the first term is dominant, but in the present cell circuits the short-current dissipation can form the greater part of total dissipation because the i/CMOS inverters are frequently operated with intermediate gate voltages. The short-current dissipation is transient and minute in ordinary CMOS inverters because they are only used when the static gate voltage is absolutely 1 or absolutely 0. This is not true of i/CMOS inverters because they are used for threshold logic applications and therefore their floating gates are frequently set at intermediate voltages between 1 and 0. This section proposes low-power J / M O S circuits for implementing cellular-automaton LSIs.
156
M. Ikebe & Y.
6.8.2
Amemiya
A high-treshold
i/MOS
circuit 10-2
ordinary type (AVth = 0 V)
10-3 input VDD y0—Oil r % buffer inverter ViV8-
Coi
-°^V
' ~^
•Co
Vc- C\
y We set the MOS FET J threshold voltage to high value.
DH
(a)
10 5
10 6 io7 108 operation frequency (Hz) (b)
Fig. 6.17 A high threshold fMOS circuit: (a) circuit construction; (b) power dissipation vs. frequency curve.
To lower the short-current dissipation in the i/MOS circuit, we set the threshold voltage of the constituent MOS FETs to a high value. Thus the range of input voltage for turning on both MOS FETs is narrow, and we can reduce the short-current dissipation. The circuit construction and power dissipation vs. frequency curves are illustrated in Fig. 6.17. This circuit has the same construction as an ordinary i/MOS circuit. In the high-threshold j/MOS circuit, we change the MOS FET threshold voltage from an ordinary value to a high value. The ordinary MOS FET threshold we assumed in the previous sections is 0.55 V for n-channel MOS FETs and -0.6 V for p-channel MOS FETs (see Appendix). The threshold is changed in the direction of negative on the p-MOS FET and in the direction of positive on the n-MOS FET. An absolute value of change quantity is shown in the figure as parameter A Vth. The short-current dissipation of the i/CMOS inverters can be lowered by setting the threshold voltage of the constituent MOS FETs to a high value. With a frequency of 1 MHz, the dissipation can be as small as about 1% of that in the ordinary i^MOS circuit. Yet the load driving force of a z^MOS FET circuit becomes small and its actuation becomes late when the MOS FET threshold voltage is high. Therefore the upper limit of the operation frequency is depressed.
v MOS Cellular-Automation
Circuit for Picture Processing
157
In the upper limit region of the operation frequency, an output of the i/MOS circuit ( an input of the buffer inverter) stays at an intermediate value and stops moving. Therefore the short current of a buffer inverter increases and all power rapidly increases as well.
6.8.3
A dynamic
uMOS
circuit
10-2 ordinary type input Vo- _£o. Vi •
_£D_
VDD
1>
-clock
JQ-3'zrS--.-S.--!Q:-T.&.--.-| :«--<*"&.--.-B:-Sjy«---: o ~ a — a—D—J CI = 650 fF CL = 10 fF 200 fF
buffer inverter •^ output =±CL
Vc
500fF
(a)
10-5 4 10
60fF
10 5
Cl=590fF
106 10 7 10 8 operation frequency (Hz) (b)
Fig. 6.18 A dynamic f MOS circuit: (a) circuit construction; (b) power dissipation vs. frequency curve.
The dynamic uMOS circuit has a n-MOS FET(or p-MOS FET) logic circuit that is held between the clocked n-MOS FET and p-MOS FET. When synchronized with a clock signal, it repeats a precharge and a logic operation alternately. The period of time that both MOS FETs are turned on is short, so the short-current dissipation is small enough (being the same as ordinary CMOS circuits). The construction of the dynamic i/MOS circuit can be applied to the i^MOS cell circuit, because the cell automaton changes its state at the clock signal and the dynamic logic circuit conformity is good. The circuit construction and power dissipation vs. frequency curves are illustrated in Fig. 6.18. The threshold value is controlled with voltage Vc and a bias capacitance Ci. For correct operation, we need to adjust the size of load capacitance CL. When the actuation frequency was to be increased, the value of load capacitance CL was adjusted first. The optimal value of CL became small
158
M. lke.be. & Y.
Amemiya
along with the time frequency. When the value of CL became 10 fF, the value of bias capacitance C\ was adjusted after that. With a frequency of 1 MHz, power dissipation can be about 5% of that in the ordinary circuits. 6.8.4
An analog i/MOS
circuit
VDD-
input
Vc J output .2 TTT
TTT
TTT
}*^
(a) 10 5
106 10? 108 operation frequency (Hz) (b)
Fig. 6.19 A analog z/MOS circuit: (a) circuit construction; (b) power dissipation vs. frequency curve
We constructed the threshold logic circuits by using a differential amplifier z/MOS FET. When we applied input voltage to one of the i/MOS FET differential motion parts, threshold voltage was given to the other. The output voltage as 1-0 signal was the result of the comparing input voltage with threshold voltage. This circuit was proposed for unsharpen filter of an analog image (see Ref. [12]). We take up this method from an angle of the low power dissipation here. The circuit construction and power dissipation vs. frequency curves are illustrated in Fig. 6.19. We used i/MOS FET for a differential motion part of wide range-type differential amplifier. The threshold value is controlled with voltage Vc. A current always flows in this circuit. However, the current can set at an optional small value by adjusting current resource bias voltage Vb. If the power dissipation is lowered, the upper limit of the actuation frequency is depressed. The frequency level dose not greatly affect the power dissipation. With a frequency of 1 MHz, the dissipation can be about 1% of that in the ordinary circuits.
u MOS Cellular-Automation
6.8.5
Comparison
between three
Circuit for Picture Processing
159
circuits.
10" ordinary type
> 10"^ dynamic type
.B-io-4 1 \j
o 10 " 5
analog type
high-treshold type 10 - » 104
Fig. 6.20
106
io 7 108 operation frequency (Hz)
Comparison between three circuits.
We compared the characteristics of the three circuit methods (Fig. 6.20). In the balance of the power dissipation and actuation frequency, the curves display the best characteristics of each circuit. The curve of the analog i/MOS circuit is not smooth, because we did not calculate the characteristics of MOS FETs in the subthreshold actuation. The analog z/MOS circuit is most suited to a cell circuit of a cell automaton LSI. The analog I'MOS circuit has a minimum power dissipation circuit method in a range over 1 MHz. In a low power operation, the speed limit of this circuit is not high, but there is no obstruction in cell automaton application. The dynamic type requires an attention to the value of capacitance in LSI production, because a detailed adjustment of the capacity parameter are necessary. As for the high threshold type adjustability with a complementary CMOS process is not good, because a special process that makes the threshold voltage high is necessary. Yet if we want circuits with superlow power about 1 ;uW (speed 10 kHz-100 kHz), the high threshold circuits is suitable for the cell circuit in the balance of speed and power.
160
6.9
M. lke.be. & Y.
Amemiya
Conclusions
In this chapter we proposed the construction of a cell circuit for implementing cellular-automaton devices that perform morphological processing. To produce the morphological processing, complex cell functions are generally required, such as variable-threshold, multithreshold, majority-decision, and weighted-input functions. To implement such functions in a compact construction, we presented the idea of using a silicon functional device, the z/MOS FET. We designed sample cellular-automaton circuits for several morphological processings, and simulated their operation to show that the expected cell functions can be adequately obtained. We also designed a sample cell circuit(for processing of morphology) using the proposed cell circuits, and demonstrated in simulation its operations of noise cleaning, edge detection, thinning and shrinking. The z/MOS circuits can be designed for low power dissipation by using three methods. For instance, a low dissipation of about 10 fiW per analog z/MOS circuits can be expected at 1 MHz operation; therefore, 105 or more cells that operate in parallel can be integrated into an LSI. The proposed z/MOS cell circuits can be used in compact cellular automaton devices for high-speed morphological processing.
v MOS Cellular-Automation
Circuit for Picture Processing
161
Appendix A Simulation Analyses of Low Power i/MOS circuits
We analyzed the power dissipation in three kinds of fMOS circuits in a computer simulation. The design of the circuitry was based on the assumption that its parameters are same as in a 0.6-/zm CMOS. We think that i/MOS threshold logic circuits have 9 inputs (which correspond to the outputs of 8 neighboring cells and their own feedback). And the threshold of the circuits is bound in the following manner: when the number of " 1 " inputs are 5 or more, the output of a buffer inverter is " 1 " ; otherwise, it is "0". Next we set 4 of the 9 inputs to the " 1 " state, the other 4 to "0". Changing the remaining 1 input alternately from " 1 " to "0", we observe a circuit operation. The sum total of the inputs takes middle value, so the power dissipation of the circuits becomes the highest of all. The power dissipation depends on the clock frequency. We calculated dissipation of the three kinds of circuit as a function of clock frequency. The threshold voltage was set at 0.55 V for n-channel MOS FETs and -0.6 V for p-channel MOS FETs. We denned the threshold voltage as the gate voltage that gives a 1 fiA drain current at a 3.3 V drain voltage and a 5 fj,m gate width. We did not calculate characteristics of MOS FETs in the subthreshold actuation.
162 M. lke.be & Y. Amemiya
References [I] K. Preston, M. Duff, "Modern cellular automata", Plenum Press, (1984). [2] T. Toffoli, N. Margolus, "Cellular automata machines", MIT Press, (1987). [3] J. Serra, "Image Analysis and Mathematical Morphology", Academic Press, (1982). [4] W. K. Pratt, "Digital Image Processing", a Wiley interscience (1991).
Publication,
[5] M. Nadler, E. P. Smith, "Pattern Recognition Engineering", Wiley, (1993). [6] T. Shibata, T. Ohmi, "A functional MOS transistor featuring gate-level weighted sum and threshold operations", IEEE Trans ED, 39, pp.1444-1455, (1992). [7] T. Shibata, T. Ohmi, "Neuron MOS binary-logic integrated circuits. Part 1. Design fundamentals and soft-hardware logic circuit implementation", IEEE Trans ED, 40, pp.570-576, (1993). [8] T. Shibata, T. Ohmi, "Neuron MOS binary-logic integrated circuits. Part 2. Simplifying techniques of circuit configuration and their practical applications", IEEE Trans ED, 40, pp.974-979, (1993). [9] R. Stefanelli, A. Rosenfeld, "Some parallel thinning algorithms for digital pictures", J A CM, 18, pp.255-164, (1971). [10] C. Arcelli, L. Cordelia, S. Levialdi, "Parallel thinning of binary pictures", Electron Lett, 1 1 , pp.148-149, (1975). [II] Tamura., "General comments on thinning methods", Proc Reliability Soc PRL, pp.66-75, (1975). [12] T. Sakai, T. Matumoto, "A Parallel Architecture for Intelligent Image Sensors Using Floating Gate Transisters", The Journal of the Institute of Image Information and Television Engineers, 5 1 , pp.263-269, (1997).
Chapter 7 Semiconductor Chaos-Generating Elements of Simple Structure and Their Integration Koichiro Hoh, Tatsuo Tsujita, Takahiro Irita, Yuichiro Aihara, Jun-ya Irisawa, Akira Imamura and Minoru Fujishima The University of Tokyo
Abstract As the novel types of chaos-generating circuit usable for the bio-inspired information processing with their simple and suitable structure for the integration, capacitortransistor pair (CTP), return-map unit and CMOS chaos multivibrator (CMV) have been developed. Noise source for the Boltzmann machine utilizing CTP or CMV, and the coupling experiment with the return-map unit are discussed. Operation of the pipelined AD converter as the chaos generating circuit under the map function of the Bernoulli-shift is also demonstrated. Keywords : chaos, noise source, bipolar transistor, CMOS, multivibrator, pipelined AD converter, return map, Bernoulli shift
7.1 Introduction Chaotic oscillations in electronic circuits are expected to be utilized in the novel architecture of bio-inspired information processing because chaos has been observed in neural functions of biological systems[l][2]. Aiming the realization of high-density integrated circuit that treats chaos, we have developed chaos generating elements with simple structure of semiconductor devices and circuits[3]. Here we introduce their recent progress, describe their characteristics and discuss their usage for bio-inspired information processing.
163
164 K. Hon, T. Tsujira, T. Irita, Y. Aihara, J. Irisawa, A. Imamura & M. Fujishima
7.2 Capacitor-Transistor Pair The chaos-generating capacitor-npn-transistor pair(CTP) is an outgrowth from our preceding study[4][5] on the chaos generation in a single thyristor. The mechanisms of chaos generation in thyristors and CTP has been understood as the return-mapping nature in charging and discharging of junction capacitances during each cycle of the external ac driving source[5]-[7].
noise output Fig.7.1 Differential coupling of two sets of the capacitor-transistor pair to obtain zero-centered random noise.
-5
--------.L...:....:....:....:....:...J....
20
40
60
80
100
J differential
40 60 time [us]
100
Fig.7.2 Outputs from each capacitor-transistor pair and the random noise obtained from their difference.
Semiconductor Chaos-Generating Elements . • •
165
We can control CTP by its base current to produce either chaotic or periodmultiplied signals, the latter having the period of nT where n is the integer and T is the period of the ac source that drives CTP.[5]-[7] Chaotic output from CTP may be utilized as an artificial noise source for the application such as the Boltzmann machine. For such application two identical CTPs were coupled in differential manner as is shown in Fig.7.1 to eliminate the offset. The noise output from coupled CTP is then made zero-centered as is shown in Fig.7.2. This is favorable for the application of CTP in the Boltzmann machine.
7.3 Return-Map Unit The system shown in Fig.7.3 can generate chaos by utilizing an I-V curve of a proper device directly as a return-map function, through converting the current to the voltage and feeding it as the input in the next cycle. Actually the I-V curve of a CMOS inverter shown in Fig.7.4(a) was used. The shape of the map function can be changed by adjusting the gain /?, of the I-V converter, thus resulting in periodic or chaotic states. Clock
n
<>
I-V Curve
1 L
1
V I-V Converter
Delay
•v V
Fig.7.3
Return-map unit utilizing the device /-Vcurve as the map function.
Experimental chaotic output is shown in Fig7.4(b). Figure 7.5 shows the bifurcation diagram and the Lorenz plot, the latter of which exactly reproduces the map function used. This unit, alike the capacitor-transistor pair (CTP), is externally driven by the clock signal fed to the sample-and-hold circuit. The difference is that, while the state of CTP is sensitive to the frequency of the driving ac source, the present unit operates insensitively to the clock frequency in a wide range.
166 K. Hon, T. Tsujira, T. Irita, Y. Aihara, J. Irisawa, A. Imamura & M. Fujishima
Another advantage is that the selection of the map function for this unit is more intuitive.
5
0
0.05 time [s] (b)
voltage [V] (a)
0.1
Fig.7.4 (a) l-V curve of a CMOS inverter as the map function and (b) the resultant output (measured). b
4
> 4\ ^3
jte)'
Q.
•
&
3 o„ 3
Msa 'msumnA.
^ss siis&,
_y
2
^iil,
R 1
(a) Fig.7.5 output.
3
/ / /
/
/ --,;•••
/
I
5?
i
i3 +
>< o
t
\ \
J
/
/
\ . .v.
C*. 3
Xn [V]
(b)
(a) Bifurcation diagram and (b)Lorenz plot constructed from the measured
The identical two return-map units, one with the map functions / set to chaotic and the other with the function g set to 3T where T is the clock period, were made coupled with the coupling coefficient e as, X„ + 1 =/[X n + e ( r „ - X n ) ] Y^=g[Ytt + e{XB-YJ].
(1) (2)
Figure 7.6 shows measured bifurcation with tliese units in their coupled operation. It is interesting to note that for e~0.05 the chaos unit was once pulled into 3T, but beyond e~0.2 both units behaved chaotic until IT state appeared commonly at e~0.7. Beyond this, both units turned into period-adding chaos accompanied by several windows.
Semiconductor Chaos-Generating Elements ... Return-Map Unit
4.5
3.5 3
a. *-» 3 O
2.5
o
o
o
ra~
e (a) Unit which is chaotic in single operation.
4.5
~ 3.5 > 3
a. 3 O
2.5
O
O
fTF"
e (b)Unit with the period 3T in single operation. Fig.7.6
Coupled operation of two return-map units.
167
168 K. Hoh, T. Tsujira, T. Irita, Y. Aihara, J. Irisawa, A. Imamura & M. Fujishima
7.4 CMOS Chaos Multivibrator In contrast to the externally-driven types stated above, a chaos-generating element of self-oscillating type has also been developed[8]-[10]. It is based on a CMOS multivibrator and the resistor that usually determines the pulse width has been replaced by an inverter as is shown in Fig.7.7. When the time constant of charging or discharging through this inverter is affected by the width of the previous pulse, a return-mapping relationship is expected between the widths of succeeding pulses.
Discharging
(C)
Fig.7.7
Principle of chaos-generating multivibrator.
Actual circuit is shown in Fig.7.8 in which the discharge through the inverter InR (in the subunit P3) is controlled by the voltage at the node C. This voltage is determined by the balance of the conductance of mc and md in P2, the former of which is controlled by the voltage at the node B. The voltage at the node B is the height of the saw-tooth signal generated in PI which, in turn, is determined by the inverter InR. Thus, a feed-back loop of return-mapping nature is constructed and the state becomes chaotic or period-multiplied according to the shape of the map function. Variable voltages Vpl Vp2 and Vbias in the circuit are the control parameters which determine the map function
Semiconductor Chaos-Generating Elements ...
169
• Node A
/ - Node B
Fig.7.8
CMOS chaos multivibrator circuit.
In Fig.7.9, the pulse-height waveform, the Lorenz plot and the bifurcation diagram measured with this circuit are shown. The mechanism of chaos generation, qualitatively outlined previously, has been rigorously analyzed and the analytical expression has been obtained.[9][10] The Lorenz plot calculated analytically was in good agreement with the measured result of Fig.7.9(c) as is shown in Fig.7.10.[10] There the region was divided according to three types of the state of the circuit and not only the curves fitted quantitatively, but the boundaries between these regions agreed with the experimental result consistently. Figure 7.11 shows the phase diagram for the chaotic and periodmultiplied states (denoted as IT, 2T, ) in the control-parameter space of Vpl, Vp2 and Vbias.
170 K. Hon, T. Tsujira, T. Irita, Y. Aihara, J. Irisawa, A. Imamura & M. Pujishima
S1 £ 1.0
2.0
0.6
Time [ms]
1
1.4
1.8
N th maximum voltage [V]
(A)
(B)
0.5
0.6
Vblas[V] (C) Fig.7.9 (A) Waveform, (B) Lorenz plot and (C) bifurcation diagram of the output from the CMOS chaos multivibrator.
0.8
1.2
1.6""
2
N-th maximum voltage [V] Fig.7.10
Comparison of the analytical Lorenz plot with the measured one.
Semiconductor Chaos-Generating Elements ...
171
0.48 Period 1T
0.40 0
0.2
0.4
0.6
0.8' 1 Vp2
Vp1M
(a) Fig.7.11
Measured phase diagram in parameter planes (a) (Vpl, Vbias) and (b) (Vp2,
Vbias).
Fig.7.12 vibrators.
A CMOS full-custom chip containing the network of the chaos multi-
This circuit has been implemented in the 1.2/t/m-CMOS full-custom chip, shown in Fig.7.12, as the network of mutually-coupled chaos multivibrators.
172 K. Hoh, T. Tsujira, T. Irita, Y. Aihara, J. Irisawa, A. Imamura & M. Fujishima
Figure 7.13 shows tlie frequency spectrum of the chaotic output. The spectrum shows wide and broad distribution, without spikes which were observed[4] for externally driven chaos generators corresponding to tlie driving frequency and its harmonics. This is one of tlie profits in constructing the random signal source. [dBm/Hz] -20r
-40
-60 -80
-100 10 Frequency [Mhz]
Fig.7.13
20
Frequency spectrum of the chaotic output. 4000
10 20 30 40 50 60 70
Section Number Fig.7.14
Random noise distribution obtained by summing two outputs from chaos
multivibrators.
Semiconductor Chaos-Generating Elements ...
173
Figure 7.14 shows the signal height distribution of the random noise obtained by summing the outputs from two sets of chaos multivibrators.[12] The distribution resembles to the Gaussian shape and is applicable to the information processing such as Boltzmann machine.
7.4 Chaos Generation from the Pipelined A-D Converter Some of the conventional digital or analog circuits have inherent ability of generating chaos in their principles of operation. With them, the interrelationship between nonnal and chaotic operations can be studied. Figure 7.15 shows a modified circuit of the pipelined A-D converter which was intentionally made to operate in unlimited numbers of stage recursively. As the pipelined A-D converter utilizes Bernoulli-shift (or dyadic expansion) type of the return-map function in its operation, extended operation beyond its LSB stage may result in chaotic series of internal analog signals (residuals in each stage) thereafter. This signal can be observed from the terminal labeled "chaos" in Fig.7.15.
digital output
pipelined Comparator j AD Converter
Fig. 7.15
Chaos generating version of a pipelined AD converter.
174 K. Hoh, T. Tsujira, T. Irita, Y. Aihara, J. Irisawa, A. Imamura & M. Pujishima
Figures 7.16(a) and (b) show the simulated results with this circuit. Up to 9th bit, which is the predetermined range of accuracy, tlie circuit generated the correct result of 2/3 = (101010101—)2. ••tchKl vpM-3v Win. •0ADv(wn] oo/iavitHQiiumjt;
Sytnbd X"—! Q— I
x--
Ui U I -
-\LQi Sm
mm
12m H m lBm IBm fiQm ZSfn S4m ettra SBm 30m Ti™(lr}fnMEJ
(a) Correct A-D conversion in the 1st to 9th bit stages. The trace with the mark O shows digital output and that with X shows the internal analog signal.
(b) Internal analog signal from the 1st to 200th bit stages. Chaos was observed beyond 9th bit but it declined beyond 190th bit. Fig.7.16
Signals from successive stages of the pipelined A-D converter.
Beyond the stage corresponding to the predetermined accuracy, the internal analog signal became chaotic as is seen in the extended plot in Fig.7.16(b). Such series however did not continue infinitely but declined beyond 190th stages in the case of Fig.7.16(b). This decline is due to the ambiguity in the decision by the comparator at tlie critical points, i.e. tips in the saw-tooth map curve of tlie Bernoulli shift. To avoid such decline and sustain chaos generation, peripheral margins must be introduced in the map as is shown in Fig.7.17. This inevitably reduces the slope in tlie map curve below 2 and, in turn, degrades the performance of the circuit as the A-D converter.
Semiconductor
(a)slope=2 Chaos ceases
Fig. 7.17
Chaos-Generating
(b)slope<2
Elements
...
175
(c)slope=1
Chaos sustained
Modification of the map function to sustain chaos.
Such reciprocity can be advantageously used in the on-chip adjustment of the parameters of the pipelined A-D converter by monitoring the sustainability of its chaotic output. [13] For example, starting with the slope below 2 and the fixed width of the margin in the map curve, we gradually increase the slope to the value just before chaos ceases. Then we decrease the width of the margin by one step and again try to increase the slope. Thus we can approach to the optimum setup of the A-D converter after it has been installed in a chip. The chaos-generating version of the pipelined A-D converter was implemented in a full-custom CMOS chip as shown below.
Fig.7.18 Layout of the pipelined A-D converter fabricated by 1.2/im CMOS process with 2 layers of metal and 2 layers of poly-silicon.
176 K. Hon, T. Tsujira, T. Irita, Y. Aihara, J. Irisawa, A. Imamura & M. Pujishima Semiconductor Chaos-Generating Elements
7.6 Conclusions Three types of chaos-generating elements and a conventional circuit which is in close relation with chaos generation are introduced. The first is the capacitortransistor pair which is the simplest in configuration, externally driven, and is sensitive to the frequency of external excitation. The second is the return-map unit which is also externally driven but frequency independent. The setup of the return-map function is the most intuitive with this circuit. The third is the selfoscillating chaos multi-vibrator which can be most compactly implemented in the CMOS integrated chip. Artificial noise generation with the capacitortransistor pairs and multi-vibrators, and mutual-coupling experiments using the return-map units were made and their features were discussed. All these elements can be applicable in novel information processing systems utilizing each characteristic feature. In addition, the chaos generation in die pipelined AD converter was demonstrated and its utilization in the on-chip adjustment of the converter was introduced. Acknowledgment This work was partly supported by the Japan Ministry of Education, Science, Sports and Culture under the Grant-in-Aid for Scientific Research on the Priority Area, "Ultimate Integration of Intelligence on Silicon Electronic Systems". CMOS chips in this work have been fabricated through the chip fabrication program of VLSI Design and Education Center (VDEC), the University of Tokyo with the collaboration by Nippon Motorola Ltd., Dai Nippon Printing Co.,Ltd. and KYOCERA Corporation.
Semiconductor Chaos-Generating Elements ...
177
References [I] Kazuyuki.Aihara. Gen.Matsumoto and Michinori.Ichikawa: "An Alternating Periodic-Chaotic Sequence Observed in Neural Oscillators". Physics Letters, 111A. pp.251-255(1985). [2] Hatsuo Hayashi and Satoru Ishizuka, "Chaotic Nature of Burst Discharge in the Onchidium Pacemaker Neuron", Journal of Theoretical Biology, 156, pp.269291(1992). [3] Koichiro Hoh, Tatsuo Tsujita, Takahiro Irita and Minoru Fujishima, "Generation of Chaos with Simple Sets of Semiconductor Devices", Proceedings of 2nd International Conference on Knowledge-Based Electronic Systems, pp.250259(1998). [4] Koichiro Hoh and Yoh Yasuda, "Electronic Chaos in Silicon Tliyristors", Japanese Journal of Applied Physics, 33, pp.594-598(1994). [5] Takahiro Irita, Tatsuo Tsujita. Minoru Fujishima and Koichiro Hoh, "Physical Mechanism of Chaos in Thyristors and Coupled-Transistor Structures", Japanese Journal of Applied Physics, 34, pp.l409-1412(1995). [6] Takahiro Irita, Minoru Fujishima and Koichiro Hoh, "Analysis of Chaos in Capacitance-npn-Transistor Pair and Its Application to Neuron Element". Extended Abstracts of 1995 International Conference on Solid-State Devices and Materials, pp.243-245(1995). [7] Takahiro Irita. Tatsuo Tsujita, Minoru Fujishima and Koichiro Hoh, "A Simple Chaos-Generator for Neuron Element Utilizing Capacitance-npn-Transistor Pair", Computers and Electrical Engineering, 24, pp.43-61(1998). [8] Tatsuo Tsujita, Takahiro Irita, Minoru Fujishima and Koichiro Hoh, "SelfOscillating Chaos Generator Using CMOS Multivibrator", Proceedings of 2nd International Conference on Knowledge-Based Electronic Systems, pp.213217(1998). [9] Tatsuo Tsujita, Yuichiro Aihara, Minoru Fujishima and Koichiro Hoh, "Design and Experiment of a Multivibrator-Based Simple CMOS Chaos Generator", Proceedings of 1998 International Symposium on Nonlinear Tlteory and Its Applications, pp951-954 (1998). [10] Tatsuo Tsujita, Yuichiro Aihara, Minoru Fujishima and Koichiro Hoh, "Analysis of a Multivibrator-Based Simple CMOS Chaos Generator," IEICE Transactions on Fundamentals of Electronics, E82-A, pp.1783-1788 (1999). [II] Tatsuo Tsujita. Minoru Fujishima and Koichiro Hoh, "Integrated Random-Signal Source Utilizing CMOS Chaos Multivibrator," Extended Abstracts of 1999 International Conference on Solid-Slate Devices and Materials, pp.102-103 (1999). [12] Tatsuo Tsujita. Minoru Fujishima and Koichiro Hoh, "A Compactly Integrated Random-Signal Source Using Chaos Multivibrator," Japanese Journal of Applied Physics. 39, No.4B (2000). Accepted for Publication. [13] Akira Imamura. Tatsuo Tsujita, Minoru Fujishima and Koichiro Hoh., "Accuracy Improvement of the Pipelined AD Converter by the Adjustment Using Its Chaotic Output", Extended Abstracts of 1999 International Conference on Solid-State De\'ices and Materials, pp.104-105 (1999).
Chapter 8 Computation in Single Neuron with Dendritic Trees Norihiro Katayama 1 , Mitsuyuki Nakao 1 , and Mitsuaki Yamamoto 1 1
Tohoku
University
Abstract In developing neuromimetic engineering systems, the choice of neuron-like element is critical, because this element determines the functional ability of the system. According to recent neuroscience research, a neuron is expected to perform sophisticated information processing, making use of the complex physiological properties of its dendrites. In this chapter, the computational function of neuronal dendrites is explored based on mathematical models. Theoretical analyses show that a passive dendrite is capable of complex logic operations by means of nonlinear synaptic interactions. Computer simulations show that active dendrites integrate synaptic inputs locally and then hierarchically according to synaptic organization and dendritic geometry. Based on these results, a formal neuron is proposed which simply and sufficiently describes the actual properties of the dendrites. This model is therefore suitable for constructing a large-scale network without the loss of the active dendrites' essential responsiveness. These modeling studies provide novel perspectives on the computation in a single neuron with a complex dendritic tree. Keywords : neuron model, synaptic integration, passive dendrite, compartmental model, dendritic spine, active dendrite, formal neuron, output function, neural network
8.1
Introduction
A brain neuron provides a venue for t h e dynamic interactions among excitatory and inhibitory i n p u t s which shape t h e o u t p u t of t h e neuron as a train of nerve impulses (action p o t e n t i a l s ) . T h e process of computing t h e neuronal o u t p u t from t h e synaptic i n p u t s is called synaptic
179
integration,
or
180
N. Katayama,
M. Nakao & M.
Yamamoto
dendritic integration because the neuronal dendrite receives a major part of the synaptic inputs. Since the neuronal dendrite is morphologically and physiologically complex, it has attracted a great deal of attention in the field of neuroscience research. T h e major characteritics of the dendrite can be summarized as follows: (1) Dendrites show large diversity of shape and organization dependent on the type and location of the neuron [l] [2]. (2) Synapses are located systematically over the dendrites dependent on the type and location of their originating neuron [3] [4]. (3) The synaptic input to the dendrite is a t t e n u a t e d as it is conducted to the cell body due to the dendrite's passive cable properties [5]
[6]. (4) Dendrites of some types of neuron show excitable, or active properties, e.g., dendrites can generate action potentials and propagate t h e m like a nerve fiber [7]-[l0]. These findings suggest t h a t a single neuron could function as collective processing element rather t h a n as a single unit, and could also perform higher-order information processing making use of spatiotemporal response properties of the dendrites [ l l ] . Yet so far in artificial neural network (ANN) studies, the computational function of the dendrite has been neglected or undervalued. T h e neuron has been treated as a point unit, the entire structure of the neuron assumed to possess isopotentiality, as in the case of the McCulloch-Pitts type formal neuron. Such simplification enables one to analyze and understand the collective behavior of neurons in neural networks. However, if the spatiotemporal dynamics of dendrites were found to play a crucial role in neural information processing, the framework of ANN studies would be altered. In this chapter, the computational significance of neuronal dendrites is investigated. First, mathematical modeling studies of neuronal dendrites are summarized from the viewpoint of the computation of synaptic inputs in the dendritic tree, focusing mainly on passive cable properties. Second, in order to clarify the computational consequence of dendrites' active properties, synaptic interactions of active dendrites are examined in a compartmental neuron model using computer simulations. Based on the results, a novel formal neuron will be constructed to quantitatively approximate the input-output relationship of the compartmental neuron model.
Computation in Single Neuron with Dentritic Trees 181 8.2
C o m p u t a t i o n a l Consequence of Passive D e n d r i t e s
8.2.1 Due to its dendrites' dissipative properties and ramified structures, a neuron is not isopotential. For instance, change in membrane potential caused by an input at the dendrite's distal region is attenuated and distorted as it propagates to the cell body. Thus the morphology of the dendrite and the spatial distribution of its synapses shape the input-output function of dendritic tree. In this section, the function of the dendritic tree is explored when the dendrite is assumed to behave in a passive manner. Passive Cable Theory As shown in Fig.8.1, the morphology of a dendrite can be modeled by coupling the dendrite's branches. Hence the response characteristics of the passive dendritic tree can be described based on those of one-dimensional passive cables. It is well-known that the spatiotemporal behavior of the membrane potential of the one-dimensional passive dendrite can be described by the partial differential equation called the cable equation[§]:
id*vm n^^
= Cm
ovm vm "5T + ^ +
/(x f)
' '
(1)
where Vm is the membrane potential of the dendrite relative to the resting potential; t is time; x is the distance along the axis of the dendritic branch; / is the stimulus current; r< is the cytoplasm (intracellular) resistance; rm is the membrane resistance; and cm is the membrane capacitance. The response of the passive dendrite to an injection current can be determined analytically, a process which has been studied in detail by the cable theory developed by Rall[5][l2][l4]. The response to the stimulus current I(t) applied at xin is expressed by the convolution[l3]: Vm(x,t)=
f Jo
I(r)K(x,t-T-xin)dT,
(2)
where Xjn indicates an input site; and K represents impulse response of the dendrite. Naturally the input resistance /?,-„ measured at the site of stimulation: x — xm is given by the response to step current: /,-„, Rin
= hm
V
^in,t)
=
f°°
K{XinT.Xin)dT
182
N. Katayama,
M. Nakao & M.
Yamamoto
Fig. 8.1 Model of a single neuron. (A) Schematic representation of a neuron with its dendritic tree. (B) Representation of neuron model with branched dendritic cables. (C) Ladder-network representation of passive cable model of dendritic branch.
The corresponding formula to Eq.(2) in the Laplace transform space is given in the product form, Vm{x,p) =
I(j>)K(x,p;xin),
where ~ indicates the Laplace transform with respect to t; and p is the transform variable. If the stimulus I(t) is a voltage-independent current source, the voltagechange in the dendrite depends linearly on the stimulus. In this case, a response to inputs applied at multiple sites is obtained by superposition of the responses at the respective sites. In practice, a neuronal input is provided primarily by a change in synaptic conductances for specific ions. That is, the synaptic current Is is described as follows:
is(t)=gs(t)(vm(t)-Es), where gs is the synaptic conductance, and Es is the equilibrium potential of the synaptic current. If a synapse satisfies the conditions wherein gs is much smaller than the input impedance at which the synapse located, and |Vrest — Es\^> 0 (Vrest
Computation
in Single Neuron with Dentritic
Trees
183
is the resting membrane potential), the synaptic input can be considered as a current source. Actually, depolarizing synapse (Vrest -C Es) as well as hyperpolarizing synapse (V res t 3> Es) plays the role of a positive current source and a negative one, respectively. T h e n , the superposition principle approximately holds. But if a synapse t h a t does not satisfy the conditions such as shunting synapse (which holds Vrest « Es) is involved in signal inputs, nonlinear synaptic interaction takes place. In this case, the following nonlinear equation of gs should be solved in order to determine the response. Vm(x,t)=
f g,{r)[Es Jo
- Vm(x,T)]K{x,t
-
r)dr.
T h u s the response to multiple inputs involving the shunting synapses cannot be simply obtained by the superposition. Compartmental
Modeling
In order to deal with nonlinear synaptic interactions, the compartmental modeling method was introduced by Rail [12], which has been used in studies of metabolic systems and chemical kinetics. The modeling method is based on a discrete approximation of a partial differential equation. In other words, it is a method for transforming a distributed-parameter transmission line into the lumped-parameter circuits. In the modeling, the dendritic branches are divided into successively connected small segments called compartments, each of which is treated as an isopotential unit. T h e dynamical properties of each c o m p a r t m e n t are usually described by ordinary differential equations based on physiological data. For the purpose of describing nonlinear membrane properties, Hodgkin-Huxley type voltage-dependent currents[l5] are introduced to the model. In the present study, the dynamical properties of the compartmental neuron model are investigated using the numerical simulation method. An example of the compartmental neuron model will be shown in the next section.
8.2.2 As mentioned above, when synaptic inputs involve a shunting synapse, nonlinear interactions take place. The nature of such interactions has been studied by employing compartmental neuron models with passive dendrites[l6][l7]. According to these studies, a shunting synapse can op-
184
N. Katayama,
M. Nakao & M.
Yamamoto
e , e,
Dendritic Tree Cell Body
y = output Fig. 8.2 An example of logic operation in passive dendritic tree with shunting inhibition. White triangles show excitatory synapses and black triangles show inhibitory synapses. The inhibitory synapse in the figure is assumed to be a shunting type. When the synaptic inputs are simultaneously applied to the neuron, the input-output relationship is described as a logic circuit, as shown in the right side of the figure, i.e., y = ((eiAND NOT i , ) O R e 2 ) O R ( ( e 3 O R i 4 )AND NOT i2). See text for details.
pose only excitatory synaptic inputs located more distally on the dendritic tree, but has little inhibitory effect on the excitatory inputs located elsewhere. In other words, in the case of an excitatory synapse, the effective shunting synapse is located on the path between the excitatory synapse and the cell body; this n a t u r e is called 'on the path'effect[l7]. Roughly speaking, only the shunting synapse located on the path effectively leaks sufficient current flowing from the site of excitatory input to charge up the cell body. T h u s the voltage change in the cell body is most effectively reduced by 'on the path' inhibition. From the viewpoint of single neuronal computation, the synaptic interaction involving the shunting synapse could be regarded as an analog form of A N D - N O T gate. T h a t is, an excitatory input e is effective only when a more proximal inhibitory input i is inactive. T h u s the o u t p u t is described logically e AND ( N O T i). Dendrites' branching structures and the spatial arrangement of synapses could yield more complex operations. An example is shown in Fig.8.2. ft is worth noting t h a t shunting inhibition could be regarded as a multiplicative rather t h a n a subtractive operation as t h a t seen in hyperpolarizing inhibition, because the effect of shunting inhibition is restricted only to a part of the excitatory inputs applied to the dendrites.
Computation
in Single Neuron with Dentritic
Trees
185
presynaptic terminal dendritic trunk
Fig. 8.3 Schematic representation of dendritic spine. Steady-state equivalent circuit is superimposed. gs is the synaptic conductance, Es is the equilibrium potential for the synapse, Rn is the spine neck resistance, and Rj is the dendritic input resistance.
8.2.3
Dendrites have many very fine processes called dendritic spines, which are approximately lfim in length, and 0.5/im in diameter at the neck. Most excitatory synapses are known to be located on the spine head, with a part of the inhibitory synapses located at the spine's neck. Due to its special morphology, the dendritic spine has been suggested to have the following special functions: (1) The spines could extend the dendritic surface area to receive synaptic inputs. (2) A spine could function as an independent processing unit, because electric impedance between the spine head and the dendrite stem is considerably high due to the fact that a spine is very narrow at its neck. (3) Morphological changes of the spine such as a shrinkage of the neck could modify the efficacy of synaptic transmission. The first possibility has not been accepted because the surface area of the spine is negligibly small. For instance, in a hippocampal CAl pyramidal neuron, the additional increase of the surface area occupied by spines is estimated to be at most 10% of the total area. The second hypothesis has been attractive from the viewpoint of neuronal computation. If this hypothesis is true, a dendritic tree would be a
186 N. Katayama, M. Nakao & M. Yamamoto
huge collective processor, because there are more than 105 spines on each cortical pyramidal neuron. Nevertheless, theoretical studies have proved that an isolation effect is negligible when dendrites and spines are passive [18] [19]. In response to this result, some researchers have pointed out that dendritic spines could have active properties. Many computer simulations have demonstrated rich repertories of logical operations such as AND, OR, and NAND in active dendritic spine models[20]. However, thus far, it has been uncertain whether or not a dendritic spine could actually behave in this manner. Experimental verifications are being awaited. Many researchers have been trying to assess the third hypothesis, because it could form a physiological basis of synaptic plasticity and could also be a process model for learning and memory in the brain, both of which are central topics in brain science. According to the passive cable theory[l8][21], the effect of spine length on synaptic efficacy is dependent on the values of three parameters: the dendritic impedance Rj, the spine neck resistance Rn, and the synaptic conductance <7s(Fig.8.3). Thus far, R^ and Rn have been estimated quantitatively in models reproducing neuronal morphology. On the other hand, gs was difficult to measure or estimate. Recently g, has been estimated for a glutaminergic excitatory synapse (Schaffer collateral synapse) of the rat hippocampal neuron [22]. At least for this synapse, shrinkage of the spine is shown to be ineffective in facilitating synaptic transmission. Currently, the spine is suggested to function as a chemical compartment rather than as an electrical compartment[23], because the concentration of calcium ions at the spine has been shown to be regulated locally, and calcium concentration plays an essential role for synaptic plasticity. Thus the chemical compartmentalization of the spine is expected to induce the input specificity for synaptic plasticity[24].
8.3
Functional Significance of Active Dendrites
8.3.1 In the previous section, the function of the dendritic tree has been discussed in terms of the passive cable theory. However, the dendrites of a brain neuron are known to respond actively and nonlinearly. For instance, dendrites of pyramidal neurons in the neocortex[9] and hippocampus[lO] are known
Computation
in Single Neuron with Dentritic
Trees
187
to propagate fast Na + spikes initiated at the cell body by regenerative activations of voltage-gated Na + channels distributed over the dendrites. In addition, the dendrites of a cerebellar Purkinje cell are known to generate slow Ca 2 + spikes mediated by several types of voltage-gated Ca 2 + channels in response to synaptic inputs[7][8]. These active characteristics are thought to be crucial in the synaptic interaction in dendrites. Hence, the passive cable model is no longer sufficient for fully understanding neuronal information processing. To investigate the spatiotemporal dynamics of active dendrites, the compartmental neuron model involving Hodgkin-Huxley type ionic currents is widely employed. Axon initial Segment Basal Dendrite Output
\ .,
Cel
J
Bod
[ y
Excitatory Input }
;
Synapse
^P^I^Y^Y^Y
Apical Dendrite
15
P N Inhibitory Input Fig. 8.4
Configuration of the compartmental neuron model and synaptic inputs.
Compartmental Model of a Neuron with Active Dendrites Here, synaptic interaction in active dendrites is examined in a compartmental model of a neuron with a simplified geometrical structure. The dynamics of the model are described based on physiological data obtained from the rat hippocampal pyramidal neurons. The detail model equations and parameters are provided in an appendix. Briefly, as shown in Fig.8.4, the compartmental neuron model is constructed of cylindrical compartments which compose the cell body, basal dendrite, apical dendrites, and axon initial segment(AIS) . All the compartments have voltage-gated channels, thus possessing active properties. Synaptic inputs are simulated by including synaptic conductances in the dendritic compartments. The conductance rises smoothly to a peak and then falls more slowly back to zero. Examples of the compartmental neuron's response to synaptic inputs are
188
N. Katayama,
M. Nakao & M.
Yamamoto
shown in Fig.8.5. The compartmental neuron model reproduces the backpropagation of action potential from the cell body to the dendrites. B
|30mV
10.04
2m*
CLOtmirf
« «1 ___
n ____
X /
/ / *
o
•3 M
fS _^_ * S ~
8ns,
» ,—. w
y
, 9nS
9nS
til •12
.y
Fig. 8.5 Time-course of membrane potential of the compartmental neuron model with active dendrites in response to simultaneous excitatory input (indicated by white triangles) and inhibitory input (filled triangles). (A) Response to subthreshold stimulus pattern. Only postsynaptic potentials are observed. (B) When the excitatory input is increased by InS to 9nS (indicated in box), an action potential is generated at the cell body ( # 3 ) and then propagate along the dendrites.
Nonlinear Synaptic Interaction in Active Dendrites In order to investigate the synaptic interaction between one excitatory and one inhibitory input, the effectiveness of inhibition of an inhibitory synapse is studied as a function of its location [25]. An excitatory and an inhibitory synapses are simultaneously activated, and the threshold intensity of the excitatory synapse for firing the neuron is estimated. In this case, a higher threshold intensity value indicates a more effective inhibition. The synaptic input intensity is evaluated in terms of the peak synaptic conductance value. The neuron model is judged to fire when the axon initial segment (AIS) generates an action potential. Figure 8.6 shows the effect of inhibitory inputs applied at varied dendrite
Computation
in Single Neuron with Dentritic
Trees
189
sites as indicated in Fig8.4. As shown in the graph, when inhibitory intensity is less than about 70nS, the inhibitory synapse located at N (neighboring the excitatory synapse) is most effective for suppressing the excitatory input. In addition, the effect of inhibition to N is proportional to the intensity (not shown here). Thus the neighboring inhibitory synapse produces a subtractive effect on the excitatory synapse. In contrast, synaptic interactions dramatically change when the intensity of inhibition exceeds a certain level (about 70nS). When the intensity of inhibition increases to 75nS, the most effective site for inhibition is replaced by P, which is located on the path between the excitatory synapse and the cell body. If the inhibitory intensity increases slightly beyond this, it can completely block the neuronal firing by the excitatory input. Note that the effect of inhibition to B (to another dendritic branch) is much lower than P, although they are equally distant from the cell body. Thus it is quite similar to the 'on the path' effect of the shunting inhibition in passive dendrites. These characteristics of synaptic interaction is consistent when the site of excitatory input is moved. In summary, hyperpolarizing inhibition to the active dendrite involves subtractive as well as multiplicative effects. The inhibitory inputs subtract the neighboring excitatory input, and selectively prevent the more distally located excitatory inputs from being conducted to the cell body. In other words, the excitatory and inhibitory synaptic inputs are first integrated locally, and then integrated in a hierarchical manner according to the synaptic organization and the dendritic geometry.
8.3.2
Since the compartmental model is convenient for simulating physiological phenomena, it has been widely used to study neuronal activity involving nonlinear dendritic dynamics. However, due to the complex structure of the model, it is often difficult to obtain useful perspectives on the functional significance of active dendrites and nonlinear synaptic interactions. To overcome this problem, an approach to simplify the compartmental neuron model is described in this section.
190
N. Katayama,
M. Nakao & M.
Yamamoto
inhibition distal to excitation Inhibition Intensity
D
1
1 65nS
inhibition on excitation 1
on S
M N Inhi
.0
"
inhibition proximal to excitation
1
P
inhibition to another dendrite
1
()
|
20
1
|
1
40
1 60
'
1 80
Threshold Intensiry of the Excitatory Input for the Firing (nS)
Fig. 8.6 Location dependency of the effect of inhibitory input. The inhibitory effect is evaluated in terms of the threshold intensity of the excitatory input for the neuronal firing. An increase in intensity of proximally located inhibitory synapse dramatically increases its inhibitory effect. But the effect of inhibition distal to the excitation and to another dendrite are saturated at lower levels.
Formal
Neuron
A formal neuron describes the essential input-output relationship of a neuron in an abstract form. Here let us define a formal neuron as
y = f[S(X)\, where X = (XI,:E2J • • • ,xn)
(3)
is an input vector; y is the neuronal output; /
is a nonlinear single-valued function, which is called output function; and S is an appropriate scalar function, which in the following is called synaptic integration function. There exist many types of formal neuron, each of which is specialized for its own application. Equation (3) uniquely expresses a wide variety of conventional neuron models. T h a t is, any model can be represented by selecting the appropriate o u t p u t function and the synaptic integration function. T h e o u t p u t function / is usually given as a monotonic function
Computation in Single Neuron with Dentritic Trees 191 such as the Heaviside step function,
<4»
M i :<», or sigmoidal function,
™
=
<5>
l + ex P (-»).
In these cases, if the synaptic integration function is given as a weighted summation (e.g., S = WQ + J2i=i w'xi' wi a r e constants), it is identical to the McCulloch-Pitts-type neuron model. This neuron model is the most widely used for constructing multilayer neural networks as well as Hopfield type (interconnected) networks. If S is given as a polynomial expression as follows, it is called a higher-order neuron model. S — a0 + biXx + b2x2 H
h cnxl
+ ci2xix2
-\
(6)
This model includes multiplicative synaptic interactions. Although multiplicative interaction is the simplest type of nonlinearity, it has strong computational power[26], and is expected to underlie various neuronal computations such as motion detection in the visual field. So far, many biophysical models for multiplicative synaptic interactions have been proposed: for instance, shunting inhibition, active spines involving NMDA receptors[27], active dendrites involving voltage-gated Ca 2 + channels[28], and a leaky integrate-and-fire neuron model with independent random inputs [29]. On the other hand, a non-monotonic neuron model has also been proposed [30], which has a non-monotonic output function (e.g., f[u] = 1 — u for u > 0, and — 1 — u for u < 0) and a weighted summation. Substituting McCulloch-Pitts neurons with non-monotonic neurons, the Hopfieldtype network has been proven to improve its storage capacity as an autoassociative memory. Its non-monotonicity is expected to be approximated by polynomial synaptic interactions. The radial basis function network (RBNF) is constructed by the neuron-like unit with a non-monotonic function. The RBFN is widely used for function approximation and pattern processing [31]. This unit is defined as f[u]
=
S(X)
=
exp(-u 2 )
\\X-v\\/
where \\X — fi\\ describes the distance between X and fj,.
192
N. Katayama,
M. Nakao & M.
!
eb
Yamamoto
Excitatory Inputs
^°r tpuf iCta (r ( ( (
em
1
(TTTTT)
AIS i
Ib
Inhibitory Inputs
im
I d '<
Fig. 8.7 Configuration of the compartmental neuron model and the location of the synaptic inputs.
Formal Modeling of the Compartmental Neuron Model The following section describes how to construct the formal neuron that approximates the input-output relation of the compartmental neuron model [32]. Assume that 5" is a monotonic function, i.e., dS/dxi > 0 for any excitatory input Xi, and dS/dxj < 0 for any inhibitory input Xj, and / is the Heviside step function, where y = 1 indicates the neuronal firing. In this case, it is clear from Eq.(3) that the input space is separated into two subspaces—a suprathreshold and a subthreshold space—by a decision surface S(X) = 0. Hence we can describe the formal neuron by concretely obtaining the decision surface. In order to know the structure of S, input patterns on the surface (i.e., X such that S(X) = 0) are derived. With the input patterns, S is described by an appropriate function approximation technique. The input pattern X is obtained by the bisection method so that the following conditions are satisfied. (1) When X is applied to the neuron, an action potential, which is the neuronal output, is generated at the axon initial segment (AIS). (2) The decreasing intensity of any excitatory component, or increasing intensity of any inhibitory component of X, suppresses action potential generation in the AIS. Condition (1) comes from f[S(X)] = /[0] = 1, and (2) from the assumption regarding the mono tonicity and continuity of S. A formal neuron that approximates the input-output relationship of the compartmental neuron model with active dendrites has now been con-
Computation in Single Neuron with Dentritic Trees
193
structed, whose synaptic interactions were studied in the above section. As shown in Fig.8.7, the compartmental neuron model is assumed to receive three pairs of excitatory and inhibitory inputs on its dendrites. T h e excitatory inputs are denoted by e and the inhibitory inputs by i. T h u s the input vector is described as X — {eb,em,ed,ib,im,id)According to anatomical studies[l][3], the excitatory synaptic inputs are attributed as follows: e m , inputs from CA3 pyramidal neurons; ed, inputs from entorhinal cortex neurons; eb, inputs from CA1 pyramidal neurons in the contralateral hippocampus. T h e inhibitory inputs come from interneurons neighboring the C A 1 pyramidal neuron as follows[2][4]: im, inputs from bistratified interneuron; id, inputs from L-M interneuron; ib, inputs from O / A interneuron. According to the compartmental neuron model's simulation results regarding synaptic interaction, the synaptic inputs to the active dendrites are first integrated locally and then integrated hierarchically according to the synaptic organization and dendritic geometry. Therefore, synaptic integration in the active dendrites could be modeled as a multilayer neural network system(Fig.8.8). We would like to call this formal model the hierarchical neuron model. The structure of the synaptic integration function is described as follows: S =
Shasa,i[zb(eb,ib)—hb]+S!ipica\[zm(em,im)-\-S(iista.[[zd(ed,id)—hd\—hm]—h,
(7) where hb, hm, hj, and h are constants; and zb, zm, and Zd represent local synaptic integrations in the basal dendrite, in the middle p a r t of the apical dendrite, and in the distal part of the apical dendrite, respectively. Here, the local synaptic integration functions are described to fit the input patterns X as follows: zb
=
wb(eb-sib)
(8)
zm
=
wm{em-rim)
(9)
-
-wd{ed+p)/{id
Zd
+ q),
(10)
where wb, hb, wm, wd, hd, p, q, r, and s are constants. Sbasai, ^apical, and Sdistai denote the synaptic integration functions combined according to their hierarchical relations. In this study, the local synaptic integrations are assumed to be described by the sigmoidal function 5 m a x / ( l + exp(—w)), where SVnax is a constant. The constants are estimated to minimize the square error for S(X) = 0 by employing the M a r q u a r d t method[33].
194
N. Katayama,
M. Nakao & M.
Yamamoto
Fig. 8.8 Schematic representation of synaptic integration in the neuron model with active dendrites. For the notations, see the text.
In order to evaluate the approximate accuracy of the hierarchical neuron model, the Monte Carlo simulation is performed. Randomly generated input vectors are applied to both the compartmental and the hierarchical neuron models, and those outputs are then compared. The result demonstrates that 96% of the outputs are in accord. Considering the simplicity of the hierarchical neuron model, the agreement ratio is acceptable. Thus it is concluded that the model is appropriate for representing synaptic integration in a neuron with active dendrites.
8.4
Discussion and Conclusion
Conventional neural network research has implemented complicated functions by combining very simple neuron-like elements. This trend is founded on theoretical knowledge which suggests that conventional neural network models are all-mighty computational tools that have the ability to realize any kind of logical function and continuous mapping, as well as a wide variety of optimizations. However, the conventional neural network does not always provide an optimal solution for possible applications. For example, visual movement detection can be realized by combining a delay line and a multiplication function[34], which could be implemented by using the dendritic cable property and the shunting inhibition of a single retinal
Computation
in Single Neuron with Dentritic
Trees
195
ganglion cell[l7]. Therefore, for sophisticated optimal design of an information processing system, it is very important to become familiar with the properties of an individual neuron, and to understand how these properties are utilized in actual neural information processing. In this chapter, the computational ability of a single neuron has been explored in regard to the neuronal dendritic tree. On the basis of mathematical neuron models, the dendrite could perform sophisticated information processing, making use of its morphology, physiology, and synaptic organization. The cable theory of the passive dendritic tree has revealed that the shunting synapse provides nonlinear synaptic interactions called on the path effect. That is, excitatory synaptic inputs are suppressed selectively by the shunting synapse located on the path between the excitatory input and the cell body. The shunting synapse could be regarded to express multiplicative rather than subtractive action. In addition, synaptic interaction could provide AND-NOT-like operations in the passive dendritic tree. Computer simulation of a compartmental model of active dendrites revealed that hyperpolarizing inhibition involves subtractive as well as multiplicative effects. That is, excitatory and inhibitory synaptic inputs could be integrated locally, and then integrated in a hierarchical manner according to synaptic organization and dendritic structure[25]. In this compartmental neuron model, principal synaptic integration in the active dendrites is considered to be included, although this model does not include detailed morphological structures such as dendritic branching and spines. This model will be improved by taking the detailed dendritic morphology into account in the future. Based on the simulation results using the compartmental neuron model, a novel formal neuron was constructed. The newly developed formal neuron (the hierarchical neuron model) consists of simple processing elements connected hierarchically. It is sufficiently simplified so as to be suitable for network-oriented studies without loss of essential dendritic responsiveness. The hierarchical neuron model is shown to successfully reproduce nonlinear synaptic integration in a neuron with active dendrites. The hierarchical neuron model could provide perspectives on the biological role of active dendrites and their synaptic organization. That is, according to experimental studies, a Na + spike generated at the cell body back-propagates into the active dendrites[9][l0] and modulates the efficacy
196
N. Katayama,
M. Nakao & M.
Yamamoto
of the synapse [35]. Thus the back-propagating Na + spike could be regarded as a teacher signal tracing back into the distally located processing element of the 'network' in the single neuron to modify synaptic efficacy. Currently, through the use of silicon-based technology, an artificial neuron has been developed which has a hierarchical dendritic structure[36][37] similar to that of the hierarchical neuron model. It implements a process of synaptic integration inspired by the active responsiveness of dendrites, including such as dendritic spike generation. As shown in this chapter, the hierarchical neuron model provides the theoretical basis of the artificial neuron. Heretofore, as in research regarding the complexity of single neuron dynamics, mathematical model-based studies have preceded experimental studies due to the latter's difficulty. However, experimental verification has become possible because of recent developments in experimental technique. On the other hand, the determination of detailed synaptic computation dynamics and the development of a learning rule based on the complexity of single neuron dynamics, as well as to reveal its availability to artificial computations, are matters that must be explored in future study. Acknowledgements This research was supported in part by the Minsitry of Education, Science, Sports and Culture of Japan, Grant-in-Aid for Scientific Research (B), No.10480080 (1998-1999), and for Scientific Research on Priority Areas, No. 10164208 (1998). The author (N.K.) thanks JSPS Fellowships for Japanese Junior Scientists and the Ministry of Education, Science, Sports, and Culture of Japan, Grant-in-Aid for JSPS fellowships, No.00050620 (1993-1995) and for Encouragement of Young Scientists, No.10750329 (1998-1999) for partly supporting the present work.
Computation
in Single Neuron with Dentritic
Trees
197
Appendix A Compartmental Model of Pyramidal Neuron
Configuration of the compartmental model of a hippocampal CA1 pyramidal neuron is shown in the Figs.8.4 and 8.7. The compartmental model of a pyramidal neuron is constructed of cylindrical compartments composing the cell body (one compartment), basal dendrite (3 compartments), apical dendrites (9 compartments), and axon initial segment (AIS, one compartment). All the compartments have Hodgkin-Huxley type voltage-gated channels. Dynamics of the membrane potential of j-th compartments are described by the following equation[l5],
'ill =
C
- ^ T + ^ l e a k ( ^ ( j ) - £leak) + / # + / $
(1)
where V is the membrane potential in reference to the extracellular medium; t is time; j is the compartment number; c m is the membrane capacity; <7ieak is the leakage conductance; £i ea k is the equilibrium potential for the leakage current; 7 vg is the voltage-gated ion current; 7 syn is the synaptic current; and 7axiai is the intracellular current that flows along the dendritic axis. Since the neighboring compartments are connected by a resistor, /axial is proportional to the difference between the neighboring compartments' membrance potentials.
c m and <7C are calculated according to the radius and the length of the compartment (see Table A.l).
198
A.l
N. Katayama,
M. Nakao & M.
Yamamoto
Passive Membrane Parameters
The passive membrane parameters are determined based on a recent experimental report[6] as follows: the membrane resistance Rm =22kfi-cm2 (at the resting membrane potential), the intracellular resistance R[ =0.4kf2cm, and the membrane capacitance C m = l/zF/cm 2 . The morphological parameters are given in Table A.l. Under these parameter values, the somatic input resistance of the neuron model is calculated to be 131 Mf2, which is consistent with those which have been measured recently. The resting membrane potential V^est is set to — 76mV. Table A.l
Morphological and electric parameters of the compartmental neuron model.
Apical Dendrite Basal Dendrite Cell Body AIS
A.2
length
diameter
fim
fim
720 240 40 40
5.0 5.0 20.0 2.0
tfNa
mS/cm 2 10.0 10.0 10.0 30.0
9K
mS/cm 2 5.0 5.0 5.0 10.0
9A
mS/cm 2 2.0 2.0 2.0 6.0
Voltage-gated Currents
The voltage-gated current consists of the transient Na + current, the transient K + current, and the delayed-rectifier K + current, which are known to be fundamental for generating a Na+ spike. The kinetics and the currentvoltage relationships of these currents are modeled based on the experimental data observed in hippocampal CA1 pyramidal neurons of rats and guinea-pigs[38]-[40]. Since here the primary focus lies on action potential initiation, slowly activating currents such as Ca 2 + currents are neglected for the sake of simplicity. The currents is determined as follows: -fvg
= =
^Na + Ac + -TA gNam3h(V - £ N a ) + gKn\V
- EK) + gAab{V -
EK),
where <7Na, <7K', an<^ 5 A a r e the maximal conductances; E^a. — 55mV and EK — —85mV are the equiliburium potentials; m, n, h, a, and b are the
Computation
in Single Neuron with Dentritic
Trees
199
channel gating variables, whose dynamics obey the following equation: dx — = ax(l - x) - (lxx,
x
e{m,n,h,a,b},
where a and /? are the state transition rates which depend on the membrane potential. The current kinetics are described in the following. A.2.1
+ -faa = 'g^3rn3h{V -
ENa)
with
A.2.2
am{V)
=:
0.19(-V-49.5)/(exp((-V-49.5)/10)-l)
pm{V)
=
ah{V)
=
0.032exp((-K-56.6)/15)
Ph{V)
=
1.14/(exp((-V - 34.5)/10) + 1).
0.14(V + 25.5)/(exp((K + 2 5 . 5 ) / 5 ) - l )
+ IK =9Kn\V
- EK)
with
A.2.3
an{V)
=
0.0035(-V-41)/(exp((-V-41)/10)-l)
Pn(V)
=
0.043exp((-K-50)/80).
+ /A = gAab[V - EK)
with aa(V)
=
pa{V)
=
0.022(-^-37)/(exp((-V-37)/10)-l)
ab(V)
=
0.03exp((-V-77)/7)
fo(V)
=
0.0029(V + 329)/(exp((V + 3 2 9 ) / 1 0 0 ) - l ) .
0.018(K + 26)/(exp((V + 2 6 ) / 1 0 ) - l )
200
N. Katayama,
A.3
M. Nakao & M.
Yamamoto
Synaptic C u r r e n t
The synaptic input is simulated by adding synaptic conductance to a dendritic compartment. The spatial arrangements of the synapses are shown in Figs.8.4 and 8.7. The synaptic current Isyn consists of both excitatory (indicated by e) and inhibitory (i) currents. The dynamics of the current are described as follows: iW = e U) a{t/Te)(V<» J
syn
c
- Ee) + i«) • a(t/Ti)(vW
_ Ei),
(2)
where e and i indicate the maximal conductance of the excitatory synapse and inhibitory one, respectively; and Te = 1ms, TJ = 3ms, Ee = OmV, and Ei = —90mV are the model parameters, a(-) is called the alpha function, which is widely used to describe the time-course of synaptic input (Fig.A.l). The alpha function is defined as follows:
a(t)
0.8
t •exp(l
•t),
0,
t>0
(3)
t <0.
excitatory synapse(T=1 ms) inhibitory synapse(T=3ms)
0.6 Q. 0.4 0.2
10 t(ms)
Fig. A.l inputs.
15
20
25
Time-course of conductance changes of the excitatory and inhibitory synaptic
Computation
A.4
in Single Neuron with Dentritic
Trees
201
Simulation Method
In order to numerically integrate the simultaneous nonlinear differential equations described above, the 4th-order Runge-Kutta method[33] was employed with the fixed time step=5ps. The threshold intensity of the excitatory input for neuronal firing as well as X were obtained by the bisection method with a precision of 1% or O.lnS. The neuron is judged to fire when the membrane potential of the AIS compartment exceeds — lOmV within the time=25ms from the onset of the synaptic input.
202 N. Katayama, M. Nakao & M. Yamamoto
References [l] Ramon Y Cajal, S., Histologic du Systeme Nereux de l'hommoe et des Vertebres, Vol. 1, Paris: Maloine, 1909. [2] Buhl, E. H., Halasy, K. and Somogyi, P., "Diverse Sources of Hippocampal Unitary Inhibitory Postsynaptic Potentials and the Number of Synaptic Release Sites," Nature, 368, pp.823-828, 1994. [3] Andersen, P., Bliss, T. V., and Skrede, K. K., "Lamellar Organization of Hippocampal Excitatory Pathway," Experimental Brain Research, 13, pp.222238, 1971. [4] Gulyas, A. I., Miles, R., Hajos, N., and Freund, T. F., "Precision and Variability in Postsynaptic Target Selection of Inhibitory Cells in the Hippocampal CA3 Region," Europian Journal of Neuroscience, 5, pp.1729-1751, 1993. [5] Rail, W., "Branching Dendritic Trees and Motoneuron Membrane Resistivity," Experimental Neurology, 1, pp.491-527, 1959. [6] Spruston, N., Jaffe, D. B., and Johnston, D., "Dendritic Attenuation of Synaptic Potentials and Currents: The Role of Passive Membrane Properties," Trends in Neuroscience, 17, pp.161-166, 1994. [7] Ross, W. N., Lasser-Ross, N., and Werman, R., "Spatial and Temporal Analysis of Calcium-dependent Electical Activity in Guinea Pig Purkinje Cell Dendrite," Proceedings of Royal Society London, B240, pp.173-185, 1990. [8] Miyakawa, H., Ross, N. W., Jaffe, D., Callaway, C. J., Lasser-Ross, N., Lisman, E.J., and Johnston, D., "Synaptically Activated Increase in Ca 2 + Concentration in Hippocampal Pyramidal Cells Are Primarily Due to Voltage-gated Ca 2 + Channels," Neuron, 9, pp.1163-1173, 1992. [9] Stuart, G. J., and Sakmann, B., "Active Propagation of Somatic Action Potential into Neocortical Pyramidal Cell Dendrites," Nature, 367, pp.69-72, 1994. [10] Spruston, N., Schiller, Y., Stuart, G., and Sakmann, B., "Activity-dependent Action Potential Invasion and Calcium Influx into Hippocampal CA1 Dendrites," Science, 268, pp.297-300, 1995.
Computation in Single Neuron with Dentritic Trees
203
[11] Cohen, L., and Wu. J., "One Neuron, Many Units?," Nature, 346, pp.108, 1990. [12] Rail, W., "Theoretical Significance of Dendritic Trees for Neuronal Inputoutput Relations," in Neural Theory and Modeling (Ed. Reiss R.F.), Stanford University Press, pp.73-97, 1964. [13] Rinzel, J., and Rail, W., "Transient Response in a Dendritic Neuron Model for Current Injected at One Branch," Biophysics Journal, 14, pp.759-790, 1974. [14] Rail, W., "Dendritic Spines, Synaptic Potency and Neuronal Plasticity," in Cellular Mechanisms Subserving Changes in Neuronal Activity, Brain Information Research Report # 3 (Eds. Woody, C D . et al.), University California, 1974. [15] Hodgkin, A. L., and Huxley, A. F., "A Quantitative Description of Membrane Current and its Application to Conduction and Excitation in Nerve," Journal of Physiology (London), 117, pp.500-544, 1952. [16] Poggio, T., and Torre, V., "A Theory of Synaptic Interactions," in Theoretical Approaches in Neurobiology (Eds. Reichardt, W. E. and Poggio, T.), MIT Press, 1981. [17] Koch, C , Poggio, T., and Torre, V., "Nonlinear Interactions in a Dendritic Tree: Localization, Timing and Role in Information Processing," Proceedings of National Academy of Science USA, 80, pp.2799-2803, 1983. [18] Koch, C , and Poggio, T., "Electrical Properties of Dendritic Spines," Trends in Neuroscience, 6, pp.80-83, 1983. [19] Turner, D. A., and Schwartzkroin, P. A., "Electrical Characteristics of Dendrites and Dendritic Spines in Intracellularly Stained CA3 and Dentate Hippocampal Neurons," Journal of Neuroscience, 3, pp.2381-2391, 1983. [20] Shepherd, G. M., and Brayton, R. K., "Logic Operations Are Properties of Computer-simulated Interactions Between Excitable Dendritic Spines," Neuroscience, 21, pp.151-166, 1987. [21] Kawato, M., Hamaguchi, T., Murakami, F., and Tsukahara, N., "Quantitative Analysis of Electrical Properties of Dendritic Spines," Biological Cybernetics, 50, pp.477-454, 1984. [22] Bekkers, J. M., and Stevens, C. F., "Presynaptic Mechanism for Long-term Potentiation in the Hippocampus," Nature, 346, pp.724-729, 1991. [23] Kock, C , and Zador, A., "The Function of Dendritic Spines: D evices Subserving Biochemical rather than Electrical Compartmentalization," Journal of Neucoscience , 13, pp.413-422, 1993. [24] Lynch, G.S., Dunwiddie, T., and Gribkoff, V., "Heterosynaptic Depression: A Postsynaptic Correlate of Long-term Potentiation," Nature, 266, pp.736737, 1977.
204 N. Katayama, M. Nakao & M. Yamamoto [25] Katayama, N., Nakao, M., and Yamamoto, M., "Spatiotemporal Integration of Synaptic Inputs in Active Dendrite Neuron Model," Proceedings of Japanese Neural Network Society Annual Meeting '95-Sendai (in Japanese), pp.137-138, 1995. [26] Koch, C , and Poggio, T., "Multiplying with Synapses and Neurons," in Single Neuron Computation (Eds. McKenna,T., Davis, J. and Zornetzer, S.F.), Academic Press, pp.315-345, 1992. [27] Kelso, S. R., Ganong, A. H., and Brown T. H., "Hebbisn Synapses in Hippocampus," Proceedings of National Academy of Science USA, 83, pp.53265330, 1986. [28] Mel, B. W., "Synaptic Integration in an Excitable Dendritic Tree," J. Neurophysiol., 70, pp.1086-1101, 1993. [29] Srinvasan, M. V., and Bernard, G. D., "A Proposed Mechanism for Multiplication of Neural Signals," Biological Cybernetics, 21, pp.277-236, 1976. [30] Morita, M., Yoshizawa, S., and Nakano, K., "Analysis and Improvement of the Dynamics of Autocorrelation Associative Memory," Transactions of IEICE (in Japanese) , J73-DII, pp.232-242, 1990. [31] Moody, J., and Darken, C , "Fast Learning in Network of Locally Tuned Processing Units," Neural Computation, 1, pp.281-294, 1989. [32] Katayama, N., Nakao, M., and Yamamoto, M., "Synaptic Interactions and Their Integration in the Active Neuronal Dendrite Model," Technical Report of IEICE, NC95-72, pp.53-60, 1995. [33] Press, W. H., Flannery, B. P., Teukolsky, S. A., and Vetterling, W. T., Numerical Recipes in C, Cambridge University Press, 1988. [34] Hassenstein, B., and Reichardt, W., "Systemthoretishe Analyse der Zeit-, Reihenfolgen- und Vorzeichenauswerung bei der Bewegungsperzeption des Riisselkafers Chlorophanus," Z. Naturforsch., l i b , pp.513-524, 1956. [35] Linden, D. J., "The Return of the Spike: Postsynaptic Action Potentials and the Induction of LTP and LTD," Neuron, 22, pp.661-666, 1999. [36] Shigematsu, Y., Ichikawa, M., and Matsumoto, G., "Reconstitution Studies on Brain Computing with the Neural Network Engineering," in Perception, Memory and Emotion: Frontiers in Neuroscience (Eds. Ono, T. McNaughton B. L, Molotchnikoff, S.), Elsevier Science Ltd., pp.584-599, 1996. [37] Ichikawa, M., Yamada, H., Iijima, T., and Matsumoto, G., "Modeling of Artificial Neurons with Complex Dendrite Structures," Abstract of Neuroscience Meeting, 1996. [38] Magee, J., and Johnston, D., "Characterization of Single Voltage-gated Na + and Ca 2+ -channels in Apical Dendrites of Rat CA1 Pyramidal Neurons," Journal of Physiology (London), 487, pp.67-90, 1995.
Computation in Single Neuron with Dentritic Trees
205
[39] Numann, R. E., Wadman, W. J., Wong, R. K. S., "Outward Currents of Single Hippocampal Cells Obtained from the Adult Guinea-pig," Journal of Physiology (London), 393, pp.331-353, 1987. [40] Segal, M., and Barker, J. L., "Rat Hippocampal Neurons in Culture: Potassium Conductances," Journal of Neurophysiology, 51, pp.1409-1433, 1984.
About the Authors
Yuichko Aihara Department of Information and Communication Engineering Faculty of Engineering The UniYersity of Tokyo 7-3-1 Kongo, Bunkyo-ku, Tokyo 113-8656, Japan E-mail : [email protected] Phone :+81-3-5841-6775 Fax : +81-3-5841-8574 URL : http://hoh.tu-tokyo.ac.jp/
Yuichiro Aihara received B.S. degree of Information and Communication Engineering from the University of Tokyo in 1998 for his work on semiconductor chaotic devices and circuits. He is now working for M.S. degree in the Graduate school of Engineering, die University of Tokyo. He is engaged in the research on semiconductor chaotic devices and circuits.
Yoshihito Amemiya Department of Electrical Engineering Faculty of Engineering Hokkaido University Kite 13, Nishi 8, Sapporo 060-8628, Japan E-mail : amemiya® sapiens-ei.eng.hokudad.ac.jp Phone :+81-11-706-6233 Fax : +81-11-706-6585 URL : http://sapiens-eLemg.holmdai.ac.jf?*
4ifc
Yoshihito Amemiya received die B.E., M.E. and Dr. Eng. degrees from die Tokyo Institute of Technology in 1970, 1972 and 1975, respectively. From 1975 to 1993, he was a Member of die Research Staff at NTT LSI 209
210 About the Authors
Laboratories, Atsugi, Japan. Since 1993 he has been a professor in the Faculty of Electrical Engineering af Hokkaido University, Sapporo, Japan. His research is in the field of: LSI circuits, functional CMOS circuits, neural network systems, and digital- and analog-processing elements utilizing quantum phenomena and single-electron effects.
Tetsuya Asai Faculty of Engineering, Hokkaido University Kite 13, Nishi 8, Kita-ku, Sapporo, 060-8628, Japan E-mail : [email protected] Phone :+81-11-706-6080 Fax :+81-ll-707-6585 URL : http://sapiens.huee.hokudai.ac.jp/
Tetsuya Asai received the Ph.D. degree in electrical engineering from Toyohashi University of Technology, Aichi, Japan, in 1999. His current research interests are neural computation and its analog circuit implementation. At present, he is a research associate of electrical engineering at Hokkaido University, Japan.
Jens Doge Chair of Electronic Devices and Integrated Circuits Faculty of Electrical Engineering, IEE Dresden University of Technology Mommsenstr. 13 D-01062 Dresden E-mail : [email protected] Phone :+49-351-463-3711 Fax : +49-351-463-7260
About the Authors
211
Jens Doge recieved a diploma degree (Dipl.-Ing.) in electrical engineering at Dresden University of Technology, Germany, in 1999. Since August 1999 he is working as a Research Assistant towards his doctor's degree (Dr.-Ing.) at Dresden University of Technology. His research interests are inn the design and application of high-speed and highdynamic range CMOS image sensors, neural networks, bio-inspired systems, and low-power mixed-signal VLSI. Jens Doge enjoys studying the Japanese language and culture.
Michael Eberhardt Chair of Electronic Devices and Integrated Circuits Faculty of Electrical Engineering, BEE Dresden University of Technology Mommsenstr. 13 D-01062 Dresden E-mail : [email protected] Phone :+49-351-463-5302 Fax :+49-351-463-7260
4^t
M
Michael Eberhardt began his studies of economy at Marburg University in. 1993. He continued his studies 1995 at Dresden University of Technology and obtained a diploma degree (Dipl.-Wirtsch.-Ing.) in economy and electrical engineering in 1999. From 1997 to 1999 he was with Synotec Psychoinformatik and contributed psychoacoustics, sound design, and data analysis. Since July 1999 he is Research Assistant at Dresden University of Technology and is working towards a Ph.D. in the domain of visual exploratory data analysis, optimization, and self-learning systems.
212 About the Authors
Minora Fujishima Department of Frontier Informatics Faculty of Frontier Sciences The University of Tokyo 7-3-1 Kongo, Bunkyo-ku, Tokyo 113-8656, Japan E-mail : [email protected] Phone :+81-3-5841-7425 Fax : +81-3-5841-8574 URL : http://hoh.tu-tokyo.ac.jp/
TaHBR 13> * K
Minora Fujishima received B.S., M.S. and Ph.D. degrees of Electronic Engineering, all from the University of Tokyo in 1988, 1990 and 1993, respectively. In 1993 he joined the Faculty of Engineering, the University of Tokyo as a teaching staff and from 1996 to 1997 he worked for VLSI Design and Education Center, the University of Tokyo. In 1999 he became the Associate Professor in the Department of Frontier Informatics, Graduate School of Frontier Sciences, the University of Tokyo. Since 1998 he is staying at the Catholic University of Leuven as the "Visiting Professor. He is engaged in the researches on ULSI devices and circuits, single electron devices, chaotic devices and humansensor electronics. Minora Fujishima*s hobbies are swimming and skiing. As a swimmer, he won the All Japan Student and Adult Championships of 100-meter breast stroke.
Koichko Hoh Department of Frontier Informatics Graduate School of Frontier Sciences The University of Tokyo 7-3-1 Kongo, Bunkyo-ku, Tokyo 113-8656, Japan E-mail : [email protected] Phone : +81-3-5841-6675 Fax : +81-3-5841-6724 URL : http://hoh.tu-tokyo.ac.jp/ Koichiro Hoh received B.S. degree of Applied Physics, M.S. and Ph.D. degrees
About the Authors
213
of Electronic Engineering, all from die University of Tokyo in 1965, 1967 and 1970, respectively. In 1970 he joined Electrotechnical Laboratory (ETL), Agency of Industrial Science and Technology, Japan and served as a principal research staff. From 1976 to 1979 he temporarily belonged to the Cooperative Laboratories, VLSI Technology Research Association, Japan as a principal research staff and finally as the Chief of the Planning Section. After returning to ETL, he served as the Chief of the Applied Physics Section, and as the Chief of Three-Dimensional Device Section, successively. In 1988 he moved to Yokohama. .National University, -Yokohama, Kanagawa, Japan as the Professor of Electiical and' Computer Engineering. In 1993 he moved to the University of Tokyo as the Professor of Electronic Engineering and in 1996 he became the Director of VLSI Design and Education Center (VDEC), the University of Tokyo. Since 1999 he is the Professor in the Department of Frontier Informatics, Graduate School of Frontier Sciences, the University of Tokyo. He is engaged in the researches on semiconductor chaotic devices, ULSI devices and circuits and single electron devices. Koichiro Hoh likes to visit western historic architecture and he also appreciates composite performances between music, drama, movie and literature
MasayuM Ikebe Department of Electrical Engineering Faculty of Engineering Hokkaido University Kita 13, Nishi 8, Sapporo 060-8628, Japan E-mail : [email protected] Phone :+8141-706-6080 Fax : +81-11-706-6585 URL : http://sapiens-ei.emg.hokudai.ac.jp/ MasayuM Ikebe received die B.E. and M.E. degrees in Electrical Engineering from Hokkaido University in 1995 and 1997, respectively. He is currently working toward the Dr. Eng. Degree in the Faculty of Electrical Engineering, Hokkaido University. His main research interest lies on cellular-automata circuits, chaotic systems and image processing.
214 About the Authors
AMra Imamura Department of Information and Communication Engi-neering Faculty of Engineering The University of Tokyo 7-3-1 Kongo, Bunkyo-ku, Tokyo 113-8656, Japan E-mail : [email protected] Phone :+81-3-5841-6775 Fax : +81-3-5841-8574 URL : http://hoh.tu-tokyo.ac.jp/ AMra Imamura received B.S. degree of Information and Communication Engineering from the University of Tokyo in 1999 for his work on semiconductor chaotic devices and circuits. He is now working for M.S. degree in the Graduate school of Engineering, die University of Tokyo.
Jun-ya Irisawa Department of Information and Communication Engi-neering Faculty of Engineering The University of Tokyo 7-3-1 Kongo, Bunkyo-ku, Tokyo 113-8656, Japan E-mail : irisawa@ hoh.tu-tokyo.ac.jp Phone :+81-3-5841-6775 Fax : +81-3-5841-8574 URL : http://hoh.tu-tokyo.ac.jp/ Jun-ya Irisawa received B.S. degree of Information and Communication Engineering from the University of Tokyo in 1998 for his work on semiconductor chaotic devices and circuits. He is now working for M.S. degree in the Graduate school of Engineering, the University of Tokyo.
About the Authors
215
Takahiro Mta Department of Information and Communication Engi-neering Faculty of Engineering The University of Tokyo 7-3-1 Kongo* Bunkyo-ku, Tokyo 1 13-8656, Japan E-mail : [email protected] Phone :+81-3-5841-6775 Fax : +81-3-5841-8574 URL : http://hoh.tu-tokyo.ac.jp/ Takahiro Mta received B.S., M.S. and Ph.D. degrees of Information and Communication Engineering, ail from the University of Tokyo, in 1993, 1995 and 1998, respectively. He has been engaged in the researches on semiconductor chaotic devices and circuits, and VLSI processors. Since 1998 he belongs to Hitachi Ltd.
Hkoshi Ishiwara Frontier Collaborative Research Center Tokyo institute of Technology 4259 Nagatsuta Midoriku Yokohama 226-8503 Japan E-maE : [email protected] Phone :+81-45-924-5040 Fax : +81-45-924-5961 URL : http://it-www.pi.titech.ac.jp/
Hiroshi Ishiwara was born in Yamaguchi Pref., in 1945. He received the B.S., M.S., and Ph.D. degrees in electronic engineering from "Tokyo Institute of Technology, Tokyo, Japan, in 1968, 1970, and 1973, respectively. Then he worked for die Faculty of Engineering of Tokyo Institute of Technology as
216 About the Authors
Research Associate since 1973, and Associate Professor of Interdisciplinary Graduate School of Science and Engineering since 1976. He is a Professor of Precision and Intelligence Laboratory of Tokyo Institute of Technology since 1989 and also a Professor of Frontier Collaborative Research Center of Tokyo Institute of Technology since 1998. His research interests are in the areas of device and process technologies in integrated circuits, and at present particularly concerned with ferroelectric memories. He was awarded the Japan -IBM Science Prize in 1990, the Inoue Prize for Science in 1994, and the Ichimura Prizes in Technology-Meritorious Achievement Prize in 1994. He is a senior member of IEEE and a member of the Materials Research Society, the Electrochemical Society, the Japan Society of Applied Physics, the institute of Electrical Engineers of Japan.
Atsushi Iwata Faculty of Engineering, Hiroshima University 1-4-1 Kagamiyama, Higashi-Hiroshima, 739-8527 J A E-mail : [email protected] Phone : +81 824-24-7656 Fax : +81 824-22-7195 URL : http://www.dsl.hiroshima-u.ac.jp/'*'iwa/
He received the B.E., M.S. and Ph.D. degrees in electronics engineering from Nagoya University in 1968, 1970, and 1994, respectively. From 1970 to 1993, he was with the Nippon Telegraph and Telephone Corporation. Since 1994 he. has been a professor of Electrical Engineering at Hiroshima University. His research is in the field of integrated circuit design including circuit architecture and design techniques for A/D and D/A converters, DSP*s, ultra high-speed telecommunication IC's, and VLSI neural networks. He received an Outstanding Panelist Award at the 1990 International Solid-State Circuits Conference. Dr. Iwata is a member of the IEEE and IEICE.
About the Authors
217
Norihiro Katayama Laboratory of Neurophysiology and Bioinformatics Graduate School of Information Sciences Tohoku University Aobayama 05, Sendai 980-8579, Japan E-mail : [email protected] Phone :+81-22-217-7179 Fax : +81-22-263-9438 URL : http://www.yamamoto.ecei.tohoku.ac.jp/
Norihiro Katayama received the B. Eng degree in information engineering in 1991, M. Eng. degree in electronic engineering in 1993, and the Ph.D. degree in information science in 1996 from Tohoku University, Sendai, Japan. He was a fellowship of Japan Society for the Promotion of Science (JSPS) from 1993 to 1995. He is currently a research associate in the Graduate School of Information Sciences, Tohoku University. His current interests are neurophysiology, computational neuroscience, and biosignal processing. He is a member of the Japan Neuroscience Society, Japanese Neural Network Society (JNNS), Japan Society of Medical Electronics and Biological Engineering, and title Institute of Electronics, Information and Communication Engineers (IEICE).
218 About the Authors
Thomas Knobloch Chair of Electronic Devices and Integrated Circuits Faculty of Electrical Engineering, IEE Dresden University of Technology Mommsenstr. 13 D-01062 Dresden E-mail : [email protected] Phone :+49-351-463-5302 Fax :+49-351-463-7260
Thomas Knobloch contributed to this work during his studies of electrical engineering. His research interests were in CMOS image sensors, bio-inspired circuits and systems, and complex VLSI circuit and system design. After the completion of the reported work he received a diploma degree (Dipl.-Ing.) 'in electrical engineering from Dresden University of Technology in 1999.
Andreas Konig Chair of Electronic Devices and Integrated Circuits Faculty of Electrical Engineering, IEE Dresden University of Technology Mommsenstr. 13 D-01062 Dresden E-mail : [email protected] Phone :+49-351-463-2805 Fax :+49-351-463-7260 URL : http:/www.iee.et.tu-dresden.de/-koeniga/ Andreas Kdnig received the diploma degree in electrical engineering from Darmstadt University of Technology in 1990. From 1990 to 1995 he worked as Research Assistant at Darmstadt University of Technology in the domain of neural network implementation and application. In 1995 he obtained a Ph.D. degree from the same university. His doctoral thesis focused on neural structures
About the Authors
219
for visual surface inspection in industrial manufacturing and related neural network VLSI implementations. After completion of his Ph.D., he joined Fraunhofer Institute IITB in Karlsruhe and did research in visual inspection and aerial image evaluation. In 1996 he was appointed Assistant Prof. (Hochschuldozent) for Electronic Devices at Dresden University of Technology. His main research interest lies in design, application, and microelectronic implementation of neural networks and bio-inspired systems for vision and cognition.
Tsutomu MiM Department of Control Engineering & Scicjnce Faculty of Computer Science & Systems EEngineering Kyushu Institute of Technology 680-4 Kawazu, lizuka, Fukuoka 820-8502 ,Japan E-mail : [email protected] Phone :+81-948-29-7700 Fax : +81-948-29-7709 URL : ht%://tsugenoM.ces.kyutech.ac.jp/ Tsutomu Miki received the B. Eng. degree in Electronics Engineering in 1983 from Kumamoto University, Kumamoto and the M. Eng. degree in Information Engineering in 1985 from Kumamoto University, both in Japan. He received the Ph.D. degree for his studies on Soft Computing hardware systems in 1998 from Kumamoto University, Japan. From 1985 to 1990, he worked in Omron Corporation, and engaged in the development of IC Promotion Center as an LSI designer. He joined the faculty of •Computer Science and Systems Engineering, Kyushu Institute of Technology (KIT), lizuka, Japan and received a fall professorship in October 1990. He also joined, a national foundation, Fuzzy Logic Systems Institute (FLSI), in Japan in 1990 to promote the international collaboration on Soft Computing, and to promote the spread of the research results. He is now a Research Associate of Computer Science and Systems Engineering in KIT, lizuka and Research Scientist of FLSI. His main research interest lies on hardware implementation of fuzzy systems, ftizzy neural networks, and chaotic systems.
220 About the Authors
Takashi Mode Faculty of Engineering, Hiroshima University 1-4-1 Kagamiyama, Higashi-Hiroshima, 739-8527 J A E-mail : [email protected] Phone : +81 824-24-7686 Fax : +81 824-22-7195 URL : http://www.dsl.hiroshima-u.acjp/~morie/
He received the B.S. and M.S. degrees in physics from Osaka University and the Dr.Eng. degree from Hokkaido University in 1979, 1981 and 1996, respectively. From 1981 to 1997, he was a member of the Research Staff at Nippon Telegraph and Telephone Corporation (NTT). Since 1997 he has been Associate ft*ofessor of Electrical Engineering at Hiroshima University, HigashiHkoshima, Japan. His main interest is hi the area of VLSI implementation of neural networks, analog-digital merged circuits, and new functional devices. Dr. Morie is a member of the IEICE, the Japan Society of Applied Physics, and the Japanese Neural Network Society.
Makoto Nagata Faculty of Engineering, Hiroshima University 1-4-1 Kagamiyama, Higashi-Hiroshima, 739-8527 J A E-mail : [email protected] Phone : +81 824-24-7658 Fax : +81 824-22-7195 URL : ht^://www.dsl.hiroshima-u.ac.jp/**'nagata/
He received the B.S. and M.S. degrees in physics from Gakushuin University, Tokyo, Japan, in 1991 and 1993, respectively. From 1994 to 1996 he was a Research Associate at the Research Center for Integrated Systems, Hiroshima University. He is currently a Research Associate of Electrical Engineering at
About the Authors
221
Hiroshima University. His research interests include the development of AnalogDigital merged circuit architecture, modeling techniques of circuits Mid cross-talk, noises in analog HDL for Mixed Signal LSI design, VLSI implementation of neural networks, and also the area of new functional devices. He is a member of the IEICE and the IEEE.
MitsuyuM Nakao Laboratory of Neurophysiology and Bioinformatics Graduate School of Information Sciences Tohoku University Aobayama 05, Sendai 980-8579, Japan E-mail : [email protected] Phone :+81-22-217-7178 Fax : +81-22-217-7178 UEL : http://www.yaimmoto.ecei.tohoku.ac.jp/ MitsuyuM Nakao received the Dr. Eng. degree in information engineering in 1984 from Tohoku University, Sendai, Japan. Currently, he is an associate professor in the Graduate School of Information Sciences, Tohoku University. His current interests are neurophysiology, physiology-based neural-network models, and modeling of ckcadian and cardiovascular systems. He is a member of die IEEE EMB, IT, NN, SP Societies, the International Neural Network Society, and the Japan Neuroscience Society.
Masahko Ohtani Department of Electrical and Electronic Engineering Toyohashi University of Technology 1-1 Hibarigaoka, Tempaku-cho, Toyohashi, Aichi, 441-8580, Japan E-mail : [email protected] Phone :+81-532-44-6745
222 About the Authors
Fax : +81-532-44-6757 URL : http://www.dev.eee.tut.ac.jp/yonezulab/
Masahiro Ohtani was bom in Nara, Japan, on May 27,1974. He received die B.E. and the M.E. degree in electrical and electronic engineering from tie Toyohashi University of Technology in 1997 and 1999, respectively. He is currently in the Ph.D.'s program at Toyohashi University of Technology. His research is on vision chips.
Naoki Ohshima Department of Advanced Materials Science and Engineering, Faculty of Engineering, Yamaguchi University 2-16-1 Tokiwadai, Ube, Yamagachi, 755-8611, Japan E-mail : [email protected] Phone :+81-836-35-9042 Fax : +81-836-35-9965 URL : http://www.amse.yamaguchi-u.ac.jp/~smoro/f
Naoki Ohshima was bom on Feb. 17, 1964. He received B.E. degree in applied physics, and M.E. and Ph.D. degrees in materials science engineering from Nagoya University, Japan, in 1988, 1990, and 1993, respectively. He joined die Department of Electrical and Electronic Engineering, as a Research Associate in 1993. He has been a Lecturer of Faculty of Engineering, Yamaguchi University, since 1999. His research has been concerned with semiconductor materials and devices. He is a member of the Japan Society of Applied Physics and the Surface Science Society of Japan.
About the Authors
223
Tadashi Shibata Department of Frontier Informatics School of Frontier Science The University of Tokyo 7-3-1, Kongo, Bunkyo-kE, Tokyo, I 1-8656 JAPAN
E-mail : [email protected] Phone :+81-3-5841-8567 Fax :+81-3-5841-8567 URL : http://wwwif.tu-tokyo.ac.jp/
Tadashi Shibata was bom in Hyogo, Japan on September 30, 1948. He received the B.S. degree in electronic engineering and the M.S. degree in material science both from Osaka University, Osaka, Japan, and the Ph.D. degree from the University of Tokyo, Tokyo Japan, in 1971,1973 and 1984, respectively. From 1974 to 1986, he was with Toshiba Corporation, where he worked as a researcher on the R & D of ULSI device and processing technologies. He was engaged in the development of microprocessors, EEPROMs and DRAMs, primarily in the area of process integration and the research of advanced processing technologies for their fabrication. From 1984 to 1986, he worked as a production engineer at one of the most advanced manufacturing lines of Toshiba. During the period of 1978 to 1980, he was Visiting Research Associate at Stanford Electronics Laboratories, Stanford University, Stanford, CA, where he studied laser beam processing of electronic materials. From April 1986 to May 1997, he was Associate Professor at the Department of Electronic Engineering, Tohoku University and was engaged in the research of ultra-clean technologies and thek application to low-temperature processing. Since the invention of a new functional device Neuron MOS Transistor (vMOS) in 1989, he has been intensively working on exploring vMOS circuit technology and its application to intelligent electronic systems. Since May 1997, he has been Professor at The University of Tokyo. Dr. Shibata is a member of Japan Society of Applied Physics, the Institute of Electronics, Information and Communication Engineers, and die IEEE Electron Devices Society, Circuits & Systems Society and Computer Society.
224 About the Authors
Jan Skkibanowitz Chair of Electronic Devices and Integrated Circuit::s Faculty of Blecfrical Engineering, BEE Dresden University of Technology Mommsenstr. 13 D-01062 Dresden E-mail : [email protected] Phone :+49-351-463-5302 Fax : +49-351-463-7260 URL : http:/www.iee.et.tu-dresden.de/-skribano/ Jan Skribanowitz received a diploma degree (Dipi.-Ing.) in electeical engineering from Dresden University of Technology, Germany, in 1998. Since July 1998 he is working towards his doctor's degree (Dr.-Ing.) and receives a scholarship from Deutsch Forschunhgsremeinschqft (German Research Foundation) within die scope of Graduiertenkolleg ttSensorik,t (Graduate College of Sensor Tech-nology). His research interests are in the design and application of integrated vision systems, CMOS image sensors, bio-inspked circuits as well as low-power, mixed-signal VLSI systems. Jan Skribanowitz lies mountain climbing and snowboarding. Besides learning Japanese, he also enjoys Ju Jutsu and horse riding.
Eisuke Tokumitsu Precision and Intelligence Laboratory Tokyo institute of Technology 4259 Magatsuta Midoriku Yokohama 226-8503, Japan E-mail : [email protected] Phone :+8145-924-5084 Fax : +8145-924-5961 URL : http://it-www.pi.titech.ac.jp/
Eisuke Tokumitsu was born in Tochigi Prefecture, Japan hi 1960. He received
About the Authors
225
the B.S., M.S., and Ph.D. degrees in physical electronics from Tokyo Institute of Technology, Tokyo, Japan, 1982, 1984, and 1987, respectively. His Ph.D. dissertation was on epitaxial growth of GaAs and AlGaAs by metalorganic molecular beam epitaxy (MOMBE). In 1987, he joined the Department of Physical Electronics, Tokyo Institute of Technology as a Research Associate. From 1988 to 1990, he was a postdoctoral member of the Technical Staff at At&T Bell Laboratories, working on - compound semiconductor devices. In 1992, he joined tie Precision and InteEigence Laboratory of Tokyo Institute of Technology as an Associate Professor. His current research interests are semiconductor devices and process technologies for ferroelectric memories. Dr. Tokumitsu is a member of IEEE, die Japan Society of Applied Physics.
Tatsuo Tsujita Department of Information and Communication Engineering Graduate School of Engineering The University of Tokyo 7-34 Kongo, Bunkyo-ku, Tokyo 113-8656, Japan E-mail : [email protected] Phone :+81-3-5841-6775 Fax : +81-3-5841-8574 URL : http://hoh.tu-tokyo.ac.jp/
I1
Tatsuo Tsujita received B.S., M.S. and Ph.D. degrees of Information and Communication Engineering, all from the University of Tokyo, in 1994, 1996 and 1999, respectively. He has been engaged in die researches on semiconductor chaotic devices and circuits, and ULSI fabrication technology. Since 1999 he belongs to THine Electronics, Inc.
226
About the
Authors
Mitsuaki Yamamoto Laboratory of Neurophysiology and Bioinformatics Graduate School of Information Sciences Tohoku University Aobayama 05, Sendai 980-8579, Japan E-mail : [email protected] Phone :+81-22-217-7177 Fax ; +81-22-217-7177 URL : http://www.yamamoto.ecei.tohoku.ac.jp/
Mitsuaki Yamamoto received the B. Eng. and Dr. Eng. degrees in electrical communication engineering from Tohoku University, Sendai, Japan, in 1963, and 1971, and -the Dr. Med. Sci. degree from the Tohoku University School of Medicine in 1980. He is currently a professor of neurophysiology and bioinformatics at the Graduate School of Information Sciences of Tohoku University. His areas of research are neurophysiology of sleep and pain, and Mosignai processing. He is a member of Sleep Research Society, the International Association for Study of Pain, and the Physiological Society of Japan.
Hiroo Yonezu Department of Electrical and Electronic Engineeriing Toyohashi University of Technology 1-1 Hibarigaoka, Tempaku-cho, Toyohashi, Aichi, 441-8580, Japan E-mail : [email protected] Phone :+81-532-44-6744 Fax : +81-532-44-6757 URL : http://www.dev.eee.tut.ac.jp/yonezulab/ Hkoo Yonezu received the B.E. degree in electronic engineering from Shizuoka University in 1964 and the Dr.E. degree in the electrical engineering from Osaka
About the Authors
227
University in 1975. In 1964 he joined Nippon Electric Co. Ltd (NEC Corporation). He made contributions to the research on degradation mechanisms and the improvement of operating life of AlGaAs lasers in the Central Research Laboratories. Since 1986 he has been a professor with the Department of Electrical and Electronic Engineering, Toyohashi University of Technology. His research has been concerned with basic technologies for future OEICs including lattice-mismatched heteroepitaxy, optoelectronic devices and neuro-devices. He is a managing director of The Japan Society of Applied Physics. He received the SSDM Award from the International Conference on Solid State Devices and Materials in 1995.
Sung-Min Yoon Frontier Collaborative Research Center Tokyo institute of Technology 4259 Nagatsuta Midoriku Yokohama 226-8503, Japan E-mail : [email protected] Phone :+81-45-924-5040 Fax : +81-45-924-5961 URL : http://it-www.pi.titech.ac.jp/
Sung-Min Yoon was born in Seoul, Korea in 1970. He received die B.S. in department of inorganic material engineering from Seoul National University, Seoul, Korea, 1995, and M.S. in applied electronics from Tokyo Institute of Technology (TIT), Tokyo, Japan, 1997, respectively. Currently, he is pursuing the PhJD. degree in applied electronics at TIT. He has researched on the fabrication of the adaptive-learning neuro-chip using ferroelectric thin film. His research interests include the characterization of ferroelectric thin film, device physics and process technologies for ferroelectric-related devices. He was awarded the Japan Society of Applied Physics Award for Research Paper Presentation in 1999, the Teshima Prize in 1999, and SSDM Young Researcher Award in 1999. He is a member of the Japan Society of Applied Physics.
Keyword Index A absolute value circuit 16 active dendrites 187 adaptation 99 adaptive-learning function 33 analog -vMOS circuit 158 —approach 63 -EEPROM 18 -EEPROM technology 13 -hardware 105 -integrated circuit 124 -memory 34 -multiplier 4 -VQ processor 22 analog/digital-merged, analog-digital merged —architecture 62 -decision making operation 16 AND-NOT gate 184 anisotropic smoothing 114 arbitrary nonlinear transfer function 76 assessment functions 94, 98, 99 association 13 association processor 12, 15 —architecture 2 associative memory 62 automated visual inspection 89 automotive -applications 89 -area 104, 112 -image processing 116 autozeroing 4
B 229
backpropagation, back-propagation 188 -learning 73 -method 55 —networks 62 Bernoulli-shift 173 bifurcation 165, 169 -diagram 80 binary digital circuits 4 bio-inspired -algorithm 90, 91, 100 —circuits 90 -systems 90 biological -evidence 90 -systems 90 biometric systems 116 bipolar weighted summation 64 bipolar weighting 66 Boltzmann machine 62, 165,173 Boolean logic 8
c cable -equation 181 -theory 181 calculation precision 63 capacitor-transistor pair 164 CDMA matched filter 25 cell -circuit 143, 144, 145 -function 137 cellular-automaton, cellular automaton 137 -circuit 152
230 Keyword Index
center of mass 9 chaotic -behavior 79 -neural networks 62 -signals 75 chopper comparators 4 clocked CMOS comparator 76 CMOS 4, 171 -inverter 76, 165 -multivibrator 168 -Schmitt-trigger 34 -VLSI chips 76 compensation capacitance 71 continuous-state discrete-time dynamics 64 continuous-time continuous-state dynamics 63 correlation neural network 124 coupled operation 167 complementary unijunction transistor, CUJT 34 current source 67
D D/A conversion 76 data retention 36 dendritic integration 180 dendritic spines 185 depolarizing synapse 183 design -automation 99 -costs 91 -hierarchy 95 -methodology 91, 96, 98, 99, 100,106,108, 116, 117 Difference-of-Gaussian, DoG 105 differential-of-gaussian filtering 27 digital approach 63 digital VQ processor 19 dilation 140
dissimilarity measure 15 dyadic expansion 173 dynamic vMOS circuit 157
E edge detection 140 EEPROM 4 EPROM 4 erosion 140 Exclusive OR 7,8 eye-tracker 114, 116,117 -system 106
F feature space visualization 93, 94, 97 feedforward network 74 ferroelectric film 34 finger print identification chip 27 fixed pattern noise 116 floating gate, floating-gate 4, 115, 117 -EEPROM cell 13 frequency spectrum 172 full-custom chip 171 fuzzy processor 27
G Game-of-Life 138 Gaussian pyramid 107 graceful degradation 90
H hardware —computation 2 -friendly vector representation 28 hierarchical neuron model 193
Keyword Index
high dynamic range 108, 112 higher-order neuron model 191 high-threshold vMOS circuit 156 hippocampal pyramidal neurons 187 Hopfield networks 62 hybrid systems 91 hyperpolarizing synapse 183
I image processing 4 input resistance 181 input vector 15 integrate-and-fire neuron model 191 intelligent -microsystems 116 -system 91 interactive —feature space visualization 97 -visualization 93 inverter 168
L learning mechanisms 99 Learning-Vector-Quantization, LVQ 95 linear image sensor 110, 112 local adaptation 105, 113, 114 local contrast resolution 113 logistic map 80 Lorenz plot 165, 169 low power dissipation 155 low-power 91
M machine vision 89, 112, 114, 115, 116 majority black 140 Manhattan distance 15 matched filter 25
231
matching cell 18 maximum-likelihood search 2 McCulloch-Pitts type -formal neuron 180 -neuron model 191 memory window 35 MFSFET 34 Moore's Law 1 morphological picture processing 139 motion -detection 124, 130 -picture compression 18 -vector detection 23 MPEG-2 23 multiple-level values 7 multiplicative -effects 189 -syanptic interactions 191 multivariate data projection 93, 96
N Nearest-Neighbor-techniques, kNN 95 neuMOS, vMOS 4 -association processor 15 -circuit 144, 155 -inverters 11 -multivalued ROM technology 18 -source follower 10, 15 -VQ processor 21 vMOS FET 142, 145, 150, 155 -cellular automaton 151 neural network 33 -hardware 90, 93 neurochips 93 neurocomputers 93 neuron 66 Neuron MOS Transistor 4 NMD A receptors 191 noise 165
232
Keyword
Index
non-idealities 63 nonlinear oscillator networks 62 nonlinear reference waveform 65 nonlinear transformation 65 nonmonotone transformation 63, 75 non-monotonic neuron model 191 nonvolatile memory 35
pulse modulation approach 63 pulse phase moduration, PPM 63 -approach 64 pulse width modulation, PWM 63 -approach 64 -signals 78
o
Q QuickCog 93
OCR 100, 103, 116 on the path 189 -effect 184 opportunistic design 90, 117 optical flow 124, 129, 130 output function 190 overtake monitoring 104
P parallel -architecture 137 -connection 70 -information processing 33 -processing system 135 parasitic capacitance 70 passive dendrite 181 performance measures 92 period-adding chaos 166 phase diagram 169 pipelined A-D converter 173 power consumption 90 power dissipation 90 principal variable 7 pseudorandom noise code, PN code 25 psychological brain model 2 pulse density modulation, PDM 63 -approach 63 pulse frequency modulation, PFM 37, 63
R radial basis function network, RBNF 191 random signal 172 real-time reconfigurable logic gates 4, 9 recognition systems 91, 116 reference system 93, 94, 95, 100, 103 resistive fuses 114 return-map 165 Right Brain Computing 11 ROM-version cell 18
s sample and hold (S/H) circuit 77 saw-tooth signal 168 scaling 2 -trend 64 Self-convergent vMOS WTA 21 self-learning neural networks 4 self-learning system 99 serial connection 69 shrinking rule 147 shunting synapse 183 sigmoidal function 65 SIMD architecture 18 Soft Hardware Logic 9 SOI structure 38
Keyword Index 233
source follower, source-follower 73, 79 spatial smoothing 109, 112 spatio-temporal processing 90 SPICE simulation 70 surveillance 89 switched-capacitor (SC) network 109 switched-current sources, SCSs 64 synapse 66 -array 38 synaptic -current 182 -efficacy 196 -integration 179 -integration function 190 -interaction 184 -plasticity 186 -weight 34, 66 synthesis techniques 99 system validity 91
T 3D-displays 104 teacher signal 196 template 139, 147 -vector 15 temporal smoothing 109 tent map 79 thinning rule 145 threshold processing 37 thyristors 164 turnaround time 90, 91, 95, 116
V variable threshold 7 vector quantization, VQ 18 velocity sensing circuit, VSC 125, 126, 127, 130 vision chip 107, 112
visual inspection 110, 112 VLSI implementation 62 VLSI system 63 voltage follower 79
w WB-CDMA 25 weighted summations 64 weighted-sum operation 38 winner-take-all, WTA 13, 15, 20
Y Y-diagram 93
\p\
^ T ] | Soft Computing Series — Volume 6
Brainware: Bio-Inspired Architecture and its Hardware Implementation Editor: Tsutomu Miki (Kyushu Institute of Technology, Japan)
The human brain, the ultimate intelligent processor, can handle ambiguous and uncertain information adequately. The implementation of such a humanbrain architecture and function is called "brainware". Brainware is a candidate for the new tool that will realize a human-friendly computer society. As one of the LSI implementations of brainware, a "bio-inspired" hardware system is discussed in this book. Consisting of eight enriched versions of papers selected from IIZUKA '98, this volume provides wide coverage, from neuronal function devices to vision systems, chaotic systems, and also an effective design methodology of hierarchical large-scale neural systems inspired by neuroscience. It can serve as a reference for graduate students and researchers working in the field of brainware. It is also a source of inspiration for research towards the realization of a silicon brain.
ISBN 981-02-4547-5
www. worldscientific. com 4632 he
9 "789810"245474"