Harnessing VLSI System Design with EDA Tools
Rajanish K. Kamat • Santosh A. Shinde Pawan K. Gaikwad • Hansraj Guhilot
Harnessing VLSI System Design with EDA Tools
Rajanish K. Kamat Department of Electronics VLSI Laboratory Shivaji University Kolhapur India
[email protected]
Santosh A. Shinde Department of Electronics VLSI Laboratory Shivaji University Kolhapur India
[email protected]
Pawan K. Gaikwad Department of Electronics VLSI Laboratory Shivaji University Kolhapur India
[email protected]
Hansraj Guhilot Department of Electronics VLSI Laboratory Shivaji University Kolhapur India
[email protected]
ISBN 978-94-007-1863-0 e-ISBN 978-94-007-1864-7 DOI 10.1007/978-94-007-1864-7 Springer Dordrecht Heidelberg London New York Library of Congress Control Number: 2011938274 © Springer Science+Business Media B.V. 2012 No part of this work may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, microfilming, recording or otherwise, without written permission from the Publisher, with the exception of any material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)
Dr. Rajanish K. Kamat was born in India in 1971. He received B.Sc. in Electronics, M.Sc. in Electronics both in distinction in 1991 and 1993 respectively. Further he completed M.Phil. in Electronics in 1994 and qualified the State Eligibility Test (in 1995), which is mandatory for faculty positions in India. He pursued his Ph.D. in Electronics specialized in ‘Smart Temperature Sensors’ at Goa University and completed the same in 2003. He was awarded merit scholarship during the Masters programme. Dr. Kamat is currently an Associate Professor with the Department of Electronics, Shivaji University, Kolhapur, India. Prior to joining Shivaji University, he was working for Goa University and on short term deputation under various faculty improvement programmes to Indian Institute of Science, Bangalore and IIT Kanpur. He has successfully guided five students for Ph.D. in the area of VLSI Design. His research interests include Smart Sensors, Embedded Systems, VLSI Design and Information and Communication Technology. He is recipient of the Young Scientist Fellowship under the fast track scheme of Department of Science and Technology, Government of India and extensively worked on Open Source Soft IP cores. One of his research papers won 4th place in the international paper contest organized by American Society for Information Science and Technology [ASIST, USA] for the year 2008. He has published over 35 research papers, presented over 60 papers in conferences and authored three books: Unleash the System On Chip using FPGAs and Handel C (U.K., Springer, 2009), Practical Aspects of Embedded System Design using Microcontrollers C (U.K., Springer, 2008), Exploring C for Microcontrollers: A Hands on approach (U.K., Springer, 2007). Dr. Kamat is a Member of IEEE and also a life member of Society of Advancement of Computing. He has been listed in the Marquis Who’s Who in the World, USA.
Dr. Santosh A. Shinde was born in India in 1981. He received B.Sc. in Electronics, M.Sc. Electronics both in First Class in 2004 and 2006 respectively. Further he completed his Ph.D. specialized in VLSI Design in the year 2009. Currently he is working as an Assistant Professor at Department of Electronics, Shivaji University, Kolhapur. He has published over 10 research papers, presented over 06 papers in conferences and authored two books: Unleash the System On Chip using FPGAs and Handel C (U.K., Springer, 2009), Practical Aspects of Embedded System Design using Microcontrollers C (U.K., Springer, 2008) both published by Springer, UK.
Dr. Gaikwad Pawan Kumar was born in India on August 29, 1976. He did his M.Sc. from University of Poona, Pune. He further pursued specialized programmes in the areas of VLSI design and Cyber Law. He completed his Doctorate in Electronics from Shivaji University, Kolhapur for which he developed FPGA based Portable ECG and Pulse Oximeter. The significant part of his research work has been reported in the present book.
Prof. Hansraj Guhilot was born in Haraji, Rajasthan, India in 1958. He received B.E. in Electronics from University Visveswaraiah College of Engineering, Bangalore and M.Tech in Industrial Electronics from Karnataka Regional Engineering College, Surathkal in 1981 and 1985 respectively in first class with distinction. Presently he is Professor and Head in the Department of Electronics and Communication in K.L.E. Society’s College of Engineering & Technology, Udyambag, Belgaum, He has to his credit 26 years of rich professional experience that includes two years consultancy for a US based company. He holds 9 international and 1 Indian patents. His areas of interests are Smart Sensors, Agro-Electronics, Biomedical and Mixed Mode VLSI Design. He is as an expert in Power Electronics especially his patented solution on a power processor for metal halide lamps and High Frequency Electronic Ballast for lighting has been major breakthroughs in the industry. He has published 27 research papers and was felicitated by Ministry of Human Resource Development, Govt. of India, for owning international patents. Prof. Guhilot is a fellow of Institute of Electronics & Telecommunication Engineers and Indian Society for Lighting Engineers and member of Indian Society for Technical Education and IEEE. He has been listed in Indo American Who’s Who and Asian American Who’s Who.
Foreword
One of the most challenging disciplines in this era of increasing consumerism, shrinking design cycles, lowered project budget without compromising the performance is undoubtedly the Very Large Scale Integration a.k.a. VLSI. In order to address the increasing owes of the chip designers with respect to the above mentioned attributes, the phenomenal growth of Electronic Design Automation i.e. EDA is witnessed in the present decade. The EDA regarded as the main arm of the today’s’ VLSI has successfully set a track record of four decades by now, and still is witnessing many challenges, which is witnessed by the good number of research groups all over the globe. The present book “Harnessing VLSI System Design with EDA Tools” is yet another endeavor placed in front of the chip designers by a group of experienced research professionals. When more and more designers opting for the Field Programmable Gate Array (FPGA) as their realization platform, the induction of the books as the present one is very apt and therefore, I foresee that it will be well taken up by the design community. The fundamental aspect of the book is sharing the research experiences of the authors’ with the design community, which I liked the most. Such sort of efforts would definitely bring the laboratory technologies to market and foster the ties between the industries and academia. Another appreciable facet of the book is the adoption of the mixed EDA techniques pertinent to the design problems. For instance the authors have used VHDL, Handel C and Verilog to exemplify the appropriate choice of the tools fitting to the complexity and nature of the systems to be designed. The book instead of focusing on mere theoretical discussion encourages hands on aspects and harps on realization of the systems in the FPGA paradigm. The documentation style of the book is also research oriented which is not a surprise as al the authors have had their stint in research and have come out with three such books prior to the present one besides many research papers in the journals of high repute. In my opinion, the book will be a value addition and may be viewed from different perspective by the potential readers. The Post Graduate Science and Engineering students might found it useful to learn the design cycle right from the problem definition, using appropriate methodology to attack the problem and coming out with a successful project design. Nevertheless, the research professionals like doctorate ix
x
Foreword
students will find it useful to derive parallel case studies for defining their research topics. Networking professionals’ interested in designing smart appliances with internet connectivity will be benefited with the know-how reported here as regards to the system design with highest level of abstraction. The biomedical community may take up product design with focus on the portability aspects. The soft IP cores reported in one of the chapters are really useful for any designer to comprehend their systems by merely analytical marriage of the pre-designed, thoroughly verified design blocks. The time-to-digital converter reported in the last chapter is really a boon for alleviating the power hungry ADCs required in almost all the SoCs today. Thus in nutshell, the book will serve the EDA and chip design community well and it will be sort of lamp post for the professionals nurturing their career in this domain of knowledge where the rate of obsolescence is really awesome. Prof. (Dr.) A.D. Shaligram Professor and Head Department of Electronic Science University of Pune Pune
Prof. (Dr.) G.M. Naik Professor and Head Department of Electronics Goa University GOA
Foreword
xi
Prof. (Dr.) A.D. Shaligram is presently professor and Head, Department of Electronic Science, University of Pune. His research interests are Optoelectronics, Fiber optic and Optical Waveguide sensors, PC/Microcontroller based Instrumentation, Simulation software development, Biomedical Instrumentation and sensors. He has successfully guided many Doctorate students in the area VLSI design and completed research, development and consultancy projects in the above mentioned areas. Dr. Shaligram has been instrumental in standardization of the Electronics course material and inspired fellow colleagues to inculcate innovations in teaching and research techniques.
Prof. (Dr.) G.M. Naik is presently Professor and Head, Department of Electronics, Goa University, Goa. He did his Doctorate from Indian Institute of Science, Bangalore. At Goa University Prof. Naik has successfully launched the teaching and research programme in the areas of VLSI Design with the grant-in-aid from University Grants Commission under Innovative Scheme. He is an active researcher in VLSI Design and has been instrumental to nurture quality Human Resource in this area.
Preface
The dictionary meaning of the word ‘Harness’ is ‘to bring under conditions for effective use’ or ‘gain control over for a particular end’ and in view of that the title of the book “Harnessing VLSI System Design with EDA Tools” itself intuitive enough in the VLSI arena. It aims at exploring the various dimensions of the EDA technologies for achieving different goals of the VLSI system design. The EDA has by now matured enough with its longstanding stint of more than four decades of existence and is constantly evolving along with its other contemporary complementing technologies such as the computing architectures, algorithms, data mining techniques and not to stop thinking about the FPGAs which have revolutionized the VLSI system design aspects through fast prototyping. Though, the scope of the EDA in a true sense is very broad and comprises of the diversified hardware and software tools to accomplish different phases of the VLSI system design such as design, layout, simulation, testability, prototyping or implementation, however this book focuses only on demystifying the code a.k.a. firmware development and its implementation in the FPGA paradigm. In fact there are different varieties of such languages empowering the EEs for attaining their system design goals; nonetheless through this book we put forth our notion of FPGA based system design through a variety of case studies selected from different engineering domains and realized through different languages. Before the readers start with this book, we would like to caution them about the very nature of the book. This is not a text book unlike our earlier ones Unleash the System On Chip using FPGAs and Handel C (U.K., Springer, 2009), Practical Aspects of Embedded System Design using Microcontrollers C (U.K., Springer, 2008), Exploring C for Microcontrollers: A Hands on approach (U.K., Springer, 2007). After reading our previous book on Handel C, many EEs wrote to us that we should now deliver something in greater depth with live case studies for truly perceiving the feel of applications. This has really motivated us to shape the present book. Authors would like to specially mention about good books on EDA, VLSI Design and FPGAs (mentioned in [20–31] in reference section) and our book is no
xiii
xiv
Preface
way attempts to be substitute for these books. Instead it is a value addition by focusing on the research aspects in VLSI Design. After setting the background of the book, we would now like to mention about its key features. The book is written by the researchers for the budding researchers. The different chapters present the gist of the research work which has led the authors to their doctorate degree. Mainly two types and three languages viz. Hardware Description Languages (VHDL and Verilog) and Behavioral High-Level Languages (Handel C) have been used for developing the firmware. The prototyping environment used is the Xilinx FPGAs and it is worth mentioning here that the Xilinx Starter Kit was very useful for testing and prototyping. We assume the readers of this book to be familiar with Digital Electronics, Computer Networking, Algorithms and Computational theory and also with the basic design flow pertaining to the FPGA based design projects. Covering these aspects in one text is fairly impossible and worthless. Instead we have provided footnotes pointing to the reference through which the readers can gather more information. The text also presents practical know-how of the state-of-art design methodologies such as ‘HardwareSoftware Codesign’, ‘Soft IP Cores’ and so on. It also presents the complete listing of the code so that any one who is interested can further use these soft IP cores in their design projects. The screenshots and device utilization reports are purposely given in depth so that the readers could make themselves familiar with the design for testability and debugging aspects. The reference/bibliography section at the end of the book is quite rich and lists around 170 selected references drawn from the scholarly journals, industry whitepapers, web resources and presentations of various researchers. Finally we would like to give a brief about the organization of the book. The book is divided into five chapters. Chapter 1 introduces the theme of the book and covers the very rationale behind proposing this book. The major conclusion which is on the basis of the research papers of visionaries in the field like ‘Makimoto’ and ‘Tredennick’ is the emerging need of the state-of-art VLSI applications to have mixed mode design environments. The book then takes this further in two ways. First, it exemplifies development of FPA based applications with different language suits like VHDL, Verilog and System C. Second, even while building these case studies the issues such as testability, verification, power consumption etc. have been handled again by using different sets of EDA tools such as ModelSim, Leonardo Spectrum etc. Chapter 2 starts with an interesting application of developing FPGA based AntiSpam solution which comes from the reported processing bandwidth limitations of general purpose processors, which can serve only a few hundred Mbps throughput which then poses a bottleneck in the overall bandwidth of the setup. Hardware based Anti-Spam solution has been projected as the potential solution to alleviate the bandwidth holdups. Amongst the hardware based solutions, the FPGAs offer the most striking advantages due to their inherent capability to reconfigure; offer more throughputs and exploit parallelism. Moreover, with the development of the tools such as Handel C that work at the higher level of abstractions, their programming
Preface
xv
becomes easy. The anti-spam solution described in this chapter derives benefits of all the constructive attributes of the FPGA based system design that includes hardware-software codesign, integration of the soft IP cores on the chip and prototyping the entire functionality on a single platform reducing the off-chip access cycles to minimum possible. With careful grasping of the chapter details, a design environment that allows networking experts to use FPGAs might also be explored for many similar problems. Chapter 3 presents yet another interesting application in the biomedical / health care domain. The casse study described here deals with design of low-cost, miniature, lightweight, low-power, portable ECG system. Such wearable health monitoring systems integrated into a telemedical system are a promising new information technology capable to support prevention and early detection of abnormal conditions. The portable system developed during the present research work is capable of recording, storing and real time displaying the ECG in a single portable device. For emergency detection the analysis can incorporate patients profile and activity information to reduce the number of false alarms. The main advantage of the deve lopment is the cardiologist can gather data from the patient over a long period of time, during which the patient can enjoy their normal day-to-day lifestyle. The system is very useful because the ECG signals obtained from stress examinations are diagnostically important in detecting a number of heart diseases, which may not be apparent when the patient is at rest. Goal of this development is to determine the ‘normal’ state of the patient in different activity modes so that each set of ECG readings may then be interpreted within the context of the patient’s current physical activity. The research work also addresses the issue of today’s most pressing matters in medical care i.e. response time to patients in need. It suggests a FPGA based solution with a support of VHDL that would help reduce response time in emergency situations utilizing modem based trans-receiving technology. Chapter 4 provides a comprehensive overview of developing FPGA based embedded and discuss how FPGAs have the potential to be used as a platform for System-on-Chip (SoC) styled designs. The firmware developed and implemented in the form of soft IP cores showcases the manner in which such cores can be combined to form semicustom ASICs for the intended applications. The soft IP cores reported here, if reused judiciously might lead to big solutions for the development problems of the potential readers of this book. A widely agreed fact is that for any modern applications A to D converter is unavoidable. However integrating such a ADC poses several typical problems due to its inherent mixed mode architecture. Time to Digital Converter a.k.a ‘time interval measurement’ have traditionally been popular with the discrete digital components. In Chap. 5 we have developed such as high precision ADC based on vernier Time to Digital conversion principle. Again he we rely on Verilog for the firmware part and Spartan 3e FPGA for prototyping. Thus in nutshell the book shares an integrated knowledge based of the authors’ and the value addition comes from their research background. It is hoped that the integrated presentation of information, with embellishment using expounding case
xvi
Preface
studies, will be of value to many readers looking out to develop their research problems. Authors’ hope that the text will stimulate further innovations and would really lookforward to hear from the readers regarding the usefulness of the text. Dr. Rajanish K. Kamat Dr. Santosh A. Shinde Dr. Pawan K. Gaikwad Prof. Hansraj Guhilot
Acknowledgments
There are so many people, whose support, encouragement & inspiration are very much essential to accomplish major achievements in life, especially, if it involves the elements of fulfilling one’s cherished dreams such as publishing this fourth quality book through Springer. For me, this book is such an important destiny & I am indeed, indebted to lot of people for their well wishes & blessings, for completing this journey. This book infact is a compiled version of the cumulative knowledge and wisdom gained throughout the research work of the authors’ towards their doctoral work at the Department of Electronics, Shivaji University, Kolhapur which is on the dawn of entering into its golden jubilee year. At the outset we extend our sincere thanks to Springer, through which we could reach good number of Institutes of Higher Learning all over the globe. It is really gratifying to notice through the Online Catalogue of Library of Congress and Google Book tools that our previous three books have find their appropriate place in the leading Libraries in the world and also referred in the curricula of worldclass institutes. This wouldn’t been possible without the editorial support of the editors and the Springer staff with whom we work for last three years. We would like to pace on record our sincere appreciation towards Charles B. Glaser, Senior Editor, Electrical Engineering and Elizabeth Dougherty as well as Mark de Jongh and Ms. Cindy Zitter for their persuasion and patience through the project. We are highly indebted to our present Vice-Chancellor Prof. (Dr.) N.J. Pawar and the past Vice-Chancellor Prof. (Dr.) M.M. Salunkhe for motivating, guiding, and helping us to complete this project. The book would not have been possible without the encouragement, wisdom, feedback, and support from Dr. G.M. Naik, Professor and Head, Department of Electronics, Goa University and Dr. A.D. Shaligram, Professor and Head of Pune University who has also kindly agreed to give foreword to this book. Thanks are also due to Department of Science and Technology (DST), New Delhi for using the facilities procured through the DST Fast Track Young Scientist Project granted to Dr. R.K. Kamat.
xvii
xviii
Acknowledgments
Special thanks to our family members who were enthusiastic towards our writing. Moreover, Dr. Kamat would like to dedicate his contribution in this text to his new born daughter “Reva”. It is well known that in the writing of any text, the person who benefits is the author himself. We think this is true in our case too. We dedicate this work to all those who have directly or indirectly helped and encouraged us. Dr. Rajanish K. Kamat Dr. Santosh A. Shinde Dr. Pawan K. Gaikwad Prof. Hansraj Guhilot
Contents
1 Introduction................................................................................................ 1.1 Introduction....................................................................................... 1.2 Prologue............................................................................................ 1.3 EDA: From Methodologies, Algorithms, Tools to Integrated Circuits and Systems......................................... 1.4 EDA from Halcyon’s Days to the Blooming Paradigm of Chip Industry................................................................ 1.5 Categories of the EDA Tools............................................................ 1.6 Quo Vadis, EDA? The Challenges and Opportunities...................... 1.7 Just One More Book on EDA or Value Addition to the Scholarly Literature by US?.................................... 1.8 Designing the System as SoC Using the Soft IP Cores.................... 1.9 Types of IP Cores.............................................................................. 1.10 Design Issues Pertaining to the Soft IP Cores.................................. 1.11 Justifying FPGA as the Prototyping Platform.................................. 1.12 Justifying the Differing Flavors of Languages Used in This Book............................................................................ 2 Development of FPGA Based Network on Chip for Circumventing Spam............................................................ 2.1 Introduction....................................................................................... 2.2 Conception of the Spam Mail........................................................... 2.3 FPGA Based Network on Chip for Circumventing Spam................ 2.3.1 Inspiration............................................................................. 2.3.2 Core Concept........................................................................ 2.3.3 Method.................................................................................. 2.3.4 Motivation............................................................................. 2.3.5 Advantages of FPGA Based Antispam Appliance in Nutshell........................................................... 2.3.6 Significance of the Work.......................................................
1 1 1 2 4 5 6 8 9 10 11 11 14 15 15 16 18 18 18 18 19 19 20
xix
xx
Contents
2.4 Tools Infrastructure and Design Flow............................................... 2.4.1 Handel C........................................................................... 2.4.2 ISE Webpack 9.2............................................................... 2.4.3 EDK Version 9.2............................................................... 2.4.4 Xilinx Starter Kit.............................................................. 2.5 Introducing Hardware-Software Co-design...................................... 2.6 Hardware Software Co-design.......................................................... 2.6.1 Motivation for Hw/Sw Co-design..................................... 2.6.2 Advantages of Hw/Sw Co-design Methodology.............. 2.6.3 State of the Art Hw-Sw Co-design Methodologies.......... 2.7 Hardware-Software Codesign Framework Proposed in the Present Case Study.............................. 2.7.1 Addressing the Issues Through Co-design....................... 2.8 Description of System at Higher Level............................................. 2.9 Resolving the System a Step Down.................................................. 2.10 System Design.................................................................................. 2.10.1 Microblaze Processor........................................................ 2.10.2 PLB BUS.......................................................................... 2.10.3 XPS UART Lite................................................................ 2.10.4 Off Chip Level Converter................................................. 2.10.5 XPS Ethernet Lite............................................................. 2.10.6 SMSC LAN83C185 High Performance Single Chip Low Power 10/100 Mbps Ethernet Physical Layer Transceiver (PHY)..................... 2.10.7 XPS Timer........................................................................ 2.10.8 XPS Interrupt Controller................................................... 2.10.9 Double Data Rate (DDR) Synchronous DRAM (SDRAM) Controller........................................... 2.10.10 Off Chip DDR SDRAM MT46V32M16.......................... 2.11 Development of Soft IP Core of Bloom Filter.................................. 2.11.1 Justifying Bloom Filters for the Keyword Parsing........... 2.11.2 Theoretical Foundations of Bloom Filter......................... 2.11.3 Hash Function................................................................... 2.11.4 Deciding the Size and Number of Hash Functions in Our Bloom Filter Implementation............... 2.12 Presenting System Design of Purely Software Modules.................. 2.13 Integrating of the Hardware-Software Modules Using EDK........... 2.14 Setting the POP3 Client and Describing Overall Working of the System........................................................ 2.15 Conclusion........................................................................................ 3 Analog Front End and FPGA Based Soft IP Core for ECG Logger................................................................................ 3.1 Prior Art............................................................................................ 3.2 The Very Rationale of the System....................................................
21 21 22 22 23 23 24 25 26 26 28 29 30 30 31 33 33 35 35 35 36 36 37 37 39 39 39 40 40 42 45 47 47 49 51 51 53
Contents
3.3 Analog Front End of the Setup......................................................... 3.3.1 Leads Formation................................................................. 3.3.2 Restricting Number of Leads.............................................. 3.3.3 ECG Instrumentation Amplifier......................................... 3.3.4 Deriving the Signal from the Augmented Leads................ 3.3.5 Filtering the ECG Signal.................................................... 3.3.6 Multiplexing the Lead Signals............................................ 3.3.7 Post-multiplexer Amplifier Stage....................................... 3.3.8 Digitization of the ECG Signal........................................... 3.3.9 FPGA Based Handshake Micro-logic................................ 3.3.10 MODEM Interface.............................................................. 3.4 VHDL Implementation of the ECG Soft IP Core............................. 3.4.1 Driving ADC: LTC 1407 and Storing Data in 3D RAM................................................................. 3.4.2 Details of the VHDL Code................................................. 3.4.3 VHDL Processes for Conversion and Storage in 3D Memory: (Process P_conv, P_SHIFT and P_STORE)................................................................... 3.4.4 VHDL Process for Serial Transmission of the ECG Signal (Process Serial)..................................... 3.5 ModelSim Simulation Results.......................................................... 3.6 Synthesis Results Using Mentor Graphics Tool: Leonardo Spectrum........................................................................... 3.6.1 Synthesis Report................................................................. 3.6.2 RTL View............................................................................ 3.6.3 Technology Schematic View.............................................. 3.6.4 Critical Path Schematic....................................................... 3.7 Monitoring the ECG Using MODEM Based Setup......................... 3.7.1 Tele-monitoring of the ECG Signal at the Hospital End.................................................. 3.8 ECG Signal Reconstruction Mechanism at the Hospital End............................................................................ 3.8.1 DAC Interfacing Details..................................................... 3.8.2 FPGA Driving Demultiplexer and DAC: Core Algorithm................................................................... 3.8.3 Serial ECG Receiver: Flow Chart....................................... 3.9 VHDL Listing for Driving the Analog Demultiplexer and Serial DAC from Spartan-3E FPGA.......................................... 3.10 Discussion Regarding the VHDL Implementation........................... 3.10.1 Process Serial_P.................................................................. 3.10.2 Process S_OUT................................................................... 3.11 ModelSim Simulation Results.......................................................... 3.12 Synthesis Results Using Mentor Graphics Tool: Leonardo Spectrum........................................................................... 3.12.1 Synthesis Report................................................................. 3.12.2 RTL View............................................................................
xxi
54 54 54 56 56 60 62 62 62 63 63 65 65 69 69 71 71 73 73 76 76 76 76 76 76 78 79 81 81 83 83 84 84 85 85 86
xxii
Contents
3.12.3 Technology Schematic View.............................................. 3.12.4 Critical Path Schematic....................................................... 3.13 Conclusion........................................................................................ 4 FPGA Based Multifunction Interface for Embedded Applications....................................................................... 4.1 Introduction....................................................................................... 4.2 Universal FPGA Based Interface for High End Embedded Applications............................................................ 4.2.1 Hardware Aspects................................................................. 4.3 Soft IP Core for the LCD Interface................................................... 4.4 Soft IP Core for the DAC Interface.................................................. 4.5 Handel C Listing of the Soft IP Core for the DAC Interface....................................................................... 4.6 Soft IP Core for the Linear Tech LTC6912-1 Dual Amp Interface.......................................................................... 4.7 Soft IP Core for the ADC Interface.................................................. 4.8 Soft IP Core for the VGA Interface.................................................. 4.9 Soft IP Core for the Keyboard Interface........................................... 4.10 Triangular Wave Generator Using DAC........................................... 4.11 Conclusion........................................................................................ 5 FPGA Based High Resolution Time to Digital Converter...................... 5.1 Introduction....................................................................................... 5.2 TDC: Prior Art.................................................................................. 5.3 TDC Using Vernier Principle............................................................ 5.3.1 Coarse measurement........................................................... 5.3.2 FINE MEASUREMENT.................................................... 5.4 Simulation and Verilog Modules...................................................... 5.4.1 Ring Oscillator (Fast Clock) RTL Schematic..................... 5.4.2 Verilog Module for Ring Oscillator (Fast Clock)............... 5.4.3 Verilog Module for Ring Oscillator (Slow Clock).............. 5.4.4 Phase Detector.................................................................... 5.4.5 Simulation Wave Form of Phase Detector.......................... 5.4.6 Verilog Module for 8 Bit Counter....................................... 5.4.7 RTL Schematic of 8 Bit Counter........................................ 5.4.8 Simulation Results of 8 Bit Counter................................... 5.4.9 Verilog Module for 8 Bit Counter....................................... 5.4.10 RTL Schematic of Time to Digital Converter..................... 5.4.11 Schematic of Time to Digital Converter ............................ 5.4.12 Verilog Module for Time to Digital Converter................... 5.5 Applications of the TDC Implemented.............................................
90 90 90 93 93 94 94 95 100 101 106 108 113 117 122 126 127 127 128 129 130 133 137 139 139 140 142 142 142 143 143 144 144 145 145 146
References......................................................................................................... 147
List of Figures
Fig. 1.1 Fig. 1.2 Fig. 1.3 Fig. 2.1 Fig. 2.2 Fig. 2.3 Fig. 2.4 Fig. 2.5 Fig. 2.6 Fig. 2.7 Fig. 2.8 Fig. 2.9 Fig. 2.10 Fig. 2.11 Fig. 2.12 Fig. 2.13 Fig. 2.14 Fig. 2.15 Fig. 2.16
Typical constituents of an EDA tool indicating an underlying interdisciplinary knowledge base............................. Increasing VLSI complexities forcing the evolution of the EDA................................................................ Comparison of the types of IP cores............................................... Characteristics of spam emails........................................................ Motivation for co-design methodology........................................... Our framework for hardware-software codesign............................ Higher level schematic of the system............................................. Functional architecture of the system............................................. System design................................................................................. Base configuration of the DDR SDRAM........................................ Memory interface details revealed by EDK.................................... Round robin arbitration algorithm shown by EDK......................... Hash function mapping variable length keywords to fixed length vector....................................................... Variation of false positive rate as a function of m/n....................... Variation of size of bloom filter as a function of the error rate.......................................................... Varying value of false positive as a function of number of hash functions............................................................... Variation of false positive with number of hash functions at a given m/n ratio............................................. Figure revealing the role of LWIP stack in the system................... Design flow for the Xilinx EDK adopted in the present work............................................................
3 8 10 18 25 30 31 32 32 38 38 38 41 44 44 45 45 46 48
xxiii
xxiv
Fig. 3.1
List of Figures
Fig. 3.20 Fig. 3.21
The three standard leads form an equidistant triangle (Einthoven triangle)......................................... (a) Lead-I: (LA–RA), (b) Lead-II: (LL–RA), (c) Lead-III: (LL–LA), (d) aVL: {LA – (RA + LL)/2}, (e) aVR: {RA – (LA + LL)/2}, and (f ) aVF: {LL – (RA + LA)/2}................................................... Wilson network and right leg drive................................................. Six lead ECG data acquisition system using FPGA....................... LTC1407 operating sequence (Retrieved from Linear Technology Corporation: LTC1407 data sheet).............................. (a) Flow diagram to drive serial ADC LTC1407 using FPGA and (b) flow diagram to drive serial ADC LTC1407 using FPGA........................................................... Flow diagram shows how to transmit the serial data bits of a 15 bit data frame........................................ A three dimensional memory structure used in VHDL code......................................................... Simulation window to show driving of serial ADC and generating serial data for transmission........................... RTL view of the system ADC driver and serial transmitter....................................................................... Technology schematic view of the system ADC driver and serial transmitter................................................... Critical path view of the system ADC driver and serial transmitter............................................................ ECG signal reconstruction mechanism at the hospital end........................................................................... Timing diagram of the DAC LTC1257 (Retrieved from the Linear Technology Corporation data sheet)................................................................... A 15 bits data frame receiving serially........................................... Flowchart to drive analog demultiplexor and serial DAC from spartan 2e FPGA........................................... Simulation results of VHDL code for ECG receiver driver................................................................... RTL view of the ECG serial receiver.............................................. Technology schematic view of the ECG serial receiver............................................................... Critical path view of the ECG serial receiver................................. Snapshot of the setup......................................................................
Fig. 4.1 Fig. 4.2 Fig. 4.3 Fig. 4.4
Hardware design of the universal FPGA based interface............... 95 Top level view of the soft IP core for the LCD............................... 96 Detailed synthesis view of the soft IP core for the LCD................ 96 Top level synthesis view of the DAC interface............................... 101
Fig. 3.2
Fig. 3.3 Fig. 3.4 Fig. 3.5 Fig. 3.6 Fig. 3.7 Fig. 3.8 Fig. 3.9 Fig. 3.10 Fig. 3.11 Fig. 3.12 Fig. 3.13 Fig. 3.14 Fig. 3.15 Fig. 3.16 Fig. 3.17 Fig. 3.18 Fig. 3.19
55
57 60 61 63 64 66 70 72 74 75 77 78 78 79 80 85 87 88 89 90
List of Figures
Fig. 4.5 Fig. 4.6
xxv
Top level synthesis view of the VGA interface............................... 114 Detailed synthesis view of the VGA interface................................ 115
Fig. 5.1 Basic principle of measurement of time interval using Nutt Interpolation method............................................................... Fig. 5.2 Selecting a DCM from clocking wizard......................................... Fig. 5.3 General setup wizard for DCM as a frequency multiplier.............. Fig. 5.4 Wizard shows specifying the clock buffers to be used................... Fig. 5.5 DCM used as a symbol in schematic design entry......................... Fig. 5.6 Behavioral simulation of frequency multiplier............................... Fig. 5.7 Editable floor plan for ring oscillators............................................ Fig. 5.8 Schematic and behavioral simulation of error finder...................... Fig. 5.9 Timing for the slow clock, fast clock.............................................. Fig. 5.10 General block diagram revealing the principle............................... Fig. 5.11 Schematic of double ring oscillator with start, stop control........... Fig. 5.12 Editable floor plan for ring oscillators............................................ Fig. 5.13 Ring oscillator timing simulation................................................... Fig. 5.14 Schematic and behavioral simulation of phase detector................. Fig. 5.15 Ring oscillator timing simulation................................................... Fig. 5.16 Schematic of the ring oscillator for fast clock configuration.......... Fig. 5.17 Schematic of phase detector............................................................ Fig. 5.18 Behavioral simulation of phase detector......................................... Fig. 5.19 RTL schematic of 8 bit counter Fig. 5.20 Simulation results........ Fig. 5.20 Simulation results........................................................................... Fig. 5.21 RTL schematic of TDC................................................................... Fig. 5.22 Final schematic of the TDC............................................................
130 131 132 132 133 133 133 134 135 136 136 137 137 138 140 141 141 142 143 143 144 145
List of Tables
Table 1.1 Millstones in the EDA evolution.................................................... Table 1.2 Comparison of approaches for design realizatio............................ Table 1.3 Roadmap of VLSI design............................................................... Table 2.1 Hardware/Software co-design compared with the hardware/software design process.................................... Table 2.2 Microblaze usage summary............................................................ Table 2.3 Comparison of level converters for networked applications.............................................................. Table 2.4 Comparison of hash functions........................................................ Table 3.1 Bipolar leads and their connections................................................ Table 3.2 Augmented leads and their connections.........................................
5 12 13 26 34 35 41 55 55
xxvii
Chapter 1
Introduction
1.1
Introduction
This chapter is meant to be a short introduction to the Electronic Design Automation (EDA) paradigm. The last decade has witnessed phenomenal growth in the number of R&D groups, corporate players, universities and research laboratories working in this exciting area up-and-coming as the hub of interdisciplinary activity. Increasing design complexities owing to the “more than Moore” phenomenon, added expected functionalities, shrinking design cycle and time to market window, more software centric designs are all the crucial factors forcing the EDA progression in diversified directions more than ever before. The intent of this chapter is also to make the reader familiar with the very rationale of the book, its organization and to set the basic foundations of its remaining chapters which exploit various flavors of different EDA tools to build live case studies of increasing complexities.
1.2
Prologue
In the era of technology shrinkage of the order of ~0.7 per generation with 2× more functions per generations and declining cost of the functions by the same order; the Electronic Design Automation tools are at the forefront of the Very Large Scale Integration (VLSI) design. Electronic Design Automation (EDA) is one of the key enablers of the semiconductor industry [1]. No chip is designed without EDA. Conversely, semiconductors drive EDA technology [2]. These EDA tools are now progressively more required to address the microscopic and macroscopic design issues. The former includes design concerns such as ever-increasing speed, more demand towards reduction in power supply and power dissipation, noise, crosstalk, interconnects and overall reliability aspects. While the later comprises of productivity challenges with the shrinking time to market window, different levels of abstractions
R.K. Kamat et al., Harnessing VLSI System Design with EDA Tools, DOI 10.1007/978-94-007-1864-7_1, © Springer Science+Business Media B.V. 2012
1
2
1
Introduction
with the in surge of more software oriented algorithmic design practices, design reuse for saving the reinvention of the wheel and so on. With the scaling of the Moore’s law, cramming of increasingly more transistors and IP cores on chip, more software centric designs, the EDA paradigm too is witnessing more challenges, since it has to be always a step ahead of the chip technology. This is impossible unless innovative methodologies are brought in for the EDA tool design itself that can really address the emerging new phenomena and chip design issues. With more and diverse tools in appearing on the VLSI design canvass the pertinent issues such as portability are also coming to forefront. Like the art of hiking, VLSI system design is a very practically oriented technology with very high peaks that few can scale. The VLSI industry is advancing on the pillars such as progress in EDA technology, fabrication technology, designs and microarchitectures, IP cores etc. In order to zero down on the key theme of the present book, it is worthwhile to define the very grounds of ‘EDA’ and cover its scope.
1.3
EDA: From Methodologies, Algorithms, Tools to Integrated Circuits and Systems
The definition of EDA from various sources is as follows: Electronic Design Automation (EDA) is s an umbrella term for computer-aided engineering, computer-aided design and computer-aided manufacturing of electronics in the discipline of electrical engineering. The term electronic design automation probably originates in the IEEE Design Automation Technical Committee. s broadly defined as software tools for the development of integrated circuits and systems [7]. s methodologies, algorithms and tools, which assist and automate the design, verification, and testing of electronic systems [8]. s Using the computer to design, lay out, verify and simulate the performance of electronic circuits on a chip or printed circuit board [9]. s a general methodology for refining a high-level description down to a detailed physical implementation for designs ranging from [8] ɴ integrated circuits (including system-on-chips), ɴ printed circuit boards (PCBs) and ɴ electronic systems. s involves modeling, synthesis, and verification at every level of abstraction [10]. EDA is regarded as an important field in computer science and engineering, owing to its significant impact on the development of information technology by
1.3 EDA: From Methodologies, Algorithms, Tools to Integrated Circuits and Systems
Derivative & Incremental Design Modules
Logic Generator & Simulator
Language Parsers & Lexical Analyzers Parallel & Concurrent Computation Incremental Synthesizable Library of IP cores to Facilitate Design Reuse
EDA Tool Constituents
3
Lithographic & Scaling Modules
Pattern Generation & Recognition Textual And/or Graphical User Interface
Support for Component Database With Advanced Data Mining Algorithms
Fig. 1.1 Typical constituents of an EDA tool indicating an underlying interdisciplinary knowledge base
supporting the successful scaling of Moore’s Law. Some of the basic reasons attributed to its successful growth are as follows [10]: s successful in managing the exponential increase in design complexity and supporting the ever increasing scaling with cramming of more and more on chip functionalities s ground-breaking technological domain that showcases linking of theoretical concepts such as computational modeling, computational thinking, and computational discovery to an application realm viz. electronic circuit design s fuelled the interdisciplinary collaborations electrical engineers to derive various levels of circuit models; with physicists and chemists to derive manufacturing models; with theoretical computer scientists to conduct various kinds of complexity analysis; with applied mathematics and optimization experts to improvise highly scalable simulation and synthesis algorithms; and with application domain specialists to develop intellectual property (IP) libraries, etc. It is also worthwhile to note that the emerging conception of the EDA is now moving from the IC a.k.a VLSI domain to other Engineering domains such as synthetic biology, cyber physical systems, Datacenter Design and concurrent software development. The VLSI for which the EDA has been originally conceived now refers to systems implementation using ICs, while the IC now refers mostly to general manufacturing technique such as micro/nano-scale devices on seemingly shifting semiconductor crystalline substrate to newer platforms such as in MEMS (Fig. 1.1).
4
1
Introduction
Just in nutshell the EDA industry is poised to be very broad and it is really difficult to focus on everything. The present book is an attempt to showcase the VLSI system design with diversified set of EDA tools.
1.4
EDA from Halcyon’s Days to the Blooming Paradigm of Chip Industry
The automation of the design of electronic systems and circuits [electronic design automation (EDA)] has a history of strong innovation. The EDA business has profoundly influenced the integrated circuit (IC) business and vice-versa. An interesting review as regards the technologies, algorithms, and methodologies that have been used in EDA tools and the business impact of these technologies to this has been taken by MacMillen in [1]. As the complexities of VLSI circuits increase, the crucial role of electronic design automation tools in virtually every aspect of VLSI circuit design is undeniable. Larger designs require much greater designer productivity to achieve reasonable design schedules and costs, and this dictates a greater role for EDA tools [3]. Moreover, advances in areas such as software methodology, operating systems, storage systems, and programming languages have often had an enormous impact on EDA. The explosive growth in the development of wide-area network infrastructure over the past few years indicates an opportunity for the industry and the field of computer science in general to make a leap to a new generation of capabilities [4]. The cyclical nature of the chip industry which changes direction between standardization and customization roughly every 10 years has really had a profound impact on the EDA roadmap. As put forth for the first time by Makimoto [5] the cycle nature of the industry which was later named as “Makimoto’s Wave” by Electronics Weekly (UK) in January 1991 paraphrases five major cycles as follows: s s s s s s
1947–1957: Dawn of Semiconductor Age 1957–1967: Era of Transistor 1967–1977: Era of IC/LSI 1977–1987: Era of MPU/Memory 1987–1997: Era of ASIC 1997–2007: Era of Field Programmability
The end of each Makimoto’s cycle listed above has been referred as a design crisis. However, in the context of the theme of the present book i.e. EDA tools, their inception is seen emerged in the first design crisis i.e. around mid 1970s. Pioneering research work from the academic point of view was undertaken by Mead and Conway [6] and the same marked the foundation of the EDA industry. Some of the milestones summarized in Table 1.1 presents an overview of the evolution of the EDA. With the basic history of the EDA in pace let’s have a look as regards to how they are categorized?
1.5
Categories of the EDA Tools
Table 1.1 Millstones in the EDA evolution Year Development 1960 Manual design without the conception of the EDA 1970
Mid 1970
1980
Conception of the EDA was discussed in major events like DAC conference led to the development of SPICE and standardization in the form of GDSII format Development of powerful workstations and GUI
Publication of the book “Introduction to VLSI Systems” by Carver Mead and Lynn Conway
Mid 1980
Commercial Players such as Daisy Systems, Mentor Graphics, and Valid Logic Systems popularly known as DMV entered in the scenario 1986 Verilog was introduced by Gateway Design Automation 1987 VHDL was introduced by U.S. Department of Defens 1990 New Concepts such as Analog and Mixed Circuit Design, Design for testability, Built in Self test were in forefront with increasing design complexities 2000 New paradigms such as System onwards on Chip, Network on Chip, Smart Appliances, FPGA based fast prototyping have become the buzzwords
1.5
5
Remark Pioneering work by Shannon’s, McCluskey’s, and companies like Calma laid the foundation of the EDA Emergence of the era of automated artwork and simulation and to some extent automated routing for the PCB design
EDA developers made use of the same to come out with schematic capture and intensified simulations of the complex circuits that led to the first microprocessor development Marked the next milestones in design verification and simulation by the academic community. Metal Oxide Semiconductor Implementation Services (MOSIS) was the commercial offshoot in this era EDA started to become an commercial activity
Concept of RTL simulation was strengthened
New players such as Cadence, Synopsys, Mentor Graphics started delivering intelligent tools
EDA players started delivering tools with cell library, soft and hard reusable IP cores with a possibility of new design innovations such as hardware software codesign with surfacing of ‘Hard Software’ and ‘Soft Hardware’
Categories of the EDA Tools
As discussed above the impracticality of designing with the millions and billions of transistors on a single chip led the conception of the automation and thus the EDA tools came into existence. Along with checking the functionality of the design, these tools also facilitated in different aspects of the design such as speed, power, delay
6
1
Introduction
etc. Further they also helped in enhancing the productivity by ensuring the speeding up of the design process, making IP reuse a reality and along with uniformity in all facets of the design process amongst the members of the design community. All such EDA tools have been categorized under several heads such as: s EDA Tools based on Design methodology: ɴ ɴ ɴ ɴ
Full custom design Standard cell based design FPGA design Structured ASIC design
s Based on Design flows ɴ Implementation tools ɴ ɴ ɴ ɴ ɴ
Logic and Physical Synthesis Design for test Full custom layout Floor planning Place and Route
ɴ Verification tools ɴ ɴ ɴ ɴ ɴ ɴ
Simulation Timing analysis Formal verification Power analysis Signal integrity DRC and LVS
s Based on the type of model to be dealt with ɴ ɴ ɴ ɴ
Logic models Performance models Timing models System Models
All the above mentioned categories of tools manage the design and verification complexities and though they have around 40 years of history of existence, still countenance many challenges and opportunities. Some of them are reviewed in the following section.
1.6
Quo Vadis, EDA? The Challenges and Opportunities
As per Keutzer/Newton “The EDA Industry paradigm is switching Every 7 Years”. Computing infrastructure has been continuously playing a decisive role in the evolution of the EDA tools. Advances in areas such as software methodology, operating
1.6 Quo Vadis, EDA? The Challenges and Opportunities
7
systems, storage systems, and programming languages have often had an enormous impact on EDA. The explosive growth in the development of wide-area network infrastructure over the past few years indicates an opportunity for the industry and the field of computer science in general to make a leap to a new generation of capabilities [4]. Specifically, we envision the entire EDA community organized as an integrated distributed environment [11] that offers users the ability to create an evolvable, customizable and adaptable “virtual” design system that can couple tools, libraries, design, and validation services. Beyond that, the system could also provide manufacturing, consulting, component acquisition, and product distribution, encompassing the developments of companies, universities, and individuals throughout the world. The above trend indicates that the EDA paradigm is matured to the extent to take over disciplines in addition to Electronics for which it was conceptualized. The crux of the story could be ideally expressed in Alberto SangiovanniVincentelli words who spoke profusely in the ACM Distinguished Speaker’s program covered in [18] and [19] paraphrased briefly as follows: “EDA has played a pivotal role in the past 25 years in making it possible to develop a new generation of electronic systems and circuits. However, innovation in design methodologies has slowed down significantly as we approach a limit in the complexity of systems we can design satisfying increasing constraints on time-tomarket and correctness. There is a general trend for the electronics industry to focus on system issues, as software becomes a fundamental component and as the supply chains are destabilized because of the quest for additional value-added in presence of increasing costs and investments. At the same time, manufacturing issues are populating the nightmares of circuit engineers as they try to cope with an implementation fabric that becomes unreliable and difficult to characterize. These trends are at the same time creating severe problems to companies that are based on a business model that is showing stress and opportunities for new enterprises that come into the play with new vistas. There are many open issues to resolve to move towards the new world of technology as world-wide economic developments add complexity to the forecasts (Fig. 1.2).” However, as far as the theme of the book is concerned i.e. EDA’s for FPGA based design, the most widely used design specification languages are Verilog and VHDL at the register transfer level (RTL) which specifies the operations at each clock cycle. There is a general (although rather slow) trend toward moving to specification at a higher level of abstraction, using general-purpose behavior description languages like C or Handel C [12], or domainspecific languages, such as MatLab [13] or Simulink [14]. Using these languages, one can specify the behavior of the design without going through a cycle-accurate detailed description of the design. A behavior synthesis tool is used to generate the RTL specification in Verilog or VHDL, which is then fed into the design flow [15].
8
1
Multiple Technology Constraints for Synthesis, Routing & Placement Handling Antenna Rules, Clock Skews & Issues Arising out of reuse of IP cores
Handling MTBF issues With Electro migration
Introduction
Interconnect Delays & Crosstalk with increasing Component Density
VLSI Metrics Driving EDA
Optimizing Leakage Power & Overall Power With increasing Clock Frequencies
Maintaining Appropriate Time Correctness with Ever aggressive Gate Sizing
Fig. 1.2 Increasing VLSI complexities forcing the evolution of the EDA
1.7
Just One More Book on EDA or Value Addition to the Scholarly Literature by US?
There are good number of existing books on EDA [20–23] and there are equally good numbers on FPGA based design too [24–28]. Moreover, few noteworthy books on the latest VLSI design techniques such as System On Chip and rapid prototyping do exist in market (e.g. [24–31]). Even one of our previous books [13] covers the design at the highest level of abstraction using ‘C’ based design methodology. Then a natural question would be amidst all such books what is the value propositions of the present book? Definitely the present book is a value addition to all the existing ones. Some of the key features of this book are as follows: s It covers the flavor of VHDL, Verilog and Handel C that too in one text. s Such a coverage would enable the designer to witness the appropriateness and value proposition of each tool for relevant applications. s The book omits the basic discussion of the above mentioned design platforms (which have been covered by many texts) and directly starts with the design case studies. s The design case studies have been chosen from diversified application domains such as ‘Network on Chip’, ‘Hospital on Chip’, ‘A to D Conversion’ and ‘Embedded System Design’. s Entire code and tools flow facilitates the designer community to exemplify the design cycle for System on Chips with increasing complexity.
1.8 Designing the System as SoC Using the Soft IP Cores
9
s Good backup of references chosen from journals, books and industry white papers encourage the potential readers for further reading on their own and thus stimulates development in this domain of high industry demands. s The case studies are conceived with standard development cycles that makes use of latest concepts such as ‘Soft IP Cores’, ‘Hardware Software Codesign’ and so on. Intended Audience of this book includes hardcore EEs, researchers in the field of Electronics, Computer Science etc. and also the new escalating breed of EDA community who have realized the power of “Configware” rather than “Hardware” and/or “Software”. However, but not limited to only hardware VLSI domain. Many software EEs may find it interesting and hopefully migrate to the FPGA platform after realizing the complexities of the problems such as Network on Chip into real existence. A point worth mentioning is some helpful prerequisite before going through the book would be s s s s
Algorithms and computational theory Experience with programming projects Basic knowledge of optimization Some experience with digital logic
With the underlying principle of the book in place, we now are going to put forth few points regarding the design platform and tools infrastructure. The case studies in this book are mostly designed as a soft IP cores and prototyped on Xilinx FPGAs. Therefore it is worthwhile to take a brief review of the soft IP cores.
1.8
Designing the System as SoC Using the Soft IP Cores
The IP based methodology has been introduced in the VLSI paradigm to cope up with very large and complex designs. It basically involves partitioning of the designs into smaller IP blocks with well-defined functionality that can be re-used across multiple designs [32]. Recent years have seen impressive improvements in the achievable density of integrated circuits. In order to maintain this rate of improvement, designers need new techniques to handle the increased complexity inherent in these large chips. One such emerging technique is the System-on-a-Chip (SoC) design methodology which has been used in the present work. In this methodology, pre-designed and pre-verified blocks, often called cores or intellectual property (IP), are obtained from internal sources or third-parties, and combined onto a single chip. These cores may include embedded processors, memory blocks, or circuits that handle specific processing functions. The SoC designer could then combine them onto a chip to implement complex functions [33]. Of late, standard product FPGA manufacturers have been trying to address system-on-a-chip (SOC) level reconfigurability by creating very large SRAM-based reprogrammable FPGAs with an embedded processor attached to them. If users are able to overcome the
10
1
Introduction
challenges of SOC designs using these mega-FPGAs, these devices will provide the benefits of time-to-market, according to these companies [34]. Today many manufacturers have come out with the design methodology using the IP Cores. As an example Xilinx has come out with Smart-IP™ technology is a combination of several features designed to deliver highest performance, predictability, and flexibility when implementing IP with Xilinx FPGAs. Smart-IP technology ensures constant core performance regardless of its position in the FPGA device; maintained performance when multiple cores are integrated in the same FPGA device; and no performance degradation when migrating to larger devices. The IP is built so that it makes use of the unique features of the Spartan-II architecture such as dedicated multiplier or multiplexor logic. The use of Smart-IP technology means that the performance of the core is independent of core placement, number of cores used, surrounding user logic, device size, and EDA tools [35].
1.9
Types of IP Cores
There are three types of IP Cores viz. s Soft IP Cores available as synthesizable VHDL or Verilog s Firm IP Cores available in the form of netlist after synthesis in the target technology s Hard IP Cores as a layout of the block on chip (GDSII, CIF) The Soft IP Cores are delivered as RTL verilog or VHDL source code with synthesis script (i.e.: clock generation logic). Here the customers are responsible for synthesis, timing closure, and all front-end processing. The firm cores delivered as a netlist to be included in customer’s netlist (with don’t touch attribute), makes the placement information mandatory. Hard cores are the most complex ones as they
Fig. 1.3 Comparison of the types of IP cores
1.11
Justifying FPGA as the Prototyping Platform
11
are provided as a blackbox with very tight timing constraints. Here the internal views can’t be alterable or visible to the customer. A comparison of the above mentioned three types interms of flexibility has been done by Miloš BeÊváĖ [36]. The same is shown in Fig. 1.3. Owing to the flexibility of the soft IP cores, we have chosen them as the main platform for realizing the case studies in this book.
1.10
Design Issues Pertaining to the Soft IP Cores
Dey et al. in [37] have discussed in detail about the design issues pertaining to Soft IP cores. A hard core, consisting of hard layouts, is the most optimized, but offers little flexibility in terms of changing the hardware features of the core itself. Most general-purpose processor and digital signal processing (DSP) processor cores available and used today, like the cores from ARM, LSI Logic, Motorola, and IBM, are hard cores. On the other hand, a soft core is a functional description of an IP, and the soft IP specification can be both simulated and synthesized. A soft IP allows flexibility in retargeting the IP specification to better fit the core user’s needs. For example, a soft processor core allows the core user to reconFig. the features of the processor, such as its instruction set, caches, communication mechanisms, and interrupt mechanisms to make the processor core more suitable for a particular SoC application. However, as opposed to a hard core user, a soft core user (the SoC integrator) must synthesize, optimize, validate, and develop tests for the soft core before integrating it in the SoC being designed. There are many design platforms for designing the soft IP cores. In the present work we have designed the Soft IP Cores in three different flavors viz. VHDL, Verilog and Handel C. The above mentioned soft IP cores are realized on the FPGA based platform. Following section justifies the choice of the platform.
1.11
Justifying FPGA as the Prototyping Platform
Once used only for glue logic, FPGAs have progressed to a point where system-onchip (SoC) designs can be built on a single device. The number of gates and features has increased dramatically to compete with capabilities that have traditionally been offered through ASIC devices only. Some of the advantages of FPGA design methodologies over ASICs, includes early time-to-market, easy transition to structured ASICs, and reduced NRE costs. With the advent of new technologies in the field of FPGAs, design houses are provided with an option other than ASICs. With the mask costs approaching a one million dollar price tag, and NRE costs in the neighborhood of another million dollars, it is very difficult to justify an ASIC for a low unit volume. FPGAs, on the other hand, have improved their capacity to build systems on a chip
12
1
Introduction
Table 1.2 Comparison of approaches for design realization Design approach
ASIC
Software
FPGA
Performance metrics Throughput/ performance
Higher due to full customization
Spatial issues
Superior
Medium, due to inherent sequential processing Not applicable
Power
Optimum
Higher
Prototyping
Not possible
Economics
Possible but with iterative time consuming flow Costly
Higher due to inherent parallelism massive routing grids present an area penalty Higher due to lowest gate utilization Rapid prototyping
Cost effective only for Cost effective even large volume for low volume Balance between Good Static nature limits performance and Reprogrammability functionality of these flexibility, parallel Throughput several performance-critical logic functions can orders of magnitude features to only a be performed over slower than the fixed set of the the area of the ASICs and FPGAs system’s device added with functionality the reconfiguration on the fly
Remark
with more than million ASIC equivalent gates and a few megabits of on chip RAM. For high volumes, a structured ASIC solution combines the cost advantage of ASICs with a low risk solution of an FPGA [38]. New low cost FPGA devices provide the ability to re-use hardware and software like never before. With experience in Altera, Xilinx and Lattice we make full use of the latest developments in these devices to deliver new and exciting embedded systems [39]. The real benefits of the FPGA come when it is used in conjunction with the Soft IP Cores. A time consuming and expensive redesign of a board can often be avoided through application-specific integration of IP cores in the FPGA – an alternative for the future, especially for very specialized applications with only small or medium volumes. Moreover, the FPGA technology is indispensable wherever long-term availability or harsh industrial environments are involved. IP cores per se are not threatened by discontinuation, even if an FPGA component may be replaced by a newer one after 10 years [40] (Table 1.2).
MSI
LSI
1965–1969
1969–1989
1989–till date VLSI ULSI GSI
Technology SSI
Year 1959–1965 20–200 200–2,000 2,000+ 4,000+ 8,000+
102–103
103–105
105–107 500,000–10,000,000 >10,000,000
Number of transistors per chip Gate count 102 2–20
Table 1.3 Roadmap of VLSI design Platform themes as per Major design outcomes Tredennick’s paradigm as per Makimoto’s shifts Impact on EDA Package waves DIP Transistor – diode Manual handcrafted 1957 centric design Fixed algorithms with fixed resources DIP TV, calculator, watch Migration towards design automation Hardwired Quad MPU, memory Design using VHDL/ 1977 Verilog Variable algorithms 1987–97- ASIC, RISC EDA with mixed trends with fixed that includes 1997–2007 – PC resources VHDL, Verilog, C Chipset emotion Procedural based behavioral engine programming methodology and 2007 onwards system 1997 tools for full on chip Variable algorithms custom/semicustom 2007–2017 – Software with variable design centric design resources Structural programming
1.11 Justifying FPGA as the Prototyping Platform 13
14
1.12
1
Introduction
Justifying the Differing Flavors of Languages Used in This Book
This book is strikingly different in yet another aspect than those established ones in the areas of VLSI design. It uses different flavors of languages for accomplishing the design problems. For instance, the anti-spam appliance in Chap. 2 is designed in Handel C, the ECG data logger in VHDL, the soft IP cores reported in Chap. 4 again in Handel C, while the time to digital converter in Chap. 5 in Verilog. One natural question would be why such diversified design languages have been used? The convincing answer to this might be given if one looks at the industry trends shown in Table 1.3. Since the table is self explanatory we wouldn’t like to deliberate more on it. However, an helpful insight shown under the column ‘impact on the EDA’, reveals that the prevailing trend in the industry in the design community is to use such mixed EDA tools and to make the best use of whatsoever fits appropriate for a particular design problem. Moreover designing with different language suites showcases how to exploit the best of each of these languages to realize the design.
Chapter 2
Development of FPGA Based Network on Chip for Circumventing Spam
Abstract With the growing popularity of Internet and extensive use of E-mail as a communication media, the volume of Spam mails has seen to be growing at a phenomenal rate. The growing volume of Spam mails as well as their mutating nature annoys people and affects work efficiency significantly. The unsolicited emails or Spam’s used to be deliberated of as just a nuisance, in the past few decades, however in the last few years; their annoyance has reached to epidemic proportions. Thus the Spam mails have has really become a nightmare for every email user. This chapter presents Anti-Spam solution prototyped on Xilinx Spartan 3e FPGA and designed using Handel C. We have adopted the hardware-software co-design methodology and the same is described from scratch. Two IP cores have been designed viz. Content Addressable Memory (CAM) and Bloom Filter in Handel C and the same have been deployed on the Spartan 3e FPGA along with the customizable version of Microblaze. The main contribution is reporting the technical know how related to the co-design aspects that comprehends synergic mixture of soft IP cores of the content addressable memory (CAM) and bloom filter realized both in hardware and software with the Xilinx Microblaze processor. The toolset used for the hardwaresoftware co-design is the Xilinx Embedded Design Kit (EDK). The design flow comprises of the IP core design in Handel-C, embedded in EDK driven by the central customized core of Microblaze.
2.1
Introduction
The last quarter of the past century, and the first decade of the current century have witnessed a phenomenal evolution of Internet. The Moore’s Law describing the number of transistors cramming on an integrated circuit, can also describe the growth of the Internet. The researchers have reported [41] exponential growth of Internet with a prophecy of doubling its size every 5.32 years. The web, one of the main arms of the internet, is seen penetrating in almost all the walks of life ubiquitously. Moreover, it has emerged as the all pervasive model of the R.K. Kamat et al., Harnessing VLSI System Design with EDA Tools, DOI 10.1007/978-94-007-1864-7_2, © Springer Science+Business Media B.V. 2012
15
16
2
Development of FPGA Based Network on Chip for Circumventing Spam
business, education, research, entertainment and social networking. Exponential growth of web has had its benefits in widening the horizons of the market, dissemination of data, information, knowledge, and value added services as well as work and resource sharing leading to the cost effective model of new media. The growth of internet and World Wide Web has forged through several stages. The recent effort in streamlining the information avalanche on this new media has seen interms of the web 2.0 emphasizing on the social networking aspects of the web. Even though emails only represent1 a tiny fraction of the traffic volume going around public IP networks (a little more than 1%), it involves a gigantic amount of messages: around 31 billion worldwide per day in 2002 and perspectives claim more than 60 billion for 2006 [42]. Interestingly, E-mail one of the main personalized communication arms of the internet has commenced its operation much before the internet itself. Today, it is mainstay of business, personal, research and academic communications and is a crucial instrument for information exchange. Reading an E-mail is nowadays a daily habit of netizens.2 Emails have gradually emerged as proficient, swift and inexpensive means of communication. This makes it preferred both in professional and personal correspondences. Additionally, reading occasionally an E-mail from unknown source and content of which is not of the user interest is not really a misfortune. However, it is estimated that 60% of all email is Spam [43], and often illicit; this is what one might call a nightmare. Thus, with the prominence of the email for information interchange, the junk mails, unsolicited mails or Spam mails also appeared on the email communication canvas diminishing the utility of this valuable tool. As the primary focus of this research work is “Anti-Spam Techniques”, it is worthwhile to present the background information regarding the Spam mail.
2.2
Conception of the Spam Mail
As the Email emerged out as the superior model of the communication for sending messages, data, files in a quicker manner to one or many persons at a time, the Spam email also come into sight as a result of more people using email. People agree that Spam is a serious problem, but they have difficulty agreeing on its definition. Unsolicited Bulk E-mail (UBE) is probably the most useful definition [44, 45]. Literature review reveals several definitions of the Spam3 email: The Merriam websters dictionary gives laymen’s definition of Spam as “unsolicited usually commercial e-mail sent to a large number of addresses” [46]. The oldest 1
The major part of the internet volume (80% according to [20]) is related to peer-to-peer (P2P) file sharing. Followed by Web browsing, electronic mails (emails), File Transfer Protocol (FTP), Remote login (telnet), Instant Messaging (IM) and Media Distribution (audio and video streaming). 2 A Netizen (a portmanteau of Internet and citizen) or cybercitizen is a person actively involved in online communities. 3 Hormel Foods Corporation, the maker of SPAM luncheon meat, does not object to the Internet use of the term “Spamming”. However, they did ask that the capitalized word “SPAM” be reserved to refer to their product and trademark. By and large, this request is obeyed in forums which discuss Spam. The same convention is also followed in the present chapter.
2.2
Conception of the Spam Mail
17
conception of Spam is as per Southwick S. and Falk J. (1998). As per them, the Spam is “the same article posted an unacceptably high number of times to one or more newsgroups” [47]. The term “SPAM®” is coined as a brand name of luncheon meat and is a registered trademark of Hormel Foods Incorporated. On the other hand, “Spam” (all lowercase) is a term that was light-heartedly adopted by the Internet community after a famous Monty Python sketch, to label unsolicited mass postings on USENET newsgroups [48]. The focus of this research will be on the latter meaning of the word. As per SpamHAUS, technically the mail is termed as Spam if “the recipient’s personal identity and context are irrelevant because the message is equally applicable to many other potential recipients” AND the recipient has not verifiably granted deliberate, explicit, and still-revocable permission for it to be sent [49]. Thus, the important issues regarding the conception of the term Spam mail are: (i) It is an unsolicited email whatever may be the content; it is deficient in recipient’s consent. (ii) In addition to the above, the email must be bulk in order to categorize under Spam. Another widely agreed definition of the Spam mail reported by MAPS is: “An electronic message is ‘Spam’ IF: 1. The recipient’s personal identity and context are irrelevant because the message is equally applicable to many other potential recipients; and 2. The recipient has not verifiably granted deliberate, explicit, and still-revocable permission for it to be sent; and 3. The transmission and reception of the message appears to the recipient to give a disproportionate benefit to the sender. In spite of all the above reported definitions, there was large disagreement in defining the Spam due to its meaning that would encompass all unwanted messages while excluding legitimate e-mail. Earlier definitions of the Spam4 led to estimated one-in-five commercial e-mails getting caught in filters for failing content checks or poor bounce management, even when specifically requested by the consumer. A widely agreed highly understandable and 100% consumer-centric definition was laid down in the Federal Trade Commission’s recent Spam Summit (2007) as “Operationally, we define Spam as anything users don’t want in their inbox,” [50] However an important notion is that some people misguidedly regard all bulk e-mail as “spare,” a derogatory term for untargeted, unsolicited bulk e-mail. But if one flips the Spam concept on its head, we have a powerful tool one can use to reach a lot of people quickly and inexpensively, for business as well as personal purposes (Fig. 2.1) [51].
4 In addition to the email Spam, the term is also been applied to similar abuses in other media: instant messaging Spam, Usenet newsgroup Spam, Web search engine Spam, Spam in blogs, wiki Spam, online classified ads Spam, mobile phone messaging Spam, Internet forum Spam, junk fax transmissions, and file sharing network Spam.
18
2
Development of FPGA Based Network on Chip for Circumventing Spam
Fig. 2.1 Characteristics of Spam emails
2.3
FPGA Based Network on Chip for Circumventing Spam
This section elaborates the intent of the present case study.
2.3.1
Inspiration
FPGA based Network On Chip for Anti-Spam solution.
2.3.2
Core Concept
Investigate challenges and alternatives for implementation of different algorithms for Anti-Spam implementation over reconfigurable (RC) devices.
2.3.3
Method
Prototype hardware interface and software infrastructure, demonstrate proof of concept for benefits of network-attached RC resources
2.3 FPGA Based Network on Chip for Circumventing Spam
2.3.4
19
Motivation
Effective anti-Spam solution deployment is becoming increasingly difficult due to the adversarial—and hence constantly evolving—nature of the problem space, in addition to the inherently limited power of signature matching in the face of false positives, polymorphism or mutation, and zero day attacks. The main motivation behind designing the hardware based anti-Spam systems is the possibility of extensive parallelism, which is scarce in the software centric systems. Moreover the hardware based solutions satisfies the flexibility, throughput and scalability requirements of the state of art high performance networks. At network speeds of the order of Gbps Spam detection in software faces the latency problems, and the only practical solution is either an ASIC or FPGA based system to cope up with high-speed network traffic. The rationale behind building the system on FPGA platform as contrasted to the ASIC, is the easy reconfiguration and configuration on the fly required for adapting the ever changing and mutation of the Spam keywords. Although the later (ASIC) approach offers more speed, it suffers from the inflexibility, scalability and turns out to be a costlier affairs. Besides it also offers the design paradigm for hardware-software codesign to attain the benefits and synergy of the both and thus realizing the system at higher semantic levels. As the present development showcases, the incorporation of the context correlated across multiple connections is achieved by configuring the anti-Spam solution at gateway and client end. The attributes of the FPGA based development platform, such as reusability of the soft IP cores, has been extensively used in the present work for rapid prototyping of the single chip Anti-Spam solution. The complex programming in the FPGA framework has been alleviated by using the Handel C based platform to design the system at higher level of abstraction.
2.3.5
s s s s s s
Advantages of FPGA Based Antispam Appliance in Nutshell
Inherent Parallelism reduced time to market field upgradeability to support enhanced protocols configurable scalability to meet various protocol lower total cost of development (TCD) meeting the performance and scalability requirements of applications
The market is flooded with the software, ASICs and general purpose CPU based anti-Spam appliances. However, due to the above mentioned advantages, the research work implemented in present chapter is based on FPGAs. In order to “beat the cost” out of the product, systems-level designers are looking for FPGAs instead of ASICs or general CPUs as the enabling technology to help
20
2
Development of FPGA Based Network on Chip for Circumventing Spam
them meet performance and scalability requirements, while at the same time keeping TCD within budget [52]. Another source of motivation for the present research is NSF CyberTrust Center Proposal entitled “Center for Internet Epidemiology and Defenses”5 by Stefan Savage, Geoffrey M. Voelker, George Varghese from University of California, San Diego and Vern Paxson, Nicholas Weaver from International Computer Sciences Institute (October 2004–September 2009) which focuses on developing a new science of empirical data and analyses for understanding Internet epidemics, and applying this understanding to the engineering of strong defense systems.
2.3.6
Significance of the Work
The worth of the present case study in one sentence is “Transforming the Anti-Spam Heuristics into the Electronic System”. Some problem spaces have a limited life of their own. However, when they spread, they start a trail of growing epidemic and become more troubled and troubling. The problem of Spam mail perfectly fits into this category. Through the increased use of email it has become the agent for propagation of malicious programs, worms, viruses and moreover chocks the internet bandwidth that can be otherwise used for noble causes. For example a list of contacts or address book incorporated into the email client can be very easily exploited to propagate the malicious program, often before the user even realizes that his or her computer has been infected. Because of the speed at which these malicious programs can propagate, a well-adapted Spam mail can quickly turn into a global outbreak. Although the Spam can’t be zeroed down, still the research like the present one attempts to prevent the high probability risks by working on the low probability menace. Other striking features of the case study that makes it significant is implementation the Network on chip (NoC) on the FPGA platform. The same has been extensively studied under an ASIC cost model. FPGAs have been left relatively unexplored for this mapping. This work comes out with the FPGA based ASIC framework for filtering the Spam. The solution described here attempts to tackle the Spam problem by design and implementation of “Soft Hardware”. The need for such a FPGA based soft IP core implementation is summarized in the following paragraph. The computer networks today are looking for two major improvements which are interrelated to each other viz. speed and security. It is a common observation that if the later is tried to strengthen, the former degrades. Therefore, these days, the networks use a layered approach with scanning at both the desktop, and the gateway
5
The proposal is available on web at URL: http://www.cs.ucsd.edu/users/savage/papers/ CIEDProposal.pdf
2.4 Tools Infrastructure and Design Flow
21
using a security appliance. The problem of Spam adds one more dimension to the above mentioned specifications. With more Spam, the network bandwidth gets chocked inturn decreasing the speed. While introducing the Spam filters and following the pattern matching techniques on the server or client itself leads to a bottleneck in speed on the client or server itself. A wayout demonstrated by this work, is development of a standalone device at the gateway with specialized content processors to which the scanning task be offloaded to remove the speed bottleneck at the server or client end. Yet another attribute that makes this work interesting is the exploitation of the dynamic reconfiguration property of the FPGAs, that makes it ideal for content security with constant updation of the bad IP or content of the Spam which regularly alters its signature. Even as new content types, new attacks and new protocols become critical, embedded processors based on FPGAs can download new firmware to remain relevant. It is this ability, combined with the high-performance available in the latest generation of FPGAs that make them the best choice for content processing. Furthermore, being able to re-flash the firmware and hardware over the Internet gives appliance vendors additional revenue streams for the same product, making for a very compelling business case.
2.4
Tools Infrastructure and Design Flow
This section describes the EDA tools and hardware platform used in the present case study.
2.4.1
Handel C
The Handel C based design flow can be best described as “Software-Compiled System Design for Field Programmable System-on-Chip Design”. The working methodology of the Handel C is directly compiling C to optimized EDIF output for prototyping on the programmable logic. The main negative points of the hardware descriptor languages such as VHDL or Veriolg in the context of the software dominated problems such as the present one are limited applicability to application developers and more involved development process in which the developer has to always keep a track of the intended hardware to be realized. This leads to many flaws in the design such as poorly profiled partitions leading to sub-optimal performance, incompatible flow, difficulty in verification between HW/SW designs and gap between specification and hardware RTL. The ‘law of conservation of pain’ proposed by Brian Holland et al. reports the fondness of more developers towards C based environment in the reconfigurable paradigm. Consequently the variants of C for the reconfigurable paradigm such as
22
2
Development of FPGA Based Network on Chip for Circumventing Spam
system C, impulse C, Dime C, Napa C and Handel C have emerged for the top-down design (as contrasted to bottom-up of the HDLs) at higher level of abstractions [53]. Handel C offers various advantages such as s Designing at various levels of abstraction from the underlying hardware. s Simplifies hardware/software partitioning by describing both with a single, C-based language. s Enhanced simulation and debugging performance. s Enables designing with software dominated flow for the target hardware s Democratize the embedded system design by bridging the Skills Gap between hardware and software professionals. s Allows new potential in hardware by exploiting the Reconfigurable Logic A variant of ANSI C, the Handel C offers the following added advantages [53] such as Parallelism, Timing, ready made Interfaces, provision for Clock, built in RAM/ROM structures, Shared expression, built in structures for Communications, rich design libraries, Floating Point component library, Bit manipulation, Macro functions for hardware block reuse etc. Detailed know-how of SoC and NoC design is discussed in depth by the authors in their latest book.
2.4.2
ISE Webpack 9.26
ISE® WebPACK™ design software is the free, fully featured front-to-back FPGA design solution for Linux, Windows XP, and Windows Vista. It is the ideal downloadable solution for FPGA and CPLD design offering HDL synthesis and simulation, implementation, device fitting, and JTAG programming. It delivers a complete, front-to-back design flow providing instant access to the ISE features and functionality at no cost. Xilinx has created a solution that allows convenient productivity by providing a design solution that is always up to date with errorfree downloading and single file installation. We received a licensed copy of this tool at no cost from the Xilinx Inc along with the Embedded Development Kit (EDK). This toll was extensively used for the synthesizing the design in the Spartan 3e FPGA.
2.4.3
EDK Version 9.27
The Embedded Development Kit (EDK) is an integrated development environment for designing embedded processing systems. This pre-configured kit includes Xilinx Platform Studio and the Software Development kit, as well as all the documentation
6 7
More information available at Xilinx website: http://www.xilinx.com/tools/webpack.htm More information available at Xilinx website: http://www.xilinx.com/tools/platform.htm
2.5
Introducing Hardware-Software Co-design
23
and IP required for designing Xilinx Platform FPGAs with embedded PowerPC® hard processor cores and/or MicroBlaze™ soft processor cores. In the present wok we used the MicroBlaze soft IP core as a central processor for the Anti-Spam solution. The other modules of this kit used in the present work are: Xilinx Platform Studio (XPS) Tool Suite—Including: Graphical IDE and command line support for developing hardware platforms for embedded applications. Software Development Kit (SDK) for MicroBlaze: This includes GNU C/C++ compiler and debugger; Xilinx Microprocessor Debug (XMD) target server; Data2MEM utility for bitstream loading and updating. SDK is the recommended software-centric design environment based on the Eclipse IDE. Processing IP and MicroBlaze Soft Processor Core -Pre-verified IP catalog, including a wide variety of processing peripheral cores for customizing your embedded systems as well as the flexible MicroBlaze 32-bit soft processing core.
2.4.4
Xilinx Starter Kit
The system realization is implemented on the Spartan®-3E FPGA Starter Kit. This is a complete development board solution that includes board, power supply, evaluation software, and USB cable. Following selected features of the above mentioned kit were used in our work: s Spartan-3E FPGA (XC3S500E-4FG320C) s Clocks: 50 MHz crystal clock oscillator s Memory: ɴ 128 Mbit Parallel Flash ɴ 16 Mbit SPI Flash ɴ 64 MByte DDR SDRAM s s s s
Connectors and Interfaces: Ethernet 10/100 Phy JTAG USB download Two 9-pin RS-232 serial port
With the complete description of EDA tools and the hardware platform, we now present the complete system design using the hardware-software co-design methodology.
2.5
Introducing Hardware-Software Co-design
The term hardware/software (Hw/Sw) codesign surfaced in the early 1990s to describe a confluence of problems in integrated circuit (IC) design [54]. Pioneering work was done in this area by Prakash and Parker [55] of the University of Southern
24
2
Development of FPGA Based Network on Chip for Circumventing Spam
California, who developed the SOS system comprising of the arbitrary multiprocessor topology for scheduling and allocating processes onto the multiprocessor. Soon after the initial efforts, the hardware software portioning emerged out as the major design flow for the complex embedded system design and was exploited successfully by Vulcan [56] from Stanford and Cosyma [57] from the Technical University of Braunschweig. Out of the many prevalent working definitions of the Hw/Sw codesign, the most appropriate for the present problem could be the one given by Micheli and Gupta [57] as follows: “Hardware/software co-design means meeting system-level objectives by exploiting the synergism of hardware and software through their concurrent design.” Another equally valid definition is: The meeting of system-level objectives by exploiting the trade-offs between hardware and software in a system through their concurrent design [62]. Hardware-software codesign has had its existence for several decades now. To ensure system capability designers had to face the realities of combining digital computing with software algorithms. To verify interaction between these two prototypes, hardware had to be build. But in the 1990s this won’t suffice because codesign is turning from a good idea into an economic necessity [61]. Predictions for the future point to greater embedded software content in hardware systems than ever before. So something has to be done to speed up and improve traditional software-hardware codesign. Developments in this matter direct to: s Top-down system level codesign and co-synthesis work at universities s Major advances made by EDA (Electronic Design Automation) companies in high speed emulation systems. In the present work the codesign exercise is carried out for a SoC rather NoC realization with a set of functions and a set of performance factors. A core for each function is selected from a set of alternative IP cores and software components, and optimal partitions is found in a way to evenly balance the performance factors and to ultimately reduce the overall cost, size, power consumption and runtime of the final realization.
2.6
Hardware Software Co-design
The main rationale behind the Hw/Sw Co-design is reunion of system-level objectives by utilizing the trade-offs between hardware and software in a system through their concurrent design. The conceptual base of Hw/Sw Co-design encompasses the following points [59]: s The cooperative design of hardware and software components; s The unification of currently separate hardware and software paths;
2.6
Hardware Software Co-design
25
Rapid Development of System on Chip using the reusable Soft IP cores Emergence of platform FPGA architectures posing compatibility and inculcation with the soft IP cores
Development of the toolsets having mixed mode flow
Surfacing of the Design at the higher level of abstraction Using Handel C and similar tools Prevalence of the optimized design methodology by making use of Hardware-software Synergy (that can be best described as ȃ(ARDWARE BECOMING 3OFT 3OFTWARE "ECOMING (ARDȄ
Fig. 2.2 Motivation for co-design methodology
s The movement of functionality between hardware and software; s The meeting of system-level objectives by exploiting the synergism of hardware and software through their concurrent design;
2.6.1
Motivation for Hw/Sw Co-design
The main motivation of the co-design methodology is the growing number of applications combining hardware and software. The design of such systems often involves design around strict performance metrics involving area, timing, power and cost constraints. Portioning the system interms of hardware and software enables the designer to envision the advantages at an early stage in the design phase itself. The software-heavy implementations pose certain performance bottlenecks interms of poor timing specifications especially the worsened latency in the Network on Chip paradigm. However, their positive implications are flexibility, cost effectiveness, ease of debugging. One of the major drawbacks of the software heavy systems is the time consuming simulation when integrated with the hardware. The co-design methodology is fuelled with the emergence of the reconfigurable FPGAs as they offer unique advantages to evaluate and redesign the system meeting the design goals. Prior to their appearance, only the software used to be redesigned leaving no scope for the hardware refinement (Fig. 2.2). Other equally important motivation of the codesign is speeding up the system design which consists of hardware and software, and thus any technique used to help shortening the design time can be categorized as codesign issue. As shown in Fig. 2.1, the codesign centers around the key issue of deferring the clear separation point between of HW and SW through the whole design process as late as possible.
26
2
Development of FPGA Based Network on Chip for Circumventing Spam
Table 2.1 Hardware/Software co-design compared with the hardware/software design process Classic hardware/software design process Hardware/software co-design Hardware-software development commences Immediate partition of the system into on a concurrent path hardware –software before commencement of the design makes the immediate realization of the effects of their interaction until the last phase Restricted trade-off of hardware-software Design gaining benefit by exploiting the trade-off of hardware-software Lack of a unified hardware-software represenLate realization of the Hw-Sw interaction tation, which leads to difficulties in results in suboptimal designs, costly verifying the entire system, and hence to modifications, delay to market, schedule incompatibilities across the HW/SW delays, etc boundary Separation between the function, timing and Instead of concentrating on system level issues communication is possible designer tucks into individual hw/sw issues
2.6.2
Advantages of Hw/Sw Co-design Methodology
The advantages of the Hw/Sw Co-Design methodology are as follows (Table 2.1): s s s s s s s
Improves design quality, design cycle time, and cost Reduces integration and test time Supports growing complexity of embedded systems Takes advantage of advances in tools and technologies Processor cores High-level hardware synthesis capabilities ASIC development
2.6.3
State of the Art Hw-Sw Co-design Methodologies
Some of the recently developed methodologies for codesign are as follows: Polis – Hardware/Software Codesign [63] The POLIS system is centered on a single Finite State Machine-like representation. A Co-design Finite State Machine (CFSM), like a classical Finite State Machine, transforms a set of inputs into a set of outputs with only a finite amount of internal state. The difference between the two models is that the synchronous communication model of classical concurrent FSMs is replaced in the CFSM model by a finite, non-zero, unbounded reaction time. This model of computation can also be described as Globally Asynchronous, Locally Synchronous.
2.6
Hardware Software Co-design
27
The Ptolemy Project [64]: The Ptolemy project studies modeling, simulation, and design of concurrent, real-time, embedded systems. The focus is on assembly of concurrent components. The key underlying principle in the project is the use of well-defined models of computation that govern the interaction between components. A major problem area being addressed is the use of heterogeneous mixtures of models of computation. COMET (COdesign METhodology) Hardware-Software Codesign Methodology [65]: COMET is a hardware-software codesign methodology which uses C and VHDL as the software and hardware description of an embedded system. COMET uses a rules file to bind the C and VHDL descriptions into a complete system description. COSMOS: a codesign approach for communicating systems [66]: COSMOS is a method for modeling and synthesis of complex communicating systems. It starts from a system-level specification based on an extended finite state machine model allowing for the specification of complex protocols. System-level synthesis is composed of three tasks: partitioning systems into inter-dependent sub-systems, inter-sub-system communication synthesis and architecture generation. The output is a flexible architecture model which includes both hardware and software components. It concerns a set of parallel processors communicating through well defined protocols. LYCOS: the Lyngby Co-Synthesis System [67]: LYCOS (LYngby CO-Synthesis) is an experimental co-synthesis environment in which . the hardware/software partitioning is done by using a target architecture consisting of a single CPU and a single ASIC communicating through memory mapped I/O. GRAPE [68]: GRAPE-II (Graphical Rapid Prototyping Environment-II) is a hardware-software codesign environment for the real-time functional emulation of synchronous DSP systems. It allows one to specify the application’s data dependency graph in a target-machine-independent way. After specifying the heterogeneous target machine’s architecture, it estimates the resources needed by each application subtask. Based on these requirements, it assigns the subtasks to specific target devices at compile-time, be they processors or FPGAs, establishes routing paths and determines a static schedule. Evolution of Commercial Tools: In addition to the above mentioned academic and research projects commercial tools such as IBM PERCS [69], Synopsys’ Nimble compiler [70], Altera SOPC [71] and Xilinx EDK [72].
28
2
2.7
Development of FPGA Based Network on Chip for Circumventing Spam
Hardware-Software Codesign Framework Proposed in the Present Case Study
The traditional Network appliances are designed with a philosophy of configuring the hardware at the manufacturing end, while customizing the software at run time. The above mentioned philosophy is seen to be followed in most of the state of art routers, switches, firewalls and some of the antispam appliances based on ASIC or multi-chip boards. However, with the emergence of the FPGAs, the scene is altogether changing due to two main reasons. The FPGAs can be configured on the fly to implement the traditional software functions in much more efficient manner and moreover it can be reconfigured to incorporate new hardware functionality without changing the underlying hardware. The main motivation of our framework is based on the three important system design issues: viz. the use of Handel C for system design, integration of the reusable pre-verified Soft IP cores from third party and software centric implementation in ANSI C. The impetus of the hardware-software codesign methodology proposed in the present research work is principally based on the following key issues: 1. At the outset the system specifications of the system are fixed. The target system is set to circumvent the Spam by extracting the sender’s IP and content analysis. This requires harvesting the email from the email server, through Ethernet interface and provide the output to the email server as regards to whether the email is Spam or HAM 2. The system specifications when transformed into the behavioral model gives an insight regarding the components or subsystems required as follows: s Ethernet interface: For harvesting the email from the server s Pop3 Client: For facilitating the email harvesting. s Processor: preferably 32 bit RISC type inorder to comply with the latency issues of the networked systems s Memory: Shared storage space for data and program storage. s UART: For run type configuration from the server end as well as to implement on chip debug. s Bad IP and Good IP comparison Mechanism of the inbound email s Parsing the email to extract the keyword from the inbound email to compare them with the Spam keywords. 3. In addition to the above an interface is required to codesign the hardware in the form of soft IP cores developed in Handel C, software in the form of ANSI C and the preverfied soft processor core developed in VHDL. 4. It is also required to satisfy the communication owes between the various subsystems and the 32 bit CPU implemented on the FPGA fabrics. 5. The proposed framework is required to deliver the System on Chip (SoC) installable at the server end and even at the individual client end so as to configure from both the ends.
2.7
Hardware-Software Codesign Framework Proposed in the Present Case Study
2.7.1
29
Addressing the Issues Through Co-design
In order to address the above mentioned issues, the first and foremost step is system portioning in terms of hardware developed in Hndel C paradigm, the software developed in ANSI C and the processor core developed in VHDL taking the benefit of them to achieve higher throughput, less latency, scalability, customization (for client specific keyword and IP settings) and rapid development cycle. Moreover the above mentioned hybrid system should be prone to simulation and debug. The above desired attributes are achieved as follows: s The physical Ethernet interface is kept off the system, while its media access controller (MAC) is built in the FPGA in the form of soft IP core of VHDL. s Pop3 Client: It is designed by using the open source light-weight implementation of the TCP/IP protocol in ANSI-C s Processor: The MicroBlaze a 32 bit soft processor core designed in VHDL for Xilinx FPGAs has been integrated on the FPGA fabrics to undertake the core processing work. s Memory: The communication between the microblaze and other subsystems is implemented through the shared storage space comprising of 64 MByte DDR SDRAM. s UART: The UART is integrated on the FPGA as a soft IP core in VHDL s Bad and Good IP comparison Mechanism of the inbound email is implemented in Handel –C (described in Chap. 4). The same is integrated in the system as a soft IP core. s Parsing the email to extract the keyword from the inbound email to compare them with the Spam keywords is implemented in Handel C and integrated in the system as a soft IP core. The FPGA chosen for the SoC integration is Spartan 3e, while the hybrid integration interface chosen is the Xilinx Embedded Design Kit (EDK). The pure software portion i.e. ANSI C based modules go through the compile process, while the hardware in the form of soft IP cores go through the synthesis process. Fusing of the above mentioned separate flows requires the tools interoperability which is provided by the Xilinx EDK tool set. The obvious advantages of this integrated soft IP core with software-compiled system design are ease in prototyping, better exploration of the design space, verification through co-simulation, common higher-level language base in the form of ANSI C and Handel C for hardware and software design. Thus the presented codesign methodology resolves the incompatibility of the hardware design methodology and the software design methodology. Moreover it facilitates the system design at much higher level of abstraction (Fig. 2.4).
30
2
Development of FPGA Based Network on Chip for Circumventing Spam
USER System level specification Behavioral Design Partition based on HW/SW
HW Part
SW Part
ANSI C based Methodology
Glue Logic Design
Compile
Synthesize
EDK based System Integration
Final System Fig. 2.3 Our framework for hardware-software codesign
2.8
Description of System at Higher Level
The block diagram of the system at higher level is shown in Fig. 2.4. It reveals the role of the FPGA based Anti-Spam setup in a networked environment. The Figure comprises of an email server 120 and clients 40, 141, 42 and so on. The device 130, is a FPGA based system on Chip for filtering Spam emails. It parses through the mails from the email server 120 to the individual clients 140–143. The FPGA based board 130, is configured from the server end i.e. 120 in order to filter and delete the Spam mails addressed to all the clients 140–143 or all the email accounts. However, the said board can also be placed per client and configured individually so as to offer flexibility to them.
2.9
Resolving the System a Step Down
The detailed version of the FPGA based SoC for filtering Spam emails, 130 in Fig. 2.4 is resolved a step down in Fig. 2.5. The core processor 210, controls the entire functionality of the other subsystems i.e. 220, 230 and 240. The aforesaid processor 210 is a 32 bit RISC architecture configured through the off chip flash
2.10
System Design
31
120
110
Internet 130 140 FPGA based System on Chip for Filtering Spam e-mails
141
142
143
Fig. 2.4 Higher level schematic of the system
memory 240. The off chip memory 240 is a 64 MByte DDR SDRAM. The network interface 220 is Ethernet 10/100 media access controller driven through the core processor 210. The 230 is a set of three customized processors for lookup and processing of IP, Keyword and email address of the received email. The above said processors are respectively 32 bit for IP processing, 8 bit for the keyword processing and 8 bit for the email address processing. The system is designed with ‘distributed control’ methodology. The three separate controller parses the email concurrently for extracting the respective attributes and to set a flag indicating Spam mail. All the aforesaid hardware is integrated as SoC on the Xilinx Spartan-3E FPGA (XC3S500E-4FG320C).
2.10
System Design
The system design is shown in shown in Fig. 2.6. It reveals the blocks integrated on the Spartan 3e FPGA and few off chip blocks required to facilitate communication and configuration. Out of the blocks mentioned above, the most important ones designed in the present work for circumventing Spam are CAM and the Bloom filter. The CAM has been discussed elsewhere in literature. The development of soft IP core of the Bloom filter is discussed right away in the following point. Once this discussion is in place,
32
2
Development of FPGA Based Network on Chip for Circumventing Spam 130 250
210
Network Interface
Core Processor
240
Memory
Fig. 2.5 Functional architecture of the system
Fig. 2.6 System design
220
B U S
230
Customized Processors with Concurrent Logic
2.10
System Design
33
the brief description of the remaining soft IP cores reused in this work and their integration aspects to synthesize the SoC using EDK will be presented. The description of the individual blocks follows in the following points:
2.10.1
Microblaze Processor
There is a current design trend in the EDA industry particularly amongst the FPGA vendor to provide pre-verified soft processor cores. Owing to the advantages of the soft IP processors cores towards reconfigurability, customization and emulation, many manufacturers have developed these cores either for a specific FPGA family or as a third party solution. Altera has developed ARM as a hard process core while NIOS and NIOS II as soft IP cores [92, 93]. Similarly ATMEL and Quick Logic [41, 42] have come out with AVR and MIPS as hard processor cores respectively. Other popular examples are the PicoBlaze and MicroBlaze provided by Xilinx Inc. and NIOS and NIOSII by Altera. Main advantage of these soft processors are higher level of design reuse to cope up with the shrinking design cycle, reduced obsolescence risk with the provision of scalability just by updating the code, increased design implementation options through design modularization as the cores are prone to customization. The above mentioned advantages are explored in detail with different case studies in the latest book of the authors. One of the popular soft IP cores in the EDA industry is a popular soft processor core example is the 32-bit Reduced Instruction Set Computer (RISC) core given by the Xilinx’s known as MicroBlaze processor core. The key features of Microblaze are: s s s s s s s s
Harvard bus architecture Three-stage pipeline 32 general-purpose registers Two interrupt levels and exception handling capability Configurable cache Support for optional co-processing functionality Optional single precision floating-point unit (FPU) (IEEE-754 compatible) Standardized IBM Core Connect Bus interface
It comes with several instantiatable units such as multiplier, barrel shifter, divider, FPU, or cache which can be tuned with the soft core parameters for a target application. In the present application, the Miroblaze serves as the central processing core doing all the intercore communication, soft IP core management and execution of the software such as POP3 client in LWIP.
2.10.2
PLB BUS
There are many reported bus infrastructures for connecting the Soft IP cores on a SoC. The importance of an on-chip bus is not only in interconnecting the soft IP
34
2
Development of FPGA Based Network on Chip for Circumventing Spam
Table 2.2 Microblaze usage summary Modules used s )NTEGER MULTIPLIER -5, Additional machine status register instructions s Pattern comparator s I Cache 2 KB s D Cache 2 KB Optimization used s Spatial optimization with lower instruction throughput Memory management unit s Data shadow translation look aside buffer: 4 s Instruction shadow translation look aside buffer: 2 s Access to memory management special registers: full s Memory protection zone: 16 Debug s Number of PC breakpoints: 1
cores on the SoC but also in facilitating the protocol stack and synchronous communication between the various functional blocks. A bus protocol is used on a SoC to s Unambiguously identify a communication transaction through its temporal (e.g., duration and sequence of messages exchanged) and spatial (e.g., message size) characteristics. s Which component may access the shared bus if multiple requests to send (or receive) data appear on the bus at the same time i.e. arbiteration mechanism. A detailed survey of the SoC bus infrastructure may be seen in references [20–24]. This approach of adopting the integration of the soft IP of the bus simplifies the verification problem and increases the level of abstraction with which the verification becomes very simple. The main issues in choosing an appropriate bus architecture for a SoC in particular for a NoC are shared communication along with the arbitration mechanism, low power, high throughput and scalability. From this point of view comparison of several possible approaches was carried out and the results are shown in a nutshell in the Table 2.2 The comparison reveals that all the buses are synchronous, hierarchical, application specific, support various transfer types as well as the arbiteration mechanisms. They also support handshaking, split transfer and burst transfer, pipelined transfer and configurable address and data bus width, and user defined Operating frequency. However the PLB surpasses the rest for the present NoC implementation owing to its high speed (for 32 b PLB width maximal frequency is 256 MB/s, for 64 b PLB width 800 MB/s and for 128 b PLB width, 2.9 GB/s) and more important is the latency timer that restricts the master to occupy the bus all the time. This enables the other soft IP cores such as CAM and Bloom to work in an autonomous manner and communicate the Microblaze as and when required. The latest version of the MicroBlaze i.e. v7.0 comes with the native PLB v4.6 bus interface, that offers 128-bit wide data path; massively increasing the bandwidth of the system with little impact on the processor clock speed or silicon footprint.
2.10
System Design
Table 2.3 Comparison of level converters for networked applications
2.10.3
35
Speed Supply current Supply voltage
ST3232C 250 Kb/s 300 mA +3 V
MAX232C 120 Kb/s 8 mA +5 V
XPS UART Lite
The UART (Xilinx IP) core handles I/O to and from the system for configuration from the client end and debugging from the system administrator end. As shown in the Fig. 2.3 it is connected to the off chip RS232 serial port through the level converter chip ST3232C . It features one transmit and one receive channel in full duplex mode, 16-character Transmit and Receive FIFO each, Configurable number of data bits in a character (5–8), Configurable parity bit (odd or even) and Configurable baud rate. For the present application, the UART Lite is configured as 8 bit data transfer with 57,600 baud rate and no parity. UART is provided as an additional feature in this application. It adds value from the client’s viewpoint to configure the Spam keywords and/or IPs and poses ease of testing, debugging and verification for the system administrators.
2.10.4
Off Chip Level Converter
The ST3232 is a 3 V powered EIA/TIA-232 and V.28/V.24 communication interface with low power requirements, high data-rate capabilities. ST3232 has a proprietary low dropout transmitter output stage providing true RS-232 performance from 3 to 5.5 V supplies. The device requires only four small 0.1 PF standard external capacitors for operations from 3 V supply. The ST3232 has two receivers and two drivers. The device is guaranteed to run at data rates of 250 Kbps while maintaining RS-232 output levels. Detailed datasheet is given in 25. The ST3232C is chosen owing to its superior performance with respect to MAX232C in following regards: The Table 2.3 reveals superior performance of ST3232C as it matches the high throughput and low power requirements of the intended application. It is connected to the email server through a DB9 connector.
2.10.5
XPS Ethernet Lite
Communication between the email server and the designed Spartan 3e anti-Spam boards is controlled with the help of the Ethernet Media Access Controller (EMAC)
36
2
Development of FPGA Based Network on Chip for Circumventing Spam
IP core. The above said EMAC IP core features PLB interface based on PLB v4.6 specification, memory mapped direct I/O interface to the transmit and receive data dual port memory, Media Independent Interface (MII) for connection to external 10/100 Mbps PHY transceivers, independent internal 2 Kb Tx and Rx dual port memory for holding data for one packet and optional dual buffer memories, 4 Kb ping-pong, for Tx and Rx. Detailed documentation is given in 26. This core connects to an on board Ethernet physical transceiver chip SMSC LAN83C185. This EMAC (Xilinx IP) core is responsible for all receiving of Ethernet frames for harvesting the emails from the email server. It also communicates the TCP/IP commands from the LWIP stack.
2.10.6
SMSC LAN83C185 High Performance Single Chip Low Power 10/100 Mbps Ethernet Physical Layer Transceiver (PHY)
The essential features of this device from the current application point of view are: s s s s s s s s s s s
Fully Compliant with IEEE 802.3/802.3u Standards, 10BASE-T and 100BASE-TX Support Supports Auto-Negotiation and Parallel Detection Automatic Polarity Correction 802.3u Compliant Register Functions Vendor Specific Register Functions Comprehensive Power Management Features General Power-down Mode Energy Detect Power-down Mode Single +3.3 V Supply with 5 V Tolerant I/O Low Power Consumption Detailed datasheet is given in [80].
2.10.7
XPS Timer
The XPS Timer is a 32-bit timer module that is connected to the PLB bus as a slave. With the PLB interface it can be configures in a byte-enable mode. It comprises of two programmable interval timers with interrupt, event generation, and event capture capabilities, configurable counter width, one Pulse Width Modulation (PWM) output. It can even freeze the input for halting counters during software debug. Detailed datasheet of the XPS timer is available in [81]. It is used in the current application to constantly update the TCP timers. The functional ‘time out’ feature required in the TCP/IP is implemented using the timer as it generates an interrupt after elapsing the count. More specifically this functionality
2.10
System Design
37
is achieved in conjunction with the EMAC. The acknowledge frames are timed out by the timer in case they are not received by the EMAC in a stipulated amount of time after sending the transmit frame. The non-receipt of the acknowledge frame causes the timer to invoke an interrupt to the interrupt controller, the service routine for which forces to resend the frame until an acknowledge message is received. In case no acknowledge message is received after 10 unsuccessful attempts the system will notify the user of a send error corresponding to the server time out or server not reachable.
2.10.8
XPS Interrupt Controller
The XPS Interrupt Controller (XPS INTC) concentrates multiple interrupt inputs from peripheral devices to a single interrupt output to the system processor. The registers for checking, enabling and acknowledging interrupts are accessed through a slave interface for the Processor Local Bus (PLB V4.6). The number of interrupts and other aspects can be tailored to the target system [82]. The XPS INTC is connected as a 32-bit slave on 32 bit PLB V4.6 bus and although it has 32 configurable interrupt inputs, in the present application only three are used. These interrupts are invoked based on the intended conditions pertaining to Ethernet, timer and UART. The single interrupt output is given to the Microblaze that processes them as per their Priority determined by vector position. As already discussed in the previous point, the timer interrupts with acknowledge timeouts indicating Ethernet frame error and the need for retransmission. The EMAC interrupts with incoming Ethernet frames that require processing for Spam detection. The XPS UART Lite is interrupted whenever the configuration from the client end or the debugging from the system administrator end is required.
2.10.9
Double Data Rate (DDR) Synchronous DRAM (SDRAM) Controller
MPMC is a fully customizable memory controller that supports SDRAM/DDR/ DDR2 memory. It provides access to memory for 1–8 ports, where each port can be chosen from a set of Personality Interface Modules (PIMs) that permit connectivity into MicroBlaze processor using PLBv4.6 bus. MPMC supports the Soft Direct Memory Access (SDMA) controller that provides full-duplex, high-bandwidth, LocalLink interfaces into memory. Additionally, MPMC supports optional Error Correcting Code (ECC) and Performance Monitoring (PM). The rationale behind the selection of the MPMC interface for the DDR SDRAM is the better SDRAM latency. The uniform cycle count offers scalability in the event of high network traffic. It also performs well with respect to arbitrating cache
38
2
Development of FPGA Based Network on Chip for Circumventing Spam
Fig. 2.7 Base configuration of the DDR SDRAM
Fig. 2.8 Memory interface details revealed by EDK
Fig. 2.9 Round robin arbitration algorithm shown by EDK
2.11
Development of Soft IP Core of Bloom Filter
39
requests, resulting in better hit ratios and throughput for similar system conditions (Figs. 2.7–2.9).
2.10.10
Off Chip DDR SDRAM MT46V32M16
The Micron (MT46V32M16) 64 Mbyte DDR SDRAM chip having 16-bit data interface is accessed through a DDR SDRAM IP Memory Controller (MPMC core discussed in the previous point) within the MicroBlaze. The above mentioned 512 Mb DDR SDRAM is a high-speed CMOS, dynamic random-access memory containing 536,870,912 bits, configured internally as a quad-bank DRAM. It uses a double data rate architecture to achieve high-speed operation. The double data rate architecture is essentially a 2n-prefetch architecture with an interface designed to transfer two data words per clock cycle at the I/O pins. A single read or write access for the 512 Mb DDR SDRAM effectively consists of a single 2n-bit wide, one-clock-cycle data transfer at the internal DRAM core and two corresponding n-bit wide, one-half-clock-cycle data transfers at the I/O pins [84].
2.11
Development of Soft IP Core of Bloom Filter
For most of the web based applications such as Web caches, spellcheckers, databases, etc. “Set” data structures are heavily used. The Bloom theory based on the set membership problem, introduced in 1970 is a lossy summary technique that has been found very useful to address the above mentioned problems.
2.11.1
Justifying Bloom Filters for the Keyword Parsing
The most popular approach adopted by many researchers is the Bayesian Filters for the table lookup of the Spam keywords. However, they pose bottlenecks in terms of memory access rates and hence the classification speed has temporal inefficiency. In [87], a survey reported on performance of various Spam filters reveals the speed performance limitations up to only 100 Kb/s of the widely used Spam Assassin based on the Bayesian theory. This speed is clearly not agreeable in view of the resulting network latency. The Bloom filter is a memory efficient data structures and at the expense of little bit of precision yields incredible memory and run-time savings. The theory of Bloom filters work on the principle of chunking the input strings in fixed sizes n, followed by building the n × n template based on the hash function in hardware on FPGA, thereafter comparing the hashed input strings with the table lookup to come out with a signature.
40
2
2.11.2
Development of FPGA Based Network on Chip for Circumventing Spam
Theoretical Foundations of Bloom Filter
The basic foundation of the Bloom filter is based on the hashing functions. Pioneering work on the hash functions is done by Carter/Wegman, who defines them mathematically as: In the context of the present work the Bloom Filters is used as follows. A lexicon of Spam keywords is defined as a set K = {k1, k2, k3, …, kn}, on an universe U implemented by using an array of 256 × 8. For an inbound mail M, we want to seek answer whether a Spam keyword in M K
(2.1)
The Bloom filter provides the answer for the above query, moreover with the space and time efficiency of the implementation. In case of Naïve set implementation stores each element of K as an m bit longer string. This requires K(m x n) storage space requirement and K (m log (n)) time for detection of a particular element in the set K. However, the Bloom filter results into finite probability of errors that can be mathematically described as: False positives : i.e. M K but still reports M K
(2.2)
False negatives i.e. M K but still reports M K
(2.3)
The Bloom filter is basically evolved from the hashing theory. So it is worthwhile to review the details of hash functions. There are three functions are performed by Bloom filter. 1. Add keywords 2. search keywords 3. Update threshold according to match found. Applications of Bloom filter in computer networking is widely discussed in the literature [90].
2.11.3
Hash Function
A hash function converts an input from an outsized domain into an output in a smaller range (the hash value, often a subset of the integers) (Fig. 2.10). The choice of Hash function is based on the following factors: s s s s
Their deviation in the domain of their inputs Range of their outputs Variation of patterns and similarities of input data affecting the output data. Number of collisions in the expected domain of values it has to deal with; or the uniqueness with which it identifies most the input patterns.
2.11
Development of Soft IP Core of Bloom Filter
41
Fig. 2.10 Hash function mapping variable length keywords to fixed length vector
Table 2.4 Comparison of hash functions Name Size-1000 Speed Inlinea Collide-1000 Additive 1009 5n+3 n+2 +806.02 Rotating 1009 6n+3 2n+2 +1.24 One-at-a-time 1024 9n+9 5n+8 −0.05 Bernstein 1024 7n+3 3n+2 +1.69 Pearson 1024 12n+5 4n+3 +1.65 CRC 1024 9n+3 5n+2 +0.07 Generalized 1024 9n+3 5n+2 −1.83 Universal 1024 52n+3 48n+2 +0.20 Zobrist 1024 10n+3 6n+2 −0.03 Paul Hsieh’s 1024 5n+17 N/A +1.12 lookup3.c 1024 5n+20 N/A −0.08 MD4 1024 9.5n+230 N/A +0.73 a This is the speed assuming the hash is inlined in a loop that has to walk through all the characters anyways, such as a tokenizer. Such a loop doesn’t always exist, and even when it does inlining isn’t always possible
In order to select a proper hash function we have compared all the existing hash functions from the literature (Table 2.4) [89]. We have selected the CRC hash function as it is faster than the other HASH functions of comparable size. With the hypothesis that the inherent Ex-Or mechanism will maintain the speed in the FPGA paradigm our selection justifies the choice.
42
2
Development of FPGA Based Network on Chip for Circumventing Spam
Our Handel C based CRC Hash function works on the following steps: s Parse the input content block s Calculate the hash value s Reset the internal state for a new calculation of the next content keyword.
2.11.4
Deciding the Size and Number of Hash Functions in Our Bloom Filter Implementation
There are various trade-offs in deciding the size of the Bloom filter. Here, bit vector length and the number of keys stored in the filter decides the false-positive rate. The larger the bit vector, the lesser the probability that all k bits being checked will be put to logic ‘1’, unless the key truly exists in the filter. The correlation between the number of hash functions and the false-positive rate is more delicate. Using less number of hash functions, implies little discrimination between keys. However, large number of hash functions leads to very dense Bloom implementation, that increases the probability of collisions. A rule of thumb is there should be atleast more than two hash functions. Detailed mathematical formulation in this regards is given widely in the literature [38]. Here we are taking a set of n keywords in the body of the email (E). Bloom filters describe membership information of E using a bit vector V of length m using k hash functions, i.e. h1 , h2 ,..., hk with hi : X o {1.. m}
(2.4)
Now we are interested to get the false positive i.e. the case of keyword collision which is calculated using the mathematical formula: 1· § ¨© 1 ¸¹ m
p0
kn
| 1 e
kn m
(2.5)
The above formula is based on the assumption of using the perfect hash function. The CRC function is reportedly having the +0.07 collision property for 1,000 entries. We are taking a case of only 300 Spam and Ham keywords, so the CRC turns out to be a perfect hash function. As derived in [38], the false probability formula then becomes: k
perr
k
1 p0
kn kn § § · § 1· · m ¨ 1 ¨© 1 ¸¹ ¸ | ¨ 1 e ¸ m ¹ © ¹ ©
k
(2.6)
The number of parameters to be decided in the above formula are: 1. Deciding the number of Keywords (n) Here we have taken a realistic number of Spam and Ham keywords as 300 each. So n becomes 300 (Due to parallel architecture of the Ham and Spam lookup tables)
2.11
Development of Soft IP Core of Bloom Filter
43
2. Deciding the bit vector length (m): The false positive rate is dependent on both the ratio m/n and the number of hash functions k. However the dependence on k is not so direct. The formula for the m/n is: k (bits per entry) ln perr § · ln ¨ 1 e k ¸ © ¹
m n
(2.7)
With one parsing entry at a time, and false probability as 0.003 i.e. one in 333 keywords, the mathematical formulation can be best described by resorting to the ceiling functions that map a real number to the next smallest or next largest integer. (Mathematically ceiling(x) is the smallest integer not less than x). The formula for the bit vector computation for a fixed value of number of keywords and false probability can be given as: m
ceil n * log p Ô log 1.0 / pow 2.0, log 2.0 ;
(2.8)
3. Number of Hash Functions: The formula is k
g
m n
(2.9)
Where g = ln (2) for zero false positive. However in practice, computing this parameter results into large number of hash functions that increases the computational overhead. Therefore we have used a rounding function to compute the realistic value of the k as follows: k
round log 2.0 *m / n ;
(2.10)
Based on the 1, 2 and 3 for the non-zero probability of the false positive and realistic value of the bit vector length; the formulation of the architecture is multilevel counting bloom having 8 bit vector length 256. Bloom Filter functionality in this work goes on the following lines: For Mapping 1. 2. 3. 4.
Input the variable-length string. Pass the string to Hash function for key generation Map the key into hash table using map function. Repeat the process 1–3 until all the string is mapped.
For Searching 1. Input the variable-length string. 2. Pass the string to Hash function for key generation 3. Find the Index using Map Function.
44
2
Development of FPGA Based Network on Chip for Circumventing Spam
1.0 0.9
False Positive probability
0.8 0.7 0.6 0.5 0.4
g=2
0.3 0.2 g=log(2) 0.1 0.0 0
5
10 m/n
15
20
Fig. 2.11 Variation of false positive rate as a function of m/n
Size of Bloom filter as a function of the error rate 50
K=2 K=4 K=32
Bits per entry
40
30
20
10
0 0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.10
Error rate
Fig. 2.12 Variation of size of bloom filter as a function of the error rate
4. Look into index for key, if match found increment threshold value. 5. Repeat process 1–4 until all the end of mail. The results of Bloom filter realization are discussed at the end of the chapter (Figs. 2.11–2.14).
2.12
Presenting System Design of Purely Software Modules 0.10
45
False positive rate as a function of the number of hash functions
False positive rate
0.08
0.06
0.04
0.02
0.00
0
5
10
15
20
25
30
35
Hash Functions
Fig. 2.13 Varying value of false positive as a function of number of hash functions
0.14
m/n=7 m/n=10
False Positive rate
0.12 0.10 0.08 0.06 0.04 0.02 0.00 0
2
4
6
8
10
12
14
16
18
K Fig. 2.14 Variation of false positive with number of hash functions at a given m/n ratio
2.12
Presenting System Design of Purely Software Modules
The POP3 client is implemented in this work by using the light-weight Internet protocol stack (LWIP) stack. The LWIP stack is pioneered by Adam Dunkels from the Swedish Institute of Computer Science in the year 2001 (Fig. 2.15) [85].
46
2
Development of FPGA Based Network on Chip for Circumventing Spam
Implemented as software in ANSI C
POP-3 Client
LWIP Stack
EDK Libraries
XPS Timer
XPS UART
Microblaze
XPS Ethernet
Soft IP Cores
Fig. 2.15 Revealing the role of LWIP stack in the system
Several options were available for the onchip TCP/IP selection. One of the popular solutions is Treck TCP/IP stack which can also be used with the Xilinx EDK. However ‘Treck’ is better tuned with the Viretx FPGA. Another option is the SynDEx that can generate a sequence of generic executives for complete TCP/ IP-based communication. However, it doesn’t give the storage efficiency and basically being a kernel most of the modules could remain unutilized for the present application. Few more reported TCP/IP stacks include the native uC/TCP-IP stack, and XMK (eXtreme Minimal Kernel) which is an open-source real-time kernel. But all of them are the tiny operating systems, the functionality which is not required in the present application. Another potential solution is the network libraries for embedded processors known as libXilNet. However since it is based on the Socket Application Programming Interface (APIs) functions; it is no longer provided with the EDK and has been taken over by the Light Weight IP (lwIP) library used in the current application. One of the striking feature, due to which the LWIP is chosen is its availability under the BSD license. The LWIP which is now being developed and updated world wide as an open source software and distributed by Leon Woestenberg. Some of the features of the LWIP from the current application viewpoint are: s IP (Internet Protocol) including packet forwarding on multiple interfaces s ICMP (Internet Control Message Protocol)
2.14
Setting the POP3 Client and Describing Overall Working of the System
47
s TCP (Transmission Control Protocol) s Raw API interface support for applications s Sockets API interface support for applications Although it supports raw as well as sockets API interface, the former is used in this application owing to its advantages such as simplicity and callback approach. The raw API provided by lwIP is the lowest level interface of the stack. It has two modes of operation: Event and Callback. The EDK port uses the raw mode in the present application since it offers high performance and lower memory overhead as compared to the socket API mode. The portability advantage of the socket API overrules the inefficiency and performance bottlenecks which are crucial from the networked throughput point of view. The raw API works on the callback-based and hence although it does not feature portability, the same is not very significant in this application. Detailed documentation of the LWIP stack is given in the latest Xilinx documentation note “lwIP 1.3.0 Library (v1.00.b)” released on April 15, 2009 [86].
2.13
Integrating of the Hardware-Software Modules Using EDK
Embedded Development Kit (EDK) is a proprietary EDA software suite of Xilinx Inc, that provides the framework for integration of the hardware and software on the FPGA platform. The kit has built in support for the tools, documentation, and provision of soft IPs required for the intended design with embedded IBM PowerPC™ hard processor cores, and/or Xilinx MicroBlaze™ soft processor cores. A generalized design flow for designing Embedded Systems using the EDK has been covered by the Xilinx in its application note xxxx. Authors have also reported a simplified flow in their latest book [13]. However, the customized flow for the present application is shown in Figs. 2.15 and 2.16.
2.14
Setting the POP3 Client and Describing Overall Working of the System
The logic/pseudo code execution of the ANSI C program for setting the POP3 client given below exemplify its working. ****************************************************************** Main Program starts here Declaration of Xilinx Header Files Declaration of LWIP Header files Defining lower 6 bits of MAC Global memory initialization for storing the MAC table
48
2
Development of FPGA Based Network on Chip for Circumventing Spam
Embedded Hardware
Embedded Software
Selecting the target- Spartan 3e Using Base System Builder Wizard
Generating libraries and drivers with LibGen pertaining to selected IPs
Adding peripherals from IP Catalogue
Creating the software in ANSI C & Debugging with XPS
Importing Handel C IPs using ‘Import Peripheral Wizard Inserting ChipScope Pro for Debugging
Compiling the software using gcc
Connect to the Spartan 3e using XMD
Generating the bitstream
Generating executable in elf
Configuring the FPGA
Dumping in the DDR SDRAM
Spartan 3e based system with Built in functionality for Circumventing SPAM
HardwareSoftware Integration
Fig. 2.16 Design flow for the Xilinx EDK adopted in the present work
Timer Initialization LWIP variables definitions IP, Gateway, Subnet Mask Enable and initialize cache Enable microblaze interrupts Set the Timer to interrupt for every 100ms set the number of cycles the timer counts before interrupting */ /* 100 Mhz clock => .01us for 1 clk tick. For 100ms, 10000000 clk ticks need to elapse Resetting and starting the timer System Initialization Clears the structure where runtime statistics are gathered. Module for Easy Configuration setting Memory i.e. heap size initialization Reserve buffer i.e. memory pools initialization pbuf memory pool initialization for packet storage Setting MAC, gateway and IP address
2.15
Conclusion
49
TCP Initialization Starting the application Indefinite Loop Capturing the packet before the count in the timer is elapsed Delivering the same to the application routine In case the packet not received before the maximum timer setting then Generate an interrupt to MicroBlaze Main Program Ends Here ****************************************************************** ****************************************************************** Application program starts here Defining Port 110 for POP3 server Connect to the server Call back function for receiving packet; Generate error alert if failed Extract the data from the packet Checking the TCP authentication by following the sequence Checking +OK from server Sending the username by client Checking +OK from server Sending the password by client Checking +OK Logged in from server Sending the ‘STAT’ command by the client Getting the number of mails and their total memory utilization from server Repeat till the number of mails exhaust Retrieve the mail Invoking the ‘IP’ Checking through the CAM filter Invoking the Bloom filter for content parsing Generate the threshold for the mail Issue ‘Delete’ if threshold value in case the threshold value exceeds than the set Continue till the last mail ******************************************************************
2.15
Conclusion
This chapter presented a hard-software codesign framework for realizing the AntiSpam solution using a Spartan 3e based FPGA. It exemplifies the integration of various soft IP cores for realization of the target system using the Xilinx EDK. Our modified hardware software codesign framework was exploited for the integration of predesigned, preverfied IP cores e.g. Microblaze and other peripherals such as ethernet, timer to acheive the design goals without the pain of reinventing the wheel.
50
2
Development of FPGA Based Network on Chip for Circumventing Spam
The main contribution of this chapter is the successful design of the Bloom filter. As shown in this chapter, the Bloom filter is a space-efficient probabilistic data structure used to test whether or not an element is a member of a set. It stores a set of signatures compactly by computing multiple hash functions on each member of the set. The answer to querying a database of strings to check for the membership of a particular string can be “false positive”, but never “false negative”. From the efficient realization point of view, there are various design trade-offs for Bloom filter such as the number of hash functions used (driving the computational overhead), the size of the filter and the error (collision) rate. By varying the key design parameters, we could come graphically conceptualize the efficient design of the Bloom filter for the Spam and Ham keyword matching applications. With the help of various mathematical formulations and approximations and subsequently using the graphs the trade-offs are seen to be bridged illustrated in Figs. 2.11–2.14. Finally we describe the integration of the ANSI C based POP3 client implementation in the system. A full listing of the pseudo code illustrates handshaking of the various modules and anti-Spam strategy inculcated in the system.
Chapter 3
Analog Front End and FPGA Based Soft IP Core for ECG Logger
Abstract This chapter presents an FPGA-based ECG system with telemonitoring facility. There are several methods of recording and transmitting ECG signals. A classical recording in health centers, then ambulatory ECG recordings and telemetry monitoring the patient in and round medical center. The system presented in this chapter has a provision to transmit the ECG data through the serial modem for diagnosis by the expert cardiologist. In order to facilitate this, a communication protocol is designed and implemented in VHDL. The objective of this work was to study a modem based transmission of the ECG using telephone lines. Using this system the ECG of the patients was successfully transmitted to the expert medical professionals for immediate diagnosis. The design is implemented on a Xilinx Spartan 3E FPGA. The extent of the resources used indicates plenty of room for incorporating additional functionality which has been integrated and described elsewhere by the authors [96]. The system implemented here used with the control monitor center having trained medical staff could enable round the clock monitoring of few thousands of patients. With pre-paid service the patients might get the economical benefits and efficient way of monitoring than with classical recording of the ECG signal.
3.1
Prior Art
The integration of an electrocardiogram (ECG) device into a chip is already well known in the field of implanted devices, such as pacemakers. For noninvasive electrocardiology, this approach has not been used on a broad scale commercially [97]. However there are many references reporting the embedded set up for the ECG applications. Dimopoulos et al. [98] have recently reported a highly efficient embedded system, implemented entirely in reconfigurable hardware, for the extraction of electrocardiogram (ECG) measurement parameters and the recognition of the normal ECG. The entire process takes place on an FPGA and is based on the syntactic pattern recognition approach. The underling model for this system is that R.K. Kamat et al., Harnessing VLSI System Design with EDA Tools, DOI 10.1007/978-94-007-1864-7_3, © Springer Science+Business Media B.V. 2012
51
52
3
Analog Front End and FPGA Based Soft IP Core for ECG Logger
of Attribute Grammars (AG), whose descriptive power allows the concurrent recognition and measurement of the input ECG. The proposed generic platform for syntactic pattern recognition applications is using the fastest parallel Context Free Grammar parsing algorithm in the bibliography. Main advantage of such embedded ECG setup is to diagnose and identify cardiac arrhythmias, and evaluate the effects of drugs. A recent paper [98] presents the design of an ECG-processing Systemon-Chip (SoC), which incorporates an ARM922T hard macrocell as its processor core. This SoC takes the ECG signals as inputs, and detects the positions of the QRS-complexes. The architecture of this SoC and the associated algorithm of QRS detection has been discussed in the paper. Another paper [99] reports a novel ECG Biochip solution leveraging the computation horsepower of many concurrent DSP cores to process ECG data in real-time. This solution paves the way for novel healthcare delivery scenarios (e.g., mobility) and for accurate diagnosis of heart-related diseases. Authors have described the design methodology for the MPSoC and explored the configuration space looking for the most effective solution, performance and energy-wise. Same group of researchers have extended their research with a focus on multiprocessor system-on-chip (MPSoC) architectures for human heart Electrocardiogram (ECG) real-time analysis as a Hardware/Software (HW/SW) platform offering an advance relative to state-of-the-art solutions. They have reported it as a relevant bio-medical application, with good potential market since heart diseases are responsible for the largest number of yearly deaths. Hence, it is a good target for an application-specific system-on-chip (SoC) and HW/SW co-design. They have investigated a symmetric multi-processor architecture based on ST Microelectronics VLIW DSPs that process in real-time 12 lead ECG signals. This architecture improves upon state-of-the-art SoC designs for ECG analysis in its ability to analyze the full 12 leads in real-time, even with high sampling frequencies, and ability to detect heart malfunction for the whole ECG signal interval. They have explored the design space by considering a number of hardware and software architectural options. There are many developments on the ECG SoC for wireless applications. Park et al. [101] have reported a new class of miniature, ultra low noise, capacitive sensor that does not require direct contact to the skin, and has comparable performance to gold standard ECG electrodes, has been developed. This paper presents a description and evaluation of a wireless version of a system based on these innovative ECG sensors. They have used a wearable and ultra low power wireless sensor node called Eco. Experimental results evidences that the wireless interface adds minimal size and weight to the system while providing reliable, untethered operation. Shukri et al. [102] have reported a microcontroller-based underwater acoustic telemetry system for digital transmission of the electrocardiogram (ECG). The system is designed for the real time, through water transmission of data representing any parameter, and it was used initially for transmitting in multiplexed format the heart rate, breathing rate and depth of a diver using self-contained underwater breathing apparatus (SCUBA). Here, it is used to monitor cardiovascular reflexes during diving and swimming. The programmable capability of the system provides an effective solution to the problem of transmitting data in the presence of multipath interference.
3.2
The Very Rationale of the System
53
An important feature of this paper is a comparative performance analysis of two encoding methods, Pulse Code Modulation (PCM) and Pulse Position Modulation (PPM). Hu et al. [102] have reported the design procedure and results on telehealthcare in nursing homes through RFID based wireless sensor networks. Their research in this field included medical sensor design, signal transmission, medical privacy and security, ECG data mining, and so on. The results showed the feasibility of applying wireless sensor networking to medical monitoring anytime and anywhere. Galjan et al. have reported portable battery powered 3-channel ECG-system. This system can be operated up to 10 days by two standard AA batteries. It acquires three ECG signals with 16 bit resolution using a dedicated system on chip (SoC) with an embedded DSP for power efficient data handling. The digitized data are filtered with an 80th order FIR filter and stored on a compact flash (CF) card that can be read out using a standard PC operating system. Combining high performance analog parts for signal acquisition and a powerful DSP on a single chip opens up innovative possibilities in reduction of system size and power consumption. Similar SoC based implementations for the ECG applications are presented in the literature [104–112].
3.2
The Very Rationale of the System
With the development of electronics and its application in medicine it is possible to transmit and process many vital parameters of the human body. The most important, and in this moment the most interesting signal for monitoring and analyzing is the electrocardiography (ECG) signal. For the patient suffering from the cardiac disease it is very important to perform accurate and quick diagnosis. For this purpose a continuous monitoring of the ECG signal and the patient’s current heart activity are necessary [113].The main aim behind the FPGA based ECG setup with real time communication facility is especially for the post-operative patients who may develop complications once they are discharged from the hospital. In some patients the cardiac problems are likely to recur once they start their routine work. With the increasing population and insufficient personal capacity of hospitals, the system such as the one reported in this chapter, plays a vital role in ensuring the benefits of the expert cardiac professionals for the patients. The work presented in this chapter highlights prevalent issues pertaining to the portable ECG signal acquisition. The system design is based on noise filtering, multiplexing, amplifying and digitizing the quantum Leads’ analog signals emerging from a six electrodes ECG Instrumentation Amplifier. Latter to these essential processes on Lead signals we sent them serially into the FPGA for standard values comparison and to wary the patient before he/she enters into a dangerous heart functioning. The intention of using serial data transfer between Leads’ Digital data and FPGA is to avoid any malfunctioning increased by increase in number of wires (or tracks) and important is to save the space on PCB to make the system more portable in size.
54
3.3 3.3.1
3
Analog Front End and FPGA Based Soft IP Core for ECG Logger
Analog Front End of the Setup Leads Formation
From the viewpoint of designing the ECG data acquisition system we use a concept that, the ECG Leads are formed by negative and positive poles as shown in the Fig. 3.1. A suppositional line joining the two poles of the lead is known as the axis of the lead. The direction of this line depends upon the position of the positive and negative poles. As shown in the figure the three standard electrodes placed on the limbs spaced apart 15 cm from the center are equidistant from the heart and form a triangle, with the heart as its center. As shown in the diagram, the system uses three electrodes Right Arm (RA), Left Arm (LA) and Left Leg (LL). This is just as per the standard ECG systems that are being used today in hospitals. The output of the leads is further connected to individual differential buffer amplifiers. The waveforms being recorded across the pair of these electrodes are called as Leads. With the basic arrangement of the leads shown in Fig. 3.1, an equilateral triangle whose vertices lie at the left and right shoulders and the pubic region and whose center corresponds to the vector sum of all electric activity occurring in the heart at any given moment, allowing for the determination of the electrical axis can be derived. The above referred geometrical topology is termed as Einthoven’s triangle. It is approximated by the triangle formed by the axes of the bipolar electrocardiographic (ECG) limb leads I, II, and III as shown in Fig. 3.1. The center of the triangle offers a reference point for the unipolar ECG leads. The bipolar leads and their corresponding connections are as shown in Table 3.1. The unipolar limb leads identified as augmented limb leads examines the composite potential from all the three limbs at once. The augmented leads and their corresponding connections are as shown in Table 3.2. In all the three augmented leads, the signals from two limbs are summed in a resistor network and then applied to the amplifiers’ inverting input, at the same time the signal from the residual limb electrode is applied to the non-inverting input.
3.3.2
Restricting Number of Leads
There are different trends in using the number of leads for the ECG system. On one hand there are instances that an 80 lead electrocardiographic body-surface mapping system significantly improves detection of acute myocardial infarction and unstable angina in the emergency department, compared with the standard 12 lead electrocardiogram [121]. On the other hand, even the ECG with two leads has been reported [122]. In the present work we have chosen six lead system on the basis of the recent studies reported in the literature. Recently, Madias [122] has reported the performance of three ECG systems (i.e. the 12 lead ECG, a 6 lead ECG comprising the
3.3
Analog Front End of the Setup
55
Fig. 3.1 The three standard leads form an equidistant triangle (Einthoven Triangle)
Table 3.1 Bipolar leads and their connections Leads Connections Lead I LA electrode is connected to the amplifiers’ non-inverting input, while RA is connected to the inverting input Lead II LL electrode is connected to the amplifiers’ non-inverting input, while RA is connected to the inverting input (LA is shorted to RL) Lead III LL electrode is connected to the amplifiers’ non-inverting input, while LA is connected to the inverting input (RA is shorted to RL)
Table 3.2 Augmented leads and their connections Leads Connections Lead aVR RA is connected to the non-inverting input, while LA and LL are summed at the inverting input Lead aVL LA is connected to the non-inverting input, while RA and LL are summed at the inverting input Lead aVF LL is connected to the non-inverting input, while RA and LA are summed at the inverting input
limb leads, and a 2 lead ECG comprising exclusively leads 1 and 2). Diagnosis reports for 28 patients with anasarca (AN), 28 control patients, 10 patients who had undergone hemodialysis, and three patients with idiopathic dilated cardiomyopathy was collected. It is concluded that the ECG systems, comprising 2 or 6 leads, can be substituted for the 12 lead ECG for certain clinical and research applications (pertaining to the amplitude of QRS complexes), attesting to the inherent redundancy of the information from the 12 lead ECG. The other reasons behind the obselcence of the standard 12 lead ECG system are [122]: s Over 90% of the heart’s electric activity can be explained with a dipole source model s Only three orthogonal components need to be measured, which makes nine of the leads redundant s The remaining percentage, i.e. nondipolar components, may have some clinical value. This makes eight truly independent and four redundant leads
56
3
Analog Front End and FPGA Based Soft IP Core for ECG Logger
The focus of the present work is more on QRS complexes, and therefore the six leads system has been chosen here.
3.3.3
ECG Instrumentation Amplifier
In general the ECG amplifiers are the medium gain amplifiers. The challenges in the design comes from the fact that the ECG signal should be AC coupled despite of the component as low as 0.05 Hz to overcome electrode offset potential from electrode-skin connection. The desired high-frequency response is form 0.05 to 100 Hz. The other essential attributes for the amplification system are input amplitude ranging from 1 to 5 mV. And coping up with the largest measurement error sources mainly due to the motion artifacts and 50/60 Hz power line interference. The other features required are electrical safety and isolation as well as defibrillation protection. In our design, the ECG amplifier copes up with the five patient electrodes. It is based on the precision Instrumentation Amplifier IC AD624. The reasons behind the choice are as follows: s High precision, low noise, characteristics suiting for low level biopotential generated by the ECG leads. s Input offset voltage drift of less than 0.25 PV/°C, output offset voltage drift of less than 10 PV/°C makes the resolution of the signal better. s CMRR of the order of 80 dB with typical nonlinearity of the order of 0.001% copes up with the noise sources. s Slew rate of the order of 5 V/Ps permits high speed. s Pretrimmed gains of 100 used in this work alleviate the need for external components. The op-amp IC AD624 are used for the leads I, II and III. For the other leads viz. aVR, aVL and aVF a combination of the Op-Amps LM308A has been used as shown in Fig. 3.2. A unity gain amplifier buffers all electrodes. The series protection resistors are used for these buffers, with the input resistance of 51K:. A passive resistor array in the form of Wilson Network is used for deriving the six basic leads as shown in the Fig. 3.3. The design standards and advantages of Wilson Network are covered in a reference [120].
3.3.4
Deriving the Signal from the Augmented Leads
In order to derive the signal in the form of augmented leads, the ‘Wilson Network’ has been used as shown in Fig. 3.3. The passive resistor array has been used in deriving the six basic leads: Lead-I, Lead-II, Lead-III, aVR, aVL and aVF in the form of Wilson Network. The tap point shown drive the bank of six differential input amplifiers each of them is set to provide a voltage gain of about 100. As seen
3.3
Analog Front End of the Setup
57
from the previous literature, the gain has to be low to forestall electrode-offset voltage from saturating the amplifiers. Each differential amplifier is placed to amplify a tap and/or combinations of taps with respect to another. With the choice of Operational Amplifiers with high CMRR the 50/60 Hz noise is significantly reduced. This has been further ensured by using a combination of differential amplifiers at each electrode. The Wilson Central serves as the common reference point. The inputs for the Wilson Central are the patient electrodes- RA, LA and LL. Thus the Wilson network serves the purpose of leads selection. Buffers based on Op Amp LM308 have been deployed between the Wilson network and the patient, for signal isolation. The resistors used for the form of the triangle of Wilson network are having 1% tolerance with 0603 package.
a 2
RA
51k
3
6
RA’
100K 10k
2
6
Lead-)
3 10k
2
LA
10k
6
3
51k
LA’
b 2
RA
51k
3
6
RA’
100K 10k
2
3
6
Lead-))
10k 2
LL
51k
3
10k 6
LL’
Fig. 3.2 (a) Lead-I: (LA–RA), (b) Lead-II: (LL–RA) (c) Lead-III: (LL–LA), (d) aVL: {LA – (RA + LL)/2} (e) aVR: {RA – (LA + LL)/2}, and (f) aVF: {LL – (RA + LA)/2}
58
3
Analog Front End and FPGA Based Soft IP Core for ECG Logger
c 2
LA
51k
3
6
LA’
100K 10k
2
6
3
Lead-)))
10k
2
LL
51k
10k
6
3
LL’
d 2
RA
51k
2
LL
51k
10k
6
3
RA’
10k
6
3
LL’
100K 10k
2
6
aVL
3 10k
2
LA
3 51k
10k 6
LA’
Fig. 3.2 (continued)
In terms of the standard ECG terminology, the geometrical topology that indicates the axes of the leads and the vectors of the ECG signal directed towards the leads is known as the Einthoven triangle which is shown in Fig. 3.1. The common reference point for the Right Leg (RL) drive is obtained by inverting the Wilson Central output. The voltage gain of the RL drive is set to 39. The literature reports a gain
3.3
Analog Front End of the Setup
59
e 2
LA
51k
51k
10k
6
3
2
LL
LA’
10k
6
3
LL’
100K 10k
2
6
aVR
3 10k
2
RA
10k 6
3
RA’
51k
f 2
RA
51k
2
LA
51k
10k
6
3
RA’
10k
6
3
LA’
100K 10k
2
3 10k
2
LL
3 51k
Fig. 3.2 (continued)
10k 6
LL’
6
aVF
60
3
Analog Front End and FPGA Based Soft IP Core for ECG Logger Wilson Network
(RA+LL)/2
(RA+LA)/2
RA
WCT
(LA+LL)/2
LA
RL Drive RL 51K7
All Rs=10K7
390K7
LL
47pF
10K7 WCT Out (RA+LA+LL)/3
Fig. 3.3 Wilson network and right leg drive
much higher than 39, but in order to avoid the Operational Amplifier going in saturation stage, the same has been lowered down. The 47 pF capacitor used in the feedback loop ensures limiting of the high-frequency gain and thus prevents the oscillations.
3.3.5
Filtering the ECG Signal
The main intent of the Low-Pass-Filter deployed here is to attenuate the high frequencies coming from mains noise. The high order low-pass-filters having knee just below 50 Hz are used here. With the sharp cut-off, significant portion of the signal having frequencies closer to 50 Hz, can also be accommodated as if high order filters are employed. The theory of similar implementation detects that the second order low-pass-filter comprises two RC networks cascaded together with a Qualify factor Q less than ½ have limitations mainly of the passive filters. Therefore there is a trend towards the on-chip digital filters in ECG processing. However in our work
3.3
Analog Front End of the Setup
From Instr.Amp. Output Lead-I R11
C12
R12 C11
2nd Order Low-Pass Sallen-Key Filter#1
THS3001 Vcc+ 7 + A1 6 − 4 2 Vcc− R14
3
R22 C21
2nd Order Low-Pass Sallen-Key Filter#2
–40dB/dec Vo l Analog MUX MAX4051(8:1) Lead-I’ 13
NO0 NO1
THS 3001
Lead-III’ 15
NO2
Rf=100K
aVR’
12
NO3
aVL’ THS3001 Vcc+ aVF’ + 7 Vo2 6 A2 4 2− Vcc– R24 − 80dB/dec R23
1
3
NO4
5
3 COM
NO5
2 4
NO6
9
Serial ADC LTC 1407
Gain=11
Lead-II’ 14
R13
C22
R21
61
Vcc+ + 7 Ri=10K 6 A13 – 4 2 Vcc– Ri’=10K
Serial Dout
3
Analog +IN
NO7 10 11 C B A(LSB) Select Lines
3
AD_CONV
GND
SPI_SCK
FPGA Spartan 3E
ADDR
SPI_MIBO Reset
CLK
For Serial Transmission
Fig. 3.4 Six lead ECG data acquisition system using FPGA
we used the positive feedback with the filters, to attain the high Q sufficient enough to guard the resolution of the signal. As shown in the Fig. 3.4, the implementation comprises of two-second order Sallen-Key Low-Pass-Filters having –40 dB/decade frequency roll-off rates. The significance of using two Sallen-Key filters is to obtain –80 dB/decade frequency roll-off and the improved Signal to Noise Ration consequently. The sole filter of such type would result in SNR of 1:1 only and thus wouldn’t be competent to attenuate 50 Hz noise adequately. However, with the positive feedback, emanating near the cut-off frequency, due to matching of the impedances of both capacitors to that of the combination of resistors, ensures enhanced Q and better attenuation of the noise. The cut-off frequency fc of single stage of Sallen-Key filter is given as: fc 1 / (2P ) R11R12 C11C12
(2.1)
fc 1 / 2P RC
(2.2)
and
…as if we use R11 = R12 = R and C11 = C12 = C The design values for fc = 34 Hz, we get R = 4.68 M: if C = 1 nF The Sallen Key filters are based on Texas Instruments op-amp IC THS3001. The justification of using the THS3001 goes to its high-speed current-feedback mechanism that gives very high slew rate of the order of 6,500-V/Ps with a settling time as low as 40 ns indicating excellent transient response. Additionally it offers very low signal distortion as low as–96 dB.
62
3.3.6
3
Analog Front End and FPGA Based Soft IP Core for ECG Logger
Multiplexing the Lead Signals
Each lead signal has its own implication that can be applied for variety of diagnostic purposes. Uniting them at proper time intervals we get P wave, QRS complex, T and U waves. Each of the above mentioned waves has its association with different functions of the Heart. In order to combine the waveforms, we used the analog Multiplexer MAX4051 owing to its low-voltage, multi-channel nature. Out of the available eight input lines we used six inputs for multiplexing the signals delivered from two stage Second Order Sallen-Key Filters. The standard specifications of the MAX4051 are available in the Maxim datasheet [123].
3.3.7
Post-multiplexer Amplifier Stage
The operational amplifier IC manufactured by Texas Instruments –THS3001 is found to be the suitable in the presented design for the post amplification after the multiplexer stage. It is a high-speed device having slew-rate of 6,500 V/Ps. The nature of the signal processing after the multiplexer stage requires the acquisition of the smallest quantum sample parts of the lead signal. Otherwise the same is likely to be lost in the digitization. This very requirement justifies the use of precision amplifier such as THS3001 more so due to its high slew rate that would match with the high speed ADC employed later. It further isolates the multiplexer from the ADC. The amplifier is used in non-inverting mode with a gin of 11.
3.3.8
Digitization of the ECG Signal
The digitization of the ECG signal is done by using the LTC1407, a 3 V micro power, 14 bit, successive approximation sampling A/D converter. Instead of using the onboard Analogue to Digital (A/D) converter implementation in FPGA, off-the-chip ADC approach was preferred owing to its micropower operation mechanism. The typical supply current is as low as 160 PA and further it facilitates the Auto Shutdown feature with current consumption of 1 nA. This is an exceptionally helpful feature to have in a micropower device. For the majority of time, the A/D converter would operate in the shutdown mode, only being turned on when a conversion is needed. The output of the ADC is 14 bit data serial in nature. This requires customized handshake micro-logic implementation on the FPGA. Further there is also need to post the data to the modem interface with a stamping of the identification pertaining to the ECG waveform segment emanating from the respective electrodes. The handshake micro-logic is implemented on the FPGA using VHDL after referring the LTC1407 operating sequence shown in Fig. 3.5.
3.3
Analog Front End of the Setup
63
Fig. 3.5 LTC1407 operating sequence (Retrieved from Linear Technology Corporation: LTC1407 data sheet)
3.3.9
FPGA Based Handshake Micro-logic
The ADC inputs the sample of the analog input voltage when ‘AD_CONV’ signal is provided in the form of a pulse width of the order of at least 4 ns, as shown in the Fig. 3.5. When ‘AD_CONV’ signal falls to low, the ADC requires three clock cycles to generate serial bits corresponding to the analog sample which it was sampled. The serial bits are collected in FPGA at every negative edge of the clock signal ‘SPI_SCK’, which is generated from FPGA and given to ADC. To perform next A to D conversion the for another voltage sample again ‘AD_CONV’ is provided from FPGA. ADC provides 14 bit digital data in serial form. The MSB of digital data is provided first. Micro-Logic is implemented in VHDL and explained in the later part of the chapter.
3.3.10
MODEM Interface
The ECG system reported here also facilitates the Physician to retrieve the data by means of the modem interface. The data is sent using a pair of modem at the transmitter as well as receiver end. Alternately it was also redirected from the Serial port to TCP/IP by using a shareware TCP-Com [124]. By using the software TCP-Com implementation and with the FPGA system connected to the serial port of the PC, the whole TCP/IP client implementation was possible. Actual trans-receiving of sample ECG data from thus formed Modem Server has been implemented. The Fig. 3.6 shows flow diagram used to implement ADC1407 driver. The following Fig. 3.7 shows a flow used to code a process that transmit the serial bits out.
64
3
Analog Front End and FPGA Based Soft IP Core for ECG Logger
Fig. 3.6 (a) Flow diagram to drive serial ADC LTC1407 using FPGA
start
is Reset =‘0’ ?
Yes
SPI_SCK=Low Clear Storage Register Clear Counter Clear Address lines(Page No.) Clear 3D RAM
No is -ve Clock edge ?
No
halt Yes Increment counter for Delay Invert SPI_SCK for ADC clk
B Provide Pulse Width To sample voltage
After three clocks to ADC accept bit wiz Serial Data in FPGA
Store 14 bits in Storage Reg. data in 3D RAM Increment Analog MUX Channel Address (Page No.)
A
3.4
VHDL Implementation of the ECG Soft IP Core
Fig. 3.6 (continued) (b) Flow diagram to drive serial ADC LTC1407 using FPGA
65
! ? B
!
% !! % " # !!$
! #
!! ?
! % # !!
3.4 3.4.1
VHDL Implementation of the ECG Soft IP Core Driving ADC: LTC 1407 and Storing Data in 3D RAM
****************************************************************** library ieee; use ieee.std_logic_1164.all; use ieee.std_logic_unsigned.all; use ieee.std_logic_arith.all; entity LTC1407A is port (Reset :in std_logic;
66
3
Analog Front End and FPGA Based Soft IP Core for ECG Logger
start
is -ve CLK edge ?
No halt
Yes
Have proper CLKs Given to ADC
No
Yes Get the data frame of 15 bits Stored in a Variable
No
Is the count value in range of 2 to 6 ?
Yes Shift stored Variable Data_AD (15 bit frame) One Bit Position Left Output Serial bit $). for serial Transmission of Data Frame at every -ve edge of CLK
Fig. 3.7 Flow diagram shows how to transmit the serial data bits of a 15 bit data frame
CLK SPI_SCK ADDR AD_CONV SPI_MIBO Data DIN
:in std_logic; :inout std_logic; -- Clock input to the ADC LTC1407 :inout integer range 0 to 6; -- ADDR to select channel of MUX4051 :out std_logic; -- CHIP SELECT for ADC LTC1407 :in std_logic; -- Serial Data coming out from ADC :out std_logic_vector(13 downto 0); -- Data to be stored in 3D RAM :out std_logic -- Serial Data for transmission
3.4
VHDL Implementation of the ECG Soft IP Core
67
); end LTC1407A; architecture rtl of LTC1407A is signal count:integer; signal Data_s:std_logic_vector(13 downto 0); signal RAM_Row: integer range 0 to 15; -- RAM Length for each Analog Input(Leads) -- Six Leads: Lead-I,Lead-II,Lead-III,aVR,aVF,aVL --THREE DIMENTIONAL RAM ARRAY IS DECLARED subtype bit12 is std_logic_vector(13 downto 0); type RAM_array is array(integer range 0 to 15, integer range 0 to 6) of bit12; signal RAM_data :RAM_array; begin RAM_data(RAM_Row,ADDR) <=Data_s; P_conv: process(Reset,CLK) begin if(Reset=’0’)then SPI_SCK<=’0’; count<=0; elsif(CLK’event and CLK=’0’)then if (count>80)then count<=0; else SPI_SCK<=not SPI_SCK; count<=count+1; end if; end if; end process;
--clock input to ADC
P_SHIFT: process(count,SPI_SCK) begin if(Reset=’0’)then AD_CONV<=’0’; elsif(count>1)and (count<6)then AD_CONV<=’1’; elsif(count>=6)then AD_CONV<=’0’; end if; if(SPI_SCK’event and SPI_SCK=’0’)then if(count>11)and (count<40)then Data_s(0)<= SPI_MIBO; Data_s(13 downto 1)<=Data_s(12 downto 0); end if; end if;
68
3
Analog Front End and FPGA Based Soft IP Core for ECG Logger
end process; P_STORE: process (count) begin if(Reset=’0’)then ADDR<=0; RAM_row<=0; elsif(count=40)then Data<=Data_s; ADDR<=ADDR+1; if(ADDR=5)then ADDR<=0; RAM_row<=RAM_row+1; if(RAM_row=15)then RAM_row<=0; end if; end if; end if; end process; --------------------------------------------------------------------------- The process “Serial” is to convert the parallel formed data(obtained -- from ADC LTC1407 in serial form and converted into its’ paralle form) -- into serial form and sends out to MODEM’s input for telecommunication -- line Serial: process(count,CLK) variable Data_AD:std_logic_vector(14 downto 0); begin if(SPI_SCK’event and SPI_SCK = ‘0’)then if(count = 40)then Data_AD:=CONV_STD_LOGIC_VECTOR(ADDR,3)& Data_s(11 downto 0); end if; if(count > 40) and (count < 80)then Data_AD(14 DOWNTO 1):=Data_AD(13 DOWNTO 0); -- Left Shift to send MSB first DIN < =Data_AD(14); end if; end if; end process Serial; end rtl; ******************************************************************
3.4
VHDL Implementation of the ECG Soft IP Core
3.4.2
69
Details of the VHDL Code
Certain subtle points regarding the VHDL implementation given above are described in the following paragraphs. Rest of the code is self explanatory.
3.4.3
VHDL Processes for Conversion and Storage in 3D Memory: (Process P_conv, P_SHIFT and P_STORE)
The SPI_SCK is the clock signal generated by FPGA required for LTC1407 to perform sampled signals’ analog to digital conversion and to send the 14 bit information serially at its’ output line. It should be noticed that, the system uses eight channel analog multiplexor-IC-MAX4051; which requires three bit select signal to select the analog inputs. The implementation intends to sample the analog signals generated by different leads of ECG viz. Lead-I, Lead-II, Lead-III, aVR, aVL and aVF; which happens to be the inputs to IC MAX4051. The VHDL vector signal ADDR[2:0] is defined as a select bus form FPGA to instantiate one of these signals for conversions. There are six signals emanating from the leads to be sampled one by one. The VHDL process P_STORE uses ‘ADDR’ as an integer data type signal generated from the FPGA end to select the lead information for the conversion of analog to digital form. Upon assertion of the proper ADDR signal the respective multiplexer signal gets routed for A to D conversion. The ‘SPI_SCK’ signal is made HIGH for a fixed duration as shown in the timing diagram of ADC; which is required for ADC to initialize the A/D conversion. The handshake micro code works on the following lines. The serial data is accepted in the FPGA from the serial output line of the ADC i.e. ‘SPI_BIMO’. In order to collect serial bits, 14 bit signal ‘Data_s’ has been used in the VHDL code. The first bit B13 from ADC is entered in Data_s[0]. To collect remaining 13 bits, the formerly stored bits are shifted one bit position left. In this way the signal ‘Data_s’ is made free for collecting the serial bit stream. In order to facilitate the data processing, we have used a concept of three dimensional memory. After complete reception of the 14 bit information the data is stored in a three-dimensional memory. A three dimensional memory structure is as shown in Fig. 3.8. The above referred three-dimensional memory (shown in Fig. 3.8) is designed for storing 16 words of 12 bits. This 16 × 12 array is treated as a PAGE, and six such pages are used for effective storage of one ECG sample. In technical terms the memory may be specified as: Rows × Bits × Pages i.e. 16 × 12 × 6. At the end of the serial to parallel conversion at the FPGA end, the information is stored in the 3D memory using a VHDL code line in the concurrent part of the architecture as given below. RAM_data (RAM_row, ADDR) < =Data_s(11: downto 0);
70
3
Analog Front End and FPGA Based Soft IP Core for ECG Logger PAGE#5: 16X12 RAM_data(0,5) RAM_data(1,5) RAM_data(2,5)
RAM_data(0,0)
16 information
RAM_data(1,0) RAM_data(2,0) RAM_data(15,5) 16 information
PAGE#0: 16X12
RAM_data(15,0)
12 bits
Fig. 3.8 A three dimensional memory structure used in VHDL code
The PAGE i.e. ADDR number of the 3D memory is incremented by one so that analog multiplexor will select the next channel and route the signal to the input of ADC. The above referred process is once again repeated and the data is stored to the next PAGE. Here we have specifically designed for page by page storage in lieu of the line wise storage from the point of view of ease in processing the data. The reason behind moving to next page and not to the next memory line is that, in case there is no step taken to sample a signal from next Lead and just continued sampling the same Lead, it would fail to notice the signals coming from rest of the leads. In case of any critical situation occurs in ECG signals such as missed or corrupted signals, the above said mechanism will let the Physician note as regards to the abnormalities and thus prevents the false positive diagnosis. After every lead has had its turn, the recording system proceeds to the first Lead back in a round robin manner and continue storing digital data in the next memory location of the first page, followed by the next page and so on. This concludes that, for the same row address of the memory, only PAGE address is incremented. And after finishing information storage of last page i.e. PAGE#5, the row address of memory is incremented. In turn, it samples in all 6 × 16 of 12 bit information corresponding to the analog signals of six leads of ECG. Remaining two channels of MUX 4051 are not considered here as the acquisition system deals with a six electrode ECG.
3.5 ModelSim Simulation Results
3.4.4
71
VHDL Process for Serial Transmission of the ECG Signal (Process Serial)
The ECG digital data being recorded is also reproduced into its serial form and sent from the FPGA serial output line DIN in order to facilitate its transmission to PC/MODEM. The serial data is in the form of a data frame of 15 bits. In order to facilitate the lead identification per signal and the naming conventions of the PQRST waveforms, a protocol is developed through the process ‘Serial’. Assertion of various signals and their timings for this protocol goes on the following lines. Out of the 15 bit serial data frame to be outputted through the FPGA, the first three MSBs are designated as Lead Address, and remaining 12 bits represents the magnitude of analog voltage sample of the address associated Lead. While sending this 15 bits data frame serially, the MSB is sent first and is followed by remaining bits. The digitized samples obtained from the ADC are collected through the signal ‘Data_s’. Subsequently, this process commences sending the data frame bits serially with the address of the Lead. The Lead address ADDR [2:0] and Data_s [11:0] are concatenated to form a complete 15 bits data frame using the VHDL statement: Data_AD:=CONV_STD_LOGIC_VECTOR (ADDR, 3) & Data_s (11 downto 0); The frame formed by a variable ‘Data_AD[14:0]’ is used in this process. The serial output line DIN is connected to the MSB of this variable using the VHDL line given below: DIN < =Data_AD(14); At the receiving end, the frame will be collected in the same order as that of its transmission counterpart. Just to elaborate the reception process, the MSB of the frame is received first and LSB at last. After receiving first three bits: Lead /Page Address, the particular page number will be used to store a 12 bit information and the Lead corresponding output line of demultiplexor will be connected to its input line; where the output of serial DAC is connected.
3.5
ModelSim Simulation Results
Extensive simulation was performed in order to ascertain the working of the system. The simulation details are covered in the following paragraph. As shown in the simulation window of Fig. 3.9, on removal of the ‘Reset’ the signal ‘AD_CONV’ is pulled up to high. On arrival of each negative CLK edge the counter value is seen to be incremented, and clock pulses of signal ‘SPI_SCK’ are seen to be provided to serial ADC’s clock input. After ‘AD_CONV’ is made low, three clock cycles are given to ADC to let it convert the sampled data into its digital equivalent. After these three clock cycles, FPGA provides 14 more clock cycles to ADC, due to this it produces 14 bit serial data corresponding to analog voltage.
3
Fig. 3.9 Simulation window to show driving of serial ADC and generating serial data for transmission
72 Analog Front End and FPGA Based Soft IP Core for ECG Logger
3.6 Synthesis Results Using Mentor Graphics Tool: Leonardo Spectrum
73
At each of the negative edge of ‘SPI_SCK’, bits are seen emerging out at the serial output line ‘SPI_MIBO’ of the ADC. The simulation window also exemplifies that, ‘at every positive edge of the ‘SPI_SCK’ the ‘SPI_MIBO’ bits are being entered in a signal Data_s through Data_s [0] line’. This line is intended to be a serial input line for FPGA. The newly entering serial bits are collected in the mode of shifting Data_s [11:0] bits by one position left.
3.6
Synthesis Results Using Mentor Graphics Tool: Leonardo Spectrum
The VHDL code was synthesized using the Mentor Graphics EDA tool- Leonardo Spectrum. The results are presented in the following table.
3.6.1
Synthesis Report
Total accumulated area: ****************************************************************** Cell: LTC1407A View: rtl Library: work Total accumulated area : Number of BUFG 2 Number of BUFGP 1 Number of Dffs or Latches 80 Number of Function Generators 104 Number of IBUF 2 Number of MUX CARRYs 31 Number of MUXF5 13 Number of OBUF 20 Number of ports 23 Number of nets 289 Number of instances 28 Number of references to this view 0 Device Utilization for 2s30pq208: Resource Used Avail IOs 23 132 Function Generators 104 864 CLB Slices 52 432 Dffs or Latches 80 1,296 Clock Frequency Report: Clock Frequency
Utilization 17.42% 12.04% 12.04% 6.17%
3
Fig. 3.10 RTL view of the system ADC driver and serial transmitter
74 Analog Front End and FPGA Based Soft IP Core for ECG Logger
Fig. 3.11 Technology schematic view of the system ADC driver and serial transmitter
3.6 Synthesis Results Using Mentor Graphics Tool: Leonardo Spectrum 75
76
Reset CLK
3
Analog Front End and FPGA Based Soft IP Core for ECG Logger
305.8 MHz 57.8 MHz
******************************************************************
3.6.2
RTL View
The RTL view of the design is shown in Fig. 3.10.
3.6.3
Technology Schematic View
The Technology Schematic view of the design is as shown in Fig. 3.11.
3.6.4
Critical Path Schematic
The Critical Path Schematic resulted by the synthesis tool Leonardo Spectrum is as shown in the Fig. 3.12.
3.7 3.7.1
Monitoring the ECG Using MODEM Based Setup Tele-monitoring of the ECG Signal at the Hospital End
One of the objectives of the present development is to facilitate the monitoring the ECG signal using the telephone network so as to bridge the gap between the increasing population and insufficient personal capacity of hospitals. The other objectives of the scheme are to build a systematic database of the record of the patient at the hospital end. The scheme is also beneficial to the patients who are bed ridden and find it difficult to visit the hospitals frequently. The basic setup required at the hospital end for reception of the ECG data is shown in Fig. 3.13. The setup comprises of a Spartan 3E FPGA with a serial DAC interface based on LTC1257. The transmission-reception process is facilitated by keeping a soft IP core of the protocol at both end FPGAs in addition to the device drivers developed for the ADC and DAC at the respective ends. The system is elaborated in the following part of the chapter.
3.8 ECG Signal Reconstruction Mechanism at the Hospital End As shown in the Fig. 3.13, at the receiving end the serially received digital data is converted back into its’ analog equivalent by using a DAC from Linear Technology Corporation LTC1257. The transmitter sends the 12 bit digital data analogous to the
Fig. 3.12 Critical path view of the system ADC driver and serial transmitter
3.8 ECG Signal Reconstruction Mechanism at the Hospital End 77
78
3
Analog Front End and FPGA Based Soft IP Core for ECG Logger
Fig. 3.13 ECG signal reconstruction mechanism at the hospital end
t1
t4
B11 MSB
DIN
t7
t6
t2
CLK
t3
B10
B0 LSB
B1
t5 LOAD t8 B11 DOUT (PREVIOUS WORD)
B10
B1
B0
B11 CURRENT WORD
Fig. 3.14 Timing diagram of the DAC LTC1257 (Retrieved from the Linear Technology Corporation data sheet)
respective leads with the unique 3 bit address that serves the purpose of the lead identification. The data therefore becomes a frame of 15 bit which will be transmitted per analog sample voltage of a Lead. Before converting any received digital data into its analog equivalent, it is crucial to extract and further decode the Lead address embedded in the received data frame. This is accomplished by using a three wire bus from FPGA end to the analog demultiplexor select lines. This serves the purpose of routing the single input analog line to either of the output lines. FPGA thus converts the serially received lead address into its parallel form and send it to the select lines of the demultiplexer. The serial DAC output is connected to input of the analog demultiplexer. With the above mentioned connection mechanism the analog signal associated with the respective lead at the transmitter end is reconstructed at the hospital end.
3.8.1
DAC Interfacing Details
This section presents the DAC interfacing details. Figure 3.14 shows the internal the timing diagram of the DAC which has been used to develop the driver in the VHDL.
3.8 ECG Signal Reconstruction Mechanism at the Hospital End
79
Fig. 3.15 A 15 bits data frame receiving serially
As shown in the timing diagram, in order to accept the serial data at the input of DAC, a positive going edge of Clock from the FPGA needs to be provided. At the positive edge of each of the clock pulse, DIN bits are stored in an internal 12 bit shift register (starting from MSB to LSB at last). With the complete loading of the shift register, a 12 bit latch has to be enabled to pass data from shift register output to input of the DAC. This latch can be enabled by pulling down the control signal LOAD. The latch remains transparent till the LOAD signal is Low. Based on the above described timing of the DAC, an algorithm, flowchart and VHDL driver has been developed and presented below.
3.8.2
FPGA Driving Demultiplexer and DAC: Core Algorithm
As shown in the Fig. 3.13, the Spartan 3e FPGA drives two devices viz. demultiplexer and the DAC. Different algorithmic steps to drive these two devices are explained below: s At the outset, serial input data is accepted. s The input serial data is in the form of data frame of 15 bits, first three bit of which are the Lead address, which were treated as page address of the 3D memory at the transmitter end. s A 3 bit Address bus is extracted from the input frame and invoked through the FPGA to select lines of Analog Demultiplexor. According to the select lines data (Lead Address) output of DAC are connected to either of the outputs of demultiplexor, i.e. to appropriate Lead signal. s After decoding the Lead address, and connecting DAC output to one of the Lead outputs, the DAC is enabled. s Out of 15 bits of the data frame as shown in the Fig. 3.15, the 12 data bits are required to be extracted and converted into the parallel form. This facilitates creation of the ECG signal record in the three dimensional memory corresponding to that of the transmitter end. s Clock input is then provided to DAC by asserting the clock signal through FPGA to start the conversion of serially received data (12 bit) into its’ analog equivalent. s It is ensured that the FPGA asserts the clock signal only after extraction of the lead ID from the input data frame. This will also ensure capturing of the data at the appropriate instance and loading the same further into the shift register.
80
3
Analog Front End and FPGA Based Soft IP Core for ECG Logger
12!02
%1
A
)1 %1%2 ?
72!02/0.4)$)-'#+.#*/3+1%1 2.1%0)!+)-)
. )1
4% +.#* %$'%
7+%!0.3-2%0 7)'(2(%./)7)'(2(%.!$./ 7+%!0 %,.06 $$0%11
(!+2 %1
.
7+%!0.3-2%0
.
)1 #.3-2a 40 %1
7%0&.0,")2+%&21()&2./%0!2)..4%02(% ")21!'-)23$%$!2! !2%4%06 4%%$'%.&2(%#+.#* )%)-
732/32,!'-)23$%")211%0)!++6 2.!2 4%%$'%.&2(%#+.#*: )-
7##%/2 ")211%0)!++6!1!$!2!&0!,% 71% 1!1%!$!'%$$0%11
)12()1!12 )2.& !'-)23$%
7./6 ")21!1!'-)23$%.&-!+.'1)'-!+ 5(%-#.3-20%!#(%14!+3%
.
%1
A
-#0%,%-2%,.06 )-%$$0%11
Fig. 3.16 Flowchart to drive analog Demultiplexor and serial DAC from Spartan 3E FPGA
s A train of 12 clock pulses is given to DAC to convert 12 serially received bits into its analog equivalent. s The DAC commences the conversion per positive edge of the clock. And continues it further till it receives a low signal on LOAD input. With the assertion of the above referred low signal the DAC transfers serially captured 12 bits frame into its latch. This is followed with the production of the analog output corresponding to the 12 bit parallel data stored in the shift register. The Fig. 3.15 further exemplifies the data frame of 15 bits being received and using it to decode the Lead address and convert the bits into its analog equivalent. From the data frame it is clear that, serially received data is first stored in D0 position and then per consecutive clock pulse new data keeps on loading through D0, with left shifting of the previous contents. Only at the end of the 15 clock cycles the complete 15 bit data becomes available for further processing. Initially, D14 to D12 bits are sliced from the data frame and sent to the output for lead addressing. After this D11 to D0 bits are used to convert 12 bit data into its analog equivalent. As the DAC is serial in nature serially data bits are given as out-
3.9 VHDL Listing for Driving the Analog Demultiplexer and Serial DAC
81
put from the Spartan 3e FPGA. The serial input starts with the bit D11 of the data frame. Per clock cycle the data frame is shifted left by one bit position and thus the D to A operation is completed in 12 clock cycles.
3.8.3
Serial ECG Receiver: Flow Chart
The Fig. 3.16 shows the flowchart to implement the algorithm for serial ECG receiver.
3.9
VHDL Listing for Driving the Analog Demultiplexer and Serial DAC from Spartan-3E FPGA
****************************************************************** library ieee; use ieee.std_logic_1164.all; use ieee.std_logic_unsigned.all; use ieee.std_logic_arith.all; entity LTC1257 is port( Reset :in std_logic; CLK :in std_logic; CLKin :inout std_logic; -- Clock input to the DAC LTC1257 ADDR: inout integer range 0 to 6; -- ADDRESS to select Output channel of -- DeMUX Load_n: out std_logic; -- Active low, when ‘0’, shift reg. data is to input to -- DAC Reg(internal) DIN : in std_logic; -- Serial Data from Transmitter to be converted in -- Parallel form DIN_DAC: out std_logic -- Serial Data out towards DAC serial input for -- Analog Conversion ); end LTC1257; architecture rtl of LTC1257 is signal count:integer range 0 to 61; signal Data_P:std_logic_vector(14 downto 0); signal Data_PS:std_logic_vector(11 downto 0); signal RAM_Row: integer range 0 to 15; -- Lenght of the RAM for each Analog -- Output(Leads) -- Six Leads: Lead-I,Lead-II,Lead-III,aVR,aVF,aVL --THREE DIMENTIONAL RAM ARRAY IS DECLARED -- i.e. 16(rows)x6(column:pages of 12 bits) subtype bit12 is std_logic_vector(14 downto 0); -- Serial Digital Data is 12 bit
82
3
Analog Front End and FPGA Based Soft IP Core for ECG Logger
type RAM_array is array(integer range 0 to 15, integer range 0 to 6) of bit12; signal RAM_data:RAM_array; begin RAM_data(RAM_Row,ADDR) < =Data_P; Serial_P: process(Reset,CLK) begin if(Reset = ‘0’)then CLKin < =’0’; Load_n < =’1’; Data_P < =(others= > ‘0’); count < =0; ADDR < =0; RAM_row < =0; elsif(CLK’event and CLK = ‘0’)then count < =count + 1; if(count < =40)then if(count < =14)then
-- 15 bits(3 bit:ADDR + 12 bit Magnitude of --- DeAnalog signal)
Data_P(0) < =DIN; Data_P(14 downto 1) < =Data_P(13 downto 0); -- LEFT SHIFT --- THE RECIEVED BITS end if; if(count = 15)then ADDR < =CONV_INTEGER(Data_P(14 downto 12)); end if; if(count > 15)then CLKin < = not CLKin; end if; if(count = 39)then Load_n < =’0’; else Load_n < =’1’; end if; else count < =0; RAM_row < =RAM_row + 1; if(RAM_row = 15)then RAM_row < =0; end if;
3.10
Discussion Regarding the VHDL Implementation
83
end if; end if; end process Serial_P; ----------------------------------------------------------------------------S_OUT:process(Reset,CLKin,count,Data_P) begin if(Reset = ‘0’)then Data_PS < =(others= > ‘0’); elsif(count = 16)then Data_PS < =Data_P(11 downto 0); elsif(CLKin’event and CLKin = ‘0’)then Data_PS(11 downto 1) < =Data_PS(10 downto 0);-- LEFT SHIFT end if; end process S_OUT; DIN_DAC < =Data_PS(11); -- SERIAL INPUT FOR DAC end rtl; ******************************************************************
3.10
Discussion Regarding the VHDL Implementation
The VHDL code comprises of two ‘process’ constructs and their implementation is discussed below.
3.10.1
Process Serial_P
As per the logic given in the algorithm in 3.8.2, the process labeled as ‘Serial_P’ receives the binary data serially at the receiving end. The VHDL statements given below performs the task of receiving bits at input signal DIN and disposing them in parallel form to an internal signal Data_P[14:0] Data_P(0) < =DIN; Data_P(14 downto 1) < =Data_P(13 downto 0); A counter is implemented in the code plays a vital role in this process. As shown in the Fig. 3.15 of data frame, there are 15 bits to be received; of which 3 bits are Lead address bits and 12 bits are the magnitude of analog signal. So receiving the serial data bits and arranging them in proper format is performed during first 15 count values (0–14) i.e. 15 clock cycles. As the count reaches value 15 the serial receiving process stops. After this from a 15 bit internal signal Data_P of the entity, three bits i.e. Data_P[14:12] are used to output the Lead address as well as provide the page address of three dimensional memory in which the data is recorded. With the VHDL implementation, there is a provision to record 16 such data elements in a page. After receiving the complete
84
3
Analog Front End and FPGA Based Soft IP Core for ECG Logger
frame, the Spartan 3E FPGA starts issuing the clock pulses to the DAC. The signal out from FPGA for this is ‘CLKin’.
3.10.2
Process S_OUT
This process receives the serial data frame and stores the same in the memory. As explained above the DAC, requires 12 clock pulses for conversion of one data frame. With reference to the VHDL listing of the process as soon as, the count value reaches 16, the first bit of conversion is provided to DAC input from the FPGAs’ output signal. This bit is derived from an internal signal vector ‘Data_PS[11:0]’ in the code. On arrival of the first positive edge of the CLKin, this bit will be entered in the internal shift register of DAC. At every negative edge of ‘CLKin’, ‘Data_PS’ bits will be shifted one bit position left to output next least significant bit of the magnitude. As the second MS-bit is ready at the input of DAC, it will simply wait for positive edge of the CLKin to get this recent bit entered in, and this edge of clock will be provided during first clock cycle of CLKin soon (after negative edge of clock there will be positive edge also) by FPGA. Having 12 bits received in the shift register, now DAC is made to wait for another input signal LOAD. FPGA provides ‘Load_n’ signal (Low) after 12 clock cycles of ‘CLKin’ and let the DAC to transfer its shift register (12 bit) data into the DAC latch. With this the DAC produces the analog voltage signal at its output. Thus the first process provides the Lead address, while the second routes the DAC output to the respective outputs of analog demultiplexer. Thus both of them facilitates the reconstruction of the ECG waveform which is been transmitted.
3.11
ModelSim Simulation Results
Figure 3.17 shows simulation results obtained for the VHDL code discussed above. As shown in the simulation window upon removing the ‘Reset’ from low to high, the counter starts counting. During first 15 counts, the serial data in is accepted and stored in signal ‘Data_P’. As count comes to its value 15, Lead address 4(100 in binary) is given at the output. From count value 16 onwards the CLKin starts toggling to provide 12 clock cycles. With count = 16, the value of ‘Data_P’ is stored in ‘Data_PS’, as shown in the simulation window the value is “111001001101”. The first bit (MSB) = ‘1’, is given out without ‘CLKin’ negative edge. From the forthcoming negative edges of ‘CLKin’, these data bits are shifted one bit position left. The serial propagation of these bits may be seen in the simulation window. The simulation window also shows data storage in the three dimensional memory. In one of the simulation cycles the lead address is given as 4 and with that the data gets stored in the 4th page. The simulation also shows recording of the ‘Data_P’ = Page number 4 and on line number 0. This is shown in two rectangles in the simulation window. Left rectangle is showing memory address and the right most shows the data being recorded in memory. These contents are obtained from the signal ‘Data_PS’ and are shown in the simulation window with elliptical rounds.
3.12
Synthesis Results Using Mentor Graphics Tool: Leonardo Spectrum
85
Fig. 3.17 Simulation results of VHDL code for ECG receiver driver
At the end of last CLKin pulse, the signal ‘Load_n’ is pulled down to low, and remains to the same level, but for only one cycle.
3.12
Synthesis Results Using Mentor Graphics Tool: Leonardo Spectrum
The VHDL code was synthesized using the Mentor Graphics EDA tool- Leonardo Spectrum. The results are presented in the following table.
3.12.1
Synthesis Report Cell: LTC1257View: rtlLibrary: work Total accumulated area Number of BUFG Number of BUFGP Number of Dffs or Latches
1 1 38
86
3
Analog Front End and FPGA Based Soft IP Core for ECG Logger
Number of Function Generators 43 Number of IBUF 2 Number of MUX CARRYs 5 Number of OBUF 6 Number of ports 9 Number of nets 106 Number of instances 103 Number of references to this view 0 Device Utilization for 2s30pq208 Resource Used Avail Utilization IOs 9 132 6.82% Function Generators 43 864 4.98% CLB Slices 22 432 5.09% Dffs or Latches 38 1,296 2.93% Clock Frequency Report Clock Frequency CLK 78.1 MHz Cell BUFG BUFGP FDCE_1 FDCPE_1 FDCP_1 FDC_1 FDPE_1 GND IBUF LUT1 LUT2 LUT2_L LUT3 LUT4 MUXCY_L OBUF XORCY
3.12.2
Library xis2 xis2 xis2 xis2 xis2 xis2 xis2 xis2 xis2 xis2 xis2 xis2 xis2 xis2 xis2 xis2 xis2
References 1x 1x 19 x 1x 11 x 6x 1x 1x 2x 3x 1x 6x 4x 29 x 5x 6x 6x
RTL View
The RTL view of the design is shown in Fig. 3.18.
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Total Area 1 BUFG 1 BUFGP 19 Dffs or Latches 1 Dffs or Latches 11 Dffs or Latches 6 Dffs or Latches 1 Dffs or Latches 1 GND 2 IBUF 3 Function Generators 1 Function Generators 6 Function Generators 4 Function Generators 29 Function Generators 5 MUX CARRYs 6 OBUF 6 XORCY
Synthesis Results Using Mentor Graphics Tool: Leonardo Spectrum
87
Fig. 3.18 RTL view of the ECG serial receiver
3.12
3
Analog Front End and FPGA Based Soft IP Core for ECG Logger
Fig. 3.19 Technology schematic view of the ECG serial receiver
88
Synthesis Results Using Mentor Graphics Tool: Leonardo Spectrum
89
Fig. 3.20 Critical path view of the ECG serial receiver
3.12
90
3
Analog Front End and FPGA Based Soft IP Core for ECG Logger
Fig. 3.21 Snapshot of the setup
3.12.3
Technology Schematic View
The Technology Schematic view of the design is as shown in Fig. 3.19.
3.12.4
Critical Path Schematic
The Critical Path Schematic resulted by the synthesis tool Leonardo Spectrum is as shown in the Figs. 3.20 and 3.21.
3.13
Conclusion
The Soft IP core for the ECG logger was presented in this chapter. As described here the presented setup is a standalone system for ECG acquisition. The system performs fairly accurate ECG acquisition and storage for a 12 h duration. There is an offline diagnosis provision by uploading the data stored in the memory by using the serial interface. A trans receiving setup designed using the MODEM also facilitates continuous transmission of the ECG from the patient end, which was received at the hospital end for further medical diagnosis. The FPGA based system design aspects including the synthesis report and various parameters such as critical path, number of
3.13
Conclusion
91
LUTs and CLBs used to realize the design have also been presented. The simulation results using the ModelSim visualize the results before actual prototyping. The overall energy efficiency of the system enables its use in hand-held, battery-operated manner and makes it useful for the portable applications. Displaying the R-R intervals on a liquid crystal display (LCD) aids a specialist in deducing conclusions from the morphologies. Thus the reported ECG monitoring system is small, easy-to-carry, and cost effective, which satisfies the needs of most clinical technicians. Compared to the expensive medical monitoring systems currently on the market, this system provides equally better performance at a lower cost.
Chapter 4
FPGA Based Multifunction Interface for Embedded Applications
Abstract Embedding a soft IP core inside an FPGA has many advantages such as customization, design reuse, accelerating the design cycle and narrowing the time to market window thereby enhancing the productivity. In view of all the above mentioned advantages, the FPGA based systems are now penetrating the embedded arena which has marked the take off of the configware rather than the traditional embedded hardware and software. The complexity of the FPGA based SoCs in an Embedded paradigm has now been addressed by the designers by using the configware libraries that comprises of the soft IP cores. The design techniques pertaining to the soft IP cores are now regarded as the evolutionary techniques and the novel design methodologies of the Embedded Systems has comes out to be just an analytical marriage of the above mentioned soft IP cores. Pre-designed and pre-verified soft IP cores such as the ones designed in this chapter addresses the pertinent issues such as time to market, performance, area, power metrics etc. Further designing such soft IP cores in Handel C facilitates maximum flexibility and reconfigurability to match the requirements of a specific embedded design application. All the cores reported in this chapter have been verified/prototypes on the Xilinx Starter kit.
4.1
Introduction
Embedded systems are the computing systems performing specific tasks within a framework of real-world constraints. These constraints are generally interms of speed, power and memory. Some of the essential attributes of the state of art Embedded Systems are as follows: s s s s
Infinite Loop a.k.a. super loop execution Generally part of a lager system Application Specific Architecture No general Purpose Software
R.K. Kamat et al., Harnessing VLSI System Design with EDA Tools, DOI 10.1007/978-94-007-1864-7_4, © Springer Science+Business Media B.V. 2012
93
94
s s s s s s s
4
FPGA Based Multifunction Interface for Embedded Applications
Real time interaction with the Environment wherein deployed Software and Hardware scalability and reconfigurability Supposed to provide the output with well defined latency and throughput Power Efficient Less Footprint Robust and reliable Capability to network/built in protocol stacks
Most of the Embedded systems so far were microcontroller centric. However, with the intense requirement of the above mentioned attributes have forced the designers to migrate to the FPGA based paradigm. As systems-on-chips (SoCs) become more widely used due to their improved performance, both in term of speed and power consumption, reconfigurable devices and in particular field-programmable gate arrays (FPGAs) are becoming more and more popular in the development of embedded systems where issues such as short time-to-market and update capabilities after deployment are critical [121]. Many of the emerging applications in the embedded arena such as programmable autonomous vehicle, smart domestic appliances, industrial controllers necessitates area, power, speed and/or cost optimizations. It is very hard to satisfy the above mentioned attributes in a purely microcontroller centric embedded applications. However, there is great potential to optimize all of them in the FPGA paradigm. The unremitting advances in this reconfigurable domain enables the realization of the system that not only extends the designer an invigorating environment to test the proof-of-concept but also does so at the rapid scale by narrowing the development time window. Since the modern FPGAs devices include useful specific resources such as memory blocks, hardware multipliers, clock management circuits, and high speed interface circuits, combined all of them with the opportunity to integrate third party soft processor cores such as microblaze, picoblaze, NIOS II facilitates the development of the intended embedded applications very rapidly. In the backdrop of the above mentioned developments, the present chapter introduces a universal FPGA based general purpose interface with host of peripheral cores so as to kick start any applications in the embedded arena. A mere customization of these soft IP cores developed in Handel C would enable the designer to realize their embedded applications for any sort of application domain. At the outset the design of the board is described followed by in detail listing of the Handel C codes in the form of the soft IP cores.
4.2
4.2.1
Universal FPGA Based Interface for High End Embedded Applications Hardware Aspects
Figure 4.1 reveals the hardware design of the board centered on the Xilinx Spartan 3e FPGA. For testing purpose we have used the Xilinx Spartan 3e Starter kit. The details of the same are given in [122]. Out of the resources available on the board
4.3
Soft IP Core for the LCD Interface
95
Fig. 4.1 Hardware design of the universal FPGA based interface
we have developed the soft IP cores for only those required frequently in embedded applications. These peripheral hardware resources are as follows: s s s s s s
2-line by 16-character liquid crystal display (LCD) Linear Tech LTC2624 Quad DAC Linear Tech LTC6912-1 Dual Amp Linear Tech LTC1407A-1 Dual A/D PS/2 mouse/keyboard port VGA display port
Since the hardware interfacing is standard and is widely covered in the literature [122], the same is omitted. Instead the soft IP cores for the above mentioned hardware resources is presented in the following sections of the text.
4.3
Soft IP Core for the LCD Interface
Liquid crystal display is gaining paramount importance in embedded system. It offers high flexibility to user to display the required data on it. The LCD taken up for the present application can be anything like Samsung S6A0069X or KS0066U, Hitachi HD44780 or SMOS SED1278. All of them have similar interfacing requirements.
96
4
FPGA Based Multifunction Interface for Embedded Applications
Fig. 4.2 Top level view of the soft IP core for the LCD
Fig. 4.3 Detailed synthesis view of the soft IP core for the LCD
4.3
Soft IP Core for the LCD Interface
97
Though the interfacing of most of the above mentioned LCDs is very common to microcontrollers, their interfacing to the FPGAs require a design of higher complexity in terms of hardware as opposed to using a microcontroller which is software based [123]. For instance in the present soft IP core the case of HD44780 LCD interfaced with 4 data lines and 3 control lines emanating from the Xilinx Spartan 3e FPGA is considered. Indepth details of the hardware interfacing of HD44780 (and for that matter any of the above mentioned equivalents) are covered in [124]. The details listing of the soft IP core for the above mentioned LCD is given below. Figures 4.2 and 4.3 gives the synthesis view on the Xilinx Spartan 3e FPGA. //****************************************************************// /* program to interface LCD with FPGA with 4 data line interface. Date := 05-12-2009; LCD data lines = E= RS= RW= */ set clock = external “C9”; void cmd_wrt (unsigned int 8); void data_wrt (unsigned char ); void delay_l (unsigned int 32); const unsigned int 32 delay_24_clk = 24; const unsigned int 32 delay_4_1_ms = 300000; const unsigned int 32 delay_100_mis = 7000; const unsigned int 32 delay_40_mis = 3000; const unsigned int 32 delay_1_mis = 200; const unsigned char function_set = 0x28; const unsigned char entry_mode_set = 0x06; const unsigned char display_on = 0x0c; const unsigned char clear_display = 0x01; const unsigned char data[10] = “SHIVAJI “; unsigned int 4 SF_D; unsigned int 1 E=0,RS,RW; interface bus_out () OutBus(unsigned int 4 OutPort=SF_D) with {data = {“M15”, “P17”, “R16”, “R15”}}; interface bus_out () OutBus1(unsigned int 1 OutPort=RS) with {data = {“L18”}}; interface bus_out () OutBus2(unsigned int 1 OutPort=RW) with {data = {“L17”}}; interface bus_out () OutBus3(unsigned int 1 OutPort=E) with {data = {“M18”}}; void main (void) { unsigned int j;
98
4
FPGA Based Multifunction Interface for Embedded Applications
SF_D = 0x3; // initialise LCD RW = 0; delay; delay; E = 1; delay_l(delay_24_clk); E = 0; delay_l (delay_4_1_ms); SF_D =0x3; // delay; delay; E =1; delay_l (delay_24_clk); E = 0; delay_l (delay_100_mis); SF_D = 0x3; // delay; delay; E = 1; delay_l (delay_24_clk); E = 0; delay_l (delay_100_mis); SF_D = 0x2; // delay; delay; E = 1; delay_l (delay_24_clk); E =0; delay_l (delay_100_mis);
I
II
III
IV
// now send command to LCD to configure it as 4 data lines cmd_wrt (function_set); //function_set=0x28 to 4 data lines configuration. delay_l (delay_40_mis); //delay of 40 micro seconds. //set entry mode (autoincrement) cmd_wrt (entry_mode_set); // delay_l (delay_40_mis); //Now sendcommand for Display on Cursor off; cmd_wrt (display_on); delay_l (delay_40_mis); //Now clear LCD cmd_wrt (clear_display); delay_l (delay_4_1_ms);
4.3
Soft IP Core for the LCD Interface
// Now writte data to LCD for(j=0;j<10;j++) { data_wrt(data[j]); delay_l(delay_40_mis); } void data_wrt(unsigned char d) { unsigned int 4 upper_nibble,lower_nibble; upper_nibble = d [7:4]; lower_nibble = d [3:0]; SF_D = upper_nibble; RS =1; // RS = 1 for data resister RW =0; delay; delay; E = 1; delay_l (delay_24_clk); E = 0; delay_l (delay_1_mis); SF_D = lower_nibble; delay; delay; E = 1; delay_l (delay_24_clk); E =0; return; } void cmd_wrt (unsigned int 8 cmd) { unsigned int 4 upper_nibble,lower_nibble; upper_nibble = cmd [7:4]; lower_nibble = cmd [3:0]; RS = 0 ; RW = 0; SF_D = upper_nibble; delay; delay; E =1; delay_l (delay_24_clk); E =0; delay_l (delay_1_mis); SF_D =lower_nibble;
99
100
4
FPGA Based Multifunction Interface for Embedded Applications
delay; delay; E = 1; delay_l (delay_24_clk); E = 0; return; } void delay_l (unsigned int 32 a) { unsigned int 32 i; for (i=a; i>0;i–) { delay; if( i==0) break; } return; } //****************************************************************//
4.4
Soft IP Core for the DAC Interface
With recent improvements in the density of field programmable gate array (FPGA) devices, systems with an increasingly higher level of integration are possible. Digital signal processing for instance is an important application area for FPGAs and such systems often require data converters to provide analog outputs from digital domain representations and vice versa [125]. There are many application notes covering sample DAC interfacing to FPGAs [126–128]. However the soft IP core developed here is for the Linear Tech LTC2624 Quad DAC. The LTC®2604/LTC2614/LTC2624 are quad 16-,14- and 12-bit 2.5–5.5 V rail-to-rail voltage output DACs in 16-lead narrow SSOP packages. These parts have separate reference inputs for each DAC. They have built-in high performance output buffers and are guaranteed monotonic [129]. As per the datasheet in [130], the features and applications of this DAC are as follows: Features s s s s s s s
Guaranteed 16-Bit Monotonic Over Temperature Separate Reference Inputs for each DAC Wide 2.5 V to 5.5 V Supply Range Low Power Operation: 250 mA per DAC at 3 V Individual DAC Power-Down to 1 mA, Max Ultralow Crosstalk Between DACs (<5 mV) High Rail-to-Rail Output Drive (±15 mA)
4.4
Soft IP Core for the DAC Interface
101
Fig. 4.4 Top level synthesis view of the DAC Interface
s Double Buffered Digital Inputs s 16-Lead Narrow SSOP Package Applications s s s s
Mobile Communications Process Control and Industrial Automation Instrumentation Automatic Test Equipment
The hardware interfacing details of the Linear Tech LTC2624 for the Xilinx FPGAs are covered in depth by Ken Chapman in [130]. The Handel C listing for the same in Handel C is given below which can be customized for any embedded application. The synthesis view is presented in Fig. 4.4
102
4.5
4
FPGA Based Multifunction Interface for Embedded Applications
Handel C Listing of the Soft IP Core for the DAC Interface
//****************************************************************// unsigned int 32 CALL_DAC (unsigned int 32 ); unsigned int 1 SPI_MOSI, DAC_CS, SPI_SCK, DAC_CLR, SPI_MISO; unsigned int 32 SEND_DATA , READ_DATA ,DAC_CODE1,PREV_CODE, DAC_CODE2; unsigned int 6 i; const unsigned int 32 delay_24_clk = 24; const unsigned int 32 delay_4_1_ms = 300000; const unsigned int 32 delay_100_mis = 7000; const unsigned int 32 delay_40_mis = 3000; const unsigned int 32 delay_1_mis = 200; const unsigned char function_set = 0x28; const unsigned char entry_mode_set = 0x06; const unsigned char display_on = 0x0c; const unsigned char clear_display = 0x01; unsigned char data; unsigned int 4 SF_D; unsigned int 1 E = 0,RS,RW; set clock = external “c9”; void cmd_wrt (unsigned int 8); void data_wrt (unsigned char ); void delay_l (unsigned int 32); interface bus_out () OutBus(unsigned int 4 OutPort = SF_D) with {data = {“M15”, “P17”, “R16”, “R15”}}; interface bus_out () OutBus1(unsigned int 1 OutPort = RS) with {data = {“L18”}}; interface bus_out () OutBus2(unsigned int 1 OutPort = RW) with {data = {“L17”}}; interface bus_out () OutBus3(unsigned int 1 OutPort = E) with {data = {“M18”}}; interface bus_out () OutBus5(unsigned int 1 OutPort = SPI_MOSI) {data = {“T4”}}; interface bus_out () OutBus6(unsigned int 1 OutPort = DAC_CS) {data = {“N8”}}; interface bus_out () OutBus7(unsigned int 1 OutPort = SPI_SCK) {data = {“U16”}}; interface bus_out () OutBus8(unsigned int 1 OutPort = DAC_CLR) {data = {“P8”}}; interface bus_in (unsigned int 1 Data1) Inbus1() with{data = {“N10”}}; void main (void) { SF_D = 0x3;
// initialise LCD
I
with with with with
4.5
Handel C Listing of the Soft IP Core for the DAC Interface
RW = 0; delay; delay; E = 1; delay_l(delay_24_clk); E = 0; delay_l (delay_4_1_ms); SF_D =0x3; // delay; delay; E =1; delay_l (delay_24_clk); E = 0; delay_l (delay_100_mis); SF_D = 0x3; // delay; delay; E = 1; delay_l (delay_24_clk); E = 0; delay_l (delay_100_mis); SF_D = 0x2; // delay; delay; E = 1; delay_l (delay_24_clk); E =0; delay_l (delay_100_mis);
103
II
III
IV
// now send command to LCD to configure it as 4 data lines cmd_wrt (function_set); //function_set = 0x28 to 4 data lines configuration. delay_l (delay_40_mis); //delay of 40 micro seconds. //set entry mode (autoincrement) cmd_wrt (entry_mode_set); // delay_l (delay_40_mis); //Now sendcommand for Display on Cursor off; cmd_wrt (display_on); delay_l (delay_40_mis); //Now clear LCD cmd_wrt (clear_display); delay_l (delay_4_1_ms); DAC_CODE1 = 0x0030fff0 ;
104
4
FPGA Based Multifunction Interface for Embedded Applications
PREV_CODE = CALL_DAC(DAC_CODE1); DAC_CODE2 = 0x0030fff0 ; PREV_CODE = CALL_DAC(DAC_CODE2); data = PREV_CODE[7:0]; data_wrt(data); data = PREV_CODE[15:8]; data_wrt(data); data = PREV_CODE[23:16]; data_wrt(data); data = PREV_CODE[31:24]; } unsigned int 32 CALL_DAC(unsigned int 32 SEND_DATA) { DAC_CS =0; i = 0; SPI_MOSI = SEND_DATA[31]; SPI_MISO = Inbus1.Data1; do { par { SPI_SCK = 1; SPI_MISO = Inbus1.Data1; READ_DATA = READ_DATA<<1; SEND_DATA = SEND_DATA<<1; } par { SPI_SCK =0; SPI_MOSI = SEND_DATA[31]; READ_DATA = READ_DATA[31:1]@SPI_MISO; i++; } }while(i<33); DAC_CS = 1; return(READ_DATA); } void data_wrt(unsigned char d)
4.5
Handel C Listing of the Soft IP Core for the DAC Interface
{ unsigned int 4 upper_nibble,lower_nibble; upper_nibble = d [7:4]; lower_nibble = d [3:0]; SF_D = upper_nibble; RS =1; // RS = 1 for data resister RW =0; delay; delay; E = 1; delay_l (delay_24_clk); E = 0; delay_l (delay_1_mis); SF_D = lower_nibble; delay; delay; E = 1; delay_l (delay_24_clk); E =0; return; } void cmd_wrt (unsigned int 8 cmd) { unsigned int 4 upper_nibble,lower_nibble; upper_nibble = cmd [7:4]; lower_nibble = cmd [3:0]; RS = 0 ; RW = 0; SF_D = upper_nibble; delay; delay; E =1; delay_l (delay_24_clk); E =0; delay_l (delay_1_mis); SF_D = lower_nibble; delay; delay; E = 1; delay_l (delay_24_clk); E = 0; return; } void delay_l (unsigned int 32 a)
105
106
4
FPGA Based Multifunction Interface for Embedded Applications
{ unsigned int 32 i; for (i = a; i > 0;i–) { delay; if( i==0) break; } return; } //****************************************************************//
4.6
Soft IP Core for the Linear Tech LTC6912-1 Dual Amp Interface
The LTC®6912 is a family of dual channel, low noise, digitally programmable gain amplifiers (PGA) that are easy to use and occupy very little PC board space. The gains for both channels are independently programmable using a 3-wire SPI interface to select voltage gains of 0, 1, 2, 5, 10, 20, 50, and 100 V/V (LTC6912-1 ); and 0, 1, 2, 4, 8, 16, 32, and 64 V/V (LTC6912-2) [131]. As per the data sheet in [132], the features of the device are as follows: * 2 Channels with Independent Gain Control LTC6912-1: (0, 1, 2, 5, 10, 20, 50, and 100 V/V) LTC6912-2: (0, 1, 2, 4, 8, 16, 32, and 64 V/V) * * * * * * * * * * * *
Offset Voltage = 2 mV Max (−40 °C to 85 °C) Channel-to-Channel Gain Matching of 0.1 dB Max 3-Wire SPITM Interface Extended Gain-Bandwidth at High Gains Wired-OR Outputs Possible (2:1 Analog MUX Function) Low Power Hardware Shutdown (GN-16 Only, 2 PA Max at 2.7 V) Rail-to-Rail Input Range Rail-to-Rail Output Swing Single or Dual Supply: 2.7 V to 10.5 V Total Input Noise: 12.6 nV/ȬHz Total System Dynamic Range to 115 dB 16-Pin GN (SSOP) or 12-Pin DFN Package Options
In fact these amplifiers are more often used in conjunction with the ADC and the hardware interfacing details of the same are well documented in [133].
4.6
Soft IP Core for the Linear Tech LTC6912-1 Dual Amp Interface
107
The Handel C soft IP core developed here can be embedded on the Xilinx Spartan 3e Starter Kit and the same can be customized for different values of the gain for maintaining the compatibility of the analog signal with the step size and resolution of the ADC. //****************************************************************// Amplifier Interface set clock = external “C9”; unsigned int 1 Amp_MOSI,Amp_Dout,Amp_Sck,CS,Amp_Shut; unsigned int 8 Amp_Set (unsigned int 8); void main (void) { unsigned int 8 Data1,Data2,Data3; CS = 0; Amp_Shut = 0; delay; delay; Data1 = 0x11; Data2 = 0x22; Data3 = 0x00; Data3 = Amp_Set(Data1); Data3 = Amp_Set(Data2); } unsigned int 8 Amp_Set (unsigned int 8 Wr) { unsigned int 4 i,count; static unsigned int 8 Rd; Amp_Sck = 0; for(i = 8;i > 0;i–) { Amp_MOSI = Wr[7]; delay; delay; Amp_Sck = 1; Rd = Rd[7:1]@Amp_Dout; delay; delay; delay; par{ Amp_Sck = 0;
108
4
FPGA Based Multifunction Interface for Embedded Applications
Wr = Wr < <1; Rd = Rd < <1; } } return (Rd); } //****************************************************************//
4.7
Soft IP Core for the ADC Interface
As the need for data bandwidth increases for end systems, data transmission rates continue to increase for Analog to Digital Converters (ADC) and the associated FPGA solution to interface to the ADCs and other parts of the system. Manufacturers of ADCs and FPGAs have responded with faster, more capable devices at a lower cost [134]. As reported in [135], there are various simple techniques to accomplish the interfacing of the ADC to FPGA. As per the above mentioned reference, which showcases that the FPGA can be used to fulfill all the interfacing requirements. It is used for all system logic functions. This allows software development, testing, implementation, and modification of logic circuitry without hardware modification. Inputs to the FPGA are the outputs of the 18 discriminators. The FPGA logic produces and locks an event signal whenever two modules along allowed lines of response trigger within a preset resolving time. The FPGA also produces detector codes identifying the detectors involved in the event. These are recorded and also sent to the multiplexer to switch the position signals from the modules involved in the event to a buffered CAMAC ADC system for digitization. A “not busy” signal from the ADC is used to reset the FPGA releasing the locked signals allowing further event acquisition. Working on the same lines, this section reports the soft IP core for the Linear Tech LTC1407A-1 Dual A/D converter. The device is a 12-bit/14-bit, 3 Msps ADCs with two 1.5 Msps simultaneously sampled differential inputs and draws only 4.7 mA from a single 3 V supply and come in a tiny 10-lead MS package. As given in [136], it features: 3Msps Sampling ADC with Two Simultaneous Differential Inputs s s s s s s s
1.5Msps Throughput per Channel Low Power Dissipation: 14 mW (Typ) 3 V Single Supply Operation ±1.25 V Differential Input Range Pin Compatible 0 V to 2.5 V Input Range Version (LTC1407/LTC1407A) 2.5 V Internal Bandgap Reference with External Overdrive 3-Wire Serial Interface
4.7
Soft IP Core for the ADC Interface
109
s Sleep (10 PW) Shutdown Mode s Nap (3 mW) Shutdown Mode s 80 dB Common Mode Rejection at 100 kHz The device is widely used in applications such as Telecommunications, Data Acquisition Systems, Uninterrupted Power Supplies, Multiphase Motor Control, I & Q Demodulation and Industrial Radio. One can refer the hardware interfacing details covered by many webresources such as the one in [137]. The soft IP core for the same is given below in Handel C. Program for ADC interfacing /* DATE - 19-12-2009 TIME - 2:15 AM programm to interface on board ADC Linear Tech LTC1407A-1 Dual A/D on SPARTEN 3E STARTER KIT; the timing spec. for SPI interface as follow.*/ /************************ALL_FUNCTIONS*************************/ /*****************ADC******************************/ unsigned int 34 CALL_ADC(); /*******************ADC_COMPLETE*******************************/ /********************AMP*************************/ unsigned int 8 Amp_Set (unsigned int 8); /***************************AMP_COMPLETE************************/ /***************LCD*************************************/ void cmd_wrt (unsigned int 8); void data_wrt (unsigned char ); void delay_l (unsigned int 32); /******************LCD_COMPLETE**************/ unsigned int 1 AMP_DOUT; /*************************ADC****************************/ unsigned int 1 SPI_MISO, SPI_SCK, SPI_MOSI, AD_CONV; unsigned int 32 READ; unsigned int 8 Send_Data; unsigned int 34 DATA1; /********************************************** /*************************AMP*******************/ unsigned int 1 AMP_CS , AMP_SHDN ,AMP_DOUT ; /******************************************************/ unsigned int 4 SF_D; unsigned int 1 E=0,RS,RW; /********************************************/ /********************PROGRAM_VARIABLES*******************/ unsigned int 8 Data1,Data2,Data3,count=0; unsigned int 8 i;
110
4
FPGA Based Multifunction Interface for Embedded Applications
unsigned int 34 DATA; unsigned int 1 SPI_SS_B,SF_CE0,FPGA_INIT_B; unsigned int 1 DAC_CS,DAC_CLR; /****************************************************************/ // ALL_BUS_DEFINATIONS const unsigned int 32 delay_24_clk = 24; const unsigned int 32 delay_4_1_ms = 300000; const unsigned int 32 delay_100_mis = 7000; const unsigned int 32 delay_40_mis = 3000; const unsigned int 32 delay_1_mis = 200; const unsigned char function_set = 0x28; const unsigned char entry_mode_set = 0x06; const unsigned char display_on = 0x0c; const unsigned char clear_display = 0x01; const unsigned int 32 delay_1_sec = 0x2fffff0; /********************************ADC****************************/ interface bus_out () OutBus9(unsigned int 1 OutPort=AD_CONV) with {data = {“P11”}}; interface bus_out () OutBus4(unsigned int 8 OutPort=Send_Data) with {data = {“F12”,”E12”,”E11”,”F11”,”C11”,”D11”,”E9”,”F9”}}; /****************************************************************/ interface bus_out () OutBus13(unsigned int 1 OutPort=SPI_SS_B) with {data = {“U3”}}; interface bus_out () OutBus14(unsigned int 1 OutPort=SF_CE0) with {data = {“D16”}}; interface bus_out () OutBus15(unsigned int 1 OutPort=FPGA_INIT_B) with {data = {“T3”}}; /*************************AMP***********************************/ interface bus_out () OutBus10(unsigned int 1 OutPort=AMP_CS) with {data = {“N7”}}; interface bus_out () OutBus11(unsigned int 1 OutPort=AMP_SHDN) with {data = {“P7”}}; interface bus_in (unsigned int 1 AMP_DOUT) Inbus1() with{data={“E18”}}; interface bus_in (unsigned int 1 SPI_MISO) Inbus2() with{data={“N10”}}; set clock = external “c9”; void main (void) { //************************AMP***************************** DAC_CS = 1; SPI_SS_B = 1;
4.7
Soft IP Core for the ADC Interface
AD_CONV =0; SF_CE0 = 1; FPGA_INIT_B = 1; AMP_CS=0; while(1) { AMP_SHDN =0; delay; delay; Data1=0x11; Data2=0x11; Data3=Amp_Set(Data1); Data3=Amp_Set(Data2); AMP_CS=1; // Read Data from ADC DATA=CALL_ADC(); delay; delay; delay; delay; delay; //Data Conversion Decimal to ASCII for LCD display Data2=0b0011@DATA[33:30]; data_wrt(Data2); Data2=0b0011@DATA[29:26]; data_wrt(Data2); Data2=0b0011@DATA[25:22]; data_wrt(Data2); Data2=0b0011@DATA[21:18]; data_wrt(Data2); Data2=0b0011@DATA[17:14]; data_wrt(Data2); Data2=0b0011@DATA[13:10]; data_wrt(Data2); Data2=0b0011@DATA[9:6]; data_wrt(Data2); Data2=0b0011@DATA[5:2]; data_wrt(Data2); Data2=0b001100@DATA[1:0]; data_wrt(Data2); data_wrt(‘ ‘); delay_l(delay_1_sec); }} //*********************AMP SET******************************** unsigned int 8 Amp_Set (unsigned int 8 Wr)
111
112
4
FPGA Based Multifunction Interface for Embedded Applications
{ unsigned int 4 count; static unsigned int 8 Rd; SPI_SCK=0; for(i=8;i>0;i--) { SPI_MOSI=Wr[7]; delay; delay; SPI_SCK=1; Rd=Rd[7:1]@Inbus1.AMP_DOUT; delay; delay; delay; par{ SPI_SCK=0; Wr=Wr<<1; Rd=Rd<<1; } } Rd=Rd>>1; return (Rd); } /****************************ADC********************************/ unsigned int 34 CALL_ADC() { unsigned int 34 ADC_DATA; unsigned int 8 j; unsigned int 8 k; AD_CONV = 1; delay; AD_CONV=0; delay; for(i=34;i>0;i--) { SPI_SCK=1; delay; delay; delay; SPI_SCK=0; ADC_DATA = ADC_DATA[33:1]@Inbus2.SPI_MISO; ADC_DATA = ADC_DATA<<1; }
4.8
Soft IP Core for the VGA Interface
113
Send_Data=ADC_DATA[26:19]; return(ADC_DATA); } void delay_l (unsigned int 32 a) { unsigned int 32 i; for (i=a; i>0;i--) { delay; if( i==0) break; } return; } void data_wrt(unsigned char d) { unsigned int 4 upper_nibble,lower_nibble; upper_nibble = d [7:4]; lower_nibble = d [3:0]; SF_D = upper_nibble; RS =1; // RS = 1 for data resister RW =0; delay; delay; E = 1; delay_l (delay_24_clk); E = 0; delay_l (delay_1_mis); SF_D = lower_nibble; delay; delay; E = 1; delay_l (delay_24_clk); E =0; delay_l(delay_40_mis); return; }
4.8
Soft IP Core for the VGA Interface
VGA (video graphics array), as a standard interface, has already been applications widely. There are a lot of FPGA-based VGA controller designs on which, however, there are still larger defects such as low-resolution display and the Chinese
114
4
FPGA Based Multifunction Interface for Embedded Applications
Fig. 4.5 Top level synthesis view of the VGA interface
characters display modules occupying large resource. Therefore, designers have proposed the use of VHDL as a logical means to describe the completion of highresolution VGA control module and a resource-conserving string display module design, and provide two main modules of designing ideas and logic diagrams [138]. The VGA interface is time sensitive as the image is controlled by two signals viz. horizontal sync and vertical sync. The former marks the start and finish of a line of pixels with a negative pulse in each case. The actual image data is sent in a 25.17 Ps window in a 31.77 Ps space between the sync pulses. (The time that image data is not sent is where the image is defined as a blank space and the image is dark.) The vertical sync is similar to the horizontal sync except that, in this case, the negative pulse marks the start and finish of each frame as a whole and the time for the frame (image as a whole) takes place in a 15.25 ms window in the space between pulses, which is 16.784 ms [139]. Both [138], [139] and [122] covers the hardware interfacing details of the VGA to Xilinx Spartan 3e FPGA. The soft IP core developed in Handel C facilitates the timing and other synchronization aspects (Figs. 4.5 and 4.6). //****************************************************************// /*date −11-12-09 ver-1.0 */
void fun_delay(unsigned int 20 i); const unsigned int V_TDISPLAY = 191933; //384000; const unsigned int H_TDISPLAY = 319; // 640; const unsigned int V_TPW = 799; //1600; const unsigned int H_TPW = 47; //by use 48 access of2 so use 47 //96; const unsigned int V_TFP = 3999; //8000;
4.8
Soft IP Core for the VGA Interface
115
Fig. 4.6 Detailed synthesis view of the VGA interface
const unsigned int H_TFP = 7; const unsigned int V_TBP = 11599; const unsigned int H_TBP = 23;
//16 ; //23200; //48;
unsigned int 1 VGA_RED,VGA_GREEN,VGA_BLUE,H_SYNC,V_SYNC; unsigned int 9 j; interface bus_out () OutBus1(unsigned int 1 OutPort = VGA_RED) {data = {“H14”}}; interface bus_out () OutBus2(unsigned int 1 OutPort = VGA_GREEN) {data = {“H15”}}; interface bus_out () OutBus3(unsigned int 1 OutPort = VGA_BLUE) {data = {“G15”}}; interface bus_out () OutBus4(unsigned int 1 OutPort = H_SYNC) {data = {“F15”}}; interface bus_out () OutBus5(unsigned int 1 OutPort = V_SYNC) {data = {“F14”}}; set clock = external “c9”; void main (void)
with with with with with
116
4
FPGA Based Multifunction Interface for Embedded Applications
{ while(1) { for (j = 0;j < 481;j++) { par { V_SYNC =1; VGA_RED =0; VGA_GREEN =0; VGA_BLUE =0; H_SYNC =0; } fun_delay(H_TPW); H_SYNC =1; fun_delay(H_TBP); par { VGA_RED =1; VGA_GREEN =1; VGA_BLUE =0; } fun_delay(H_TDISPLAY); par { VGA_RED =0; VGA_GREEN =0; VGA_BLUE =0; } fun_delay(H_TFP); if(j==480) { fun_delay( V_TBP ); V_SYNC =0; fun_delay( V_TPW ); V_SYNC =1; fun_delay( V_TFP ); j = 0; } } } }
4.9
Soft IP Core for the Keyboard Interface
117
void fun_delay(unsigned int 20 i) { unsigned int 20 k; for (k = i;k > 0;k–) { delay; } return; } //****************************************************************//
4.9
Soft IP Core for the Keyboard Interface
The soft IP core for the keyboard interface is just reported to illustrate the interfacing aspects of any PS2 device connection to Xilinx Spartan 3e FPGA. The PS/2 interface is a bit serial interface with two signals Data and Clock. Both signals are bi-directional and logic 1 is electrically represented by 5 V and logic 0 is represented by 0 V (digital ground). Whenever the Data and Clock line is not used, i.e. is idle, both the Data and Clock lines are left floating, that is the host and the device both set the outputs in high impedance. Externally, at the PCB, large (about 5 k) pull-up resistors keep the idle lines at 5 V (logic 1). The hardware aspects of the PS2 interface connected to FPGA are covered in depth in [140]. Dong et al. [141] report the application of such case studies for the usefulness in undergraduate class environment [142]. Working on the same lines we have developed two alternate ways to write the soft IP core for the keyboard interfacing. The complete listing exemplifies the various manners in which the soft IP cores could be developed. //****************************************************************// /* program to interface LCD with FPGA with 4 data line interface. Date := 05-12-2009; LCD data lines = E= RS= RW= */ set clock = external “C9”; void cmd_wrt (unsigned int 8); void data_wrt (unsigned char ); void delay_l (unsigned int 32); unsigned int 8 Amp_Set(unsigned int 8);
118
4
FPGA Based Multifunction Interface for Embedded Applications
const unsigned int 32 delay_24_clk = 24; const unsigned int 32 delay_4_1_ms = 300000; const unsigned int 32 delay_100_mis = 7000; const unsigned int 32 delay_40_mis = 3000; const unsigned int 32 delay_1_mis = 200; const unsigned int 32 delay_30_mis = 1550; const unsigned char function_set = 0x28; const unsigned char entry_mode_set = 0x06; const unsigned char display_on = 0x0c; const unsigned char clear_display = 0x01; const unsigned char data[10] = “SHIVAJI = “; unsigned int 1 Amp_MOSI,Amp_Dout,Amp_Sck,CS,Amp_Shut; unsigned char Send_Data; unsigned int 4 SF_D; unsigned int 1 E = 0,RS,RW; unsigned char Input; //**************Keyboard****************************/ unsigned int 1 Data_Key,Clk_Key ,logic; unsigned int 11 Read_Key,count; interface bus_out () OutBus(unsigned int 4 OutPort = SF_D) with {data = {“M15”, “P17”, “R16”, “R15”}}; interface bus_out () OutBus1(unsigned int 1 OutPort = RS) with {data = {“L18”}}; interface bus_out () OutBus2(unsigned int 1 OutPort = RW) with {data = {“L17”}}; interface bus_out () OutBus3(unsigned int 1 OutPort = E) with {data = {“M18”}}; // interface bus_in (unsigned int 1 Data) Inbus() with{data = {“G14”}}; interface bus_in (unsigned int 1 Data1) Inbus1() with{data = {“G13”}}; interface bus_out () OutBus4(unsigned int 8 OutPort = Send_Data) {data = {“F12”,”E12”,”E11”,”F11”,”C11”,”D11”,”E9”,”F9”}}; void main (void ) { unsigned int j; unsigned int 8 Data1,Data2,Data3,C_Data; SF_D = 0x3; // initialise LCD RW = 0; delay; delay; E = 1; delay_l(delay_24_clk); E = 0; delay_l (delay_4_1_ms);
I
with
4.9
Soft IP Core for the Keyboard Interface
SF_D =0x3; // delay; delay; E =1; delay_l (delay_24_clk); E = 0; delay_l (delay_100_mis); SF_D = 0x3; // delay; delay; E = 1; delay_l (delay_24_clk); E = 0; delay_l (delay_100_mis); SF_D = 0x2; // delay; delay; E = 1; delay_l (delay_24_clk); E =0; delay_l (delay_100_mis);
119
II
III
IV
// now send command to LCD to configure it as 4 data lines cmd_wrt (function_set); //function_set = 0x28 to 4 data lines configuration. delay_l (delay_40_mis); //delay of 40 micro seconds. //set entry mode (autoincrement) cmd_wrt (entry_mode_set); // delay_l (delay_40_mis); //Now sendcommand for Display on Cursor off; cmd_wrt (display_on); delay_l (delay_40_mis); //Now clear LCD cmd_wrt (clear_display); delay_l (djelay_4_1_ms);*/ //**************************************** count = 0; logic =1;
120
4
FPGA Based Multifunction Interface for Embedded Applications
while(1) { Clk_Key = Inbus.Data; switch(Clk_Key) { case 0: Data_Key = Inbus1.Data1; Read_Key = Read_Key> > 1; Read_Key = Data_Key@Read_Key[9:0]; count++; do { Clk_Key = Inbus.Data; }while(Clk_Key==0); if (count==11) { count = 0; Send_Data = Read_Key[8:1]; C_Data = 0b0011@Send_Data[7:4]; data_wrt(C_Data); Send_Data = 0b0011@Send_Data[3:0]; data_wrt(Send_Data); } break; case 1: delay; break; } } } void data_wrt(unsigned char d) { unsigned int 4 upper_nibble,lower_nibble; upper_nibble = d [7:4]; lower_nibble = d [3:0]; SF_D = upper_nibble; RS =1; // RS = 1 for data resister RW =0; delay; delay; E = 1; delay_l (delay_24_clk); E = 0; delay_l (delay_1_mis);
4.9
Soft IP Core for the Keyboard Interface
121
SF_D = lower_nibble; delay; delay; E = 1; delay_l (delay_24_clk); E =0; return; } void cmd_wrt (unsigned int 8 cmd) { unsigned int 4 upper_nibble,lower_nibble; upper_nibble = cmd [7:4]; lower_nibble = cmd [3:0]; RS = 0 ; RW = 0; SF_D = upper_nibble; delay; delay; E =1; delay_l (delay_24_clk); E =0; delay_l (delay_1_mis); SF_D = lower_nibble; delay; delay; E = 1; delay_l (delay_24_clk); E = 0; return; } void delay_l (unsigned int 32 a) { unsigned int 32 i; for (i = a; i > 0;i–) { delay; if( i==0) break; } return; }
//******************************************************************//
122
4.10
4
FPGA Based Multifunction Interface for Embedded Applications
Triangular Wave Generator Using DAC
//****************************************************************// unsigned int 32 CALL_DAC (unsigned int 32 ); unsigned int 1 SPI_MOSI, DAC_CS, SPI_SCK, DAC_CLR, SPI_MISO; unsigned int 32 SEND_DATA , READ_DATA ,DAC_CODE1,PREV_CODE ,DAC_CODE2; unsigned int 6 i; const unsigned int 32 delay_24_clk = 24; const unsigned int 32 delay_4_1_ms = 300000; const unsigned int 32 delay_100_mis = 7000; const unsigned int 32 delay_40_mis = 3000; const unsigned int 32 delay_1_mis = 200; const unsigned char function_set = 0x28; const unsigned char entry_mode_set = 0x06; const unsigned char display_on = 0x0c; const unsigned char clear_display = 0x01; unsigned char data; unsigned int 4 SF_D; unsigned int 1 E = 0,RS,RW; static unsigned char data1,data2; static unsigned int 12 count; static unsigned int 16 count1; set clock = external “c9”; void cmd_wrt (unsigned int 8); void data_wrt (unsigned char ); void delay_l (unsigned int 32); interface bus_out () OutBus(unsigned int 4 OutPort = SF_D) with {data = {“M15”, “P17”, “R16”, “R15”}}; interface bus_out () OutBus1(unsigned int 1 OutPort = RS) with {data = {“L18”}}; interface bus_out () OutBus2(unsigned int 1 OutPort = RW) with {data = {“L17”}}; interface bus_out () OutBus3(unsigned int 1 OutPort = E) with {data = {“M18”}}; interface bus_out () OutBus5(unsigned int 1 OutPort = SPI_MOSI) {data = {“T4”}}; interface bus_out () OutBus6(unsigned int 1 OutPort = DAC_CS) {data = {“N8”}}; interface bus_out () OutBus7(unsigned int 1 OutPort = SPI_SCK) {data = {“U16”}}; interface bus_out () OutBus8(unsigned int 1 OutPort = DAC_CLR) {data = {“P8”}}; interface bus_in (unsigned int 1 Data1) Inbus1() with{data = {“N10”}};
with with with with
4.10
Triangular Wave Generator Using DAC
123
void main (void) { SF_D = 0x3; // initialise LCD RW = 0; delay; delay; E = 1; delay_l(delay_24_clk); E = 0; delay_l (delay_4_1_ms); SF_D =0x3; // delay; delay; E =1; delay_l (delay_24_clk); E = 0; delay_l (delay_100_mis); SF_D = 0x3; // delay; delay; E = 1; delay_l (delay_24_clk); E = 0; delay_l (delay_100_mis); SF_D = 0x2; // delay; delay; E = 1; delay_l (delay_24_clk); E =0; delay_l (delay_100_mis);
I
II
III
IV
// now send command to LCD to configure it as 4 data lines cmd_wrt (function_set); //function_set = 0x28 to 4 data lines configuration. delay_l (delay_40_mis); //delay of 40 micro seconds. //set entry mode (autoincrement) cmd_wrt (entry_mode_set); // delay_l (delay_40_mis); //Now sendcommand for Display on Cursor off; cmd_wrt (display_on); delay_l (delay_40_mis);
124
4
FPGA Based Multifunction Interface for Embedded Applications
//Now clear LCD cmd_wrt (clear_display); delay_l (delay_4_1_ms); while(1) { count++; count1 = count[11:0]@0; DAC_CODE1 = 0xff30@count1[15:0]; PREV_CODE = CALL_DAC(DAC_CODE1); } } unsigned int 32 CALL_DAC(unsigned int 32 wr) { unsigned int 6 i ; static unsigned int 32 Rd; SPI_SCK = 0; DAC_CS =0; DAC_CLR = 1; i = 0; SPI_MISO = wr[31]; for (i = 32;i > 0;i–) { SPI_SCK =1; SPI_MISO = Inbus1.Data1; Rd = Rd[31:1]@SPI_MISO; Rd = Rd < <1; SPI_SCK =0; wr = wr < <1; SPI_MOSI = wr[31]; } DAC_CS = 1; SPI_SCK = 1; return(Rd); } void data_wrt(unsigned char d) { unsigned int 4 upper_nibble,lower_nibble; upper_nibble = d [7:4];
4.10
Triangular Wave Generator Using DAC
lower_nibble = d [3:0]; SF_D = upper_nibble; RS =1; // RS = 1 for data resister RW =0; delay; delay; E = 1; delay_l (delay_24_clk); E = 0; delay_l (delay_1_mis); SF_D = lower_nibble; delay; delay; E = 1; delay_l (delay_24_clk); E =0; delay_l(delay_40_mis); return; } void cmd_wrt (unsigned int 8 cmd) { unsigned int 4 upper_nibble,lower_nibble; upper_nibble = cmd [7:4]; lower_nibble = cmd [3:0]; RS = 0 ; RW = 0; SF_D = upper_nibble; delay; delay; E =1; delay_l (delay_24_clk); E =0; delay_l (delay_1_mis); SF_D = lower_nibble; delay; delay; E = 1; delay_l (delay_24_clk); E = 0; return; } void delay_l (unsigned int 32 a)
125
126
4
FPGA Based Multifunction Interface for Embedded Applications
{ unsigned int 32 i; for (i = a; i > 0;i–) { delay; if( i==0) break; } return; } //*******************************************************************//
4.11
Conclusion
The main objective of this chapter was to design the soft IP cores of customizable embedded peripherals at higher level of abstraction and report their complete code listing. Free distribution of these IP cores in this manner would not only ensure their customization at the user end but also initiate a healthy open source hardware movement. Such type of core repositories are now gaining popularity amongst the VLSI, SoC and Embedded design community owing to their striking advantages such as accelerating design cycle, reliability with the pre-verified methodology and significant reduction in the time to market. Some examples of such well established repositories of the soft IP cores are “Open Cores” (http:// www.opensores.org and FPGA CPU (http://www.fpgacpu.org ).
Chapter 5
FPGA Based High Resolution Time to Digital Converter
Abstract In an increasingly digital domain of applications, the Digital Signal Processing (DSP) has become inevitable. With the outside environment predominantly analog, its conversion into the digital domain in order to facilitate the benefits of the matured DSP technology is also mandatory. Many a times the analog to digital converter performance becomes the bottleneck in advanced instrumentation and DSP applications. This chapter presents a Vernier Time to Digital Converter (TDC) with resolution less than 30 pS, implemented on SPARTAN III FPGA. Detailed description of the TDC using schematic editor and Verilog code ring oscillator, phase detector and counter have been described in this chapter.
5.1
Introduction
As the technology is advancing and scaling limits are being hit by red brick wall, no further scale down is possible because of impossible thermal management of the VLSI chip and leakage current issues. As size of the devices reduces, supply voltage also go on reducing. Because of definite threshold voltages of MOSFETs we are left with very small voltage swing. This reduces the noise margin of the functional block realized using these MOSFETs. In any instrumentation the outside world is analog and the signal processing is digital. Digital signal processing can be well attended with the help of many algorithms developed for a particular job on hand. The only bottleneck in the whole instrumentation set up realized on the VLSI with nanometer technology is Analog to Digital Converter. With 120 nm. Technology the recommended supply voltage is approximately 1–1.2 V. After deducting the threshold voltage of MOSFETs which is approximately about 0.3–0.5 V we are left with hardly 0.8 V. Available for conversion interpretation. Even if simple 8 bit resolution is required then the minimum voltage we need to differentiate is 0.8/28. This will put a heavy demand on supply voltage stability and also the temperature of operation. For high speed ADC flash seems to be the solution. But the DNL and INL can not
R.K. Kamat et al., Harnessing VLSI System Design with EDA Tools, DOI 10.1007/978-94-007-1864-7_5, © Springer Science+Business Media B.V. 2012
127
128
5 FPGA Based High Resolution Time to Digital Converter
be achieved as we need to have about 256 levels to be detected within 0.8 V. So with the advent of technology the only possible solution for stable and viable ADC is Time to Digital Converter as there are only two stages between Vdd and ground that need to be known. Hence there lies a great thrust on TDC in any present day research. For our research problem of detecting chloroplast content using the latest of the technology the option for ADC becomes TDC. The TDC can be realized in a variety of ways. Its implementation can be full ASIC or it can be implemented on FPGA depending on the product cost and time to market. We have realized a variety of TDCs on full custom VLSI as well as on FPGA. They are covered in depth in the doctoral thesis [167] of one of the authors of this book. The full custom TDCs are realized using CMOS at 120 and 90 nm technology suites available from Cadence and Microwind in [167]. The fastest full custom TDC implemented in 120 nm technology by us is based on Multipath Gated Ring Oscillator. However, in this chapter a different flavor of the TDC using the Verilog has been implemented. At the outset it is worth while to take a review of the literature and prior developmental work on TDC.
5.2
TDC: Prior Art
A time to digital converter (abbreviated TDC) is a device for converting a signal of sporadic pulses into a digital representation of their time indices. In other words, a TDC outputs the time of arrival for each incoming pulse. Because the magnitudes of the pulses are not usually measured, a TDC is used when the important information is to be found in the timing of events [162]. HIGH-resolution time-to-digital converters (TDC’s) have application in a number of measurement systems, e.g., time-of-flight (TOF) particle detectors, laser range finders, and logic analyzers [143]. Literature survey reveals the application of the time to digital converter technique in various domains such as used space science [144, 145], high-energy physics [146–148], laser range finders [147] and test instrumentation [150], delay locked loops (DLLs) [151] and so on. In opto-electronics the TDC’s have been used for measurement and digitalization of time intervals with very high resolution and accuracy in applications such as Time-Correlated SinglePhoton Counting [152–154], optical characterization of CMOS circuits [155] and laser-ranging [156]. There are several methods of realization of time to digital converter and they have been implemented widely by many research groups. The methods and there implementation in literature is as follows: s s s s s
Tapped Delay Lines (TDL) [154, 155] Delay Locked Loop (DLL) [156] Vernier Delay Line (VDL) [157] Multi-level TDC [158] Triggered Ring Oscillator [159]
The silicon implementation of the abovementioned TDCs also comes in different flavors such as full custom ASIC, semi custom ASIC, mixed mode ASIC and
5.3
TDC Using Vernier Principle
129
based on FPGA. A recent book [161] covers an in-depth detail of advanced TDC architectures while addressing the challenges of signed time interval measurement, long measurement time, high resolution, high linearity, low-power, variability and calibration, low mismatch among multiple measurements, and suitability for design automation. Resolution enhancement techniques such as pulse-shrinking, Vernier delay-line, local passive interpolation, gated delay-lines, and time amplification are introduced and discussed with respect to operating principle, resolution, power, area, conversion time, susceptibility to variations, and suitability for implementation and mass production in this noteworthy book by Stephen Henzler. From the implementation on FPGA based platform is concerned, the pioneering work on implementation of the TDC on an FPGA-based approach was proposed by Kalisz et al. [165] in 1995. The implementation principle was based on the difference between a latch delay and a buffer delay of QuickLogic’s FPGA and they could achieve a time resolution of 100 ps. Further work on this has been reported in 2003, a in an ACEX 1 K FPGA from Altera by Wu et al. [168]. Unlike the earlier work, Wu used cascade chains of the FPGA and could get improved time resolution of 400 ps. Szymanowski et al. implemented a high-resolution TDC with two stage interpolators in a QL12X16B from QuickLogic [169] with a timing resolution of 200 ps. Another TDC implementation on a general purpose FPGA device by using dedicated carry lines in the FPGA to perform time interpolation has been reported by Qi An et al. in [170]. Recently Aloisio et al. [170] have reported high resolution FPGA based TDC using two types of architectures in different Xilinx FPGAs. Both approaches use the classic Nutt method based on the two stage interpolation. In this chapter we present the implementation of Vernier Time to Digital Converter on Xilinx Spartan 3e FPGA.
5.3
TDC Using Vernier Principle
Many techniques have been proposed for FPGA based implementation of TDCs. FPGA-based implementation is first introduced by Kalisz. He adopted Tapped Delay Lines (TDL) technique with 200 ps resolution time was the result. Conversion time is less but drawback of this method is requirement of large number of flipflops. Interpolation method used when both long full-scale range and high resolution are needed. Long full scale range is provided by coarse counter driven by reference clock while high resolution is obtained by fine interpolators, but suffers from errors like non-linearity of interpolators, quantization error [146]. Also using tapped delay lines or Vernier delay line concept, most FPGA-based TDC design usually suffer from large area consumption and unpredictable P&R delay. The time delay from gates themselves together with the unpredictable internal P&R delay. This is the inevitable FPGA hardware restriction [144]. The limitation of architecture mentioned in [145] is it’s difficulty in real device implementation, since frequencies of the ring oscillators are faster than what most of the FPGA and I/Os are rated for. Direct measurement of ring oscillator period is difficult, because of loading of ring oscillator by a buffer, variations induced by placement, routing delays to the device pin, all results in mismatch between the measured and true period
130
5 FPGA Based High Resolution Time to Digital Converter
Error = (n1-1)Tslow - (n2-1)T fast
Time interval
Reference clock
Coarse Error1
Error2
Fig. 5.1 Basic principle of measurement of time interval using Nutt Interpolation method
[145]. The Vernier method is an all digital technique and is described in detail in [164–166]. Better known as ‘Digital Time Stretching”, the basic constituents of the present time to digital converter are two stable oscillators slightly differing in frequency. In normal method of time interval measurement the reference clock is used for coarse counter which is binary counter. Coarse count is simply equal to number of clock cycles multiplied by the period of the reference clock i.e. nTref. And the resolution is limited to reference clock period Tref. There are two possible errors as shown in Figs. 5.1 and 5.7. In Fine measurement, these errors are measured and added/subtracted with the coarse count to obtain exact time interval value. Texact nTref p errors
(5.1)
With respect to Fig. 5.1, Eq. 5.1 yields Texact nTref error1 error2
5.3.1
(5.2)
Coarse measurement
For coarse measurement we used onboard crystal clock which produces stable oscillations. But crystal oscillators are too slow, so to increase the frequency and there by speed of conversion we used DCM (Digital Clock Manager) available onchip as a core. Advanced series FPGA like sparten-3 has got four DCMs. There are three ways to use the DCM. First, calling DCM as a component through HDL. Second is through
5.3
TDC Using Vernier Principle
131
Fig. 5.2 Selecting a DCM from clocking wizard
schematic i.e. taking as a symbol and assign ports for required pins, then create user symbol to use in schematic. Third is through XILINX clocking wizard from IP (Coregen & Architecture wizard). Third one is the easiest way which is explained below Procedure to Use DCM Step1: Double-click Create New Source – Select IP (COREGen & Architecture Wizard), and click Next The Architecture Wizard Contains Several Wizards as shown in figure 5.2. Step2: The Clocking Wizard helps you define the DCM. In main window – Select pins – Specify s 2EFERENCE SOURCE s #LOCK FREQUENCY s 0HASE SHIFT Step3: Specify the Types of Clock Buffers Connect clock buffers (BUFG, BUFGMUX) to the selected output pins of the DCM.
132
5 FPGA Based High Resolution Time to Digital Converter
Fig. 5.3 General setup wizard for DCM as a frequency multiplier
Fig. 5.4 Wizard shows specifying the clock buffers to be used
The DCM will appear in source window and ready to use as a component. It can also be used in schematic by making it as a user symbol through ‘design utilities’ option as shown in Figs. 5.5 and 5.6 shows behavioral simulation of DCM as frequency multiplier.
5.3
TDC Using Vernier Principle
133
Fig. 5.5 DCM used as a symbol in schematic design entry
NOR 2
NOR 3
INV
NOR 1
NOR 2
enable
cp
Fig. 5.6 Behavioral simulation of frequency multiplier
NOR 2
Fig. 5.7 Editable floor plan for ring oscillators
5.3.2
FINE MEASUREMENT
To measure errors we have two controllable oscillators with slightly different frequencies ffast and fslow, and the new resolution is given by Tresolution Tslow Tfast
(5.3)
To measure error we first enable slow oscillator at the positive rising edge of the error, and the fast oscillator at negative edge. These oscillators are connected to
ALIAS
0
0
1
tm
error1
error2
300
INV
Fig. 5.8 Schematic and behavioral simulation of error finder
1
clk
Current Simulation Time: 1000 ns
clk
tm
VCC
400
C
D
C
D
R
FDR
R
FDR
Q
Q
500
600
AND282
AND281
700
error 2
error 1
134 5 FPGA Based High Resolution Time to Digital Converter
5.3
TDC Using Vernier Principle
135
Fig. 5.9 Timing for the slow clock, fast clock
counters and phase detector. Phase detector output indicates coincidence of fast and slow clock edges. The measurement procedure mentioned in [144] is used. The numbers stored in counters n1 and n2 are used as below. Error n1 1 Tslow n2 1 Tfast
(5.4)
The Interpolative oscillator shown in figure basically has got odd number of nor gate, five nor gates in this case. It is controllable oscillator running at a period of 3 ns. This interpolative oscillator is used as VCO in the reference [143]. We modified it by using control voltage input as enable input. Where control voltage was used to change the frequency of oscillation and in modified version the enable input decides whether to oscillate or not. This high frequency clock make error finder block inefficient and erroneous. This 3 ns period clock is divided by 2, 4 or 8 then can be used as a coarse clock. 24 bit binary counter is used as coarse counter so as to enhance the range of TDC. However the range can be increased by connecting D flip flop to the counter (Fig. 5.8). Error finder is used to find the errors occurred during the coarse measurement. The maximum quantity of errors is directly proportional to period of coarse clock. Depending on the maximum quantity of errors we need to choose the size of the counters used in fine measurement. These errors are measured using vernier method which is mentioned early in this section. It has advantage in term of resolution also (Figs. 5.9 and 5.10). There are many ring oscillator circuits with control inputs. Few of them mentioned in [2] and [3]. The schematic of ring oscillators is as shown in Fig. 5.11, drawn using ISE simulator’s schematic entry tool. This tool also provides manual floor planning, place and route facility. We did floor planning manually so as to get different delays hence oscillation frequencies. Fig. 5.12 shows editable floor plan for xc3s400-4-pq208.
136
5 FPGA Based High Resolution Time to Digital Converter
Fig. 5.10 General block diagram revealing the principle
Fig. 5.11 Schematic of double ring oscillator with start, stop control
Post route Simulation results shown in Fig. 5.13, which shows fast oscillator running at a period of 1964.0 ps and slow oscillator running at 1974.0 ps. Difference in delay gives 10 ps resolution. Behavioral simulation does not give oscillations because delay is not considered i.e. either delay due to logic or delay due to routing. Phase detector schematic is given in Fig. 5.14. Output of AND2b1 gate is high when it detects “10” sequence. Simulation results also shown in figure. Output of the phase detector is used to control the oscillators and there by counters.
5.4 Verilog Modules and Simulation
Fig. 5.12 Editable floor plan for ring oscillators
Fig. 5.13 Ring oscillator timing simulation
5.4
Verilog Modules and Simulation
`timescale 1ns / 1ps module m(clr, clr1, pdop, Tm, op, op1); input clr; input clr1; input pdop; input Tm; output op; output op1; wire and_op; wire and2_op; wire start1; wire start2; wire to_vcc; wire to_vcc1; wire xor_op;
137
138
5 FPGA Based High Resolution Time to Digital Converter
Fig. 5.14 Schematic and behavioral simulation of phase detector
wire xor2_op; wire op1_DUMMY; wire op_DUMMY; assign op = op_DUMMY; assign op1 = op1_DUMMY; AND2 and2_ro1 (.I0(start1), .I1(xor2_op), .O(and_op)); AND2 and2_ro2 (.I0(start2), .I1(xor_op), .O(and2_op)); FDC dff1 (.C(Tm), .CLR(pdop), .D(to_vcc), .Q(start1)); defparam dff1.INIT = 1’b0;
5.4 Verilog Modules and Simulation
139
FDC_1 dff2 (.C(Tm), .CLR(pdop), .D(to_vcc1), .Q(start2)); defparam dff2.INIT = 1’b0; VCC vcc1 (.P(to_vcc)); VCC vcc2 (.P(to_vcc1)); XOR2 xor1_ro1 (.I0(start1), .I1(and_op), .O(op_DUMMY)); XOR2 xor1_ro2 (.I0(start2), .I1(and2_op), .O(op1_DUMMY)); XOR2 xor2_ro1 (.I0(clr), .I1(op_DUMMY), .O(xor2_op)); XOR2 xor2_ro2 (.I0(clr1), .I1(op1_DUMMY), .O(xor_op)); endmodule
5.4.1
Ring Oscillator (Fast Clock) RTL Schematic
The schematic of the ring oscillator for fast clock configuration is shown in Fig. 5.15 The verilog implementation for the same is given in 5.4.2
5.4.2
Verilog Module for Ring Oscillator (Fast Clock)
`timescale 1 ns/1 ps module ring3(en, clk_f ); input en; output clk_f; wire XLXN_1; wire XLXN_2; wire XLXN_3; wire clk_f_DUMMY; assign clk_f = clk_f_DUMMY; AND2 XLXI_1 (.I0(en), .I1(clk_f_DUMMY), .O(XLXN_1)); INV XLXI_2 (.I(XLXN_1), .O(XLXN_2));
140
5 FPGA Based High Resolution Time to Digital Converter
Fig. 5.15 Ring oscillator timing simulation
INV XLXI_3 (.I(XLXN_2), .O(XLXN_3)); INV XLXI_4 (.I(XLXN_3), .O(clk_f_DUMMY)); endmodule RING OSCILLATOR (SLOW CLOCK) RTL SCHEMATIC (Fig. 5.16)
5.4.3
Verilog Module for Ring Oscillator (Slow Clock)
`timescale 1 ns / 1 ps module ring5(EN, clk_s); input EN; output clk_s; wire XLXN_1; wire XLXN_2; wire XLXN_3; wire XLXN_4; wire XLXN_5; wire clk_s_DUMMY; assign clk_s = clk_s_DUMMY; AND2 XLXI_1 (.I0(EN), .I1(clk_s_DUMMY), .O(XLXN_1)); INV XLXI_2 (.I(XLXN_1), .O(XLXN_2)); INV XLXI_3 (.I(XLXN_2), .O(XLXN_3)); INV XLXI_4 (.I(XLXN_3), .O(XLXN_4)); INV XLXI_5 (.I(XLXN_4), .O(XLXN_5)); INV XLXI_6 (.I(XLXN_5), .O(clk_s_DUMMY)); Endmodule
5.4 Verilog Modules and Simulation
Fig. 5.16 Schematic of the ring oscillator for fast clock configuration
5.4.4
Phase Detector (Fig. 5.17)
Fig. 5.17 Schematic of phase detector
141
142
5.4.5
5 FPGA Based High Resolution Time to Digital Converter
Simulation Wave Form of Phase Detector (Fig. 5.18)
Fig. 5.18 Behavioral simulation of phase detector
5.4.6
Verilog Module for 8 Bit Counter
`timescale 1ns / 1ps module pd(Osc_f, Osc_s, pdop); input Osc_f; input Osc_s; output pdop; wire DFF1_op; wire DFF2_op; AND2B1 and2 (.I0(DFF2_op), .I1(DFF1_op), .O(pdop)); FD DFF1 (.C(Osc_f), .D(DFF2_op), .Q(DFF1_op)); defparam DFF1.INIT = 1’b0; FD DFF2 (.C(Osc_f), .D(Osc_s), .Q(DFF2_op)); defparam DFF2.INIT = 1’b0; endmodule
5.4.7
RTL Schematic of 8 Bit Counter (Fig. 5.19)
Fig. 5.19 RTL schematic of 8 bit counter Fig. 5.20 Simulation results
5.4.8
Simulation Results of 8 Bit Counter (Fig. 5.20)
Fig. 5.20 Simulation results
144
5.4.9
5 FPGA Based High Resolution Time to Digital Converter
Verilog Module for 8 Bit Counter
module count8(clk,cnt,hold,reset); input clk,hold,reset; output [7:0]cnt; reg [7:0]cnt; initial begin cnt = 8’b00000000; end always @(posedge clk) begin if (reset) cnt = 8’b0; else if (hold) cnt = cnt; else cnt = cnt + 1’b1; end endmodule
5.4.10
RTL Schematic of Time to Digital Converter (Fig. 5.21)
Fig. 5.21 RTL schematic of TDC
5.4 Verilog Modules and Simulation
5.4.11
Schematic of Time to Digital Converter (Fig. 5.22)
Fig. 5.22 Final schematic of the TDC
5.4.12 Verilog Module for Time to Digital Converter `timescale 1 ns/1 ps module proj2(set, time_1, XLXN_7, XLXN_8); input set; input time_1; output [7:0] XLXN_7; output [7:0] XLXN_8; wire XLXN_2; wire XLXN_5; wire XLXN_9; wire XLXN_10; wire XLXN_14; wire XLXN_15; wire XLXN_34;
ring3 XLXI_1 (.en(XLXN_15), .clk_f(XLXN_5)); ring5 XLXI_2 (.EN(XLXN_14), .clk_s(XLXN_2)); pdet1 XLXI_3 (.clk_f(XLXN_5), .clk_s(XLXN_2), .hold(XLXN_34));
145
146
5
FPGA Based High Resolution Time to Digital Converter
count8 XLXI_4 (.clk(XLXN_5), .hold(XLXN_34), .reset(XLXN_9), .cnt(XLXN_7[7:0])); count8 XLXI_5 (.clk(XLXN_2), .hold(XLXN_34), .reset(XLXN_10), .cnt(XLXN_8[7:0])); GND XLXI_6 (.G(XLXN_9)); GND XLXI_7 (.G(XLXN_10)); FDC_1 XLXI_12 (.C(time_1), .CLR(XLXN_34), .D(set), .Q(XLXN_15)); defparam XLXI_12.INIT = 1’b0; FDC XLXI_13 (.C(time_1), .CLR(XLXN_34), .D(set), .Q(XLXN_14)); defparam XLXI_13.INIT = 1’b0; endmodule
5.5
Applications of the TDC Implemented
Following are the applications of the TDC implemented in this chapter: s TDC’s in their digital nature are well suited to sensor markets, automotive, medical and military. s TDC offers major advantage by replacing former analog solutions. s TDC is used to measure the time of flight. s 3D image detection can be used for robovision in robotics. s 3D object recognition is moreover very important in navigation and control applications. s 3D object detection can be used in many automotive collision warning systems. s 3D face projects explores methods that are applicable to biometric modalities, e.g. iris and fingerprint multibiometrics. s 3D object detection can be used in 3D scanner which can be used to construct digital three dimensional models which are extensively used in entertainment industry.
References
1. MacMillen, D., Butts, M., Camposano, R., Hill, D., Williams, T.: An industrial view of electronic design automation. IEEE Trans. ICCAD 19(21), 1428–1448 (2000) 2. Camposano, R., MacMillen, D.: Design technology for systems on a chip. In: Proceedings VLSI-SoC, Montpellier, pp. 3–7, Dec 2001 3. Farrahi, A.H., Hathaway, D.J., Wang, M., Sarrafzadeh, M.: Quality of EDA CAD tools: definitions, metrics and directions, Quality Electronic Design, 2000. ISQED 2000. In: Proceedings of the IEEE 2000 First International Symposium on Quality of Electronic Design, San Jose, pp. 395–405, 20–22 Mar 2000 4. Francis, L. Chan, Mark, D. Spiller, Richard Newton, A.: WELD – an environment for webbased electronic design. In: 35th Design Automation Conference, ACM, University of California, Berkely, 1998 5. Makimoto, T.: The hot decade of field programmable technologies. Retrieved from https:// www.doc.ic.ac.uk/~wl/teachlocal/cuscomp/k01_makimoto.pdf 6. Impact of the Mead-Conway innovations in VLSI chip design and implementation methodology: an overview by Lynn Conway. Retrieved from http://ai.eecs.umich.edu/people/conway/ Impact/Impact%20of%20the%20Mead-Conway%20innovations.pdf 7. Definition of EDA: Retrieved from http://encyclopedia2.thefreedictionary.com/Electronic+ Design+Automation 8. Basu, S., Brayton, R., Cong, J.: NSF workshop electronic design automation: past, present, and future, Arlington, 8–9 July 2009 9. Electronic design automation, TechEncyclopedia. Retrieved from http://www.answers.com/ topic/electronic-design-automation 10. Dr. Ananda, H.V.: Overview of EDA Industry, VTU-VSI-ISA Confluence Meeting, Feb 2006 11. Spiller, M.D., Newton, A.R.: EDA and the network. In: Proceedings of the IEEE International Conference on Computer-Aided Design, San Jose, pp. 470–476, Nov 1997 12. http://www.mathworks.com 13. Kamat, R.K., Shinde, S.A., Shelake, V.G.: Unleash the System on Chip Using FPGAs and Handel C. Springer, New York (2009) 14. Kamat’s paper 15. Chen, D., Cong, J., Pan, P.: FPGA design automation: a survey. Found. Trend. Electron. Des. Autom. 1(3), 195–330 (2006) 16. Otten, R.H.J.M., Camposano, R., Groeneveld, P.R.: Design automation for deepsubmicron: present and future. In: Proceedings of the Conference on Design, Automation and Test in Europe, IEEE Computer Society, Washington, DC, 2002
R.K. Kamat et al., Harnessing VLSI System Design with EDA Tools, DOI 10.1007/978-94-007-1864-7, © Springer Science+Business Media B.V. 2012
147
148
References
17. Electronic Design Automation. Retrieved from http://neohumanism.org/e/el/electronic_ design_automation.html 18. Quo Vadis, EDA? Summary hosted at http://dsp.acm.org/view_lecture.cfm?lecture_id=85 19. Sangiovanni-Vincentelli, A.: Quo Vadis, SLD? Reasoning about the trends and challenges of system level design. Proc. IEEE 95(3), 467–505 (2007) 20. Laung-Terng, W., Chang, Y.-W., Cheng, K.-T.: Electronic Design Automation: Synthesis, Verification, and Test. Morgan Kaufmann/Elsevier, Amsterdam (2009). Print 21. Jansen, D.: The Electronic Design Automation Handbook. Springer, Dordrecht (2003) 22. Laung-Terng, W., Cheng-Wen, W., Cheng-Wen, W. (EE Ph.D.), Xiaoqing, W.: VLSI Test Principles and Architectures (2006) 23. Scheffer, L., Lavagno, L., Martin, G.E.: EDA for IC Implementation, Circuit Design, and Process Technology. CRC Taylor & Francis, Boca Raton (2006) 24. Rahman, A.: FPGA Based Design and Applications. Springer, London (2010) 25. Wayne, W.: FPGA-Based System Design. Prentice Hall PTR, Englewood Cliffs (2004) 26. Chu, P.P.: FPGA Prototyping by VHDL Examples. Wiley-Interscience, Hoboken (2008) 27. Coffman, K.: Real World FPGA Design with Verilog. Prentice Hall PTR, Upper Saddle River (2000) 28. DeHon, A., Hauck, S.: Reconfigurable Computing. Morgan Kaufmann, Amsterdam (2008) 29. Al-Hashimi, B., Institution of Electrical Engineers: System-on-Chip. Institution of Engineering and Technology, London (2006) 30. Dubey, R.: Introduction to Embedded System Design Using Field Programmable Gate Arrays. Springer, London (2009) 31. Sass, R.R., Schmidt, A.G.: Embedded Systems Design with Platform FPGAs. Morgan Kaufmann/Elsevier, Amsterdam (2010) 32. Chandra, R.: IP-reuse and platform base designs, design and reuse articles. Retrieved from http://www.design-reuse.com/articles/6125/ip-reuse-and-platform-base-designs.html 33. Wilton, S.J.E., Saleh, R.: Programmable logic IP cores in SoC design: opportunities and challenges. In: Proceedings of the IEEE Custom Integrated Circuits, University of British Columbia, Vancouver, 2001 34. Tanurhan, Y.: Soc fesign using embedded cores, electronic news, 9 Apr 2001. Retrieved from http://findarticles.com/p/articles/mi_m0EKF/is_15_47/ai_73121502/ 35. Dhir, A.: Intellectual Property (IP) cores for home networking, Xilinx Application Note WP137 (v1.1), 14 Mar 2005 36. Ing. Miloš BeÊváĖ, Testability issues in designing large SoC 37. Dey, S., Panigrahi, D., Chen, L., Taylor, C.N., Sekar, K., Sanchez, P.: Using a soft core in a SoC design: experiences with picoJava. IEEE Des. Test. Comput. 17, 60–66 (2000) 38. Chandrashekar, S.: Advantages of FPGA design methodologies, EE Times. Retrieved from http://www.eetimes.com/news/design/showArticle.jhtml?articleID=26100997 39. Advantages of FPGA devices. Retrieved from http://www.canterbury-consulting.co.uk/index. php/services/fpga-development/56-advantages-of-fpga-devices 40. Advantages of Field Programmable Gate Arrays. Retrieved from www.men.de/docs-ext/ expertise/pdf/fpga_advantages.pdf 41. Guo-Qing, Z., Guo-Qiang, Z., Qing-Feng, Y., Su-Qi, C., Tao, Z.: Evolution of the internet and its cores. New J. Phys. 10 (2008) 123027 42. Apté, C., Damerau, F., Weiss, S.M.: Automated learning of decision rules for text categorization. ACM Trans. Info. Syst. 12(3), 233–251 (1994) 43. Hecht, J.: Spam could be betrayed by hidden patterns. New Sci. 202(2707), 20 (2009) 44. Hoffman, P., Crocker, D.: Unsolicited bulk email: mechanisms for control. Internet Mail Consortium, UBE-SOL IMCR-008. http://www.imc.org/ube-sol.html. Revised 4 May 1998 45. Crocker, D.: Challenges in anti-spam efforts by Dave Crocker, brandenburg internet working. Reprinted from The Internet Protocol Journal (IPJ). 8(4) Dec 2005. IPJ is a quarterly technical journal published by Cisco Systems. See www.cisco.com/ipj
References
149
46. Marriam websters dictionary : Definition of spam. Retrieved from http://www.merriam-webster. com/dictionary/Spam. Accessed 2 Apr 2009 47. Southwick, S., Falk, J.: The NET Abuse FAQ. Retrieved from http://www.cybernothing.org/ faqs/net-abuse-faq.html#2.1 (1998). Accessed 2 Apr 2008 48. CNN.: Anti-spam plea to ’dump the junk’. Online: http://edition.cnn.com/2003/TECH/05/22/ Spam.survey/index.html. Downloaded 7 June 2006 49. SpamHAUS : The definition of spam. Retrieved from http://www.Spamhaus.org/definition. html (2008). Accessed 2 Apr 2009 50. Cohen, J.: Spam finally has a definition. Retrieved from http://www.dmnews.com/Spam-finallyhas-a-definition/article/107514/ (2008). Accessed 3 April 2009 51. Goldsborough, R.: Bulk e-mail doesn’t have to be Spam. (A Tech Perspective). (Brief Article): an article from Communit. Cox, Matthews (2002) 52. Chelluri, S.R., Mackin, B., Gamba, D.: FPGA-based solutions for storage-area networks. XCell J., pp. 45–47, 2006 53. Handel-C Language Reference Manual. Embedded Solutions Limited: Version 2.1 54. Wolf, W.: A decade of hardware/software codesign. IEEE Comput. 36, 38–43 (2003) 55. Prakash, S., Parker, A.C.: SOS: synthesis of application-specific heterogeneous multiprocessor systems. J. Parallel Distrib. Comput. 16, 338–351 (1992) 56. Gupta, R.K., De Micheli, G.: Hardware/software cosynthesis for digital systems. IEEE Des. Test Comput. 10, 29–41 (1993) 57. Ernst, R., Henkel, J., Benner, T.: Hardware/software cosynthesis for microcontrollers. IEEE Des. Test Comput. 10, 64–75 (1993) 58. Micheli, G., Gupta, R.: Hardware/software co-design. Proc. IEEE 85(3), 349–365 (1997) 59. Sudarshan, T.S.S.: Presentation on reconfigurable computing introduction to codesign. Retrieved from http://www.csis.bits-pilani.ac.in/faculty/tsbs/Main/Courses/Reconfig_09/ lecses/pdf6p/lec25.pdf 60. Nagaonkar Y.: FPGA-based experiment platform for hardware-software codesign and hardware emulation. Thesis, Brigham Young University (2006) 61. Hardware-Software Codesign. Retrieved from www.npd-solutions.com/swcodesign.html 62. Hardware-Software Codesign Presentation. www.cs.ccu.edu.tw/~pahsiung/…/SoC_Design_ Flow_Tools_Codesign_2005.pdf 63. Balarin, F., et al.: Hardware-Software Co-design of Embedded Systems – The POLIS Experience. Kluwer Academic, Boston (1997) 64. Eker, J., Janneck, J.W., Lee, E.A., Liu, J., Liu, X., Ludvig, J., Neuendorffer, S., Sachs, S., Xiong, Y.: Taming heterogeneity---the Ptolemy approach. Proc. IEEE 91(2) (2003) 65. Ishikawa, M., McCune, D.J., Saikalis, G., Oho, S.: CPU model-based hardware/software co-design, co-simulation and analysis technology for real-time embedded control systems. In: 13th IEEE Real Time and Embedded Technology and Applications Symposium (RTAS’07), rtas, Washington, DC, pp. 3–11, 16–19 Apr 2007 66. Madsen, J., GRODE, J., Knudsen, P.V., Petersen, M.E., Haxthausen, A.: LYCOS: the Lyngby co-synthesis system. Des. Autom. Embed. Syst. 2, 195–235 (1997). Kluwer Academic Publishers, Boston. Manufactured in The Netherlands 67. Ade, M., Lauwereins, R.J., Peperstraete, A.: Hardware-software codesign with GRAPE. In: Sixth IEEE International Workshop on Rapid System Prototyping (RSP’95), rsp, North Carolina, p. 40, 7–9 June 1995 68. Ebcioglu, K.: The IBM PERCS project: hardware-software co-design of a supercomputer for high programmer productivity, WASP 2005 Keynote 69. Li, Y., Callahan, T., Darnell, E., Harr, R., Kurkure, U., Stockwood, J.: Hardware-software co-design of embedded reconfigurable architectures, Annual ACM IEEE Design Automation Conference Archive. In: Proceedings of the 37th Conference on Design Automation, Los Angeles, pp. 507–512, ISBN:1-58113-187-9, 2000 70. Lau, D., Pritchard, O., Molson, P.: Automated generation of hardware accelerators with direct memory access from ANSI/ISO standard C functions, field-programmable custom computing
150
References
machines, FCCM ’06. In: 14th Annual IEEE Symposium, Napa, pp. 45–56, ISBN: 0-76952661-6, 24–26 Apr 2006 71. Merchant, S., Peterson, G.D., Bouldin, D.: Improving embedded systems education: laboratory enhancements using programmable systems on chip, Microelectronic Systems Education, 2005, (MSE ’05). In: Proceedings. 2005 IEEE International Conference on Publication Date: pp. 5–6, Anaheim, 12–14 June 2005 72. Lahiri, K., Raghunathan, A., Dey, S.: Design space exploration for optimizing on-chip communication architectures. IEEE Trans. CAD. ICs Syst. 23(6), 952–961 (2004) 73. Dally, W.J., Towel, B.: Principles and Practices of Interconnection Networks. Elsevier, Amsterdam (2004) 74. Shandhag, N.R.: Reliable and efficient system-on-chip design. IEEE Comput. 37(3), 42–50 (2004) 75. Ayala, J., Lopez-Vellejo, M., Bertozzi, D., Benini, L.: State-of-the-art SoC communication architectures. In: Zurawski, R. (ed.) Embedded Systems Handbook, pp. 20.1–20.22. CRC Press, Boca Roton (2009) 76. Pasricha, S., Dutt, N.: On-chip communication architectures system on chip interconnect, the Morgan Kaufmann series in systems on silicon, series, Wolf, W. (eds.), Georgia Institute of Technology 77. ST3232C Datasheet ST Microelectronics. Retrieved from http://www.datasheetcatalog.com/ datasheets_pdf/S/T/3/2/ST3232C.shtml 78. Documentation of XPS Ethernet Lite. Retrieved from http://www.xilinx.com/support/ documentation/ipcommunicationnetwork_ethernet_xps-ethernetlite.htm 79. SMSC LAN83C185 Datasheet. Retrieved from www.smsc.com/main/datasheets/83c185.pdf 80. XPS Timer Datasheet. Retrieved from http://www.xilinx.com/support/documentation/ ipembedprocess_peripheralother_xpstimcount.htm 81. XPS Interrupt Controller Datasheet. Retrieved from http://www.xilinx.com/support/ documentation/ipembedprocess_peripheralother_xpsinterruptcontrol1a.htm 82. Multi-Port Memory Controller MPMC) (v4.00.a), Product Specification DS643 January 11, 2008 83. MT46V32M16 (32M x 16) DDR SDRAM Data Sheet. http://download.micron.com/pdf/ datasheets/dram/ddr/512MBDDRx4x8x16.pdf 84. Dunkels, A.: Design and implementation of the lwIP TCP/IP Stack. Retrieved from www.ece. ualberta.ca 85. lwIP 1.3.0 Library (v1.00.b). Retrieved from www.xilinx.com/support/documentation/sw_ manuals/xilinx11/sa_lwip130_v1_00_b.pdf 86. Holden, S.: Spam filter evaluations. Retrieved from http://sam.holden.id.au/writings/Spam2 87. Bloom, B.: Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13(7), 422–426 (1970) 88. Comparison of hash functions in public domain. Retrieved from www.burtleburtle.net/bob/ hash/doobs.html 89. Broder, A., Mitzenmacher, M.: Network applications of bloom filters: a survey. In: Proceedings of the 40th Annual Allerton Conference on Communication, Control, and Computing, Illionis, pp. 636–646 (2002) 90. Ripeanu, M., Iamnitchi, A.: Bloom filters – short tutorial. Retrieved from http://people. cs.uchicago.edu/~matei/PAPERS/bf.doc 91. Parab, J.S., Shelake, V.G., Kamat, R.K., Naik, G.M.: Exploring C for Microcontrollers: A Hands on Approach. Springer, Dordrecht (2007) 92. Rambo, S.: Altera tweaks soft-core processor, Embedded.com, 02/25/03. http://www.embedded. com/story/OEG20030225S0022. Retrieved on 20 Oct 2007 93. QuickLogic QuickMIPS ESP is industry’s first – combines a high-speed processor with hardwired functions and field programmability, Design and Reuse Newsletter. http://www.us. design-reuse.com/news/news349.html. Retrieved on 21 Oct 2007 94. Blaze, P.: User Resources Xilinx Inc. website. http://www.xilinx.com/ipcenter/processor_ central/picoblaze/picoblaze_user_resources.htm. Retrieved on 21 Oct 2007
References
151
95. Shinde, S.A.: FPGA based programmable ASIC for circumventing SPAM. Ph.D. thesis, Shivaji University, Kolhapur (2009) 96. Gaikwad, P.K.: FPGA based ECG, pulse ocimeter and arrythmia detection. Ph.D. thesis, Shivaji University, Kolhapur 97. Abächerli, R., Braun, F., Zhou, L., Kraemer, M., Felblinger, J., Schmid, H.: Electrocardiogram on a chip: overview and first experiences of an electrocardiogram manufacturer of medium size. J. Electrocardiol. 39(4), S36–S40 (2006) 98. Chang, M.-C., Lin, Z.-X., Chang, C.-W., Chan, H.-L., Feng, W.-S.: Design of a systemon-chip for ECG signal processing, circuits and systems, 2004. In: Proceedings of the 2004 IEEE Asia-Pacific Conference, Tainan, vol. 1, pp. 441–444, Dec 2004 99. Khatib, I.A., Bertozzi, D., Poletti, F., Benini, L.: MPSoC ECG biochip: a multiprocessor system-on-chip for real-time human heart monitoring and analysis. In: Proceedings of the 3rd Conference on Computing frontiers, Ischia, pp. 21–28, ISBN:1-59593-302-6 (2006) 100. Khatib, I.A., Bertozzi, D., Poletti, F., Benini, L.: A multiprocessor system-on-chip for realtime biomedical monitoring and analysis: ECG prototype architectural design space exploration, ACM 1073-0516/01/0300-0034 (2007) 101. Park, C., Chou, P.H., Bai, Y., Matthews, R., Hibbs, A.: An ultra- wearable, wireless, low power ECG monitoring system. In: Proceedings of IEEE BioCAS, London (2006) 102. Istepanian, R.S.H., Woodward, B.: Microcontroller-based underwater acoustic ECG telemetry system. IEEE Trans. Info. Technol. Biomed. 1(2), 150 (1997) 103. Galjan, W., Naydenova, D., Tomasik, J.M., Schroeder, D., Krautschneider, W.H.: A portable SoC-based ECG-system for 24h x 7d operating time. In: Biomedical Circuits and Systems Conference, 2008, BioCAS 2008, Baltimore, pp. 85–88, IEEE Publication Date: 20–22 Nov 2008 104. Lo, B., Thiemjarus, S., King, R., Yang, G.: Body sensor network– a wireless sensor platform for pervasive healthcare monitoring. In: Adjunct Proceedings of the 3rd International Conference on Pervasive Computing (PERVASIVE‘05), Munich, pp. 77–80, May 2005 105. Harland, C., Clark, T., Prance, R.: High resolution ambulatory electrocardiographic monitoring using wrist-mounted electric potential sensors. Meas. Sci. Technol. 14, 923–928 (2003) 106. Chang, M., Lin, Z., Chang, C., Chan, H., Feng, W.: Design of a system-on-chip for ECG signal processing. In: The 2004 IEEE Asia-Pacific Conference on Circuits and Systems, Tainan, 6–9 Dec 2004 107. Hung, K., Zhang, Y.T., Tai, B.: Wearable medical devices for tele-home healthcare. In: Proceedings of the 26th Annual International Conference on the IEEE EMBS, San Francisco, 1–5 Sept 2004 108. Jun, D., Hong-Hai, Z.: Mobile ECG detector through GPRS/Internet. In: Proceedings of the 17th IEEE Symposium on Computer-Based Medical Systems (CBMS’04), Bethesda, 2004 109. Desel, T., Reichel, T., Rudischhauser, S., Hauer, H.: A CMOS nine channel ECG measurement IC. In: 2nd International Conference on ASIC, 1996, Shanghai, pp. 115–118, Oct 1996 110. Loghi, M., Angiolini, F., Bertozzi, D., Benini, L., Zafalon, R.: Analyzing on-chip communication in a {MPSoC} environment. In: Proceedings of Design and Test in Europe Conference (DATE), Paris, pp. 752–757, Feb 2004 111. Loghi, M., Poncino, M., Benini, L.: Cycle-accurate power analysis for multiprocessor systems-on-a-chip, GLSVLSI04. In: Great Lake Symposium on VLSI, Boston, pp. 401–406, Apr 2004 112. Bona, A., Zaccaria, V., Zafalon, R.: System level power modeling and simulation of high-end industrial network-on-chip. In: Proceedings of Design and Test in Europe Conference (DATE), Paris, pp. 318–323, Feb 2004 113. ÐAJA, N., Rejlin, I., Rejlin, B.: Telemonitoring in cardiology – ECG transmission by mobile phone. J. Ann. Acad. Studenica 4, 63–66 (2001) 114. Einthoven’s triangle. Retrieved from http://medicaldictionary.thefreedictionary.com/ Einthoven’s+triangle 115. Jancin, B.: 80-lead ECG system may improve diagnosis, family practice news article date: 1 July 2009. Retrieved from http://www.highbeam.com/doc/1G1-204544732.html
152
References
116. Fulford-Jones, T.R.F., Wei, G-Y., Welsh, M.: A portable, low-power, wireless two-lead EKG system. In: Proceedings of the 26th Annual International Conference of the IEEE EMBS, San Francisco, 1–5 Sept 2004 117. Rämö, T.: Biopotentials and electrophysiology measurement. Retrieved from www.faculty. ksu.edu.sa/…/Part%202%20Amplifiers%20%20Applications.ppt 118. Datasheet of MAX4051 – Low-Voltage, CMOS Analog Multiplexers/Switches – Maxim Integrated Products. Retrieved from http://www.alldatasheet.com/datasheet-pdf/pdf/73273/ MAXIM/MAX4051.html 119. About TCP-Com: Retrieved from http://pcmicro.com/TCP-Com/ 120. Carr, J.J., Brown, J.M.: Introduction to Biomedical Equipment Technology, 4th edn. Pearson Education, New York (1981). www.pearsoned.co.in 121. Pellizzoni, R., Caccamo, M.: Adaptive allocation of software and hardware real-time tasks for FPGA-based embedded systems, Real-Time and Embedded Technology and Applications Symposium. In: Proceedings of the 12th IEEE, Mordano, pp. 208–220, Print ISBN: 0-76952516-4, 4–7 Apr 2006 122. Xilinx Spartan 3e Starter kit: http://www.xilinx.com/support/documentation/boards_and_ kits/ug230.pdf 123. Mui, E.N.C.: FPGA interfacing of HD44780 based LCD using delayed finite state machine (FSM). Retrieved from http://www.xess.com/projects/LCD_HD44780.pdf 124. In-Depth FPGA Interfacing of HD44780 Based LCD. http://www.youritronics.com/in-depthfpga-interfacing-of-hd44780-based-lcd/ 125. Ludewig1, R., Soffke1, O., Zipf1, P., Glesner1, M, Pun2, K.P., Tsoi2, K.H., Lee2, K.H., Leong, P.: IP generation for an FPGA-based audio DAC sigma-delta converter. Retrieved from http://www.cse.cuhk.edu.hk/~phwl/mt/public/archives/papers/sdconv_fpl04.pdf 126. Application Note: Interfacing the MAX5195 High-Speed DAC to High-Speed FPGAs 127. Reju K, Joshi, K., Murali, M.: Multifunction card for data acquisition and embedded applications. Retrieved from http://www.vecc.gov.in/~sacet09/downloads/FINAL%20PDF/D35_ REJU_VPID_BARC.pdf 128. Defossez, M.: Connecting virtex-6 FPGAs to ADCs with serial LVDS interfaces and DACs with parallel LVDS interfaces, Application Note: Virtex-6 FPGAs 129. Data Sheet of LTC2604/LTC2614/LTC2624, Quad 16-Bit Rail-to-Rail DACs in 16-Lead SSOP. Retrieved from http://cds.linear.com/docs/Datasheet/2604fd.pdf 130. Chapman, K.: D/A converter control sor Spartan-3E starter kit. Retrieved from http://www. xilinx.com/products/boards/s3estarter/files/s3esk_picoblaze_dac_control.pdf 131. Data Sheet of LTC6912, dual programmable gain amplifiers with serial digital interface. Retrieved from http://public.beuthhochschule.de/~purat/lehre/esv/templates/LTC6912.pdf 132. LTC6912 - Dual programmable gain amplifiers with serial digital interface. Retrieved from http://www.linear.com/product/ltc6912 133. Chapman, K.: Amplifier and A/D converter control for spartan-3E starter kit. Retrieved from http://www.mrc.uidaho.edu/mrc/people/jff/440/handouts/PicoBlaze_Amplifier_and_ADC_ control_rev2.pdf 134. A Lattice Semiconductor White Paper, October 2007, Interfacing analog to digital converters to FPGAs. Retrieved from http://www.latticesemi.com/documents/doc26686x11.pdf?jsessio nid=f030f3c5adac9f36fc812127d4f655969264 135. Laymon, C.M., Miyaoka, R.S., Park, B.K., Lewellen, T.K.: Simplified FPGA-based data acquisition system for PET. IEEE Trans. Nucl. Sci. 50(5), 1483–1486 (2003) 136. Datasheet of TC1407-1/LTC1407A-1, Serial 12-Bit/14-Bit, 3Msps simultaneous sampling ADCs with shutdown. Retrieved from http://cds.linear.com/docs/Datasheet/14071fb.pdf 137. Spartan-3E FPGA starter kit board user guide. Retrieved from http://bears.ece.ucsb.edu/class/ ece253/papers/Spartan3e-ug1.1.pdf 138. Wang, G., Guan, Y., Zhang, Y.: Designing of VGA character string display module based on FPGA. In: International Symposium on Intelligent Ubiquitous Computing and Education, Chengdu, pp. 499–502, Print ISBN: 978-0-7695-3619-4, 15–16 May 2009
References
153
139. Wilson, P.: Design recipes for FPGAs – a simple VGA interface. Retrieved from http://www. eetimes.com/design/programmable-logic/4015149/Design-Recipes-for-FPGAs--A-SimpleVGA-Interface/ 140. Thor, J.: Interfacing a PS/2 Keyboard. Retrieved from http://www.sm.luth.se/csee/courses/ smd/098/lab31.pdf 141. Dong, L., Atashbar, M.Z.: An FPGA experience in ASIC design. In: Proceedings of the 2005 ASEE North Central Conference, Western Michigan University, Kalamazoo, 2005 142. Kamat, R.K.: Terminal report: development of FPGA based open source soft IP cores for parameterized microcontroller design. A Research Project by Department of Science and Technology, Government of India 143. Dudek, P., Szczepa ski, S., Hatfield, J.V.: A high-resolution CMOS time-to-digital converter utilizing a vernier delay line. IEEE Trans. Solid-State Circuits 35(2), 240–247 (2000) 144. Karadamoglou, K., Paschalidis, N.P., Sarris, E., et al.: An 11-bit high-resolution and adjustablerange CMOS time-to-digital converter for space science instruments. IEEE J. Solid-State Circuits 39(1), 214–222 (2004) 145. Paschalidis, N.P., Stamatopoulos, N., Karadamoglou, K., et al.: A CMOS time of flight system on a chip for spacecraft instrumentation. IEEE Trans. Nucl. Sci. 49(3), 1156–1163 (2002) 146. Dudek, P., Szczepanski, S., Hatfield, J.: A high-resolution CMOS time-to-digital converter utilizing a vernier delay line. IEEE J. Solid-State Circuits 35(2), 240–247 (2000) 147. Hwang, C.-S., Chen, P., Tsao, H.W.: A high-precision time-to-digital converter using a twolevel conversion scheme. IEEE Trans. Nucl. Sci. 51(4), 1349–1352 (2004) 148. Mota, M., Christiansen, J.: A high-resolution time interpolator based on a delay locked loop and an RC delay line. IEEE J. Solid-State Circuits 34(10), 1360–1366 (1999) 149. Raisanen-Ruotsalainen, E., Rahkonen, T., Kostamovaara, J.: A lowpower CMOS time-todigital converter. Proc. IEEE Solid-State Circuits Conf. 30, 984–990 (1995) 150. Mantyniemi, A., Rakhonen, T., Kostamovaara, J.: An integrated 9-channel time digitizer with 30-ps resolution. In: Proceedings of the IEEE Solid-State Circuits Conference, Florence, pp. 266–267, Feb 2002 151. Chung, C.-C., Lee, C.-Y.: A new DLL-based approach for all-digital multiphase clock generation. IEEE J. Solid-State Circuits 39(3), 469–475 (2004) 152. O’Connor, V., Phillips, D.: Time-Correlated Single Photon Counting. Academic, London (1984) 153. Louis, T.A., Ripamonti, G., Lacaita, A.: Photoluminescence lifetime microscope spectrometer based on time-correlated single-photon counting with an avalanche diode detector. Rev. Sci. Instrum. 61, 11–22 (1990) 154. Stellari, F., Zappa, F., Cova, S., Vendrame, L.: Tools for non-invasive optical characterization of CMOS circuits. In: Proceedings of International Electron Device Meeting IEDM ’99, Washington, DC, 5–8 Dec 1999 155. Maatta, K., Kostamovaara, J.: Accurate time interval measurement electronics for pulsed time of flight laser radar. In: Proceedings of the I” Topical Meeting on Optoelectronic Distance/ Displacement Measurement and Applications ODIMAP I, 1997 156. Kalisz, J., Szplet, R., Pasierbinski, J., Poniecki, A.: Field programmable gate array based time-to-digital converter with 200-ps resolution. IEEE Trans. Instrum. Meas. 46(1), 51–55 (1997) 157. QiAn, J.S., Liu, S.: A high-resolution time-to digital converter implemented in field-programmablegate-arrays. IEEE Trans. Nucl. Sci. 53(1), 236–241 (2006) 158. Santos, D.M.: A CMOS delay locked loop and sub-nanosecond time-to-digital converter chip. IEEE Trans. Nucl. Sci. 43–3, 1717–1719 (1996) 159. Dudek, P., Szezepanski, S., Hatfield, J.: A high resolution CMOS time-to-digital converter utilizing a vernier delay loop. IEEE Trans. Solid State Circuits 35, 240–247 (2000) 160. Hwang, C.-S., Chen, P., Tsao, H.-W.: A high-precision time-to-digital converter using a two-level conversion scheme. IEEE Trans. Nucl. Sci. 51(4), 1349–1352 (2004) 161. Chan, A.H., Roberts, G.W.: A Jitter characterization system using a component-invariant vernier delay line. IEEE Trans. VLSI Syst. 12(1), 79–95 (2004)
154
References
162. Timetodigitalconverter.Retrievedfromhttp://dictionary.sensagent.com/time+to+digital+converter/ en-en/ 163. Henzler, S.: Time to Digital Converters. Springer, Dordrecht (2010) 164. Baron, R.G.: The vernier time-measuring technique. Proc. IRE 45, 21–30 (1957) 165. Kalisz, J.: Review of methods for time interval measurements with picosecond resolution. Metrologia 41, 17–32 (2004) 166. Porat, D.: Review of subnanosecond time-interval measurements. IEEE Trans. Nucl. Sci. 20, 35–51 (1973) 167. Guhilot, H.: Ph.D. thesis, Shivaji University, Kolhapur, 2011 168. Shi, W.Z., Wang, I.Y.: Firmware-only implementation of time-to-digital converter in field programmable gate array. Proc. IEEE Conf. Rec. NSS 1, 177–181 (2003) 169. Szymanowski, R., Kalisz, J.: Field programmable gate array time counter with two-stage interpolation. Rev. Sci. Instrum. 76, 45–104 (2005) 170. Aloisio, A., Branchini, P., Cicalese, R., Giordano, R., Izzo, V., Loffredo, S.: FPGA implementation of high-resolution time-to-digital converter. In: IEEE 2007 Nuclear Science Symposium and Medical Imaging Conference, Honolulu, Oct 27 – Nov 3 2007