Standardized Functional Verification
Alan Wiemann
Standardized Functional Verification
Alan Wiemann San Carlos, CA...
96 downloads
1270 Views
9MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Standardized Functional Verification
Alan Wiemann
Standardized Functional Verification
Alan Wiemann San Carlos, CA USA
ISBN 978-0-387-71732-6
e-ISBN 978-0-387-71733-3
Library of Congress Control Number: 2007929789 © 2008 Springer Science+Business Media, LLC All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer Science+Business Media, LLC, 233 Spring Street, New York, NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. Printed on acid-free paper. 9 8 7 6 5 4 3 2 1 springer.com
Preface
It’s widely known that, in the development of integrated circuits, the amount of time and resources spent on verification easily exceeds that spent on design. A survey of current literature finds numerous references to this fact. A whitepaper from one major CAD company states that, “Design teams reportedly spend as much as 50 to 70 percent of their time and resources in the functional verification effort.” A brief paper from Design and Reuse observes that, “70 percent of the overall design phase is dedicated to verification,” and that, “as designs double in size, the verification effort can easily quadruple.” In spite of all this effort, another whitepaper from yet another CAD company observes that, “two or three very expensive silicon iterations are the norm today.” Couple these observations on verification effort with the fervent quest for functional closure taking place in the industry and it becomes clear that a breakthrough in functional verification is greatly needed. Standardized functional verification is this breakthrough. The title of this book suggests that standardized functional verification has already been accomplished. However, at the time of publication this standardization is but an exciting vision. Great strides have been made with the availability of commercial test generators and coverage analysis tools, but the organizing principles that enable direct comparisons of verification projects and results have been, until now, undiscovered and undefined. One leading software vendor refers to coverage of scenarios, stating that, a “test generator [can] approach any given test scenario from multiple paths.” However, there is no consistent method available for using lists of scenarios as a basis for comparing verification projects. In chapter 3 we will learn that any given scenario can be reduced to the specific values of standard variables and, more precisely, to one or more arcs in a value transition graph, arcs that connect these values. There usually is a multitude of paths to any given function point, and we must travel each and every one to exercise our target exhaustively. Moreover, current literature does not explain how to produce an exhaustive list of these scenarios, so risk assessment is based on hopeful
VI
Preface
assumptions that the few scenarios that were listed in the verification plan were: 1) enough, and 2) the very scenarios that are likely to result in faulty behavior caused by functional bugs. One often encounters odd complaints along the lines of, “it is impossible to ‘think’ of all the possible bugs when writing the functional test plan.” Indeed, bugs frequently remain where we do not think to look for them. One vendor defines functional coverage as “explicit functional requirements derived from the device and test plan specifications.” However, they do not explain how to derive these explicit requirements or what specific form they have, once derived. Standard variables and their ranges, and the rules and guidelines that govern their relationships, provide the analytical foundation upon which this can be achieved. They continue, stating that, “Test plans consisting of lists of test descriptions were used to write a large number of directed tests,” and that their verification software enables “exhaustive functional coverage analysis.” Later on they say that, “functional coverage items are derived from a written verification plan,” and that “conditions to satisfy the verification plan are identified and coded into functional coverage objects.” However, they do not explain how to achieve this “exhaustive functional coverage.” This same vendor also remarks (accurately) that, “… one of the most difficult and critical challenges facing the designer is to establish adequate metrics to track the progress of the verification and measure the coverage of the functional test plan,” and that, “… coverage is still measured mainly by the gut feeling of the verification manager, and eventually the decision to tape out is made by management without the support of concrete qualitative data.” They conclude that, “Perhaps the most frustrating problem facing design and verification engineers is the lack of effective metrics to measure the progress of verification.” This book reveals for the first time the organizing principles defining the relationships among the many defining variables in the vast universe of digital designs and defines precise means for exploiting them in IC development and verification. A rigorous examination of the standard variables described in this book with regard to applicable concepts in linear algebra and graph theory may blaze the trail to improved techniques and tools for the functional verification of digital systems. This book also proposes a set of specific measures and views for comparing the results obtained from verifying any digital system without regard for size or functionality. It also describes how these standard results can be used in objective data-driven risk assessment for correct tape-out decisions.
Preface
VII
Finally, the intellectual property (IP) industry needs a level playing field so that integrators can express their quality requirements clearly, and so that providers can declare unambiguously how those quality requirements are met and what level of risk is entailed in integrating their products. Standardized functional verification makes the IP market a safe place to transact business. Integrators need to be able to “peek behind the curtains” to understand the results of the IP provider’s verification effort. IP providers need to safeguard proprietary processes. Establishing standard measures and views will enable this industry to thrive in a manner similar to how interchangeable parts enabled manufacturing industries to thrive. These standards must be: 1. applicable to any and all digital systems, 2. genuine indicators of the degree of exercise of the IP, and 3. economical to produce. The specific measures and views described within the covers of this book meet these requirements. They are by no means the only possible measures or views, but they constitute a much needed beginning. They will either thrive or perish on their own merits and as their usefulness grows or diminishes. This is the natural advance of science in a field where years of accumulated data are needed to discover what works well and what does not. Additionally, as the standardized approach described in this book gains acceptance, more precisely defined industry standards will gain approval and be reflected in commercially available verification software. The experienced verification engineer will find many of the concepts familiar and wonder that they have not been organized in a standardized fashion until now. But, as John H. Lienhard explains in his recent book How Invention Begins (Oxford University Press, 2006), the relentless priority of production can shove invention to the side. “Too much urgency distracts inventors from their goal; you might say it jiggles their aim. Urgency makes it harder for inventors to find the elbow room–the freedom–that invention requires.”
What this book is about This book is about verifying that a digital system works as intended and how to assess the risk that it does not. This is illustrated in Figure 1.
VIII
Preface
Fig. 1 What this book is about
What this book is not about This book is not about: • Programming: Verification engineers practice a particularly nettlesome craft, one which requires highly refined programming skills but with a detailed and nuanced understanding of the hardware that must endure the strenuous exercising by these programs. Excellent books and training courses on programming are already in great supply. • Programming languages: Verification engineers will most likely work with one of several commercially available programming languages, such as the e language from Cadence or the Vera language from Synopsys. There are several good books that explain how to write programs in these commercially available languages. See the references at the end of Chapter 4 for books on verification programming languages.
Preface
IX
• Modeling and test generation: Verification languages are especially well suited for modeling digital systems. In addition, verification IP is available commercially for standard interfaces (USB, PCI-Express, etc.) that provides ready-to-use models for industry-standard interfaces. See the references at the end of Chapter 4 for books on modeling and writing testbenches. • Test environment architecture: Modern verification languages often have an implied architecture for the test environment and how it incorporates the models for the devices being verified. In addition, there are already many good books that deal extensively with this topic, for example, Writing Testbenches: Functional Verification of HDL Models (Bergeron 2003). • Formal verification: This emerging technology represents an orthogonal approach to dynamic verification and is not treated in this book. However, an analysis in terms of standard variables may prove to be of use in formal verification as well.
Who should read this book The people who will benefit from reading this book and applying its concepts are: • Verification engineers, who will learn a powerful new methodology for quickly transforming multiple specifications for an IC (or any digital system) into a clear, concise verification plan including a foundation for architecting the testbench regardless of verification language. A working knowledge of programming, verification languages, modeling, and constrained random verification is assumed. • Verification managers, who will learn how to plan projects that meet agreed upon risk objectives consistent with scope, schedule, and resources. Most verification managers also have highly relevant experience as verification engineers, but a working knowledge of effective project management is assumed. • Department-level and higher level managers, who will learn how to manage multiple IC developments with a data-driven approach to assessing risk. Higher level managers may or may not have a strong background in verification, and this book will give them the working knowledge they need to communicate effectively with their subordinates as well as sort through the many vexing issues and contradictory information that reach them from inside their organization as well as
X
Preface
outside. In particular, the manager who must approve the expenses for tape-out will benefit greatly from learning how to assess risk from data rather than rosy reassurances from exhausted staff.
Scope and Organization of this book The well-known techniques of constrained pseudo-random verification are the means by which the target is exercised thoroughly. Commercially available software and, often, in-house designed software, for test generation is readily available to meet this need and are beyond the scope of this book. This book deals only with functional bugs, i.e., bugs that manifest themselves as faulty or sub-optimal behavior. There are, however, many other kinds of bug that are outside the scope of this book, including: • Syntactical errors (such as those reported by VeriLint). • Synthesis errors such as inferred latches (such as those reported by DesignCompiler) • Design goodness errors, such as using both positive-edge and negativeedge triggered flip-flops in a design, lack of suitable synchronization between clock domains, presence of dead code or redundant code • Noncompliance to language conventions (such as those reported by LEDA, Spyglass, etc.) Finally, this book is organized into 7 chapters as follows: 1. Introduction to Functional Verification 2. Analytical Foundation - standard framework - standard variables 3. Exploring Functional Space - size of the space - structure of the space 4. Planning and Execution - standard interpretation - verification plan - standard results 5. Normalizing Data 6. Analyzing Results - standard measures - standard views 7. Assessing Risk
Acknowledgements
Any book that attempts to advance scientific understanding is invariably the product of many minds, and this book is no exception. Apart from the contributions of others, this work would not have been possible. A debt of gratitude is owed to reviewers of my early drafts: Marlin Jones, Kris Gleason, Bob Pflederer, and Johan Råde. I thank Eric Hill for insightful comments on a later manuscript. I am especially grateful to Anne Stern, whose meticulous reading of a final manuscript revealed errors and ambiguities and passages greatly in need of further clarification. Without her diligent attention to detail, this book would be in want of much improvement. I also thank my editors at Springer for their unflagging support, especially Katelyn Stanne who guided this project to publication. Finally, I must acknowledge the contributions of my many colleagues over the years, both engineers and managers, who ardently embraced a culture of engineering excellence. Any remaining errors in this book are mine and mine alone.
Table of Contents
1. A Brief Overview of Functional Verification .....................................1 1.1 Costs and Risk ............................................................................... 1 1.2 Verification and Time to Market................................................... 2 1.3 Verification and Development Costs............................................. 2 1.4 But any Lessons Learned?............................................................. 3 1.5 Functional Verification in a Nutshell ............................................ 4 1.6 Principles of Constrained Random Verification............................ 5 1.7 Standardized Functional Verification............................................ 6 1.8 Summary ....................................................................................... 7 2. Analytical Foundation..........................................................................9 2.1 A Note on Terminology ................................................................ 9 2.2 DUTs, DUVs, and Targets ............................................................ 9 2.3 Linear Algebra for Digital System Verification .......................... 10 2.4 Standard Variables ...................................................................... 12 2.5 Ranges of Variables..................................................................... 13 2.6 Rules and Guidelines................................................................... 14 2.6.1 Example – Rules and Guidelines ....................................... 14 2.7 Variables of Connectivity............................................................ 16 2.7.1 Example – External Connectivity ...................................... 17 2.7.2 Example – Internal Connectivity........................................ 19 2.8 Variables of Activation ............................................................... 21 2.8.1 Example – Activation......................................................... 22 2.9 Variables of Condition ................................................................ 24 2.9.1 Example – Conditions ........................................................ 25 2.10 Morphs ........................................................................................ 27 2.11 Variables of Stimulus and Response ........................................... 29 2.11.1 Internal Stimuli and Responses ........................................ 30 2.11.2 Autonomous Responses.................................................... 31 2.11.3 Conditions and Responses................................................ 31 2.11.4 Example – Stimulus and Response................................... 32 2.12 Error Imposition .......................................................................... 33 2.12.1 Example – Errors.............................................................. 33 2.13 Generating Excitement ................................................................ 35
XIV
Table of Contents
2.14 Special Cases............................................................................... 35 2.14.1 Example – Special Case ................................................... 37 2.15 Summary ..................................................................................... 37 References ............................................................................................ 38 3. Exploring Functional Space............................................................... 39 3.1 Functional Closure ...................................................................... 39 3.2 Counting Function Points............................................................ 40 3.2.1 Variables of Connectivity .................................................. 40 3.2.2 Variables of Activation (and other Time-variant Variables) ..................................................... 44 3.2.3 Variables of Condition ....................................................... 45 3.2.4 Variables of Stimulus......................................................... 45 3.2.5 Variables of Response........................................................ 45 3.2.6 Variables of Error............................................................... 45 3.2.7 Special Cases ..................................................................... 46 3.2.8 An Approximate Upper Bound .......................................... 46 3.3 Condensation in the Functional Space ........................................ 47 3.4 Connecting the Dots .................................................................... 50 3.5 Analyzing an 8-entry Queue........................................................ 53 3.6 Reset in the VTG......................................................................... 59 3.7 Modeling Faulty Behavior........................................................... 63 3.8 Back to those Special Cases ........................................................ 64 3.9 A Little Graph Theory................................................................. 65 3.10 Reaching Functional Closure ...................................................... 68 3.11 Summary ..................................................................................... 70 4. Planning and Execution ..................................................................... 71 4.1 Managing Verification Projects................................................... 71 4.2 The Goal...................................................................................... 72 4.3 Executing the Plan to Obtain Results .......................................... 73 4.3.1 Preparation.......................................................................... 74 4.3.2 Code Construction.............................................................. 74 4.3.3 Code Revision .................................................................... 76 4.3.4 Graduated Testing .............................................................. 76 4.3.5 Bug Fixing.......................................................................... 77 4.4 Soft Prototype and Hard Prototype ............................................. 78 4.5 The Verification Plan .................................................................. 79 4.6 Instances, Morphs, and Targets (§ 1) .......................................... 81 4.7 Clock Domain Crossings (§ 1) .................................................... 81
Table of Contents
XV
4.8 Verifying Changes to an Existing Device (§ 1)........................... 83 4.9 Interpretation of the Specification (§ 1) ...................................... 83 4.10 Instrumenting the Prototype (§ 2)................................................ 87 4.10.1 An Ounce of Prevention (§ 2) .......................................... 90 4.11 Standard Results (§ 3) ................................................................. 91 4.12 Setting Goals for Coverage and Risk (§ 4).................................. 94 4.12.1 Making Trade-offs (§ 4).................................................. 94 4.12.2 Focusing Resources (§ 4)................................................ 94 4.13 Architecture for Verification Software (§ 5) ............................... 95 4.13.1 Flow for Soft Prototype (§ 5).......................................... 99 4.13.2 Random Value Assignment (§ 5).................................. 101 4.13.3 General CRV Process (§ 5)........................................... 102 4.13.4 Activation and Initialization (§ 5)................................. 104 4.13.5 Static vs. Dynamic Test Generation (§ 5) ..................... 108 4.13.6 Halting Individual Tests (§ 5) ....................................... 108 4.13.7 Sanity Checking and Other Tests (§ 5) ......................... 108 4.13.8 Gate-level Simulation (§ 5)........................................... 109 4.13.9 Generating Production Test Vectors (§ 5) .................... 110 4.14 Change Management (§ 6) ........................................................ 110 4.15 Organizing the Teams (§ 7)....................................................... 111 4.15.1 Failure Analysis (§ 7) ..................................................... 114 4.16 Tracking Progress (§ 8) ............................................................. 115 4.17 Related Documents (§ 9) ........................................................... 118 4.18 Scope, Schedule and Resources (§ 10)...................................... 118 4.19 Summary ................................................................................... 119 References .......................................................................................... 120 5. Normalizing Data..............................................................................121 5.1 Estimating Project Resources .................................................... 121 5.2 Power and Convergence ............................................................ 122 5.3 Factors to Consider in using Convergence ................................ 124 5.4 Complexity of a Target.............................................................. 126 5.5 Scaling Regression using Convergence..................................... 129 5.6 Normalizing Cycles Counts with Complexity........................... 133 5.7 Using Normalized Cycles in Risk Assessment ......................... 134 5.8 Bug Count as a Function of Complexity ................................... 135 5.9 Comparing Size and Complexity............................................... 136 5.10 Summary ................................................................................... 136 References .......................................................................................... 136 6. Analyzing Results .............................................................................137 6.1 Functional Coverage.................................................................. 137 6.2 Standard Results for Analysis ................................................... 138
XVI
Table of Contents
6.3 6.4 6.5 6.6 6.7 6.8 6.9 6.10 6.11 6.12 6.13 6.14 6.15 6.16
Statistically Sampling the Function Space ................................ 138 Measures of Coverage ............................................................... 139 Code Coverage .......................................................................... 140 State Reachability in State Machines ........................................ 142 Arc Transversability in State Machines .................................... 143 Fault Coverage .......................................................................... 143 VTG Coverage .......................................................................... 144 Strong Measures and Weak Measures....................................... 144 Standard Measures of Function Space Coverage ...................... 145 Specific Measures and General Measures................................. 146 Specific Measures for Quadrant I.............................................. 148 General Measures for Quadrants II, III, and IV ........................ 149 Multiple Clock Domains ........................................................... 149 Views of Coverage .................................................................... 150 6.16.1 1-dimensional Views.................................................... 151 6.16.2 Pareto Views ................................................................ 151 6.16.3 2-dimensional Views.................................................... 153 6.16.4 Time-based Views........................................................ 153 6.17 Standard Views of Functional Coverage................................... 155 6.18 Summary ................................................................................... 156 References .......................................................................................... 156 7. Assessing Risk ................................................................................... 157 7.1 Making Decisions...................................................................... 157 7.2 Some Background on Risk Assessment .................................... 159 7.3 Successful Functional Verification............................................ 160 7.4 Knowledge and Risk ................................................................. 164 7.5 Coverage and Risk..................................................................... 166 7.6 Data-driven Risk Assessment.................................................... 167 7.7 VTG Arc Coverage ................................................................... 168 7.8 Using Q to Estimate Risk of a Bug ........................................... 169 7.9 Bug Count as a Function of Z ................................................... 174 7.10 Evaluating Commercial IP ........................................................ 174 7.11 Evaluating IP for Single Application ........................................ 176 7.12 Nearest Neighbor Analysis........................................................ 176 7.13 Summary ................................................................................... 179 References .......................................................................................... 181 Appendix – Functional Space of a Queue ............................................ 183 A.1 Basic 8-entry Queue .................................................................. 183 A.2 Adding an Indirect Condition.................................................... 186 A.3 Programmable High- and Low-water Marks............................. 190
Table of Contents
A.4 A.5 A.6 A.7
XVII
Size of the Functional Space for this Queue ............................. 190 Condensation in the Functional Space ...................................... 191 No Other Variables? .................................................................. 193 VTGs for 8-entry Queue with Programmable HWM & LWM.......................................................................... 193
Index........................................................................................................267
Chapter 1 – A Brief Overview of Functional Verification
“What is the risk that our new integrated circuit will be unusable due to a functional bug?” “What is the risk that the silicon IP (intellectual property) we just licensed does not really work as advertised?”1 Every manager who brings a design to tape-out or who purchases IP must eventually face these questions. The ability to answer these questions based on quantitative analysis is both vital and yet elusive. In spite of the enormous technical advances made in IC development and verification software, the answers to these questions are still based largely on guesswork and hand waving. If the design and verification teams are exhausted, it must be time to tape-out. However, with a standardized framework for analysis, results, and measures, dependence on specious arguments for “functional closure” is no longer necessary. Rather than relying on well-intended assurances from hopeful verification engineers and managers, or on the cheery claims in glossy marketing literature for commercial IP, risk assessment can be based on hard data.
1.1 Costs and Risk The effectiveness of functional verification has a direct impact on development costs and product profitability. This is fairly well known in the IC development community, but a case study will help to highlight some key points. Consider an example of two IC development projects code-named A and B with rather different outcomes. Managers and engineers on both projects recognized the need to identify all of the functionality that their respective ICs needed to implement and created extensive and detailed lists of things to verify and corner cases to 1
Or, as the IP provider might phrase the question, “What is the risk that the silicon IP we just released does not really work as we claim?”
2
Chapter 1 – A Brief Overview of Functional Verification
exercise. Both projects relied on simulation of the RTL (register transfer level) design to expose bugs in their implementations. Both projects created verification plans with extensive lists of what to verify. Both project teams executed their respective verification plans and taped out. Project A employed a strategy based on the older technique of “directed testing” to verify their design. As verification engineers wrote tests based on the list of things to verify, design engineers modified their RTL to make tests pass. When the tests covered every item on the list of things to verify, “functional closure” was declared and they taped-out their IC. The first silicon prototypes, unfortunately, still had a number of bugs, and a second full-layer tape-out was needed to make the IC suitable for release as a product. Project B employed a strategy based on the more efficient technique of “random testing” to verify their design. Rather than squandering engineering time on writing hundreds and hundreds of individual tests, engineers wrote test generators that created stimuli with various randomly chosen values for data and transactions and so forth. When they determined that many thousands of these automatically generated tests had covered every item on their list of things to verify, “functional closure” was declared and they taped-out their IC. Thanks to the much greater quantity of tests that had been simulated, their first silicon prototype had many fewer bugs than project A and the few remaining bugs could be fixed with a metal-only tape-out.
1.2 Verification and Time to Market For this simple example of two ICs consider the time from tape-out to shipping product for revenue. Table 1.1 shows a simple comparison using typical durations for an IC of moderate complexity. Actual ICs may require anywhere from 2X to 5X the durations shown in the table. Project A got its product to market 9 weeks later than project B, even though they taped out sooner than project B. If their product had a narrow market window, a seasonal consumer product for example, this delay could be financially disastrous.
1.3 Verification and Development Costs For the purposes of comparison, assume that both projects used 90 nm technology with a full-mask set costing $1,000K and metal-only mask set
1.4 But any Lessons Learned?
3
Table 1.1. From tape-out to revenue shipments Event
Project A Project B [weeks] [weeks] First tape-outa 0 3 Fab time 7 7 Test time 8 8 Re-design and testb 6 3 Second tape-outc 6 3 Fab time 7 3 Test time 4 2 Resulting time to market 38 29 a Project B delayed due to finding and fixing more bugs b Project B has fewer bugs to find and fix c Project A requires a full-layer tape-out, but project B only metal layers Table 1.2. Costs after first tape-out Cost item First tape-out Second tape-out Engineering Total
Project A [$K] 1,000 1,000 1,206 3,206
Project B [$K] 1,000 500 920 2,420
costing $500K.2 And, to keep it simple, assume that both projects were staffed with 15 engineers at $110K annually per engineer. The post-tapeout costs for each project are calculated as in Table 1.2 Not only did project B get its IC to market more quickly, it also accomplished this at about 25% lower cost than project A. It is well known that bugs discovered in silicon are orders of magnitude more expensive to diagnose and eliminate than bugs discovered prior to tape-out. In addition, a good marketing department could forecast the lost revenue of project A’s delayed time-to-market. When considering high technology, products that are first to market often enjoy substantial revenues that competing products will not. Thus, the 2 projects could be compared entirely on the basis of cost by transforming time (delay to market) to cost.
1.4 But any L essons L earned? The many advantages inherent in verification based on “random testing” are well understood. What is poorly understood is how to use the results 2
Costs are approximate for 2006 in $ U.S.
4
Chapter 1 – A Brief Overview of Functional Verification
from projects A and B to assess the risk faced by a new project C. Projects differ in so many ways, from the functionality, complexity, and size of the IC being developed to the tools and software used to verify the IC. On what basis can projects A, B, and C be compared so that the risk at tapeout for project C can be assessed with high confidence? We will return to this example at the end of Chapter 7 where we discuss Risk Assessment to see just how a data-driven risk assessment can be realized. But first, a brief overview of this “random testing.”
1.5 Functional Verification in a Nutshell The primary objective of functional verification3 is to give the prototype a thorough test flight 4, that is, to exercise the verification target thoroughly and thereby ensure correct and complete functionality. The target is usually first modeled in software (the soft prototype) as RTL (Register Transfer Level description), and eventually actual devices will be manufactured (the hard prototype). With the advent of constrained pseudo-random testing in the early 1980s it became possible to explore the limits of a design using automatically generated tests to exercise the broad range of variability inherent in the function of a digital system. Creating this variability in a pseudo-random manner, assigning values at random to (usually most of) these variables, but subject to defined constraints, yields a highly effective means for thoroughly exercising the target, and, thereby, exposing the presence of bugs in the target by causing them to manifest themselves in the form of some faulty behavior. Careful and detailed observations of the responses of the target, including (usually) the variables whose values are determined within the target in response to stimuli, and then checking them against the rules and guidelines, will now and then reveal faulty or sub-optimal behavior. A bug is some error of commission (something is present in the code) or omission (something is missing from the code) and often both. Functional verification does not reveal the bug itself, but it reveals the presence of the bug. The task of debugging must first consider all factors contributing to
This book deals with the subject of intent verification as defined in Bailey (p. 69) and focuses on the method of dynamic verification (p. 71) and directed random verification (p. 72) in particular. 4 A test pilot will take a new aircraft to the very edge of its flight dynamics, even flying it upside-down. This is a prudent step before taking on paying passengers. 3
1.6 Principles of Constrained Random Verification
5
the faulty behavior before making the changes necessary to eliminate the faulty behavior completely. Suitable coverage analysis enables the manager to reduce the risk of faulty behavior within practical limits imposed by available resources (people, compute assets, time, money). Reducing risk requires more resources or resources working more effectively.
1.6 Principles of Constrained Random Verification Subjecting a target to overwhelming variability during simulation prior to tape-out is the core principle of constrained pseudo-random verification (typically referred to as CRV, dropping the “pseudo-” prefix). This technique is also referred to as “directed random” or as “focused random” verification, because the pseudo-random testing is preferentially directed or focused on functionality of interest. The CRV technique exercises the target by driving its inputs with signal values chosen pseudo-randomly but constrained such that they are meaningful to the target being exercised. This technique overcomes 3 challenges to verification: • Extraordinary complexity: A complete verification of every single capability of an IC is an enormously time-consuming and insufficiently productive task. By “sampling” this vast functional space until no sample exhibits faulty behavior, the manager can gain sufficient confidence that the entire IC is free of bugs. • Habits of human programmers: Programmers have habits (particular ways of writing loops or other common code sequences) that limit the novelty that their tests can present to the target being verified. A welldesigned test generator will create tests with novelty unencumbered by programming habits. • Inefficiency of human programmers: A well-designed test generator can generate hundreds of thousands of novel tests in the time that an engineer can write one single test. Lacking the coding habits of a person, a test generator that chooses pseudo-randomly what to do next will wander about the functional space far and wide, exploring it much more thoroughly than the human-written tests. As these tests do their wandering (quite quickly considering that ample verification resources providing over 1,000 MIPS will, figuratively speaking, leap about the cliffs and rifts of the functional space), they will now and then reveal the presence of a bug somewhere. The presence of a
6
Chapter 1 – A Brief Overview of Functional Verification
bug is revealed by faulty behavior – some violation of a known rule or guideline. When all revisions have been made to the target to eliminate previously detected faulty behavior, the resulting target must survive some predetermined quantity of regression testing. This predetermined quantity of regression is determined such that it indicates sufficiently low risk of functional bugs and that it can be completed on a schedule that meets business needs. By subjecting the target to overwhelming variability during simulation prior to tape-out, the manager gains confidence that the resulting IC will function correctly.
1.7 Standardized Functional Verification Often the target of functional verification is a single integrated circuit device. However, the standardized approach to verification accounts for digital systems comprising multiple devices and also IP (intellectual property – typically provided as RTL) from which distinctly different devices or systems can be derived. Likewise, sometimes the target is only an addition or modification to an existing device or system. This standardized approach is applicable to such projects as well. The terms and techniques of this standardized approach to verification are applicable across all types of digital circuit – from I/O controllers to bus converters to microprocessors to fully integrated systems – and extensible to analog circuits and even software systems. It enables objective comparisons of targets, of verification results, and of engineering resources needed to execute the project. It also facilitates the movement of engineering talent among projects as the methodology (terminology, architecture, and analytical techniques) is standardized regardless of the target. It also provides a rigorous analytical foundation that facilitates the organization, collection, and analysis of coverage data. Standardized verification consists of: 1. a comprehensive standard framework for analysis that serves the principles of constrained pseudo-random functional verification, 2. standard variables inherent in the design, 3. standard interpretation of the specification to define the variables and their ranges and the rules and guidelines that govern them, 4. the verification plan that details 5. what results to obtain, 6. the standard measures and views of the results
1.8 Summary
7
Fig. 1.1. Standardized Functional Verification
Figure 1.1 shows the relationships among these six elements. What we really want to know is, does the RTL function as expected under all possible variations of its definitional and operational parameters? For that matter, what constitutes functional closure and how close are we to that elusive goal? Finally, what is the risk of faulty behavior? These same questions must be addressed for the prototype silicon device as well. Hence, this book.
1.8 Summary The use of the CRV technique for functional verification has proven to be extremely powerful as well as sufficiently economical to produce commercial
8
Chapter 1 – A Brief Overview of Functional Verification
IC products. However, costs remain high, both direct costs such as mask charges and indirect costs such as delayed time to market. Surely there must be more that we can do to drive down these costs and risk. Standardized functional verification will be seen to be extremely beneficial on both of these fronts, not only by increased efficiencies within verification teams, but also by more sophisticated verification tools as theory underlying functional closure is incorporated. Chapters 2 and 3 provide this necessary theory. Of course, the prudent engineer or manager will recognize that some designs, or changes to existing designs, might be sufficiently simple that the costs associated with developing a pseudo-random test environment outweigh the benefits of such an approach. Many of the techniques described in the remaining chapters will not be userful for such simple designs.
Chapter 2 – Analytical Foundation
Solid software architecture grows from well-designed data structures, and this applies to verification software in particular. All digital system share common, unifying characteristics, and it is these characteristics that form a single standard framework for the analysis of all digital systems. Data structures and algorithms that exploit this common framework yield architectures that are extensible and reusable. In addition, a common framework for verification greatly enables the sharing and moving of resources among projects rapidly and efficiently. And, most importantly, it provides the basis for data-driven risk assessment. In this chapter we will learn a method of interpretation of naturallanguage specification documents into a standard set of variables that is applicable to all possible digital systems.
2.1 A Note on Terminology Every effort has been made to create a comprehensive, concise, and unambiguous nomenclature without inventing new terms. Contemporary usage of terms in verification varies widely, not only between corporations but also within single organizations. This leads to frequent confusion and misunderstanding and can impede efforts to meet aggressive goals. In addition, certain terms are greatly overused within the industry, with some terms having so many different meanings that they become obstacles to clear communications rather than vehicles of technically precise meaning. Where a given term may differ from other terms used in the industry, clarification is provided in footnotes.
2.2 DUTs, DUVs, and Targets The terms DUT (Device Under Test) or DUV (Device Under Verification) are often used in the context of functional verification. However, DUT refers more precisely to an individual integrated circuit device undergoing
10
Chapter 2 – Analytical Foundation
test for manufacturing defects. And, the term DUV, while more precise, is used less frequently than DUT when referring to that which must be verified. Moreover, often the thing to be verified does not include all of the logic in a given IC device, such as the pins of a new device or when only a portion of an existing device is undergoing verification for certain changes to eliminate bugs or add new capabilities. This book uses the term target to refer to the precisely bounded logic that must be verified. CRV is aimed at this target specifically. A target is defined very specifically such that there is no ambiguity as to that which is to be verified, and that which is presumed to be verified already. Are the pins and their control logic included in the target or not? Is the IP integrated into the new IC included in the target or not? Does the target comprise only the enhancements to some existing IC or does the target include the existing logic as well? Does the target consist of RTL that may be used to generate many different devices, such as is the case for commercial IP (intellectual property)? The answers to these questions establish the precise boundary between the target, and everything else. In addition, as we shall see later, it is useful to regard each clock domain of a given target as a separate sub-target for the purposes of coverage analysis. The reason for this should be clear when considering that logic is evaluated for each clock cycle, and logic in a slower clock domain will not be exercised as much as the logic in a faster clock domain over the identical period of time, measured in seconds (or fractions thereof). A design running at 100 MHz with a 33 MHz PCI interface is an example of a target with two clock domains. The faster domain will experience about three times as much exercise as the slower domain for an average test. Finally, the analysis in this book applies to synchronous digital systems. Asynchronous signals are normally synchronized on entry to the device and signals crossing clock domain boundaries are synchronized into the destination domain before being used.
2.3 Linear Algebra for Digital System Verification A complete analysis of any digital system is based on the following premise: From the standpoint of functional verification, the functionality of a digital system, including both the target of verification as well as its surrounding environment, is characterized completely by the variability inherent in the definition of the system.
2.3 Linear Algebra for Digital System Verification
11
An exhaustive list of these variables, their individual ranges, and the relationships among the variables is essential to ensuring that the verification software can exercise the system thoroughly. In addition to exercising the digital system in a pseudo-random manner, certain special cases, which may not appear to lend themselves to exercising pseudo-randomly, must also be identified. And finally, in addition to the variables and special cases the specification for the digital system will also define (explicitly or implicitly) rules and guidelines for which the responses of the target during its operation much be monitored. The variables related to the functionality of the digital system may be considered to lie within four subspaces that together constitute a basis for characterizing the entire range of variability that the digital system can experience. These 4 subspaces are: • • • •
Connectivity Activation Conditions Stimuli & Responses
These 4 subspaces may be considered to constitute a “loosely orthogonal” system in a linear algebraic sense, and together they constitute the standard variables for interpreting a digital system for the purposes of functional verification. The interpretation of the specifications for a verification project by different teams of engineers may indeed yield differing sets of variables. Other than differences in names of variables, the sets will be equivalent in a linear algebraic sense by way of some suitable mapping. As a familiar example, we can describe the points in a 2-dimensional space with Cartesian coordinates (where the variables are the distance from the origin along orthogonal axes x and y) or with polar coordinates (where the variables are the radius r and the angle θ). The variables spanning the space are different but equivalent: a one-to-one mapping from each set of variables to the other set exists. Similarly, interpreting the specification for a project by different teams will yield differing but equivalent sets of variables. But, if the sets of variables resulting from different teams truly do differ, then the specification is not sufficiently clear in its statement of requirements or, possibly, contains errors. In either case it needs revision.
12
Chapter 2 – Analytical Foundation
Fig. 2.1. Standard Framework for Interpretation
Each unique combination of values of these variables determines a single function point within the functional space. In the next chapter we shall see how the size of this function space can be measured using these standard variables.
2.4 Standard Variables All variability to which a verification target can be subjected can be classified among these four subspaces of functionality. The assignment of a given variable to one subspace or the other need not adhere rigidly to the definitions as given below. Instead, these subspaces are more usefully
2.5 Ranges of Variables
13
regarded as individual “buckets” which together contain all of the variables pertinent to the target’s behavior. It is not so important that a variable be classified within one subspace or another, but rather that all variables be defined and suitably classified. Over the course of execution of a verification project, it may become more practical to reclassify one or more variables to enable implementation of better-architected verification software. These precisely defined variables will in most cases be used to determine how to subject the target to variability – variability in the information it receives (stimuli), variability in the conditions governing its operation while receiving the information, variability in the particulars of the logical operation (as a consequence of its connectivity) performed on the information, and variability in the particular sequencing (as determined by the clocking) of the logical operations. As discussed earlier, subjecting a target to overwhelming variability during simulation is the core principle of constrained random verification. The process of verification will also expose bugs in the target’s external context, and provisions must be in place to make changes as needed to achieve goals set for the target.
2.5 Ranges of Variables The range of a variable may be contiguous (containing all possible values between some minimum and maximum value), piece-wise contiguous (containing two or more subsets of values), or enumerated (containing individual values representing some special meaning such as a command). Most variables are independent, and their ranges are subject to constrained random value selection. However, some variables have ranges that are dependent on the values of other, independent variables in that they impose certain constraints on the range of values that may be assigned to such variables. Their values can be chosen only after choosing values for variables upon which their ranges are dependent. Ranges of nearly all values will have inherent value boundaries – values which are special in that they determine the beginning or end of something, or cause some particular behavior to be manifest when that value is attained. Contiguous ranges will have at least two boundary values, the largest value and the smallest value. These two values, and perhaps the next one or two neighboring values, should be identified while defining the variable. Some intermediate values, such as high-water and low-water marks in queuing systems, must also be identified as boundary values.
14
Chapter 2 – Analytical Foundation
Piece-wise contiguous variables will have multiple sets of boundary values, with each subrange in the overall range being regarded as distinct ranges with largest and smallest values, etc. Enumerated ranges (such as types of processor instructions or bus commands) may also have boundary values, but in a different sense. As will be seen in chapter 3 in the discussion of condensation, various load and store instructions might be regarded as all lying on the single boundary value of memory access instruction for certain purposes, such as for coverage analysis.
2.6 Rules and Guidelines Technical specifications, whether for some industry standard or for some particular device, define a number of requirements that must be met by any implementation. Each of these individual requirements constitutes a rule that must be obeyed by the target. Specifications often include recommendations for how a given implementation should behave to ensure optimal performance. Each of these recommendations constitutes a guideline that should be followed by the target. Each individual rule and guideline defined (whether explicitly or implicitly) within the various specification documents must be monitored by the verification software to ensure that the target’s behavior always obeys all rules and adheres as closely to the guidelines as project needs dictate. Additionally, implicit in each rule or guideline will be one or more exercisable scenarios that will lead the target either to obey or to violate the rule or guideline. Exercising these scenarios thoroughly will reduce the risk of bugs in the target. As will be seen in chapter 3, these scenarios will map to individual coverage models (Lachish et al. 2002) Performance requirements (speed, throughput, latencies, etc.) will be described in terms of rules and guidelines. 2.6.1 Example – Rules and Guidelines As an example of how to dissect rules and guidelines from a specification, consider the following few sentences from (say, section x.y of) a specification for the behavior of agents on a shared bus: “When a read request has been asserted on READ, data must be returned to the requesting agent within at least 16 cycles. To meet the performance
2.6 Rules and Guidelines
15
objectives, it is preferable that a READ request be serviced within 8 or fewer cycles.” From this brief paragraph we recognize both a rule and a guideline that govern a standard response variable (standard variables for stimulus and response will be discussed later in this chapter) called read_cycles. Both the rule and the guideline can be expressed formally in the e language as follows1: Check for rule and guideline of section x.y: <’ extend sys { setup() is also { // Halt simulation when a rule is violated. set_check(“FAIL...”, ERROR_AUTOMATIC); // Continue simulation when a guideline is // violated. set_check(“WARN...”, WARNING); } } struct bus_example { event bus_clk is change (‘top.sysclk’) @sim; event read is rise (‘top.read’) @bus_clk; event data_ready is rise (‘top.ready’) @bus_clk; event read_cycles; // Rule: On a read, must return data // within 16 cycles. expect read_cycles is @read => {[1..16]; @data_ready} @bus_clk; else dut_error (“FAIL data not ready in time”);
} ‘>
1
// Guideline: On a read, return data // within 8 cycles. expect read_cycles is @read => {[1..8]; @data_ready} @bus_clk; else dut_error (“WARN data not ready quickly”);
The examples in this chapter are intended to illustrate how the formal standard variables may be found in existing code or expressed in new code. They are not intended to endorse any particular language or promote any particular coding style.
16
Chapter 2 – Analytical Foundation
In the example, the relevant section of the specifications document is included in the comment preceding the code. This is helpful in relating the code to the natural-language specification on which the code is based. Both the rule and the guideline have been coded in this example. Violating the rule causes the dut_error method to be invoked, printing a message and halting simulation according to the manner in which set_check has been programmed. Violating the guideline causes the dut_error method to be invoked, printing a message and then continuing the simulation. Rules and guidelines govern values of variables and we have four categories of variables. We shall consider each category of variable in turn, starting with variables of connectivity.
2.7 Variables of Connectivity The network of gates and memory elements that constitute the target represents a specific connectivity2. Many designs are intended to be reused and, consequently, will be structured so that different designs can be generated from a single RTL database. This is especially true for vendorsupplied IP (intellectual property) intended to be used in a variety of different connectivity. Many targets will operate differently depending on the presence or absence of elements in its surroundings. That is, variables for both the internal connectivity and the external connectivity must be suitably defined. For example, the stimuli applied to a bus-resident target will vary depending on the presence or absence of other devices attached to the same bus, competing for access. The distinction between external and internal depends on where the boundary is defined for the target. Many specifications for industrystandard protocols provide for optional functionality that can be included in or excluded from an IC. The determination of whether a variable associated with such functionality is internal or external might be postponed until the definition of the target has been made final. Most variables of connectivity are static (constant valued) with respect to time. However, an important class of targets designed to provide
2
Bailey uses the term “structure” to refer to a network of interconnected logical and memory elements. However, in the course of a verification project both a behavioral model and a structural model may undergo regression testing with the same suite of tests. To avoid this ambiguity in the use of the term “structure”, the term connectivity is used instead.
2.7 Variables of Connectivity
17
“hot-plugging” functionality experience changes to connectivity during operation. Connecting a USB cable between a device and host or plugging in a PCM-CIA card each causes a change in value to a variable of connectivity. A particular set of values chosen from the ranges of the variables of internal connectivity determines an instance of the target. A particular set of values chosen from the ranges of the variables of external connectivity determines the context of the instance. 2.7.1 Example – External Connectivity As an example of how to interpret a specification and define standard variables for external connectivity, consider the following sentences from a specification for the number of agents on a shared bus: “The bus is limited to a total of 16 agents in total. The number of processors is limited to 8 and there may be up to 4 memory controllers. Remaining slots may be occupied by I/O controllers.”
Fig. 2.2. External connectivity
This paragraph defines three standard variables of external connectivy and also defines their ranges. This can be expressed in e as follows: External connectivity as described on page N: <’ unit processor { // Code not shown for purposes of this example. } unit mem_ctrl {
18
Chapter 2 – Analytical Foundation
// Code not shown. } unit io_ctrl { // Code not shown. } unit arbiter { // Code not shown. } unit bus_connectivity { num_proc: int; // This is a standard variable, keep num_proc in [1..8]; // and its range. num_mem_ctrl: int; // A second standard variable, keep num_mem_ctrl in [1..4]; // and its range. num_io_ctrl: int; // And a third with its range. keep num_io_ctrl in [1..(16 - num_proc - num_mem_ctrl)]; proc_agents: list of processor is instance; keep proc_agents.size() == num_proc; mem_agents: list of mem_ctrl is instance; keep mem_agents.size() == num_mem_ctrl; io_agents: list of io_ctrl is instance; keep io_agents.size() == num_io_ctrl; arb_agents: list of arbiter; keep arb_agents.size() == 1; // NOT a variable! } ‘>
Four different types of bus agent are defined, namely processors, memory controllers, I/O controllers, and arbiters. For the purposes of this example, all I/O controllers are considered equivalent. Real systems are not so simple. The number of arbiters is always one so there is no variable of external connectivity associated with arbiters. There is always one arbiter. The remaining three types of bus agent can appear in varying numbers, so there are standard variables of external connectivity defined for each. Note that ranges of two of the variables (num_proc and num_mem_ctrl) are independent and that the range of the third (num_io_ctrl) is dependent on the values chosen (generated) for the first two variables. In this simple example (and in the ones that will follow) we have merely illustrated how the concept of a standard variable of external connectivity is used in verification code. An actual implementation of a test environment would benefit from having all standard variables organized into some central location (such as include files) for inclusion in code modules developed by members of the engineering team. For the purposes
2.7 Variables of Connectivity
19
of illustrating the concepts, however, we will not attempt to create a working test environment. 2.7.2 Example – Internal Connectivity Prior to the availability of commercial IP, verification targets had little or no variables of internal connectivity. That is, the target was RTL that would be synthesized into only one single IC device. The internal connectivity was fixed and no provisions were made to vary one or more parameters of the design. However, with increasing use of commercial IP, RTL with varying internal connectivity is becoming commonplace. Variables of internal connectivity might be used in a make file to generate specific instances of multi-instance RTL. Similarly, commercial IP is often packaged with some tools to guide the user in selecting values for internal connectivity. Users eventually obtain RTL for simulation and synthesis that is generated from a single body of Verilog or VHDL. Generation produces RTL based upon values assigned to the variables of internal connectivity corresponding to the user’s requirements. In our previous example, consider the processor as the verification target. It is being designed for licensing as IP and has variable cache sizes as well as options for memory management and floating-point. These options for customizing this processor core for use in an application constitute standard variables of internal connectivity. Consider Fig. 2.3 below.
20
Chapter 2 – Analytical Foundation
Fig. 2.3. Internal connectivity
We could define fields in e corresponding to the standard variables of internal connectivity explicitly as follows: Standard variables of internal connectivity for processor: <’ unit processor is also { // Standard variables of internal connectivity. cache_size: [8K, 16K, 32K, 64K]; FPU: [ABSENT, PRESENT]; memory_mgt: [ABSENT, MPU, MMU, DMA]; // Etc. } ‘>
To ensure that the RTL for a database that is intended to generate multiple instances is thoroughly exercised, a carefully chosen subset (or perhaps all) of the possible instances must be sufficiently exercised with suitably
2.8 Variables of Activation
21
chosen combinations of conditions and stimuli (defined later in this chapter). This is especially true for IP (intellectual property) that is intended to serve a wide variety of applications by enabling the generation of many different instances from a single base of source code. Most manufactured devices will have no variables of internal connectivity. All wires and transistors are present and connected at all times during its lifetime. However, a certain class of ICs containing fusible links may be regarded as having multiple variables of internal connectivity. For example, each individual link might be mapped to a standard variable with a binary range: Standard variables of internal connectivity for IC containing fusible links: <’ unit fusible_links { cache_enabled: [OFF, ON]; fpu_enabled: [OFF, ON]; // Etc. } ‘>
2.8 Variables of Activation Choosing the values of variables associated with power, clocking, and reset determines the activation of the instance in its context. After a logic circuit is activated with the application of power, and then of clocking, and then with the release of reset, it can then be excited with stimuli to obtain responses. The variables for power determine when power is applied or removed from subsets of the target’s logic. Similarly, the variables for reset define when reset is deasserted for each clock domain or subsets of each domain relative to assertion/deassertion of reset in other domains. There may be other relationships between power, clocking, and reset, and variables of activation must be defined to describe these as well. Some targets will behave differently depending on the clock frequency at which they are operated. In this case a variable whose range of values contains each of the supported clock frequencies could be defined. A target constituting multiple clock domains would require one or more variables defining the possible combinations of clock frequencies supported by the target. From the standpoint of coverage analysis, it is more useful to consider each clock domain and its associated synchronization logic as
22
Chapter 2 – Analytical Foundation
a single target. See chapter 7 for further discussion of coverage analysis of such targets. 2.8.1 Example – Activation Consider the processor core of the previous examples (see Fig. 2.4 below). Let’s expand its functional specifications to incorporate power-saving circuitry, variable clock-speeds, and a built-in self-test unit. The optional floating-point unit (FPU) has an associated power-control block so that this power-hungry unit can be turned off when not in use. A control register called pwr_reg has bits that determine whether various subsets of logic in the processor should be powered or not. The processor has been designed to operate at a higher frequency than the system bus to which it is connected, and the bus interface block handles the synchronization of signals between the bus and the rest of the processor. For this example, we will assume that the bus is defined to operate at any of three different frequencies, 33 MHz, 66 MHz, and 132 MHz, and that the processor is defined to operate at 500 MHz, 800 MHz, and 1 GHz. Then, there are three different ways to bring the processor out of reset. One is by deassertion of the external cpu_reset signal. The second way is by writing to an internal control register (not shown in figure) to cause a soft reset. And, the third way is by way of the built-in self test (BIST) block which is able to cause reset of the processor autonomously.
2.8 Variables of Activation
23
Fig. 2.4. Activation
These capabilities add variability that must be verified and this variability might be expressed in terms of standard variables of activation as follows. <’ struct activation_vars { power_on: [EXTERNAL_PWR_APPLIED, PWR_REG]; bus_clk: int; keep bus_clk in [33, 66, 132]; cpu_clk; int; keep cpu_clk in [500, 800, 1000]; reset_deassert: [cpu_reset, BIST, SOFT_RESET]; } ‘>
There are numerous familiar examples of variables associated with activation. A “magic packet” defined within the Ethernet protocol will cause a controller to wake up and power itself. A USB host controller can shut down power to logic for ports that are not connected to any device or hub.
24
Chapter 2 – Analytical Foundation
Firewire devices can also wake up and power themselves when a cable connect event is detected. After power and clocks to a target have become stable, the target is energized but not yet activated. Assertion of reset de-activates an energized target, and de-assertion of reset activates an energized target. A target must be fully energized before it can be activated. An active target is able to perform its computations. An energized target merely drains the battery.
2.9 Variables of Condition Values which govern the operation of the target and which are generally constant during its operation are classified as conditions3. Most, if not all, conditions receive a default value as a consequence of reset. Many conditions are established by software or firmware during system initialization by setting or clearing various bits. Some conditions may even be established in hardware by strapping or jumpering a value (tying it to logical 1 or 0), or perhaps by means of a fusible link (but such a variable might instead be classified as one of connectivity if this is convenient). Conditions are often single-bit values that enable or disable some function. For example, enabling prefetching of instructions or data in a processor or enabling big- or little-endianness of a bus will govern the operation of the target. These conditions usually persist unchanged (but not necessarily) while the target is in operation. More complex conditions might establish the value for some threshold in a queue (such as a high- or low-water mark). For example, the result of training a PCI-Express link width constitutes a persistent condition that affects the subsequent behavior of the target. These conditions are called direct conditions because their values are written directly by software or by hardware (such as by strapping a bit to a logical 1 or 0). Some conditions are established indirectly in response to application of a certain sequence of stimuli. These conditions are called indirect conditions. Such conditions are usually transient, in contrast to direct conditions, which are persistent. For example, a state machine that controls the operation of a queuing mechanism might be caused to operate differently when the data reach or
3
Note that another use of the term “condition” in the context of verification is in a particular measure of code coverage, called condition coverage. See chapter 5 for a discussion of code coverage.
2.9 Variables of Condition
25
exceed some high-water mark so that the queue can be emptied more quickly and thereby prevent overflow. The prior example of a PCI-Express link width might also be considered to be an indirect condition. Such a condition could be classified either way as convenient for the verification software. Finally, it is practical to consider not only those conditions that govern the operation of the target but also those that govern the operation of its external context. Considering both the internal conditions and the external conditions relevant to the target are necessary to ensure that the target is thoroughly exercised. For example, a target residing on a shared bus might behave differently depending on the conditions established in the external bus arbiter that determine priorities for granting access. Similarly, as a consequence of activity on this shared bus, the arbiter itself might enter an indirect condition such that priority for access to the bus is temporarily changed to favor some other bus-resident agent that is temporarily starved for bus bandwidth. To ensure that a target responds as expected to all defined stimuli, it should be exercised with a carefully chosen subset (or perhaps all) of the combinations of defined conditions, both internal and external. Choosing the values of the internal conditional variables determines the operation of an instance within its context. The process of establishing the initial operational conditions is called initialization. 2.9.1 Example – Conditions Consider the operation of the processor IP from our previous examples. (see Fig. 2.5) It is designed to be flexible in its attachment to the system bus, having either 32 bits or 64 bits for address and data. It can also be set up to operate in big-endian or little-endian mode. An internal prefetch controller can be enabled or disabled under control of a software-writable bit. Finally, the store queue has a software-adjustable high-water mark that determines whether priority should be given to draining the queue or not.
26
Chapter 2 – Analytical Foundation
Fig. 2.5. External and Internal Conditions
These variables of condition can be expressed in e as follows: <’ struct ext_condition_vars { bus_width: int; keep bus_width in [32, 64]; } struct int_condition_vars { prefetch_enable: [OFF, ON]; bus_endianess: [BIG, LITTLE]; store_queue_priority: [NORMAL, DRAIN]; } ‘>
The value for the prefetch enable bit would be established typically during initialization via firmware, but might also be changed during post-silicon verification or other diagnostic testing. This is an example of a direct internal condition.
2.10 Morphs
27
The endianess of the bus might be programmed by firmware or perhaps by sensing the value of an external pin. Similarly, the bus width might be detected automatically by the bus interface (as is done during PCI-Express link training). As mentioned earlier in this chapter, these variables might instead be considered as variables of external connectivity if external jumpers are present (or absent) to determine the values. The value for store queue priority cannot be set by software. The queue controller determines priority for this queue automatically. When the number of entries in the queue reaches the high-water mark, an indirect condition occurs that causes a shift from normal operation to one that drains the queue as quickly as possible. When the queue has been emptied (or perhaps nearly emptied), operation shifts back to normal.
2.10 Morphs Another consideration is whether a given instance’s operation can be effectively redefined by establishing certain conditions. One typical example is when an instance is placed into scan mode for testing purposes. See Fig. 2.6 for an example of how an instance is transformed into one long scan chain by changing the value of test_mode from 0 to 1. Once set up for scanning, the instance is effectively governed by a different set of variables (except for those of internal connectivity). It is practical to regard such functionality as a metamorphosis (or “morph”) of the instance. Similarly, many other test or diagnostic modes of operation can be treated as different morphs of the instance, each with its own set of defined stimuli, conditions, and special cases. They will also be subject to an entirely separate set of rules and guidelines. The term mode is commonly used when referring to a morph, but it is also used in many other contexts, such as low-speed mode or high-speed mode in USB terminology and PCI mode or PCI-X mode in PCI terminology. To avoid this frequent ambiguity in the use of the term mode, the new term morph is introduced to identify the distinctly different metamorphosis of a target.
28
Chapter 2 – Analytical Foundation
Fig. 2.6. Metamorphosis of a target
Most designs nearly always specify at least 2 morphs for a given instance, one for ordinary operation and one for testing. However, some designs will also allow for a 3rd morph that facilitates diagnosis and troubleshooting. Such a morph might autonomously, often in response to some internally observable condition, save some internal state and make it available for later external access. Similarly, a diagnostic morph might respond to a distinct set of commands that provide data essential for diagnosis of problems. An instance will be realized (usually) as two or more morphs. Usually, the first morph that is defined (call it morph0) implements the principle function of the device and the second morph (morph1) implements scan chains of the memory elements of the device. The connectivity of this testability morph is dependent upon the layout of the device (flip-flops are wired together into scan chains based on geometric proximity), whereas the connectivity of other morphs is usually independent of layout. Values for variables that determine the morph are usually always chosen deterministically rather than randomly. A much larger fraction of verification resources (engineering time, compute cycles, etc.) are typically
2.11 Variables of Stimulus and Response
29
applied to the default morph with a much smaller fraction reserved for the other morph(s).
2.11 Variables of Stimulus and Response A stimulus is applied directly to an input port or bi-directional port of the target, or, for many types of target, indirectly to some specially designated internal node, such as for targets which process instructions or commands (for example, microprocessors, DMA controllers, and USB port controllers). The values of individual stimuli vary over time, and the responses to the stimuli at the output ports or bi-directional ports (or at specially designated internal nodes) of the target are observed to determine whether the target is exhibiting correct or faulty behavior. Variables of stimulus and response are grouped together because many variables pertain to both a stimulus and a response, particularly for shared busses and other shared signals. For example, the number of wait states permitted on a shared READY signal on a bus applies equally whether the target drives the signal as a response or whether some other bus agent drives the signal as a stimulus. A distinguishing characteristic of variables of stimulus (and response) is that there sure are a lot of them. Fortunately, we can further divide this vast sub-space into smaller sub-spaces. In particular, stimulus and response variables can be classified along two dimensions, spatial and temporal, i.e. variables of composition and variables of time. Variables of composition are those which indicate what values may be assigned to the various bit positions of a multi-bit signal (whether stimulus or response, and whether applied in parallel or applied serially). For example, a simple bus transaction will contain different fields, the values of which are drawn from the individual ranges for the variables describing those fields. Similarly, verification of a superscalar processor must account for the possible instruction bundles that may be presented to the front of the processing pipeline at each instruction fetch cycle. Variables of time are those that indicate when values are produced as either a stimulus or response. Variables of time represent different levels of abstraction with regard to time and may be further distinguished as follows: • actual time: These are durations, intervals, latencies, and so forth usually expressed in terms of the smallest unit of time relevant to the target, namely clock cycles. If timing simulation is performed, then sub-cycle
30
Chapter 2 – Analytical Foundation
ranges for rise-time, fall-time, and set-up time may be defined as well. Functional verification may need to account for sub-cycle timing in targets which capture data on both rising and falling edges of clocks. • relative time: These are related to the order of arrival of stimuli or responses relative to one another; i.e. sequences, or what can follow what. For example, in verifying the execution engine in pipelined processors the possible sequences of instructions must be considered in evaluating coverage of the instruction pipeline. • inverse time: In these variables time appears in the denominator, such as throughput, bandwidth, MIPS, and so forth. Or, these may count events over some interval of time, such as retry attempts, For example, a requirement that states that average bandwidth over any interval of length T will be sustained at a level of x MB/s is defining a variable of inverse time. There may be numerous other abstractions of variables of stimulus and response as appropriate for the target. For example, activity on a shared bus is usually described in terms of atomic transactions that are joined in sequence to form a sequence. These abstractions are defined for programming convenience and are readily accommodated by commercially available verification tools. 2.11.1 Internal Stimuli and Responses As a practical matter, certain classes of stimulus are applied not by the verification software to the input ports of the target, but rather are applied by internal logic to internal nodes. These are regarded as internal stimuli. For example, verification of a microprocessor with an embedded instruction cache benefits greatly by considering the stream of instructions received at the input of a processor’s instruction pipeline as stimuli for the execution pipeline. The movement of instructions into the internal cache usually does not correspond directly to the stream of instructions executed within the processor’s pipeline. So, to determine how thoroughly the execution pipeline is exercised, these internal stimuli (as applied by fetching instructions from the internal cache) are observed and analyzed to provide an indication of how well the pipeline is exercised. Likewise, internal responses, such as the results of executing and retiring an instruction, may need to be observed and analyzed to determine how well the execution pipeline is exercised. Of course, internal responses could be propagated to external output ports, but in simulation this is usually not expeditious to efficient failure
2.11 Variables of Stimulus and Response
31
analysis, because the observation at the output port usually lags the actual production of the response. On the other hand, in a hard prototype observing internal responses directly is usually not possible, and additional steps are needed to reveal internal responses at observable output ports. See chapter 4 for further discussion of designing observability into a target for the purposes of verification of the hard prototype. An important subset of internal response contains those that are produced as a consequence of activation. The values of bits in internal registers immediately after deassertion of reset are often required to have specific values, and these must be monitored accordingly. Similarly, these same registers might be defined to have different values as a consequence of hard reset as compared to soft reset, which often preserves the values of certain bits from before the target experienced the reset. 2.11.2 Autonomous Responses Some responses of a target will appear spontaneously and disassociated from any particular stimulus. These responses may be regarded as autonomous responses and include target-initiated activity such as the fetch of the first instruction by a microprocessor upon coming out of reset or the expiration of an interval timer due to the passage of time. Similarly, a change in internal condition (such as a queue becoming empty or full) can cause responses that are not associated with any particular stimulus. These examples show that there are varying degrees of autonomy for responses. Responses that occur independent of all stimuli and arise simply as a consequence of activation may be said to be completely autonomous. A free-running timer produces completely autonomous responses. Responses that occur after excitation of the target may be said to be partially autonomous. These include those that occur as a consequence of initialization, such as enabling an interval timer, or as a consequence of some indirect condition, such as the sudden appearance of wait states on a bus when a related queue becomes full or empty. 2.11.3 Conditions and Responses How does a variable of condition differ from a variable of response? Isn’t a condition simply a response of the target to some stimuli? Many an ordinary response is more or less discarded after it is produced. The data returned on a read command, the item removed from a queue, and
32
Chapter 2 – Analytical Foundation
the ACK (acknowledge) of some transaction have no further influence on the behavior of the target producing the response. Conditions, on the other hand, persist and exert their influence over the behavior of the target. It may be more correct to consider variables of condition a proper subset of variables of response. From a practical standpoint, whether a particular variable is classified as one of condition or as one of response is a decision to be made by the verification team (or by the standards body defining the standard variables for some standard protocol under its care, such as USB or PCI-Express). 2.11.4 Example – Stimulus and Response Consider the format of a 32-bit instruction for a hypothetical processor (see Fig. 2.7). The 32-bit word has been defined to contain 5 distinct fields: a 3-bit field for the major opcode, three 5-bit fields for register addresses, and a 14-bit field containing a secondary opcode.
Fig. 2.7. Composition of an instruction
The values assigned to each field are variable and constitute standard variables of stimulus. Meaningful pseudo-random programs for this processor can be generated by constraining the values assigned to each field of each instruction. This can be expressed in e as follows: <’ struct instr { %major_op: uint (bits: 3); define LD 4; define ST 2; define ALU 3; define BR 1; define SYS 7; keep major_op in [LD, ST, ALU, BR, SYS]; %src_regA: uint (bits: 5); %src_regB: uint (bits: 5); %dst_reg: uint (bits: 5); %subop: uint (bits: 14); } ‘>
2.12 Error Imposition
33
If this processor has a 3-stage pipeline, there are implicit variables of relative time reflecting the possible instructions that can be present in the pipeline simultaneously and whose execution might interact. For example, a LD instruction whose destination register is used by an immediatelyfollowing (in time) ALU instruction will exercise functionality associated with ensuring that the ALU instruction uses the newly loaded value rather than the value in the register prior to execution of the LD instruction. Notice that rules that apply to stimuli are the constraints for generating and assigning pseudo-random values.
2.12 Error Imposition Many targets are designed to tolerate errors of various sorts, and exercising this functionality requires the imposition of errors onto the stimuli4. These classes of error vary widely and include: • value errors, such as CRC and parity errors, encoding, and bit-stuffing errors • temporal errors, such as delays in the arrival of data or packets of data (with a negative delay corresponding to early arrival), overflow and underflow of queues • ordering errors, such as reading a memory location before a prior write has completed.
2.12.1 Example – Errors The bulk of CRV will typically be conducted without imposing errors on any stimuli, but a significant fraction of the CRV must be devoted to error testing, constraining generation to produce occasional errors in sufficient density to reduce the risk of a bug in error-handling logic.
4
Other texts often refer to this as error injection, but because injection implies that something has been added to the stimulus, the term error imposition is used to better describe the many classes of potential error, including those that remove or reorder values or events. A short packet in an Ethernet subsystem is an example of error imposition where something has been effectively removed from the stimulus.
34
Chapter 2 – Analytical Foundation
Consider a bus whose operation is defined such that an acknowledgement to a command issued to an addressed agent must be supplied within 16 or fewer cycles.
Fig. 2.8. Error imposition
To impose a temporal error in this circumstance the following code might be used: <’ struct response is also { event bus_clk is rise (‘~/top/BUS_CLK’) @sim; event start_cmd is rise(‘~/top/ADR’) @bus_clk; event ack; // Emit this event after waiting some // random number of cycles. response_time: int; keep impose_error == FALSE => response_time <= 16; keep impose_error == TRUE => response_time == 17; gen response_time; wait [response_time]*@bus_clk; emit ack; } ‘>
This simple example relies on the field impose_error that is assumed to be defined elsewhere in the testbench. If this field has been assigned the value true, then the rule governing response time will be intentionally violated (by one cycle), thereby exercising the functionality related to handling errors on this bus. Imposing a value error is also easily accomplished. For example, if a stream of data is protected by a final CRC value, this CRC value can be corrupted intentionally as follows:
2.14 Special Cases
35
<’ struct gen_crc { crc: int; crc = compute_crc(data); keep impose_error == TRUE => crc == crc+1; // Adding 1 to computed value yields a crc error. } ‘>
2.13 Generating Excitement Choosing the values of the external conditional variables and the values of stimulus variables determines the excitation of an instance within its context. Pseudo-random excitation is currently the province of commercially available pseudo-random test generators.
2.14 Special Cases The foregoing categories of variables (connectivity, activation, conditions, and stimuli and response) form the basis for pseudo-randomly exercising the target. However, certain aspects of a target’s functionality may not be readily exercised in a pseudo-random manner, but instead require special treatment. These are best defined as special cases, and are often exercised with deterministic tests, or with deterministic test sequences within a pseudorandom sequence. Most special cases will be associated with stimuli or responses, but there may also be other things to verify that just don’t seem to fit neatly into any subset. This should not impede progress in interpreting the design specifications – just write them down as for later classification. Eventually, with time and increasing familiarity with the design, special cases in this latter category will be recast as variables of suitable type with their respective ranges. Treating them as special cases early in the project is a practical convenience that enables progress in spite of a few “loose ends”. Interpreting functionality related to reset often accumulates a number of special cases. For example, checking the state of the target immediately after de-assertion of reset (whether hard reset or soft reset) is more readily accomplished with a deterministic test and merits description as one or
36
Chapter 2 – Analytical Foundation
more special cases. Similarly, checking for state preservation upon receiving a soft reset constitutes another batch of special cases. On the other hand, these are better regarded as variables of response whose values are dependent on the type of reset. If indirect conditions are defined for a target, it is important that the dynamic transition from one value to another during the application of stimuli under given conditions be recorded as special cases to be exercised and observed. Other external activity such as the connection or disconnection of a cable, entering or leaving power-save states, and entering or leaving test modes (morphing the target) can also be recorded as special cases during early stages of a project. Each of the special cases discussed so far fall into the category frequently referred to as scenarios. As will be seen later, these scenarios actually correspond to transitions among values of standard variables, but for the time being they will simply remain on the list as special cases. The second category of special cases lists properties5 of the target. These describe higher-level (or transcendental) behaviors of the target that are usually not verifiable via CRV, such as the performance (in MIPS) of a processor. Verifying performance is likely to need additional testing above and beyond CRV, which tends to focuses on correctness and completeness rather than speediness. Properties often encompass behavior of multiple clock-domains and are often expressed with inexact requirements. Passing a compliance test (such as for USB or PCI-Express) is also a property of the target.6 Other properties may be based on aesthetic evaluations, such as the quality of an image encoded and decoded by the target, or the sound fidelity produced by the target. Cybernetic properties such as responsiveness to human interaction (in controlling a commercial aircraft, for example) would be listed here as well.
Note that the term properties is also used in the context of model checking where a given property of the RTL is to be checked. Properties listed as special cases are not based on RTL, but are based only on the requirements stated in the design specifications. 6 Passing a compliance test does not mean that the risk of a functional bug is zero. Examples abound of products (including IP) that pass compliance tests but still have bugs. 5
2.15 Summary
37
2.14.1 Example – Special Case A classic example of a property as a special case comes to us from the specifications for Ethernet (IEEE 802.3) that states requirements on the distribution of the back-off slot times. Ethernet uses a collision-detect multiple-access algorithm on a shared medium, the wire carrying the electrical signals. When two (or more) devices happen to drive the wire simultaneously, both (or all) devices must stop driving the wire and wait some specified period of time until attempting to drive the wire again. The time each must wait is referred to as the back-off slot time, where a slot corresponds to a specific number of bit times. Testing for this requirement is achieved within a network of compliant, commercially available devices. The back-off slot times used by the device being tested are recorded and then tested for a probabilistic uniform distribution. Verifying this particular requirement doesn’t lend itself to CRV testing and isn’t readily verified in simulation. Similarly, functional requirements related to liveness, fairness, or performance of a target may require treatment as a special case outside the context of CRV.
2.15 Summary The taxonomy defined in this chapter provides a consistent and comprehensive nomenclature for all aspects of functional verification regardless of the size or complexity of the digital system to be verified. The entire function space can be characterized completely and exhaustively by a well-defined set of standard variables using the linear algebra for digital systems. The premise stated at the beginning of this chapter bears repeating here: From the standpoint of functional verification, the functionality of a digital system, including both the target of verification as well as its surrounding environment, is characterized completely by the variability inherent in the definition of the system. Different teams will interpret specification documents to yield different standard variables, surely differing in name but also differing in definition,
38
Chapter 2 – Analytical Foundation
Table 2.1. Relations among variables for a target of verification Choose values for these variables: Internal connectivity External connectivity Power, clocking, and reset Morph-related conditions and stimuli Internal conditions External conditions Stimuli Errors
To establish: Instance Context Activation Morph Operation Excitation
but the sets of variables will be equivalent, each forming orthogonal basis sets. Any non-equivalence indicates one or “bugs” in a document because it is not sufficiently clear or ambiguous in its interpretation. Table 2.1 summarizes this “loosely orthogonal system of basis variables.” The variables discussed in this chapter already exist in testbenches around the world. However, they simply have not been partitioned into the standard framework. And, without this standard partitioning, it is extremely difficult to compare different verification projects or to use these projects to assess risk for future projects. With this set of standard variables it is now possible to determine the size of the functional space in terms of its points and arcs and, therefore, determine what is required to achieve functional closure, that is, exhaustive functional coverage. This will be the topic of the next chapter.
References Bailey B, Martin G, and Anderson T (ed.) (2005) Taxonomies for the Development and Verification of Digital Systems. Springer. Lachish O, Marcus E, Ur S, Ziv A (2002) Hole Analysis for Functional Coverage Data, Proceedings of the 39th conference on Design Automation, ACM Press. Verisity Design Inc. (2002) e Language Reference Manual www.ieee1647.org. Verisity Design Inc. (2005) Spec-Based Verification, whitepaper
Chapter 3 – Exploring Functional Space
The previous chapter described a method of interpretation of naturallanguage specification documents into a standard set of variables and their ranges that is applicable to all possible digital systems. Each unique combination of values of these variables determines a single function point within the functional space. These function points determine the functional space in a mathematically precise manner, allowing us to determine the size of this space and, consequently, determine what it means to achieve functional closure. Value transition graphs provide a technique for visualizing this space in a useful way.
3.1 Functional Closure Much recent discussion in the industry has focused on “functional closure” but without providing a solid, substantive definition for this highly desirable outcome. Intuition tells us that functional closure must exist because, after all, the functionality can be expressed with a finite number of transistors. What does it really mean to achieve functional closure? This term is borrowed from timing domain analysis where closure is a genuine outcome. To reach timing closure is to adjust the placement of transistors and the routing of interconnecting wires and polysilicon such that each and every path propagates signals within a predetermined interval of time so that setup times at destinations are satisfied. Software tools make exhaustive lists of signal paths and then determine whether all paths have no negative slack. When a signal has positive slack, it means that the signal reaches its destination with time to spare, whether a few picoseconds or many nanoseconds. A signal with negative slack reaches its destination too late. When all paths have no negative slack, nothing arrives late. Some texts refer to “functional closure” or “coverage closure,” but while also claiming that the size of the space over which this “closure” is reached cannot be determined. This is not correct, as we will see in the analysis in this chapter. A survey of current literature finds many technical papers and articles that tell us, “A good test plan should list all the interesting test cases to
40
Chapter 3 – Exploring Functional Space
verify the design.” However, there is no objective way to separate “interesting” from “uninteresting” test cases. If so, there would be a software program that could accept as its input these “test cases” in some yet to be defined form and then decide algorithmically whether it is interesting or uninteresting. Clearly, this approach does not seem promising. Similar remarks appear in the current literature regarding coverage, informing the reader to “decide on the interesting data fields/registers … important state machines, … interesting interactions between some of the above states or data.” Again, we have no objective way to make these decisions regarding interestingness or importance. Formal verification (such as theorem provers of some sort) might some day be able to make such decisions. With a complete set of variables and their ranges as described in chapter 2, it is possible to arrive at an understanding of the overall size of the functional space of a given target. Because these variables imply all of the mechanisms (elements) that use them, they may be said to represent the functional space completely. This means that it is now possible to gain a complete characterization of the functional space of the target.
3.2 Counting Function Points The previous chapter explained that each unique combination of values of standard variables determines a single function point within the functional space. In other words, a function point is represented by a tuple corresponding to the location of a point in some large multi-dimensional space, providing us with the coordinates of the point in this space. It is possible to arrive at an exact count P of these points by considering each category of standard variable in turn and how it relates to the other categories of variables. Analysis begins with connectivity. 3.2.1 Variables of Connectivity Variables of connectivity are separated into two categories, variables of internal connectivity and variables of external connectivity. Each variable has a finite range of discrete values that are countable. This is true because these ranges are present in computer programs (verification software tools). In many designs the ranges of variables of connectivity are independent such that every value of some variable of connectivity kA can be associated with each and every value of some other variable of connectivity kB. So, if
3.2 Counting Function Points
41
for example a design has these two variables of internal connectivity kA and kB, then the number of points determined by only these two variables is simply the product of the number of values in the range of each variable. However, for many designs the range of values of certain variables are dependent upon the values of one or more other variables. Some ranges are not simple finite sets of discrete values, but rather must be determined by methodically analyzing dependencies among the variables. Consider the specification for a microprocessor that includes an optional cache controller capable of handling caches of varying size. This defines two variables of internal topology: CACHE_PRESENT and CACHE_SIZE. The range of CACHE_PRESENT is [YES,NO] and the range of CACHE_SIZE is (for our example) [4K,8K,16K]. The number of function points is not simply 2 ⋅ 3 = 6 due to the dependency of CACHE_SIZE on CACHE_PRESENT. When CACHE_PRESENT is equal to NO, then there is no related value for CACHE_SIZE. For that matter, one might argue that the variable CACHE_SIZE does not even exist when CACHE_PRESENT is equal to NO. For notational convenience, we will use the value of ∅ (the null value) in the tuple when the value of the corresponding variable is undefined. So, in this example there are actually only 4 function points whose coordinates are: • • • •
(NO,∅), (YES,4K), (YES,8K), and (YES,16K).
One further example illustrates how the count of function points grows quickly even for a design with relatively few variables of connectivity. Consider a simple design with 3 variables of internal connectivity (ki1, ki2, and ki3) and 2 variables of external connectivity (ke1 and ke2). Assume that all 5 variables are mutually independent. Let’s examine first how many different instances of the target can be built. Let ki1 have 2 values in its range, let ki2 have 3 values in its range, and let ki3 have 4 values in its range. Because all 3 variables are mutually independent, there are at most 2 ⋅ 3 ⋅ 4 = 24 different instances that can be built. Now consider how many different external contexts can surround each of these 24 instances. Let ke1 have 5 values in its range and let ke2 have 6 values in its range. Again, because the variables are independent, there are 5 ⋅ 6 = 30 possible different contexts.
42
Chapter 3 – Exploring Functional Space
Combining our 24 possible instances with our 30 possible contexts, we find that the subspace of connectivity contains 24 ⋅ 30 = 720 distinct function points in our simple example. In other words, it is possible to build 720 distinctly different systems. Now let’s consider the general case. Consider the set X containing all of the values from the range of a variable of connectivity (or any other category of variable, for that matter) and also the null value ∅. The zero norm of X, denoted X , is the number of non-zero elements of X. This zero norm is also known as the Hamming weight of X. For the purposes of analyzing functional spaces, X is defined as the number of non-null elements of X. Assume that there are M different variables in ki (variables of internal connectivity) and N different variables in ke (variables of external connectivity). To arrive at the maximum number of function points Pk in the space of connectivity, we count the number of values in the range of each variable and then take the product of all of these counts. Mathematically this is stated as: M
Pki = number of instances = ∏ X im and m =1 N
Pke = number of contexts = ∏ X in .
(3.1)
n =1
Then, the number of function points Pk in the subspace of connectivity of our design is at most the product of these two products:
Pk ≤ Pki ⋅ Pke .
(3.2)
Any dependencies between ranges of variables will reduce the overall number of possible instance-context pairs, so Pk may usually be less than this upper bound. To illustrate how these dependencies reduce the number of function points, consider the first example in previous chapter that described the standard variables of external connectivity. In that example we had a simple central bus on which 3 different types of agent could reside: from 1 to 8 processors, from 1 to 4 memory controllers, and from 1 to 14 I/O controllers for a total of 16 agents. The simple product of the counts of values in the ranges of the respective variables
3.2 Counting Function Points
43
yields a total of 448 function points (448 = 8 ⋅ 4 ⋅ 14). However, this count includes systems that violate the no-more-than-16-agents requirement. Consider a system with one processor and one memory controller. This particular system can include any of 1 to 14 I/O controllers for a total of 14 distinct contexts. A system with one processor and two memory controllers can include any of 1 to 13 I/O controllers for a total of 13 distinct systems. Of course, it’s unlikely that actual systems corresponding to each and every one of these systems would be interesting in a commercial sense, but the point is that such systems are theoretically allowed within the ranges defined by the requirements. Continuing in this manner (see Table 3.1) we see that 50 (50 = 14 + 13 + 12 + 11) distinct single-processor systems can be assembled when only one processor is included. Accounting for the dependency on the range for I/O controllers for the remaining possible N-processor systems, we see that only 288 possible systems can be assembled. This 288 is the actual number of function points in the space of connectivity, significantly fewer than the 448 calculated as the simple product, but still a large number of possible contexts in which the target of verification must operate correctly. This is the challenge for the IP developer. Table 3.1. Number of possible contexts based on example in Fig. 2.2 proc_agents 1
2
M
7
8
mem_agents
io_agents
1 2 3 4 1 2 3 4
[1..14] [1..13] [1..12] [1..11] [1..13] [1..12] [1..11] [1..10]
1 2 3 4 1 2 3 4
[1..8] [1..7] [1..6] [1..5] [1..7] [1..6] [1..5] [1..4]
M
M
Possible contexts 14 13 12 11 13 12 11 10
M
8 7 6 5 7 6 5 4 Total: 288
44
Chapter 3 – Exploring Functional Space
Accounting for the 4 possible instances of the processor as determined by CACHE_PRESENT and CACHE_SIZE, this yields 1,152 distinct systems, all of which must be verified to function correctly. At first glance this may appear to a rather dire situation for the verification engineer, but, as will be seen later in this chapter, condensation in the functional space can provide significant practical advantages. In general it is not possible to rely on simple closed-form algebraic expressions (such as in Eq. 3.1 and Eq. 3.2) to determine the number of function points in the subspace of connectivity (or any other subspace, for that matter), but rather these function points must be counted algorithmically. This is accomplished in a straightforward manner with nested for loops that examines each tuple of values of variable of internal connectivity (i.e., each instance) in turn and determining which tuples of values of variables of external connectivity (i.e., contexts) are valid, just as was done to create Table 3.1. 3.2.2 Variables of Activation (and other Time-variant Variables) Variables of activation are categorized into three subsets, namely power, clocking, and reset. For any valid system there may be multiple ways that the system can be activated, each described not by a single tuple of values of these variables, but rather by sequences in time of changing values of these variables. These sequences will be analyzed at length later in this chapter, but some preliminary analysis is in order here. Consider the example of the multi-processor design that we have been using so far. In a system of multiple processors one of the processors is typically designated as the monarch processor with the remaining processors designated as serfs. Activation of the monarch precedes activation of the serfs, perhaps by holding each serf in its reset state until system resources such as memory are sufficiently initialized for the serfs to begin fetching and executing instructions on their own. Also, some serf processors might not even be energized until they are needed. And, a given serf might be deactivated to conserve power when it is not needed. Each of these scenarios constitutes a different sequence of values of variables of activation, each sequence composed of multiple function points. Function points corresponding to time-variant variables such as those of activation are not counted as simply as time-invariant variables of connectivity. Constructing the value transition graphs (defined later in this chapter) is necessary before the corresponding function points can be counted.
3.2 Counting Function Points
45
3.2.3 Variables of Condition Variables of condition are also time-variant, but because they are by definition persistent, it is useful to consider the valid tuples of values of condition to get a grasp on the multiplicity of function points corresponding to the many ways in which the system can be made to operate. Variables of condition are separated into two categories, variables of internal condition and variables of external condition. Just as with variables of connectivity, for a given tuple of values of internal condition there may be many valid tuples of external condition. Determining the number of combinations of persistent internal and external conditions can be accomplished algorithmically using nested for loops, just as is done for determining the number of valid systems. A complete analysis of the conditional subspace includes not only direct conditions established during initialization, but must also include indirect conditions that arise during the application of stimuli. 3.2.4 Variables of Stimulus Continuing to variables of stimulus we recognize that they can also be separated into two categories, variables of composition and variables of time. Values of these variables are quintessentially time-variant, and counting the function points corresponding to variables of stimulus is extremely challenging, but not impossible. This procedure will be highly dependent on the design being verified, but is generally a “two-dimensional” problem along the axes of composition and time. 3.2.5 Variables of Response The analysis of the size of the response subspace is identical to that for stimulus. Responses are values produced by the target and have values that vary in their composition as well as in their appearance in time, particularly in concert with values of variables of stimulus. Arriving at an actual count of the function points in this subspace requires creation of value transition graphs, just as for stimulus and other time-variant variables. 3.2.6 Variables of Error Variables of error are also separated into two categories, variables of error in composition of stimulus and variables of error in time of stimulus. In
46
Chapter 3 – Exploring Functional Space
fact, their similarity to variables of stimulus permits us to regard them as extensions of stimuli, and counting the function points corresponding to values of these variables proceeds in the same manner as for stimuli. 3.2.7 Special Cases What about all those special cases? As discussed earlier in the previous chapter, there are special cases corresponding to scenarios and there are special cases corresponding to properties. Until we develop a few additional concepts related to the functional space, we will defer our consideration of this important category. For the purposes of advancing our discussion here, we will simply assume that there are some number of function points associated with these special cases. 3.2.8 An Approximate U pper Bound Earlier in this chapter it was stated that it is possible to arrive at an exact count of the number of function points in the overall functional space of a design. The discussion of variables of connectivity leaned in the direction of closed-form algebraic expressions (Eqs. 3.1 and 3.2), but this direction leads soon enough into a blind alley. Nevertheless, the analyses of each category of variable do provide an approximation for some upper bound on the count of function points. Actual counts of valid tuples of values of variables can be made algorithmically with nested for loops as discussed earlier. Applying this concept in a very general fashion to each category of variable can yield an approximate upper bound on the number of function points. The verification of a design can be modeled along these lines with the following pseudo-code:
}
for each system { /* an instance in a context */ for each activation { /* and de-activation */ for each condition { /* internal and external */ apply stimuli; /* extended with errors */ observe responses; } }
A greatly over-simplified computation of the number of function points yields an upper bound of:
P ≤ Pk ⋅ Pa ⋅ Pc ⋅ Ps ⋅ Pr + Py ,
(3.3)
3.3 Condensation in the Functional Space
47
where • • • • • •
Pk = number of systems, Pa = number of activations and de-activations, Pc = number of conditions (internal with external), Ps = number of stimuli (extended with errors), Pr = number of responses, and Py = number of special cases.
There may be little practical usefulness in performing such a computation (and it may be utterly impractical to do so), but it does serve to illustrate just how large the functional space of a design can be. It is possible to know the exact count of functional points, however, and we shall see eventually how to determine the value of P exactly.
3.3 Condensation in the Functional Space From the standpoint of functional verification, some values of variables are just not as equal as others. Values at upper and lower boundaries enter into logical computations frequently while inter-boundary values usually do not. If we permit the inter-boundary values to condense into a single “meta-value”, we discover that the number of function points shrinks greatly. The size of the condensed function space comprising only boundary values and these meta-values will be a rather small fraction of the size of the entire function space. Focusing on this smaller condensed space is a highly effective strategy for achieving a low-risk outcome with minimal time and resources. A simple example illustrates this principle. Consider the operation of a queue of 1024 entries. The number of function points corresponding to the number of items present in the queue is 1025 (including the value when the queue is empty). However, the associated control logic which handles filling and emptying of the queue is determined by whether the queue might be: • • • • •
empty nearly empty nearly full full every other degree of fullness
48
Chapter 3 – Exploring Functional Space
Having recognized (and applied good judgment) that the first 4 of these 5 values constitute boundaries we can condense the 1025 function points into 4 actual values (0, 1, 1023, and 1024) and one condensed value corresponding to the range [2..1022]. This results in a range of [0, 1, CONVAL, 1023, 1024]. This is, in fact, quite typical in verification code where weights for selection of a particular value are assigned both to individual values and to ranges.1 Additionally, existing testbenches typically contain code for collection of coverage data on ranges of values (as a single condensed value) as well as on individual values. That which is traditionally referred to as a corner case is simply the intersection of one or more value boundaries.2 By adjusting the weights associated with the values in the ranges of variables to select boundary values preferentially, corner cases will be generated much more frequently, and the bugs associated with them will be exposed more readily. Consider the example of our multi-processor system in Fig. 2.2 that was analyzed in Table 3.1. The analysis showed that a total of 288 distinct systems (target in context) could be assembled. However, engineering judgment tells us that condensing the in-between values of this subspace of connectivity will give us more bang for the verification buck. With this in mind, the range of values for proc_agents condenses from 8 possible values to the following 3 values: [1, PROC_CONVAL, 8], where the condensed value PROC_CONVAL will assume an actual value from the range [2..7] during random value assignment. Similarly, the range of values for mem_agents condenses to [1, MEM_CONVAL, 4] and the range of values for io_agents condenses to [1, IO_CONVAL, 16 proc_agents - mem_agents]. Table 3.2 shows how many possible systems can be assembled within the condensed space of connectivity. There are two things worth noting in Table 3.2. First, we note that the number of possible contexts has shrunk dramatically from 288 to 27, recognizing, of course, that we have judged the contexts built with all possible values for x_CONVAL (where x is PROC or MEM or IO) to be equivalent from the standpoint of verification. Second, we note that the actual value assigned for any of the condensed values x_CONVAL is dependent on previously assigned values for other condensed values. Modern verification languages with their powerful constraint solvers handle this random value assignment with ease. Commercial CRV tools already make use of this concept. For example, the e language includes an eventually operator. This effectively represents a condensation in time. 2 More precisely, it is a function arc in the condensed space as will be seen in the discussion of value transition graphs later in this chapter. 1
3.3 Condensation in the Functional Space
49
If this notion of condensation within the functional space seems familiar, that is because the reader has already encountered this concept in Boolean algebra and especially while working with Karnaugh maps. When the value of a particular Boolean variable is “don’t care”, this is equivalent to condensing the individual values in the variable’s range (the binary values 0 and 1) into the single condensed point [0..1]. In other words, the entire range of this Boolean variable has been condensed into a single value. For notational convenience we represent the value of this condensed point as X. We will extend this convenience to our notation for condensation to represent the value at any condensed function point that comprises all values within its range. It is not just the numerically valued variables that can undergo condensation. Many enumerated variables are likewise condensable. Table 3.2. Number of possible systems in condensed space of external connectivity based on example in Fig. 2.2 proc_agents
mem_agents
io_agents
1
1
[1, IO_CONVAL, 14]
Possible systems 3
MEM_CONVAL [1, IO_CONVAL, 15mem_agents]
3
4
[1, IO_CONVAL, 11]
3
PROC_CONVAL 1
[1, IO_CONVAL, 15proc_agents]
3
8
MEM_CONVAL [1, IO_CONVAL, 16proc_agents-mem_agents]
3
4
[1, IO_CONVAL, 12proc_agents]
3
1
[1, IO_CONVAL, 7]
3
MEM_CONVAL [1, IO_CONVAL, 8mem_agents]
3
4
3
[1, IO_CONVAL, 4]
Total: 27
50
Chapter 3 – Exploring Functional Space
Consider variables of composition of stimulus as applied to the instruction set for a processor. There may be many flavors of branch instructions (unconditional branch, conditional branch, branch and save return value somewhere, etc.), but for certain coverage measurements these might all condense to “branch” only. Similarly, memory access instructions might be condensed based on width of access (byte, half-word, word, etc.) or addressing mode (register-based vs. immediate address) and so on. General register files often have a small number of registers that perform unique functions. For example, general register zero could be defined to return a value of 0 for all read accesses and to discard the value written for all write accesses. A second register might be designed to receive the return address of a branch instruction automatically. The remaining registers might then be condensed into a single value representing truly general general registers. It is not inconceivable that future verification tools will be able to scan RTL for boundary values and reflect them back to the verification engineer.
3.4 Connecting the Dots A moment’s reflection reveals that the functional space is characterized not only by this large number of function points, but also by directed arcs interconnecting those whose values can vary with time, in particular the function points in the subspace of condition and of stimulus and response. Their values can vary with time, and the value state of the target advances from one functional point to the next with each successive clock cycle. A given sequence of arcs can be said to constitute a functional trajectory through a subspace of variables. The concept of an arc representing a transition in time from one value to another is probably already familiar to the reader, because they are very similar to the arcs in state transition diagrams for state machines. The values of variables of connectivity typically do not change during the operation of a target in its context. There are notable exceptions such as, for example, the attachment or detachment of a USB cable. Variables of activation do undergo changes in value, but there are typically relatively few variables with fewer values than those of condition or of stimulus and response. In fact, CRV is usually focused on driving the variables of stimulus and, to a lesser extent, of condition from value to value to value, traversing vast numbers of arcs in its attempts to visit (cover) as many function points as possible before tape-out. Clearly, the size of the functional space is more than just the count of points. It must also account for the many arcs connecting them.
3.4 Connecting the Dots
51
State-of-the-art verification tools are not yet capable of counting these arcs, so for now we can only say that the actual size S of the functional space is some function of the number of function points P and the number of arcs A connecting them. It will be seen shortly how the size of the functional space can be determined precisely. We can summarize the relations among the many values of variables in the functional space as follows: • Instantiation of a target in a context (i.e. a system) chooses a particular function point in the subspace of connectivity. • Activation of a system is a particular functional trajectory from inactive to active through the subspace of activation, arriving at some (usually) persistent region of the subspace within which all subsequent activity takes place. Deactivation is a trajectory from active to inactive. • Initialization of a system is a particular functional trajectory through the subspace of condition, also arriving at some (usually) persistent region of the subspace within which all subsequent activity takes place. • Excitation drives the functional trajectory of the target’s responses. We also observe that activation and initialization are sometimes interdependent, the dependencies being mediated by one or more variables of response. Mediation means that progress of some process, such as initialization, is suspended somehow pending the value of some variable of response indicating that the process may be resumed. This process may be activation or initialization or even excitation. For example, plugging some device into a USB port may suspend some activity in the host until the device signals that it is ready for excitation. Mathematically, these relations can be loosely formalized by recognizing that any given response is a function of values of all standard variables:
response = f (errors,stimuli,initialization,activation,system) .
(3.4)
It will be useful in later discussions to refer to the set of responses defined for a target. These responses, as well as each argument to the function f, are themselves each a function of time as measured in clock cycles n. These functions are denoted as follows: • r(n): response • e(n): error
52
• • • •
Chapter 3 – Exploring Functional Space
s(n): stimulus c(n): condition a(n): activation k(n): connectivity (systems of instantiated target and context)
We recognize that k(n) could very well be constant for all n, but hotswap peripherals can bring time into the equation now and then. In addition, the response at clock cycle n+1 might also be dependent on the response at the previous clock cycle n. We can now state that
r(n + 1) = f (r(n),e(n),s(n),c(n),a(n),k(n)) ,
(3.5)
which is just a restatement of Eq. 3.4 except that it accounts for feedback within the target with the argument r(n). Feedback in the target simply recognizes the fact that the response at cycle n+1 may often depend on the value of some response variable during cycle n during which the computations for values at cycle n+1 are made. Eq. 3.5 can be regarded as the transfer function that maps the space of excitation onto the space of responses, given an instantiation of a system and its subsequent activation and initialization. We now define the set of responses (where p represents a single point) as
R = {p | p = f (r(n),e(n),s(n),c(n),a(n),k(n))}, for some n ≥ 0.
(3.6)
We now know the coordinates of every function point in the functional space of the design. The set F of these function points is given by
F = {p | p ∈ (r(n),e(n),s(n),c(n),a(n),k(n))}.
(3.7)
Similarly we can define the set of arcs between members of the set of function points. An arc is represented as the ordered pair (p(n),p(n+1)). The response (tuple of values of variables of response) at clock cycle n
3.5 Analyzing an 8-entry Queue
53
leads to the response at clock cycle n+1. We can now define the set A of arcs among points in the functional space as
A = {( p(n), p(n + 1)) | p(n) ∈ F, p(n + 1) ∈ F, and r(n + 1) = f (r(n),e(n),s(n),c(n),a(n),k(n))}.
(3.8)
Visualizing this complex space can be accomplished using the value transition graph (VTG). Points (tuples of values of variables) in the functional space are represented as nodes in the graph, and transitions from one point to another are represented by directed arcs (arrows). For each clock domain in each instance in each context there exists a separate VTG containing all possible functional trajectories of that clock domain. This important observation will be revisited later in this chapter in a discussion of mathematical graph theory. The example in the next section will illustrate the use of value transition graphs to analyze the functional space. This will give us insights into how a functional space is structured and how it might be condensed to facilitate more efficient functional verification.
3.5 Analyzing an 8-entry Queue Consider a basic FIFO queue of 8 entries. Items can be added to the queue until it becomes full, and items can be removed from the queue until it becomes empty. The queue need not be filled before items are removed and it need not be empty before items can be added again. The response of this queue to excitation that adds items or removes items or does neither can be represented by a variable corresponding to the number of items present in the queue. Call this variable ITEMS_IN_QUEUE. The VTG in Fig. 3.1 illustrates the functional space of this queue in terms of the changes in value of the response variable ITEMS_IN_QUEUE. The value of this variable is shown within the ovals representing the function points of this space. This response variable ITEMS_IN_QUEUE has a range of [0..8]. That is, it can be empty or contain anywhere between 1 and 8 items. From any value except 8 an item can be added to the queue. Once the queue has 8 items, nothing more can be added. Likewise, from any value except 0 an
54
Chapter 3 – Exploring Functional Space
item can be removed from the queue. At any point in time (that is to say, at any clock cycle) an item can be added to the queue (if it is not already full), removed from the queue (if it is not already empty), or nothing can happen (nothing is either added or removed). This queue has 9 function points. It has 8 arcs representing the addition of an item to the queue, 8 arcs for the removal of an item from the queue, and 9 hold arcs for when nothing is added or removed. Therefore, this queue has a total of 25 arcs connecting the 9 function points in its functional space. Of these 9 function points we can observe that 7 values are condensable into the condensed function point [1..7], because every adjacent point in this range has congruent arrival and departure arcs. Fig. 3.2 illustrates the condensed functional space of this 8-entry queue. It has only 3 function points and only 7 arcs connecting them. For this particular target, condensation greatly simplifies the functional space. Now, we can add a mechanism to this queue so that priority is given to draining the queue (removing items) when the number of items in the queue reaches some high-water mark. For our example, the high-water mark is set to the constant value of 6. An indirect condition is established on the following clock cycle when the number of the items in the queue reaches the high-water mark. This indirect condition persists until the queue has been emptied. Even though the priority has been given to draining the queue, we still permit items to be added to the queue. Of course, it could be designed with a much stricter drain policy that prevents the addition of items to the queue until it has been emptied. Our example is not so strict, so even though priority has shifted to draining the queue, items can still be added (until it becomes full, of course). Fig. 3.3 illustrates the uncondensed functional space of such a queue. The pair of values in each oval represents the value of the response variable ITEMS_IN_QUEUE and the indirect condition variable we shall call DRAIN_PRIORITY. Items can be added to the queue (and removed) in normal operation up until the number of items reaches 6. Then the indirect condition is established (DRAIN_PRIORITY == 1) on the next clock cycle, whether an item is added during that same clock cycle or removed or neither added nor removed. This shifted operation appears on the right-hand side of the figure in which the pair of values shows that drain priority has been established. Items can be removed (preferably) or added to the queue, but when it eventually becomes empty the indirect condition reverts back to normal on the next clock cycle.
3.5 Analyzing an 8-entry Queue
55
Three function points with DRAIN_PRIORITY == 0 are condensable and three function points with DRAIN_PRIORITY == 1 are condensable. Condensation removes 4 hold arcs, 4 add arcs, and 4 remove arcs. Fig. 3.4 illustrates the condensed functional space of the queue after this addition of the high-water mark and the indirect condition that establishes drain priority.
Fig. 3.1. Functional space of an 8-entry queue
56
Chapter 3 – Exploring Functional Space
Fig. 3.2. Condensed functional space of an 8-entry queue
3.5 Analyzing an 8-entry Queue
57
Fig. 3.3. Reaching HWM at 6 sets the indirect condition for drain priority
We can also add a low-water mark (at 2 for the purposes of our example) such that when the number of items in the queue reaches this value, the priority reverts back to normal at the next clock cycle. This is illustrated
58
Chapter 3 – Exploring Functional Space
Fig. 3.4. Condensed functional space of queue with high-water mark
in Fig. 3.5. Note that none of the function points are condensable, because no two adjacent points have congruent arrival and departure arcs. This simple example has function points represented as 4-tuples of the values of only 4 variables, each variable having only a few values within its respective range. Real systems will have hundreds or thousands of variables with hundreds or thousands (or more) values in the range of each variable. Fortunately for us, the invention of the computer has made it feasible to handle such vast spaces, in particular sparsely populated spaces.
3.6 Reset in the VTG
59
Fig. 3.5. Queue with low-water mark added at 2
3.6 Reset in the VTG Asynchronous external events have a propensity for occurring at any time, adding complexity to the target and adding arcs to the corresponding VTG. Variables of activation have effects that are usually pervasive
60
Chapter 3 – Exploring Functional Space
throughout the functional space, adding departure arcs to perhaps every function point. Reset is one such culprit. Letting the specification for the 8-entry queue require that the queue be treated as empty when hard reset is asserted, then the VTG in Fig. 3.6 depicts this requirement.
Fig. 3.6. Explicit representation of reset for 8-entry queue
3.6 Reset in the VTG
61
The function point depicted as the dark oval indicates the arrival point for hard reset. This function point is labeled R in the figure. The departure arcs from all other function points corresponding to the assertion of hard reset are shown as headed to the single arrival point R. The arrival point for hard reset is the origin of all functional trajectories through the function space.3 All functional trajectories stem from the origin. The functional space can be condensed greatly for the purposes of depicting the effects of asserting and de-asserting reset. Consider the example in Fig. 3.7. In this example the tuples in each function point are the values of four variables of response VA, VB, VC, and VD. For clarity and simplicity, the variable of activation corresponding to hard reset is not shown. Function point P1 is a condensation of the entire activated functional space, including function point P3. Function point P2 is the arrival point for hard reset, the origin for the target. Because the specification for this target requires that the values of these four response variables have defined initial values VA(0), VB(0), VC(0), and VD(0), the function arc P1→ P2 can be verified by inspection. If there are P function points in the complete uncondensed function space, then there are also P function arcs corresponding to the assertion of hard reset, including the arc that departs from and arrives back at P2 (when hard reset remains asserted for more than one clock cycle).
Fig. 3.7. Hard reset 3
Mathematical graph theory regards such a point as similar to a source, a point in the graph joined only by one or more departure arcs. In our VTG when reset is excluded, no arc arrives at the origin from any other point in the VTG.
62
Chapter 3 – Exploring Functional Space
Fig. 3.8. Soft reset
This verification by inspection is best performed by software, and synthesis software can identify any flip-flops that do not receive the required reset signal(s). Verifying this particular subset of arcs with CRV should be unnecessary. However, a check of the values in the tuple P2 is a prudent measure to guard against any change in these initial values. This verification by inspection of resetability of all flip-flips should be listed as a special case in the verification plan (see chapter 4). To avoid clutter in the VTG for a target similar to the one in the example (all variables have defined initial values), the function arcs corresponding to assertion of reset can be regarded as implied and need not be drawn explicitly. A soft reset can be represented as in Fig. 3.8. Soft reset is similar to hard reset in that every function point has a departure arc leading to function point P2, the arrival point for soft reset. It differs from a hard reset in that the values of certain variables must be retained at the value present when soft reset was asserted. In Fig. 3.8 variable VC has a requirement that its current value (when soft reset is asserted) be retained when the target is re-activated. Function point P2 is the arrival point for soft reset at cycle n, and the value of variable VC is retained at its most recent value VC(n-1). The values of the remaining three variables are returned to their initial values VA(0), VB(0), and VD(0). The arrival point for soft reset is a soft origin of the function space.
3.7 Modeling Faulty Behavior
63
This point is also the origin for all functional trajectories within the function space that stem from assertion of soft reset at that particular time (clock cycle). All subsequent functional trajectories stem from this soft origin.
3.7 Modeling Faulty Behavior A test fails when our verification software or an attentive engineer observes something that appeared to be incorrect. Some value somewhere was different from what the monitoring software was designed to expect, breaking some rule. In other words, either a function point was observed that was not a member of the set of responses R or an arc was traversed that was not a member of the set of arcs A. Call this function point y. We recall that the set of responses is
R = {p | p = f (r(n),e(n),s(n),c(n),a(n),k(n))}, . for some n ≥ 0.
(3.6)
Faulty behavior comprises those function points y such that
y(n) ∉ R, or (y(n −1), y(n)) ∉ A, or
(3.9)
both. If every variable is properly monitored, and this particular point y(n) was the first observation of faulty behavior, then
y(n −1) ∈ R and y(n) ∉ R.
(3.10)
With this information we can theoretically determine what is needed for 100% functional coverage when testing whatever changes are made to eliminate the faulty behavior. We can list all functional trajectories that end at the function point y(n). Assume that there are k such trajectories.
64
Chapter 3 – Exploring Functional Space
Then,
y k (n −1) ∈ R and y k (n) ∉ R.
(3.11)
Moreover, because these k trajectories each end at y(n), then
y1 (n) = y 2 (n) = ... = y k (n) .
(3.12)
Testing the changes for the bug that led to the observed faulty behavior is accomplished with CRV focused such that these k trajectories are forced or observed with our monitors. If for all k
y k (n −1) ∈ R and (y k (n −1), y k (n)) ∈ A,
(3.13)
then we can claim that the faulty behavior has been eliminated. Further CRV testing may be needed to ensure that no new bug was introduced with the changes, however. Additionally, we may choose to consider the coverage over trajectories within a condensed subspace to determine if the risk (probability) of not eliminating the faulty behavior is acceptably low. Having determined that some bug is present in the RTL and that it has manifested itself in the observed faulty behavior, at least two questions remain to be answered. How can other faulty behavior caused by this bug be determined? And, how does one establish that all faulty behavior caused by the bug has been determined? It remains to be seen whether it is possible to answer the second question, but CRV techniques such as sweep testing exist to address the first question. Chapter 4 provides an overview of these techniques in the section Failure Analysis.
3.8 Back to those Special Cases We deferred discussion of special cases because we had not yet developed the concept of VTGs as a representation for the points and arcs in the functional space. We can now undertake to count the function points contributed by this category.
3.9 A Little Graph Theory
65
As mentioned earlier in this chapter, there are special cases corresponding to scenarios and there are special cases corresponding to properties, such as performance. It should be apparent by now that every scenario described in prose text can be described more precisely as one or more VTGs, and these scenarios don’t define additional points in the functional space. They are simply natural-language descriptions of specific function points or trajectories. That doesn’t mean that we simply disregard them, however. It means they are of interest for one reason or another and merit observation. These scenarios have raised some specific doubts on the part of one or more engineers or managers, and confidence is sought that there is no bug associated with the behavior. Perhaps a special case represents some extreme of power consumption, or some new compiler idioms for a superscalar processor still under development, or perhaps an area of functionality that has been notoriously buggy in other projects. If a special case does lead us to defining a new variable not already represented in P, then it means that our initial interpretation overlooked some information in the specification, or that the specification document itself was incomplete or unclear on the subject. The higher-level special cases define properties of the target and do not contribute to the functional space as modeled by vectors of values of variables. These special cases remain in the written record (the verification plan, that is – see chapter 4), because they must also undergo testing to verify the properties described. In both sub-categories (scenarios and properties), our list of special cases does not add function points to the space, so Py = 0. So, we now have a slightly more refined estimate (compared to Eq. 3.3) of the upper bound on the number of function points as
P ≤ Pk ⋅ Pa ⋅ Pc ⋅ Ps ⋅ Pr .
(3.14)
3.9 A L ittle Graph Theory Earlier in this chapter we observed that for each clock domain in each instance in each context there exists a separate VTG containing all possible functional trajectories of that clock domain. In mathematical graph theory a graph consists of points (or vertices) and edges. VTGs are graphs in this same sense.
66
Chapter 3 – Exploring Functional Space
The following definitions apply to graphs: • An edge with direction is called an arc. A graph with arcs between points is called a directed graph or digraph. A graph that includes loops (an arc departing from and arriving at the same point) is called a pseudograph. A VTG is a pseudograph. • The vertex set V(G) of the graph G contains all of the points in the graph. • The edge set E(G) of the graph G contains all of the edges or arcs in the graph. • A walk is an alternating sequence of points and arcs. A trail is a walk through a graph in which all arcs are distinct, but possibly with repeated points. A path is a trail but in which all points are distinct as well. A given functional trajectory through a VTG is a walk through the graph. • The complement G of a graph G contains the same vertex set V(G) as G, but contains all of the edges or arcs not in the graph. In other words, for every edge (x,y) where x ∈ V (G) and y ∈ V (G) , if (x, y) ∉ E (G) , then ( x, y ) ∈ E (G) . • The order |V(G)| of the graph is the number of vertices in the graph. • The size |E(G)| of the graph is the number of edges in the graph. Algebraic analysis of VTGs can help us find bugs in the specification documents or in our definitions of variables, their ranges, and their rules. For a directed graph of n points (vertices), there is an n x n matrix M = [mij] with entries defined by
mij =
1, if pi → p j 0, otherwise.
(3.15)
Such a matrix is called a vertex matrix. Consider the directed graph of 5 points shown in Fig. 3.9.
3.9 A Little Graph Theory
67
Fig. 3.9. A directed graph with 5 points (vertices)
The vertex matrix M corresponding to this graph is given by
⎢ ⎢ ⎡0 1 0 0 1⎤ 0 0 1 1 0
M= 1 0 0 0 0. 1 0 0 0 0 ⎣0 0 0 1 0⎦
(3.16)
Linear algebraic techniques can reveal much about a graph. For example, to determine the number of 2-arc walks between points i and j, one simply computes the matrix product M 2:
⎢ ⎢
⎡0 0 1 2 0⎤ 2 0 0 0 0
M2 = 0 1 0 0 1 . 0 1 0 0 1 ⎣1 0 0 0 0⎦
(3.17)
68
Chapter 3 – Exploring Functional Space
This result tells us, for example, that there is only one 2-arc walk from P1 to P3 (P1 → P2 → P3) but that there are two 2-arc walks from P1 to P4 (P1 → P2 → P4 and P1 → P5 → P4). The reader can confirm the other entries in M 2 by inspecting the graph in Fig. 3.9. Similarly, the number of 3arc walks is given by M3. In general, the number of n-arc walks is given by M n. A knot in a graph constitutes deadlock or livelock. If a VTG contains a knot, then there is a bug (error) somewhere: either in the VTG we have derived, or in our interpretation of some document, or in one of the specification documents itself. If some observed faulty behavior consists of an arc between existing function points, this arc is part of the complement G of the value transition graph G. We now have the mathematical tools needed to compute the actual size of the functional space for a given clock domain of a given instance in a given context. For a given VTG G the number P of function points is given by
P = V (G),
(3.18)
and the size S of the functional space is given by
S = E (G) .
(3.19)
3.10 Reaching Functional Closure We have seen that the overall size of the functional space is typically very large. As RTL is simulated, it can be said to pass from function point to function point as time advances and values of variables change. There are usually many arcs leading to a given function point and until each and every arc has been traversed, true functional closure has not been achieved. Referring to mathematical graph theory we see that the size of the functional space is equal to the size of the VTG, and this size is defined as the number of arcs in the graph. Returning to our example of an 8-entry queue will help understand what it means to reach functional closure. The reader can confirm by reviewing the VTGs in Appendix 1 and counting up the arcs between values, that the
3.10 Reaching Functional Closure
69
8-entry queue with programmable high- and low-water marks has a total of 756 arcs. There are no more and there are no fewer.4 After traversing each of these arcs between one set of defined values and another, there are no other arcs to traverse and continued simulations are redundant and unnecessary. Every possible transition of defined values has been demonstrated at least once. All possible transitions of defined stimuli have been applied to the target and it has been confirmed to produce the correct value transition every time, as have all defined responses, under constant observation by our watchful monitors. Genuine functional closure, in the same sense that we say we reach timing closure, means that each and every arc to each and every function point has been traversed. In other words, just as timing closure means that every timing path has been verified (via simulation) to propagate its signals in the required time, so then functional closure means that every function arc has been verified (via simulation) to lead to its required successor value. In the example of an 8-entry queue, functional closure requires traversing 100% of the 756 function arcs, resulting in visiting 100% of the 266 function points. For practical purposes, it is usually more useful to refer to closure over distinct morphs of a given target. Those who dare to undertake functional verification are faced with a colossal state machine with a stupendous number of arcs connecting the states, presenting the noble engineer with extraordinary complexity to be verified. Claiming functional closure after checking off every item on a list of things to do is, at best, misleading, if not outright false. Such lists, while useful, lack sufficient structure and analytical rigor to form a basis for comparing the universe of conceivable ICs from the standpoint of risk. Functional closure is not yet commercially practical, however, and as we shall see, it is also not commercially necessary provided that we are willing to accept some risk at tape-out and at other critical junctures in bringing product to market. Exhaustive functional coverage is theoretically possible, but commercially impractical. Fortunately, as we will see in chapter 7, exhaustive coverage is not necessary to achieve the risk goals established for a project.
4
To keep the example sufficiently small, no variable associated with activation of the queue has been included.
70
Chapter 3 – Exploring Functional Space
3.11 Summary With standard variables it is now possible to determine the size of the functional space in terms of its points and arcs and, therefore, determine what is required to achieve functional closure, that is, exhaustive functional coverage. By analyzing a simple queue in detail we have learned that the function is in the action, that is, in the arcs that carry the target forward through the correct and complete functional space. Functional closure means traversing every arc in the functional space and, thereby, visiting every point. Now that we have a technique to analyze a given functional space, we can apply this same technique to standards compliance tests, such as for USB or PCI-Express, to determine their functional coverage exactly. The outcome of such analysis may be surprising. We have also seen how condensation of inter-boundary values in this vast functional space enables us to focus our efforts on corner cases. Exhaustive functional coverage of a condensed space is much more feasible than coverage of the entire functional space. Nevertheless, even the condensed functional space can be prohibitively large for commercial applications. We will revisit this issue in our discussion of risk assessment in chapter 7. The alert reader might wonder at this point how we have come this far without any discussion of such familiar verification elements as bus functional models or checkers or scoreboards. These constitute models for digital components whose many behaviors are determined precisely by the standard variables and their ranges. The existence of these elements and their capabilities are implicit in the definitions of the variables. In fact, these elements will each use subsets of these variables in their implementation. For example, a BFM (bus functional model) will use variables of stimulus to determine what transactions to generate, what each will contain, and when to generate it, possibly as a result of some response from the target (acknowledge one transaction, generate another). Chapter 4 will explore the relationships between these standard variables and the verification software that relies upon them.
Chapter 4 – Planning and Execution
The previous chapter discussed exalted and lofty notions of functional spaces of function points connected by directed arcs over which functional closure can be computed. This chapter will bring us back down to earth with discussion of how useful results can be produced within the standard framework. The discussions in this chapter will be of use to all managers and project leaders. The manager tackling a verification project for the first time will benefit from the basics described in this section. The seasoned verification manager will recognize familiar concepts and principles and perhaps discover some new ones. This chapter will also define a sparse but sturdy skeleton for structuring a verification project by defining rational milestones that pertain to all verification projects.
4.1 Managing Verification Projects Planning and executing a verification project is not merely a theoretical exercise, flexing intellectual muscles with weighty concepts. With rare academic exceptions the goal is to produce some useful device in some limited time and on some limited budget and with limited resources. The manager must make judicious and productive use of these limited resources. This chapter provides guidance for planning and executing successful projects. Much of the discussion relates to the standard framework, but other unrelated (but helpful) topics will be discussed as well. It’s important to realize that verification – and CRV in particular – does not take place in a vacuum. The RTL is not some remote black box whose functionality must be verified at a distance. Design reviews and code inspections are essential ingredients of good verification strategy. They should not be regarded as optional. An important assumption underlying the CRV strategy relates to the distribution of bugs in the target. Experience tells us that bugs exist despite
72
Chapter 4 – Planning and Execution
our most earnest efforts to create perfect RTL. Each generation of digital devices brings greater and greater complexity to the engineering team, and specifications change during the design process. The unrelenting schedule pressure will encourage engineers to take shortcuts that might otherwise not be worth taking. Haste makes waste. We naturally expect bugs to be distributed among corner cases and new clever functionality. CRV is a powerful technique for exposing these bugs. So, what is the assumption underlying this powerful technique? It is assumed that bugs are due to human errors of commission or omission and not due to deliberate intent (vandalism or sabotage). If it were possible for such a bug to escape the diligence of code inspections, it would be necessary to achieve 100% functional coverage to ensure that no bug had been deliberately added to the RTL.
4.2 The Goal As discussed in chapter 1 the regression suite with the mechanisms to apply it is the tangible product of verification. This regression suite must: • provide evidence of thoroughness of exercise of the target (a.k.a. functional coverage). • provide protection against the introduction of bugs with future enhancements or other changes (e.g. to eliminate a bug). In effect the verification software constitutes a reference machine that produces independently the results expected from the target for comparison. If the results agree, they are both presumed to be correct. If they disagree, one or the other or both has produced an incorrect result. The verification plan details the work needed to produce this regression suite. Business imperatives will determine the acceptable level of risk for unexposed bugs at the various stages of development, from tape-out for first prototypes to ramping manufacturing for quantity shipments. No integrated circuit development, with the possible exception of those undertaken for academic purposes, is performed in the absence of competitive market pressures. But, even academic undertakings cost real money, not to mention the costs to academic reputation.
4.3 Executing the Plan to Obtain Results
73
Fig. 4.1. Planning the Verification Project
4.3 Executing the Plan to Obtain Results Verification projects may differ widely in the way they are organized and the way they make progress towards tape-out. Nevertheless, there are similarities among these many various projects that can provide useful checkpoints or milestones against which to gauge progress. Fig. 4.2 provides a conceptual skeleton – sparse but sturdy – for verification projects. Most dependencies, both those with the coding project and those within the verification project, can be tied to some key milestone between major project phases in the skeleton.
74
Chapter 4 – Planning and Execution
Fig. 4.2. Project Execution
We can regard any verification project as having three distinct phases: preparation, construction, and revision. Phases are defined from the point of view of the RTL so as to focus our attention on the overall objective: a profitable IC or system. 4.3.1 Preparation Ideally interpretation of the specifications will precede project planning (and coding, for that matter), but in practice these types of work will overlap to some extent, with iteration on specification documents and other project plans before the final go-ahead is given by upper management and the project is officially started. 4.3.2 Code Construction During the Initial Coding phase of the project functionality is added, both to the RTL and to the verification environment, and both are guaranteed to have bugs of omission (unfinished modeling of functionality). Nonetheless, as modeling within both the RTL coding team and the verification team adds brick upon brick in the testbench, each contribution is simulated with the other in order to debug the verification environment. At some point in time sufficient coding has been completed such that it becomes possible for the target RTL and the testbench to perform basic functions in co-operation with each other. This is usually an exciting time for the project because it is the very first time that the team’s created offspring shows signs of life. The target can be activated by the testbench and
4.3 Executing the Plan to Obtain Results
75
participate in, perhaps, a single simple atomic write and read transactions on a bus. It is desirable to preserve this always-working model as the project progresses from this point forward. Subsequent check-ins of code, whether for the RTL or for the testbench, should ideally not cause this most basic of functionality to break (exhibit faulty behavior). Thus, the regression suite has been born, too. This is truly an important milestone: • Initial Coding (IC) done: an instance of the target interoperates with the testbench and performs some basic function, i.e. produces correct responses to selected basic excitations. The value of achieving this particular milestone (IC done) is that it represents a particular synchronization of the RTL team with the verification team. It’s usually the case that either the RTL or the testbench will precede the other in working functionality. When the code that lags the other catches up, a working target in a working testbench (containing a model of the context) has been brought to life. Effective project management will facilitate cooperation between the two teams such that working functionality is continuously added to both the RTL and to the testbench, enabling the regression suite to grow along with these two elements. This all takes place during the Final Coding phase of the code construction effort. At some point in time the RTL team will declare that all specified functionality has been coded in the RTL. This is the official end of the construction effort for the RTL, and subsequent changes will be made to revise the RTL to fix bugs and to accommodate requirements for layout and timing. This is also a good time to begin tracking bugs formally. This is the second important milestone: • Final Coding (FC) done: all defined and intended functionality has been coded. At this point all bugs recorded in the interpretation of the specification should be resolved, and the associated interpretation should be correct and complete. Also, at this milestone all sockets for all instances of targets should be in place. That is, all I/O ports should be coded in the instances and in the components of verification software that drive ports with values. From this point forward replacing instances with updated versions should be straightforward, permitting verification to proceed on whatever instances are delivered by the RTL team for the targets.
76
Chapter 4 – Planning and Execution
It may be desirable to track construction of the RTL and of the testbench separately, having a separate IC done and FC done for each. This will be, of course, at management’s discretion as necessary to manage the development most effectively. There is no one right way to do it. 4.3.3 Code Revision Construction phase takes an element (RTL or testbench) through IC up to FC, after which the element enters its revision phase. Reaching the FC done milestone is an important trigger for detailed place-and-route activities, detailed timing and power analysis, and other layout commitments, such as pad placement. There is, therefore, an inherent urgency towards achieving this level of development in the target. A contravening need to minimize post-FC changes in the target as layout progresses will result in tradeoffs, whether consciously or unconsciously, between speed and accuracy as requirements are translated into code. It is usually worthwhile to begin collecting bug-related metrics only after the FC milestone has been reached. As long as the RTL requires additional coding to implement all functionality, faulty behavior is likely to be common during simulation. However, once code construction is complete, subsequent changes will still be needed to eliminate any observed faulty behavior, to accommodate layout and timing requirements, and (all too frequently) to incorporate changes to the specifications for the design. Regardless of the reason for the changes, it remains highly desirable to retain the always-working model of project management so that progress is ensured (regress is avoided by ensuring that the regression suite always passes with each set of changes). This is also the phase in the project where tracking coverage metrics should begin. It’s clear that measuring code coverage during the code construction phase will not truly indicate how well the test environment is able to exercise the target. However, it’s worth giving code coverage tools a trial run or two well enough in advance so that the coverage-measuring process (allocation of simulation bandwidth, availability of tool licenses, etc.) is working sufficiently smoothly so as not to become burdensome. 4.3.4 Graduated Testing Achieving the IC done milestone means that the design has finally shown signs of life. In fact, getting the target to respond to the simplest activation and excitation is a typical first goal for the RTL and verification teams.
4.3 Executing the Plan to Obtain Results
77
Thereafter, simulations become successively more comprehensive in terms of the functionality being exercised. This graduated approach to simulation is a very common practice in the industry and has been shown to be a very productive strategy for getting the target to function correctly and completely. Here are two examples of graduated testing as described by verification teams in two different companies. One whitepaper (Talesara and Mullinger 2006) describes three phases for testing: specification-driven testing which is effectively just getting the target to perform its atomic transactions (i.e. reaching IC done); bugdriven testing which unleashes CRV on a broader scale to flush out bugs from a relatively stable target; and coverage-driven testing where coverage is measured and evaluated to drive testing into uncovered functionality. Another team (Francard and Posner 2006) describes three “layers” of tests: layer 1 containing directed tests for each of the atomic transactions acting singly; layer 2 containing CRV tests; and layer 3 containing actual code from software (compiler idioms, diagnostics, interrupt handlers, DMA, etc.) that would access their design. The general path of project execution described in these sections will be an underlying assumption for the purposes of the discussions in the remainder of this chapter. However, because individual projects nearly always meander from any pre-defined path, plans will undergo modification, sometimes frequently over the course of the project. There are many ways to get from planning to tape-out. 4.3.5 Bug Fixing At this point in our discussion a few remarks are in order on the topic of “fixing” bugs. The experienced engineer or manager will surely have a wealth of anecdotal evidence for the frequently ineffective “bug fixes,” partly as a consequence of the severe pressure to make changes rapidly (haste makes waste), but partly due to incomplete analysis of the underlying causes for the observed faulty behavior. Well-designed changes are easier to maintain than quick fixes that tend to accumulate quickly, threatening to become a tangle of fix upon fix, increasingly difficult to maintain with each successive modification. A thorough understanding of the functional trajectories that lead to faulty behavior is extremely useful in the design of changes to eliminate the faulty behavior. More on this topic will be found in the subsequent section in this chapter on Failure Analysis.
78
Chapter 4 – Planning and Execution
It’s worth noting that if a tape-out has already taken place and bugs have been found in the hard prototype, it will be necessary to re-enter the revision phase, repeating as necessary as more tape-outs are needed to achieve production worthiness. And, it repeats in parallel with a very similar project using a hard prototype.
4.4 Soft Prototype and Hard Prototype Prior to tape-out, verification is conducted via simulation or emulation. Simulation via software provides the greatest flexibility in terms of control and observability but at a slower speed than emulation (using, for example, an FPGA to model the device). The instance verified via simulation or emulation can be considered to be a soft prototype, because changes can be made relatively quickly to eliminate defects or to incorporate enhancements. An FPGA-based prototype might also be called a firm prototype in the same manner that software embedded in a ROM is referred to as firmware. After tape-out, verification continues with the actual silicon device. This can be considered to be a hard prototype because changes are relatively time-consuming (and expensive) to make. The variability within the new device’s context is usually strictly limited by the capabilities of the other components to which the device is connected. However, commercially available or custom exercisers are able to produce specified variability on demand. The data collected during simulation and also during operation of manufactured prototypes can be subjected to detailed failure analysis to identify the changes necessary to eliminate the faulty behavior without introducing undesired side effects. It is necessary to observe the values of all variables in the soft prototype, but this is usually not feasible in the hard prototype. The design for verification must enable sufficient observability to facilitate diagnosis of problems found in the hard prototype. This chapter will focus on verification of the soft prototype. However, standardized verification applies equally to hard prototypes as well.1
1
Bailey defines these two different types of prototypes as “physical” and “virtual”. However, verification of processors and processor-based systems will usually need to deal with virtual vs. physical memory. To avoid clumsy discussions of a virtual prototype of physical memory (or a physical prototype of virtual memory, etc.) we use the terms soft and hard, as in software and hardware.
4.5 The Verification Plan
79
4.5 The Verification Plan There are as many ways to organize the document containing the verification plan as there are contributors. However, a structure that includes the sections shown in Fig. 4.3 will meet the needs of most projects. The plan should also include other well-established design practices that expose bugs, such as code inspections and design reviews. The plan could even impose requirements that verification of a defined body of RTL not begin until after design reviews and code inspections have been completed. These techniques are known to be rather efficient means for discovering bugs in code, but will not yield a regression suite or good, hard data for quantitative risk assessment. Finally, the plan should be regarded as a “living” document that will be revised as new knowledge comes to bear on the planning process. In the words of Field-Marshal Helmut von Moltke, “No plan survives contact with the enemy,” – the enemy in our case being the complexity of the design to be verified. The assumptions that underlie the initial plan will eventually be found to be more or less accurate and the plan should be adjusted accordingly. Expect to revise the plan to account for: • Discovery and invention: As the project is executed many discoveries will be made and inventions will be realized. Tools will work better than expected or perhaps not work as well as advertised. Leveraged designs (including commercial IP) may be buggier (or less buggy) than expected. As significant discoveries are made, the plan should be brought up to date to incorporate their effects. • Learning: As the verification team gains familiarity with the target and the verification software, the effectiveness of the team’s work might grow sufficiently to merit revision of the plan. Similarly, increased familiarity with the specification documents may indicate that the interpretation was incomplete and that additional variables must be defined. This, in turn, may require revision of the verification plan. • Change from outside: Industry standards undergo sometimes frequent revision, especially early in their definition. Competitive pressures often require changes to marketing requirements. Business conditions fluctuate (for better or for worse), and staffing levels and budgets adjust to reflect these changes. It should be understood that the verification plan may need revision to account for such changes.
80
Chapter 4 – Planning and Execution
Fig. 4.3. The Verification Plan
4.7 Clock Domain Crossings (§ 1)
81
4.6 Instances, Morphs, and Targets (§ 1) The verification plan2 for a target defines which specific instances (if not all of them) are to be verified and which (if any) are to be excluded from verification. In addition, the plan might define some morphs for opportunistic inclusion (if things go particularly well) or for possible exclusion (if things go the way they usually do). The availability of resources and business conditions (such as time-tomarket) will usually require the verification manager to narrow the focus of the verification plan to meet these practical constraints. The plan should identify which aspects of the design will be subjected to verification and which aspects will not. Typically, a commercial device is intended for use in many contexts, and the verification plan defines which combinations of instance and context are included (or opportunistically included or excluded, etc.). Many such combinations are possible, and the judgment of the verification manager will determine the combinations on which to focus the available resources. Additionally, some functionally might be designated as optional or opportunistic (for example, performance counters) if it is not considered as vital to the success of the product.
4.7 Clock Domain Crossings (§ 1) Most modern ICs are designed with multiple clock domains, because they incorporate interfaces whose protocols are defined for specific and differing clock frequencies. Processors are designed to operate as fast as possible, whereas I/O controllers must interact with the physical world of cables and connectors and mechanical contraptions that operate much more slowly. This requires that signals, both data and control, be synchronized between the sending domain and the receiving domain. Fortunately, this design issue is quite well understood and modern EDA tools are available to ensure that synchronization logic is structured correctly. Synchronizing a control signal is accomplished with a pair of special metastable hardened flip-flops clocked in the receiving domain. Datapath signals are synchronized using a synchronizer MUX (sometimes called a D-MUX, MUX synchronizer, or sync MUX) that retains a synchronized version of the datapath signals in a “holding loop” under control of a synchronized control signal. 2
In this section and in the sections that follow, the part of the verification plan to which the section relates is given in parentheses in the section heading.
82
Chapter 4 – Planning and Execution
However, simply inserting a pair of metastable hardened flip-flops is not sufficient to prevent synchronization-related bugs. The design of such logic is outside the scope of this text but the verification engineer should be cognizant of the relevant design principles: • • • •
avoid divergence (fanout) in the crossover path avoid divergence of metastable signals (between the two sync flops) avoid reconvergence of synchronized signals synching from a fast domain into a slower domain must ensure that the value of a signal in the fast domain is held sufficiently long to be captured by the slower domain • gray encoding for multi-bit signals, such as a pointer into a FIFO Modern EDA tools provide 5 checks to ensure correct synchronization: 1. source data stability 2. destination data stability 3. MUX enable stability 4. single-bit check (for gray encoding) 5. handshake check (but requires intense user intervention because automatic analysis of signals involved in a handshake protocol is not trivial It’s important to remember that separate clock domains within a verification target are distinct individual sub-targets. As such they should have a carefully defined protocol governing the exchange of signals between the domains. Many commercially available tools such as LEDA have built-in rule checks for signals that cross clock domains, simplifying the detection of bugs in the code. There are well-defined sets of rules and design principles that yield correct synchronization of signals across domain boundaries and CRV may not be necessary to verify correct synchronization. The Cadence Encounter Conformal CDC capability, for example, performs structural checks (sCDC) and functional checks (fCDC). The handshaking protocol mentioned in the list of checks above merits the definition of internal stimulus and response variables with the
4.9 Interpretation of the Specification (§ 1)
83
relevant rules (and possibly also guidelines) that govern their behavior. Thoroughly exercising this handshake and measuring coverage of this small functional space is essential to ensuring proper behavior. Of course, design reviews and code inspections play a vital role as well. There are also higher-level effects that will require verification. The dynamic behavior of a target with multiple clock domains will often vary greatly when clock frequencies change. Queues will fill and empty more or less quickly. Wait states on shared buses will be more or less prevalent. Interval timers will fire at different times for domains in a different clock domain. And so forth.
4.8 Verifying Changes to an Existing Device (§ 1) Another situation that merits refining the ranges of variables is when relatively small changes have been made to the target, such as to eliminate the few bugs that escaped into first silicon or to add a “minor” enhancement. If it can be determined to which variables the changes are sensitive, values from the ranges of these variables can be chosen randomly while holding others constant. Of course, human error can be rather unpredictable, so an element of engineering judgment must come into play in deciding how to verify such “minor” changes.
4.9 Interpretation of the Specification (§ 1) This first step of the planning stage is analogous to the word problems dreaded by young students of arithmetic, but is readily accomplished with persistent and meticulous analysis. It consists of analyzing the functional requirements of the target within the analytical framework defined in the previous section.
84
Chapter 4 – Planning and Execution
Fig. 4.4. Interpreting specification documents
A given target is usually defined by multiple documents such as one or more industry-standard specification documents as well as a chip specification document. The industry-standard specifications define all possible devices that can be built to conform to the defined functionality. That is, it is interpreted via the standard framework into all standard variables and their complete ranges. The chip requirements document often defines the constraints on the ranges of the variables from the broader industry-standard specification document and defines additional chip-specific variables and ranges. Any of these documents are subject to change and the results of interpretation must be updated as each document is updated. The first pass of interpretation will probably not yield a complete analysis, and provisions (time) should be made for subsequent discoveries and
4.9 Interpretation of the Specification (§ 1)
85
error-correction, and for accommodating mandated changes such as might come from a standards-setting body. The result of this effort (reading the various documents that define the design, discussing them with team members) must yield an exhaustive list of all variables with precise definitions, exact ranges, and clearly defined relations between dependent ranges and independent variables. The category (subspace) of a variable is determined by whether its value will be used while creating the instance, the context, the activation, the operation, or the excitation. Likewise, this effort must also yield an exhaustive list of rules (how the design must behave) and guidelines (how the design should behave). Wellwritten specifications will contain keywords that indicate which are which. Words such as “must”, “will”, and “shall” indicate that a rule is being described, whether explicitly or implicitly. Words such as “should”, “recommended”, “preferred”, and “optimum” indicate that a guideline is being described. Additionally, this step will very likely expose bugs (contradictions, ambiguities, omissions, etc.) in the specification itself. Clarification and agreement on such matters should be obtained as early as possible in the development to facilitate more rapid project execution and consequent faster time-to-market for the device. Finally, specifications will change, whether to eliminate bugs or to incorporate enhancements. In particular, industry standards are often revised on schedules contrary to the needs of the many teams around the world who are attempting to realize devices compliant to these specifications. Consequently, it’s vital to associate all variables and their ranges, and all rules and guidelines with a particular version of the specification. When revised specifications are published, another pass at interpretation will be needed to bring the resulting sets of variables, rules, and guidelines up to date with the revisions. The results to be obtained from a thorough interpretation of the design specifications are as shown in Fig. 4.5:
86
Chapter 4 – Planning and Execution
Fig. 4.5. Outcome of interpretation of the specifications
4.10 Instrumenting the Prototype (§ 2)
87
Unresolved bugs listed in the interpretation mean that the rest of the interpretation is still subject to change and may be incomplete or contain errors. A bug in a document may be anything that is: • incorrect, • incomplete, • unclear (vague, contradictory, in need of an illustration or example, etc.), or • illegible. Internal documents are those that are under the control of the organization developing the IC or system. Such documents are, in theory, more readily brought up to date, because the decision-makers are all members of the organization responsible for content of the documents. External documents are those that are not under control of the organization developing the IC. Such documents include industry standards and specifications for devices with which the IC must inter-operate. Such documents are usually not readily brought up to date because external organizations must make whatever changes are needed and then republish the document. When all documents have been fully interpreted into their variables and ranges with their rules and guidelines, planning can begin (or continue) with a complete knowledge of the complexity at hand. Planning the project should consider • • • •
architecture for verification software, instrumentation of the target(s), processes for applying tests across available compute resources, and scope, schedule, and resources.
4.10 Instrumenting the Prototype (§ 2) During regression testing it is necessary to observe and record values of all defined variables so that the correctness and completeness of the target can be determined. Rules and guidelines governing the variables are coded as assertions. Assertions are used not only to check rules derived from design specifications, but also by RTL engineers to ensure that correct design principles have not been violated. For example, a designer of a block that receives a set of one-hot enable signals could write an assertion on these signals that, indeed, only one signal
88
Chapter 4 – Planning and Execution
is logically true at any given time within a window of validity. Similarly, ensuring that there are no drive fights on shared signals can be accomplished by writing appropriate assertions. A variety of tools are available to assist in the creation of assertions and a recent book sets forth a standard for the use of assertions in SystemVerilog (see Bergeron et al. 2004). But, there is one drawback to reliance on assertions: they are not synthesized and they will not be present in the hard prototype. Consequently, the design of internal logic to facilitate coverage analysis and failure analysis (i.e., diagnosability) as well as collect coverage data should be designed into the device. The design requirements for these capabilities (which might constitute a separate morph of the instance) must be defined jointly between the verification team and the device design team. The degree of instrumentation needed will depend, of course, on the functionality of the target and, when the time comes to verify the hard prototype, how readily the values of variables can be observed at the pins of the device containing the target. During CRV the responses of the target are monitored continuously and checked against expected values. Checking is performed in two different orthogonal dimensions: value and time.3 Value checks are those that compare an observed value with an expected value, computed in some manner by the testbench. Temporal checks are those that compare the arrival time of some value with an expected arrival time, computed in some manner by the testbench. It is useful to consider the many types of checking that must be accommodated by the instrumentation for the soft prototype and for the hard prototype. These many types of checking may be grouped into the following levels of abstraction: 1. Expected results: This level of abstraction is the most basic check, whether values generated by the target match or agree with values generated independently by the testbench and are the values produced at the time expected by the testbench. The testbench constitutes a reference machine (of a sort) in that it must know in advance what values to expect from the target and inform the verification engineer when the target’s results do not agree with the testbench’s results. As such, this is perhaps not so much a level of abstraction for checking,
3
The concepts of function points and function arcs as defined in chapter 3 are already present in commercially available verification software in the form of these two types of checking.
4.10 Instrumenting the Prototype (§ 2)
89
but rather a broad definition of checking relevant to all levels of abstraction. 2. Protocol: This level of abstraction refers to the particular way in which information is communicated to and from (and within) the target, more or less disregarding the information being communicated. Checks for Ethernet protocol, for example, are concerned with packet size and bit-stuffing and so forth necessary to move a payload from here to there. The payload (the information actually being communicated) is checked at the next level of abstraction. One class of protocol always requires detailed monitoring: handshake protocols at clock-domain crossings (CDCs). This will be discussed in greater detail later in this chapter. 3. Transport: This level of checking is concerned with movement of information across the target. Are bits received over the Ethernet cable properly re-assembled to represent the structure of the information sent from the other end of the cable? Does an item removed from a queue remain unchanged when it is eventually removed? Scoreboards, in which expected data are placed in a holding area (the scoreboard) so that when the target finally generates its received copy, it can be checked against the copy of expected data in the scoreboard, are often used for this level of checking. 4. State coherency: Many targets have distributed state that must remain in agreement, such as the individual caches in a multi-processor system. 5. Transformation of data: Many targets transform data in a manner that transcends basic checking. A JPEG encoder performs an elaborate transformation of data, for example. 6. Properties: These are checks that usually transcend verification with a testbench and CRV. Properties (discussed in chapter 2) such as performance and arbitration fairness will typically be verified through other means and typically in the hard prototype. Properties of compliance to industry standards are certainly checked at a protocol level, but also at a property level in a hard prototype or firm prototype (FPGA). Running a compliance suite in a software testbench environment is usually impractical. At some point in the construction phase of the project, the logic that will provide sufficient visibility into the internal behavior must be defined. Designers of individual blocks might each provide a list of signals that are the most useful in diagnosing faulty behavior on the part of the block. If the morph provides some means to capture the values of N signals and if there
90
Chapter 4 – Planning and Execution
are B blocks in the target, then each block might be allocated N/B signals for diagnostic capture. One mechanism that is always needed is one that can impose defined errors on stimuli, so that this functionality can be verified and to facilitate testing of system responses to erroneous stimuli received by (or generated by) the verification target. Other mechanisms that are useful for verification of hard prototypes include: • Exerting control internally on demand of software or firmware by reserving a range within a processor’s address space for verification control registers. Read/write actions on these registers can exert control over internal signals or enable observation of internal signals. • Queue throttling, in which the length of a queue can be temporarily redefined to be very small (perhaps having a length of 1), can be very helpful in diagnosing test failures in pipelined processors, prefetching logic, and any other subsystems dependent on proper queuing behavior. • Special cases that require monitoring may require including additional logic to facilitate collecting data related to the cases to monitor.
4.10.1 An Ounce of Prevention (§ 2) There are a few rather well known techniques that can be applied to prevent unnecessary full-layer tape-outs to achieve functional parts. One widely used technique is the inclusion of spare gates in the layout with connectivity to inputs and outputs available at metal layers. They are not connected to any part of the target’s logic, but they are present in the layout. These spare gates are powered but not activated, their inputs being tied to a constant logical 0 or 1 to prevent undesired electrical side effects. Then, if a bug is exposed in silicon, it may be possible to make use of advantageously located spare gates to implement whatever changes are necessary. This avoids the expense and schedule delay associated with a fulllayer tape-out (to lay down more transistors). Another technique is similar to spare gates in that control logic is implemented using a programmable logic array (PLA). These elements can undergo re-wiring to implement logical changes without resorting to a fulllayer tape-out. Another preventive step is referred to as holding a split lot at metal. A lot (a particular quantity) of wafers is normally processed as a batch. By
4.11 Standard Results (§ 3)
91
diverting some fraction (such as half) of the wafers from the production line and holding them “at metal” (before metal layers have been laid down), bugs in silicon can be fixed by making metal mask changes only to rewire the transistors already implemented on these held wafers. This avoids delays associated with having to start with unprocessed wafers, saving time in getting silicon back from metal-only changes. Arrangements for holding wafers at metal must be made in advance with the processing factory. Some remedial measures are also available but they can be expensive. Using a focused ion beam (FIB) to remove layers of oxide and metal and deposit metal to rewire existing transistors (such as for spare gates) is occasionally used to try out a set of changes before a tape-out. The yield of this expensive process is rather poor and it may be necessary to sacrifice several parts to get one that works. During the design for verification it is worthwhile identifying those “tricky” features that might need to be disabled if they do not work properly. A novel instruction reordering mechanism in an advanced superscalar processor that seemed so nifty during simulation might make silicon prototypes useless for subsequent development work, such as debugging the related operating system or compiler or optimizer. By providing a means to disable this new functionality, downstream development work might be able to continue while the IC team debugs it. Finally, it should be acknowledged that not all bugs in a digital device must necessarily be fixed. Sometimes it is possible to work around the bug with software of firmware that avoids evoking faulty behavior from the device. If the workaround doesn’t have intolerable side effects, it might be decided to update the device’s documentation to reflect these new usage requirements.4
4.11 Standard Results (§ 3) As described in the verification plan, a certain set of data must be produced so that standard measures can be made and standard views of the verification results can be provided. The particular measures are discussed in chapter 6.
4
In fact, the careful reader might discover odd limitations described in product documentation that suggests that the product had a bug for which a workaround was developed.
92
Chapter 4 – Planning and Execution
Fig. 4.6. Standard Results
The standard results to be generated include the following: • values of all variables, suitably time-stamped • code coverage on a clock-domain basis so that the following measures can be determined: - statement coverage - branch coverage - condition coverage - convergence That’s it – really nothing more than is already typically preserved in regression runs.
4.11 Standard Results (§ 3)
93
The particular formats in which data are stored will depend on the tools used for verification. In-house tools and vendor-supplied software will vary widely in the particular ways in which they store data. In fact, the raw data may come from a variety of sources and appear in multiple formats. For example, simulations of a processor might produce multiple sets of data for each test: a simulation trace of the signals appearing at the I/O ports of the target, an execution history of the program executed by the processor during the test, and a dump of final state. In addition, some data may actually come not from simulation, but from an FPGA-based implementation of the target (a firm prototype). Tools will evolve to consolidate results from all verification platforms: • simulation of the soft prototype • emulation using FPGAs (a firm prototype) • operation of the hard prototype What matters is that verification results be stored in such a manner that standard results (values of standard variables) can be readily obtained for the purposes of producing standard measures and standard views (the topics of chapter 6. In other words, the standard results may consist of these raw data plus custom methods that understand the formats of the raw data such that the values of standard variables can be extracted (see Fig. 4.7). These methods provide the standard results while hiding (but not disturbing) the raw data.
Fig. 4.7. Producing standard results
94
Chapter 4 – Planning and Execution
Future standardization within the industry may eventually address some standards for data storage.
4.12 Setting Goals for Coverage and Risk (§ 4) The verification plan should clearly state the coverage goals so that resources can be managed effectively to reach the goals on schedule. The related goals for risk must be stated as well so that the verification effort is aligned with management’s expectations regarding risk. Relating coverage to risk is the topic of chapter 7, and formal statements of risk as related to project resources will be described there. 4.12.1 Making Trade-offs (§ 4) As mentioned in the preceding discussions, the verification manager is faced with the challenge of deploying a finite set of resources to achieve a commercially viable result. Consequently, the entire space of a design will not necessarily undergo thorough verification before tape-out or ramping for production. The initial samples of a device might support only a single clock frequency (or set of frequencies) with production devices or upgrades supporting a wider range of frequencies. The manager might also choose to exclude previously verified code (“leveraged” code, IP, etc.) from the target for regression and from coverage analysis if the risk of an unexposed bug in the leveraged code is regarded as acceptably low. 4.12.2 Focusing Resources (§ 4) As mentioned at the end of chapter 3, the function space to be verified can be quite vast, but it’s usually not necessary to exercise it exhaustively before tape-out or release to volume manufacturing. It is the nature of design bugs that they are nearly always associated with some corner case, so a regression suite that focuses on corner cases will be much more productive at exposing faulty behavior than one which attempts to exercise the entire space uniformly. By adjusting the weights (a.k.a. turning the knobs) for random selection of values for pseudo-random variables such that boundary values are chosen and used preferentially, bug discovery will probably be accelerated.
4.13 Architecture for Verification Software (§ 5)
95
There is an implicit assumption underlying this strategy, however, that should be acknowledged when adopting it. This assumption is, simply, that the density of bugs is highest in the condensed space, owing to the complexity associated with the logical decisions and computations made at these multi-dimensional vertices. Focusing simulation resources on the boundary values concentrates test generator power on the condensed space to ensure maximum likelihood of exposing bugs. Of course, some useful fraction of verification resources should allocated to exploring the full, uncondensed space as well. Experience and accumulation of empirical data will guide this allocation of simulation bandwidth. If certain functionality defined in the specifications seems to be buggier than other the rest of the target, it may be advisable to focus test generation such that the VTGs derived from these specifications are more thoroughly (or perhaps even exhaustively) simulated. High coverage on known buggy areas can greatly reduce the risk of unexposed bugs in the related RTL. Sometimes it is necessary to modify tests to avoid generating faulty behavior due to known bugs. This can usually be accomplished by adjusting weights (perhaps to zero) for sub-ranges of variables on which the behavior is dependent, thereby causing tests to be generated that test functionality unaffected by the known bugs. If this has been done, then before “closing out” the bug report, it is essential that any and all of these modifications be removed so that the functionality previously exhibiting faulty behavior can continue to be exercised.
4.13 Architecture for Verification Software (§ 5) The foundation of sturdy software architecture consists of well-defined data structures and that is where design of the verification software begins.
96
Chapter 4 – Planning and Execution
Fig. 4.8. Architecting the test environment
The architecture for the test environment (i.e., everything other than the instance or instances to be verified) will be dependent on a number of factors, the discussion of which is outside the scope of this book. These include: • the programming language or languages used to implement the software, • the inclusion of leveraged code or commercially available verification IP, and
4.13 Architecture for Verification Software (§ 5)
97
• the computing environment (usually a multitude of networked platforms) in which the software executes. Nevertheless, there are a number of ingredients that are common to all architectures and these should be taken into full consideration during the planning stage. These ingredients include: • Sockets5 for the instances of the verification target: Usually, the instance is provided by a separate engineering team. In the case of commercial IP, instances are generated based upon values of internal connectivity in which case the generated instance or instances must fit neatly into the architecture of the verification software (think “plug-and-play”). • Context models: The verification software must present to the instance a complete set of bus functional models, memory models, and models for other devices with which the hard prototype must eventually cooperate. • Activation models: The provision of clocking and reset signals to the instance is nearly always dependent upon the frequency of operation of the device as well as upon what is present in the context for the instance. • Test generator: The many, many sequences of stimuli are commonly referred to as tests for an instance. The production of these stimuli are nearly always provided by some discrete test generator that not only provides stimuli but also provides the expected results (responses to the stimuli) for comparison against actual results from the instance so that faulty behavior can be detected. Such tests must, of course, be able to distinguish clearly between success and failure (obvious, yet many tests suffer from a lack to make such clear distinctions). • Deterministic tests (also known as directed tests): It is very convenient to have provisions in the verification software for the inclusion of deterministic tests or deterministic sequences of stimuli. Conditions, for example, are often established by executing some fixed preamble that has been modified for the specific values of condition variables. Within random sequences of instructions or commands there will often appear deterministic sequences such as procedure calls or returns or other language idioms. A deterministic sequence of bus transactions might be used to drive bus activity to saturation. Deterministic tests are also often used to check state immediately following activation, de-activation of some clock domain, re-activation whether full (via power-up) or partial
5
The use of the term “sockets” in this context is not to be confused with software sockets for establishing connections within networked systems.
98
•
• •
•
•
Chapter 4 – Planning and Execution
(via soft reset), The ability to incorporate such deterministic sequences readily into the random test is enormously valuable. Transactors: Stimuli are applied to the target by verification software components commonly called transactors. These are protocol-knowledgeable generators of signal values to be applied to the individual input ports (or bidirectional ports) of the target. Verification IP provides transactors for a wide variety of industry-standard protocols, alleviating the need to develop them from scratch. Monitors: All rules and guidelines relevant to the morph of the instance must be checked for violation. Monitors that observe external and internal responses must be included. Protocol checkers: These are a special type of monitor in that they check that a defined protocol for exchanging information, usually via an industry standard protocol such as PCI-X or USB, has been followed in every respect. Here also, verification IP provides monitors for a wide variety of such protocols. Expected results checkers: These are also a special type of monitor that checks that data resulting from communications via some protocol or transformation via some engine (virtual to real address translation, for example) are correct. Coverage models: The scenarios defined in association with the many rules and guidelines are implemented as coverage models as defined by Lachish. These coverage models check coverage of specific functional trajectories through subgraphs of the functional space corresponding to the scenarios of interest.
Collectively, the foregoing elements constitute a testbench (as defined in Bailey). The architecture for the software for verification of the soft prototype should anticipate the following functionality: • Generate all variability to be handled by the target. • Check compliance with all rules and guidelines. • Reproduce an exact sequence of stimuli (for a particular instance, context, activation, and operation) for the purposes of failure analysis and subsequent verification of changes made to eliminate a defect. • Be able to execute on multiple networked computing platforms and aggregate all of the results. • Collect coverage data (values of variables and selected execution histories) on a per-revision basis sufficient to produce standard results.
4.13 Architecture for Verification Software (§ 5)
99
• Transform a sequence of stimuli recorded on a hard prototype into an identical (or equivalent) sequence for the soft prototype to facilitate failure analysis. • Derive manufacturing tests, typically from the morph for testability. • Incorporate changes as the various documents (specifications) that define the target are revised.
4.13.1 Flow for Soft Prototype (§ 5) Before considering a typical flow that extends from the standardized approach to verification, it’s necessary to consider the data structures needed to retain records of simulations. The following requirements on data collection should be taken into account so that standard measures and standard views can be generated from them: 1. Values used for (visited by) all standard variables, including all internal and external responses must be recorded. 2. Suitable time-stamping of data is implied in the requirements for the standard measures and views described in the next chapter. 3. Aggregate coverage data on a per-revision basis across all tests in the regression suite and across all simulation platforms contributing regression results. A typical verification process will follow these steps: 1. Generate or sense the instance and context 2. Activate the system (the instance in its context) 3. Establish the conditions of operation (initialize). Deterministic sequences or templates are often useful here. 4. Excite using stimuli subject to error imposition. Apply all deterministic tests first so that subsequent convergence (see chapter 5) of pseudo-randomly generated tests can be evaluated. Then apply pseudo-random stimuli and all other excitations. Fig. 4.9 illustrates a typical CRV process. In the following paragraphs we will elaborate on these several steps. Experienced verification engineers will find these steps familiar and may also recognize how past verification projects have employed alternative steps.
100
Chapter 4 – Planning and Execution
Fig. 4.9. Generalized CRV process
4.13 Architecture for Verification Software (§ 5)
101
4.13.2 Random Value Assignment (§ 5) A fundamental operation within any verification process using CRV is the random selection of a value for a variable from its defined range and subject to any constraints defined for that selection. Consider the following example. <’ type major_opcode: [ LOAD, STORE, BRANCH, ALU, SYSTEM ] (bits: 3); // Major opcode field in instructions. struct instruction { major_op: major_opcode; // One of several fields. keep soft major_op == select { 40: LOAD; 30: STORE; 20: BRANCH; 20: ALU; 10: SYSTEM; // Total of weights is 120. } // Remaining fields of instruction not shown. } ‘>
In this brief segment of e code the major opcode field (a standard variable of stimulus composition) for instructions has been defined. Three bits are used to encode five possible values (enumerated as LOAD, STORE, BRANCH, ALU, and SYSTEM) for this variable. Built into the e language is the notion of generating values for these fields on a pseudo-random basis, subject to the constraints provided. In this example, a relative weighting among the five possible values is provided. The most likely value to be assigned to this field is LOAD (40 out of 120) and the least likely value to be assigned is SYSTEM (10 out of 120). Apparently the test containing these constraints intends to exercise the instruction pipeline with a prevalence of memory accesses via LOAD and STORE instructions, with some BRANCH instructions mixed in to make things even more interesting. In the figures that follow, the boxes labeled as “assign random values” represent this fundamental CRV operation at work.
102
Chapter 4 – Planning and Execution
4.13.3 General CRV Process (§ 5) Fig. 4.9 illustrates a generalized CRV process and shows how the standard variables are consulted at various steps in the process. Consider each step in some detail. The process begins when a new version or release of the RTL is made available for regression testing. A given body of results relates only to a specific version of the RTL. Any change in the RTL, no matter how small or “minor,” requires that a new set of results be obtained. It is the nature of bugs that little changes in RTL can produce faulty or undesirable behavior. So, whenever a new version is produced, it starts a new run of CRV. The step labeled “generate instance/context pair” represents a number of different ways in which a system (the pair of an instance and a context) can be set up for simulation. If the project is producing a single IC, then the instance is often provided by a separate engineering team responsible for the design of the RTL for the IC. In such a case, a copy of the RTL is integrated into the testbench for simulation with a suitable context. If the IC is being designed for one and only one purpose, then the context is fixed (except when simulating the testability morph which would most likely be simulated with a context representing the test fixture for the IC). If the IC is intended for use in a variety of systems, then the context must be generated (or chosen) randomly, using values assigned to variables of external connectivity. If the RTL is being designed as multiinstance IP, then both the instance and the context would be generated randomly by the verification team, again using values assigned to the variables of connectivity during the “assign random values” process. Simulation begins with the “activate system” step. Power is applied, clocks begin to toggle, and reset signals are deasserted according to values of activation variables. Following activation, the “initialize system” step causes the various registers in the target and in the context to receive the values produced by the “assign random values” process based on the internal and external variables of condition. After the system is initialized, the action continues with the “run test” step. This step may be one of the more complex steps to implement by the verification engineer, particularly if tests are generated dynamically based upon results obtained so far. There are numerous advantages to dynamically generated tests over statically generated tests and these are discussed later in this chapter.
4.13 Architecture for Verification Software (§ 5)
103
Keep in mind that this flow just discussed is a highly simplified generalization of CRV. Any given project is likely to have differences in how the CRV process is managed. For example, some systems may undergo numerous, multiple initializations, such as a multi-processor system waking one or more idle serf processors. As another example, consider a system incorporating USB. Cable connections and disconnections will result in multiple activations and initializations of sets of logic within the target. One important detail not depicted in Fig. 4.9 requires consideration. Generating tests depends on values of variables of stimulus, of course, but often also on values previously assigned to variables of connectivity or condition. That is, the ranges (or even existence) of some variables of stimulus are dependent on values of variables of connectivity or condition and also activation in certain cases. Such dependencies are not shown in the figure but are vital in generating meaningful tests for the system. For example, consider a processor instance (see Fig. 2.4) that lacks the optional FPU. Generating a test with lots of floating-point instructions for such an instance might not be worthwhile. On the other hand, some tests containing floating-point might be needed if an instance lacking the FPU must properly handle such instructions, perhaps by a trap, the handler for which emulates floating-point instructions. During the last three steps (activate, initialize, and run test) the responses of the target are observed and checked against the defined rules and guidelines. If a rule is violated, this constitutes a failure, and failures nearly always invalidate subsequent results, so there is little value in allowing the simulation to continue for much longer. On the other hand, it is often useful to have the results from the immediately ensuing clock cycles (a few tens or hundreds of cycles) to facilitate failure analysis to determine the nature and extent of the faulty behavior, after which point the simulation would be halted. The results of a test that passes should be added to the database of standard results for subsequent analysis. The results of a test that fails do not constitute coverage of the function points visited during the test and must not be added to the standard results. Instead, the entire execution history from activation up to the point of failure (and perhaps a bit beyond) is dumped to disk for analysis by the verification team. Failure analysis will be discussed further later on in this chapter. The CRV process repeats, often on dozens or perhaps hundreds of simulation engines, until no more testing is needed. Deciding when no more CRV is needed is based on a variety of factors. If a majority of tests are failing, then it’s likely that one or more major bugs somewhere is causing these failures and additional results may be of little value, in which case the simulation engines might be better used for other purposes (or projects). If, on the other hand, all tests are passing, then completion criteria
104
Chapter 4 – Planning and Execution
will determine when no more CRV is needed. These criteria will be based on measures of coverage (discussed in chapter 6) and on risk analysis (discussed in chapter 7). Other factors that determine when to stop the CRV process include such things as: no more disk space available, new RTL version available, other demands on simulation engines, and so forth. Often during working hours, engineers use simulation engines for interactive work with evenings and weekends devoted to continuous CRV. 4.13.4 Activation and Initialization (§ 5) There may be many paths to just a few post-activation states. Arrival times of deassertions of multiple reset signals with respect to each other and with respect to other excitation might all result in the same post-activation state. Hard resets, warm resets, and soft resets (provoked by software) may each have a single post-reset state. By saving such states for subsequent testing beginning with initialization some simulation cycles can be saved. Establishing conditions in the target and in its context might be accomplished by a deterministic preamble common to all or most tests. After conditions have been established but before applying subsequent stimuli, save the state of simulation. Then other sequences of stimuli can be applied to without having to repeat the preamble. Now consider Fig. 4.10. This figure depicts an automated process for testing all (or some fraction of) possible ways a given instance/context pair can be activated. If a given instance/context pair is intended to be further subjected to CRV testing, the simulator state can be saved for later reloading in a CRV process.
Fig. 4.10. De-activation and re-activation testing
4.13 Architecture for Verification Software (§ 5)
105
However, there is one more step that typically precedes application of CRV tests, that of initialization. Consider Fig. 4.11.
Fig. 4.11. Saving initialized systems for CRV
106
Chapter 4 – Planning and Execution
After a system has been initialized, assigning values to all variables of condition and writing them into their respective registers in the target and in the context, excitation with stimuli generated by the test generator begins. Substantial gains in efficiency can be made by avoiding the common preamble that activates and initializes the system. Simply dump the state of the system in a location accessible to all available simulation engines (the “simulation farm”). Then, as CRV jobs are dispatched to the networked simulation engines in the simulation farm, this saved state is sent along as well. In Fig. 4.12 the tasks within the rectangle labeled as “Dispatcher” would be handled typically by a main CRV script that sends jobs to each of a large number of simulation engines in the simulation farm. The remaining tasks are those executed on the individual engines in the farm. Of course, if, during the course of activation and initialization, some faulty behavior is observed, the resulting state will probably not be valid and should not be used for subsequent CRV tests. Instead, the system state and all other necessary diagnostic data are saved for failure analysis. Another common practice that avoids unnecessary simulation cycles is the magical initialization of memory in the target’s context and sometimes even in the target itself. Consider a processor example. A processor’s cache control logic is not necessarily well-exercised by an extremely long loop that fills the cache, and tests that are intended (weighted) to exercise the execution pipeline might simply be executed from some saved postactivation state, and then magically (in zero-time) its cache is initialized with the program of interest.
4.13 Architecture for Verification Software (§ 5)
Fig. 4.12. CRV using previously initialized system(s)
107
108
Chapter 4 – Planning and Execution
4.13.5 Static vs. Dynamic Test Generation (§ 5) There are numerous advantages to using dynamically generated tests and modern commercially available tools readily accommodate this type of generation. A statically generated test is one whose entire trajectory of stimuli is known in advance, much like a deterministic test except that the test has been generated with random value assignment driving the creation of the test. A dynamically generated test is one whose trajectory is determined by current or recent state with random value assignment being performed as the test is simulated. Progress of the test is made more useful by anticipating or detecting indirect conditions, yielding greater functional coverage with fewer simulation cycles. This practice is rather well established in the industry and commercially available tools readily accommodate this type of test generation. Certain classes of test are not readily generated dynamically, such as a program for a processor. However, activity that is independent of such a test can be generated dynamically. For example, an interrupt might be caused to occur at a particular time relative to the instruction pipeline. 4.13.6 Halting Individual Tests (§ 5) Another advantage to using dynamic test generation is that test failures can be detected on-the-fly. This means that simulation cycles following faulty behavior are not wasted, because the target has “headed into the weeds” anyways. However, it is usually worthwhile to let simulation proceed for a few dozens or hundreds of cycles to yield possibly useful execution history for the purposes of failure analysis. Some tests may need to wait for quiescence before halting the simulation so that all internal state has reached its final value before analyzing final state for correctness and completeness. 4.13.7 Sanity Checking and Other Tests (§ 5) Even after a target has survived an overwhelming volume of CRV testing it is always prudent to apply some “sanity check” tests just as a precautionary measure. For example, verification of a JPEG codec should also use actual image files with visual confirmation and approval of the generated
4.13 Architecture for Verification Software (§ 5)
109
images. Boot-up firmware and initialization software might also be useful as alternative sources for tests. One species of “actual software” that one encounters eventually is that of the “compliance tests.” These test suites are typically developed to demonstrate that requirements for some particular industry standard (such as USB or PCI-Express) have been met. It is not uncommon for hard prototypes (and sometimes, actual products) to pass its relevant compliance tests and yet still contain bugs. We might better refer to these tests as “noncompliance” tests because, if any test fails, the device under test is not compliant to the requirements as implemented in the test. Not only should one apply such “sanity check” tests. One should also determine whether any of these tests increase coverage above that which has been accumulated via CRV and pre-CRV activation testing. This may reveal coverage holes in the CRV testing. Another method for checking the sanity of an overall regression suite is to add one or more bugs deliberately to the RTL and then determine whether the regression suite is able to detect any faulty behavior as a consequence of the “seeded bug(s)”. Future verification tools may provide such capability to grade a regression suite based on its ability to expose such bugs. 4.13.8 Gate-level Simulation (§ 5) Gate-level simulation constitutes simulation of a (usually) synthesized gate-level model with back-annotated timing.6 The objective for this step is to verify that synthesis generated the correct netlist, that scan and clock tree insertion doesn’t change circuit functionality, that the chip will meet timing (by exercising the timing-critical paths), and that any hand edits to the netlist have not introduced a bug. This gate-level model is also used for generating manufacturing tests and to evaluate fault coverage. Static timing analysis tools typically handle the job of ensuring that the chip will meet timing requirements, but the additional confidence gained from successful gate-level simulations provided a cross check for these static timing tools, provided that timing critical paths can be exercised. Manufacturing tests are typically achieved through full scan testing or BIST if implemented.
6
This is a connectivity of gates, but at a nearly meaningless level of abstraction from the point of view of the functional specifications. However, it is a very meaningful abstraction for the electrical specifications (area, power, timing).
110
Chapter 4 – Planning and Execution
The term “formal verification” is also used to apply to equivalence checking of RTL with the synthesized netlist, the netlist with test logic inserted, the netlist with clock tree inserted, and any hand edits. Hand edits may be needed to accomplish a bug fix by modifying masks (they are very expensive so we want to retain them whenever we can) and these modifications must be modeled in RTL. Equivalence checking ensures that they are equivalent. Equivalence checkers find any non-correspondence between a gate-level netlist and the RTL from which it was synthesized. Gate-level simulations with a selected subset of the regression suite serves as a cross check on equivalence checking. Fault coverage can be estimated using only a gate-level model with suitable stuck-at models. Gate-level simulations will typically run much slower than simulation of RTL because the level of abstraction is lower, resulting in many more elements to evaluate as signal changes propagate through the target. 4.13.9 Generating Production Test Vectors (§ 5) Production test vectors typically consist largely of automatically generated vectors for the testability morph of the target. Test time is expensive, so achieving high fault coverage in as little time as possible is very important. Fortunately, ATPG (automated test program generation) does the heavy lifting for us, producing vectors for the testability morph. If the test fixture for the manufactured device is able to drive and observe all I/O ports, then running some fraction (or all) of the functional regression suite on the tester can be done. One advantage is that manufactured prototypes can be tested even though a complete set of manufacturing tests is not yet available. It’s not uncommon for generation of production test vectors to lag production of prototype devices.
4.14 Change Management (§ 6) Change is constant, as they say. Effective project management expects change and is prepared to accommodate it. Changes come from within, as a consequence of invention and discovery, as well as from without, as a consequence of market and schedule pressures. To facilitate a fluid and nimble response to changes from whatever source, it is worthwhile to discuss how such changes will be handled early in the planning stage. Any agreements on how to manage such changes
4.15 Organizing the Teams (§ 7)
111
might already be part of the management practices of the organization, in which case the verification plan need merely cite these established practices. For example, the verification manager might be allocated a budget for resources and then be empowered to spend those resources as needed to achieve the project’s goals on schedule. On the other hand, various levels of approval up the management chain might be needed before changing course. When changes are imposed by external factors, such as a change in product requirements, it is invaluable to document these changes in writing. If specifications are changed, then these changes must be interpreted into the standard framework and propagated throughout the testbench as needed. If changes are made to schedule or resources, it’s vital that these changes be understood in terms of their consequences to the project. It may be useful to establish a “drop dead” date7 after which no further changes in requirements can be accommodated. A shorter schedule or loss of resources will usually entail assuming greater risk at tape-out than was originally planned. On the other hand, the availability of an advanced tool or the discovery of some more efficient process might entail embracing a more aggressive risk goal than was originally planned.
4.15 Organizing the Teams (§ 7) There are many ways to organize the various teams of engineers and managers that work together to produce a new product, but there are some common elements to each. A representative organization of development effort is illustrated in Fig. 4.13. There are typically three major efforts in the development of a new (or revised) IC, and these efforts are often undertaken by three separate teams. First there is the need to create the RTL that performs the required functions, synthesizes into the intended die area, and meets the timing requirements. A separate RTL team is often responsible for this effort. Second there is the need to create the artwork that is transformed into the mask set for the IC. This artwork must match the RTL in terms of logical functionality, fit into the die area, meet the timing requirements, and meet the design rules for the manufacturing process.
7
This is the time when engineering tells marketing to “drop dead” if they bring new requirements for the design.
112
Chapter 4 – Planning and Execution
Fig. 4.13. Typical team organization
Third there is the need to create the test environment and the resulting regression suite that demonstrates to a sufficiently high level of confidence that the synthesizable RTL functions correctly and completely according to the specifications. Additionally, a gate-level model with back-annotated timing information derived from the artwork might also be subjected to the same (or, more likely, abbreviated) regression as the source RTL. In effect, the RTL team and the verification team are each producing independently and in separate computer languages a functional implementation of the specification documents. The likelihood that each team will duplicate design errors is very small, so each is checking the work of the other, thereby assuring a quality result.8 8
The diagram in Fig. 4.13 is only a rough approximation of the countless interactions of team members, and it would be an error to restrict communications to
4.15 Organizing the Teams (§ 7)
113
During the course of the project there will invariably be some number of bugs that must be exposed and changes made so that functionality and area and timing requirements are met. The presence of these bugs is revealed in failures that must be analyzed for cause and for changes needed to eliminate whatever faulty behavior led to the failure. The verification team, having ruled out anything in the test environment as having caused the failure, relays the data associated with the failure to the RTL team for analysis. When the changes needed to eliminate the faulty behavior are made, the revised RTL is related to both the verification team and to the artwork team. Similarly, if a signal path fails to meet its timing budget or if a block fails to meet its area budget, the artwork team rules out any cause within their domain (such as routing or transistor sizing) before relaying the failure data back to the RTL team for their analysis. Again, the RTL is revised as necessary and related back to the artwork team and to the verification team. Of course, this is a highly simplified view of team interaction and the reality is usually much more dynamic and not nearly so cut and dried. RTL revisions are not necessarily relayed for each and every change, but might instead be accumulated for a more orderly check-in process, perhaps daily or weekly or at some other interval. One thing is required, however, and that is that everyone is working on the identical version of RTL. As mentioned earlier, it’s valuable to adopt an “always working” methodology whereby any changes that fail regression are rejected. Having a “quick test” that screens for common problems can prevent a broken check-in of new code from reaching the CRV processes prematurely. Proper check-in disciplines should be enforced at all times, usually by the curator for the overall environment. This is a member of the verification team who is in a position to remain aware of all changes made by everyone contributing to regressions, whether it’s RTL from the RTL team, testbench components from the verification team, or gate-level models from the artwork team. This engineer might also be empowered to manage the simulation farm and to ensure that optimal productivity from this scarce asset is obtained.
only that shown in the diagram, confining the creative process to some graphic in a book. The map is not the territory.
114
Chapter 4 – Planning and Execution
4.15.1 Failure Analysis (§ 7) Members of the verification are on the front lines of failure analysis. Some tests fail due to external factors, such as a disk becoming full or someone from the nightly cleaning crew unplugging a computer. Some tests fail due to bugs in the environment control software that dispatches simulation jobs and retrieves their results. Some tests fail due to bugs in the test environment. If the foregoing causes have been ruled out, then the cause of the failure is most likely due to one or more bugs in the RTL. This is about the right time for a hand-off from the verification team to the RTL team. Triage – sifting through failed tests to find likely common causes and sorting them as to likely cause – is an important role in the verification team. This important function is often performed most effectively by the engineer responsible for integration of the many contributions (RTL, test environment), i.e. the curator. Finding all paths that lead to the same faulty behavior gives us what we need to know to eliminate the faulty behavior as well as to generate the test that exercises the changes (not that this will be an easy task). One method that has proven useful for exploring the functional space in the proximity of some faulty behavior is sweep testing or functional shmoo testing. By varying values of one variable while keeping others constant, one may get a more comprehensive understanding of what variables contribute to the problem. Again, this is not necessarily easy but may be needed for critical functionality. We know that we can create VTGs for any subgraph in the functional space. This may be a useful technique in verifying changes to fix bugs and eliminate faulty behavior. Coverage of the VTG constructed for the affected variables can tell us how thoroughly the changes have been exercised. For bugs requiring changes in multiple modules created by different engineers, it may be worthwhile to analyze the behavior for sensitivity to context, to activation, to operation, and, of course, to excitation. This comprehensive analysis may facilitate the design of changes to eliminate the faulty behavior or the design of workarounds (avoidance of the bug by constraining software’s use of some functionality). For example, if some faulty behavior requires extensive changes to the RTL or if time does not permit designing the changes, and the behavior occurs only for one value of a variable of condition, then the behavior can be avoided by no longer using that value. Of course, this is only feasible if the other consequences of not using that value are still acceptable within the sphere of intended usage for the design. The design of a workaround is usually not a trivial exercise.
4.16 Tracking Progress (§ 8)
115
4.16 Tracking Progress (§ 8) The late great writer Douglas Adams was quoted as saying, as he labored over his characteristically late manuscripts, “I love deadlines. I love the whooshing sound they make as they go by.” Project managers love deadlines nearly as much as authors, and are routinely required to provide updated estimates for when various milestones will be met. The most critical milestone is the one at which the risk of a functional bug is sufficiently low so that final artwork can be prepared for tape-out. How close is our project to completion? That question will be addressed in detail in the last two chapters, but there are some metrics that provide some indications of progress towards completion. To track progress in reducing the risk of a bug in a verification target, the following metrics can provide key insights: • Code coverage: 100% statement coverage (see chapter 6) indicates that about one third of the overall work to expose bugs is completed. Fig. 4.14 illustrates how code coverage might increase as a function of time for a typical project. • Bug discovery rate: If you stop looking, you will stop finding. A nonconvergent regression suite or one that has reached the asymptote of its convergence (see next chapter) will eventually stop looking because it is not exercising previously unexercised functionality. See Fig. 4.15 for an example of how bug discovery rate might appear for a typical project. • Bug counts and severity: for risk analysis. Each organization counts bugs and assigns severity somewhat differently so comparison of such data from different organizations is often not reliable. • Bug locality: Bugs are not typically distributed uniformly within the target but are often clustered by code module or by functionality (as characterized by the variables on which faulty behavior is dependent). Tracking the locality of bugs can provide useful indicators on how to focus continued verification activity. Similarly, associating bugs with the specific sections of specifications for the target can also indicate areas for increased focus.
116
Chapter 4 – Planning and Execution
Fig. 4.14. Code coverage rate
Typically bugs will be discovered more rapidly on “fresh” RTL that hasn’t yet been exercised. So, as new functionality in the test environment is turned on, a brief burst in bug discovery may follow because code (whether lines or expressions etc.) is exercised in ways it had not been previously. And, as the RTL matures and bugs become fewer and fewer, this will become apparent as the discovery rate tapers off. However, there may be other reasons for such tapering off as we will see in the next section. Comparing bug discovery rates across projects or to compare bug counts from different projects can be quite problematic for a number of reasons, and such comparisons should be regarded skeptically. First and foremost is that different organizations will count bugs differently. For a given observation of faulty behavior that requires changes in multiple modules to eliminate the faulty behavior, one group might regard this as a single bug whereas another group might regard this as multiple bugs (one in each affected module). Additionally, if the changes related to the bug are later determined to be incomplete (faulty behavior not eliminated completely), this may result in reporting yet another bug. Furthermore,
4.16 Tracking Progress (§ 8)
117
Fig. 4.15. Bug discovery rate
bugs are usually graded by severity, that is, how badly they affect behavior of the target. There are no uniform definitions for severity and different organizations (indeed, different engineers in the same organization) will grade bugs differently.9 Another factor that makes cross-project comparison of bug curves problematic is when the counting begins. Some projects begin counting bugs as soon as simulation of the target begins, regardless of the completeness of the RTL of the target. Other projects may delay counting of bugs until the RTL has achieved some level of behavioral stability, possibly well after the RTL is functionally complete. Yet another factor that can affect bug counts dramatically is stability of the specifications. If requirements on the target change mid-stream, the resulting sets of changes to the RTL to implement the new requirements are
9
IP customers often grade bugs much more severely than IP providers.
118
Chapter 4 – Planning and Execution
very likely to introduce more bugs into the target, increasing the bug count for the project. One comparison that does have some validity, however, is the number of bugs still present in the target after tape-out. If a project’s goal is to produce production-worthy silicon at first tape-out, then the number of bugs that require a re-spin of the silicon matters. It is also worth noting that achieving maximum code coverage does not mean that complete functional coverage has been achieved. A useful rule of thumb for gauging progress is that when maximum code coverage has been reached, then about one-third of the engineering effort to find the functional bugs has been spent. The first bugs are typically easy to find and the last bugs are increasingly difficult (requiring more engineering effort and time) to find.
4.17 Related Documents (§ 9) This section of the plan lists the documents pertinent to the verification target, especially including the versions of each document. This list of documents includes: • design requirements (the natural-language specifications) • requirements on properties recorded as special cases, such as performance • external standards documents, such as for PCI-Express or USB or AMBA • established internal checklists, such as those for code inspections or design reviews
4.18 Scope, Schedule and Resources (§ 10) There are many well-established practices for managing projects successfully and a complete discussion of these practices are beyond the scope of this book. However, a few key points are worth noting in the context of functional verification. The three principle factors available for trade-off in the execution of a project are the project’s scope, its schedule, and the resources available. The section of the verification plan called Schedule should address all three of these factors.
4.19 Summary
119
The scope of the project determines what is to be produced. It’s worthwhile to list not only what is included in the project but also what is not included in the project. For example, if commercial IP is integrated into the target, and this IP is believed to be already verified, the verification plan should state explicitly that verification of the IP is not an objective of the project. On the other hand, if there is reason to suspect that the IP might contain bugs, then the plan should state explicitly that verification of the IP is an objective of the project. The schedule states who does what by when. Tasks are defined such that they can be assigned to a single individual and estimates are made for completing each task. The tasks are linked according to their dependencies into a single schedule (as a Gannt chart) and the current version of the schedule becomes part of the plan, perhaps by reference. The resources needed for the project will include not only the verification engineers, but also the simulation engines needed for CRV, the tools that must be licensed or purchased (including verification IP), and so forth. If any of these should change, a revision to the plan is in order.
4.19 Summary We have now related the loftier concepts of functional spaces to the more earthly concepts of producing a profitable product. By executing verification projects within the standard framework, one not only enjoys the advantage of producing results that can be used for data-driven risk assessment. One also gains huge leverage to all other verification projects. Managers responsible for multiple IC designs will benefit by having a pool of verification talent that can move readily from one project to another without having to learn yet another verification scheme. Terminology is shared among all projects. Projects can be compared because they are executing to similar milestones, such as IC done and FC done. Planning is greatly simplified by having a common template for all verification plans. Before proceeding to analyzing the results that verification produces, we still need to consider how to estimate the resources needed to execute our projects. Fortunately, there are a few techniques that can assist us that are based on objective data. That will be the topic of our next chapter.
120
Chapter 4 – Planning and Execution
References Bergeron J (2003) Writing Testbenches: Functional Verification of HDL Models, Second Edition. Kluwer Academic Publishers. Bergeron J, Cerny E, Hunter A, Nightingale A (2005) Verification Methodology Manual for SystemVerilog. Springer. Foster HD, Krolnik A, and Lacey D (2004) Assertion-Based Design, 2nd Edition. Francard R, Posner M (2006) Verification Methods Applied to the ST Microelectronics GreenSIDE Project. Design And Reuse. Haque FI, Khan KA, Michelson J (2001) The Art of Verification with Vera. Verification Central. Palnitkar S (2004) Design Verification with e. Prentice Hall Professional Technical Reference. Synopsys (2003) Constrained-Random Test Generation and Functional Coverage with Vera. Talesara H, Mullinger N (2006) Accelerating Functional Closure: Synopsys Verification Solutions. Synopsys, Inc. Wile B, Goss JC, Roesner W (2005) Comprehensive Functional Verification. Elsevier/Morgan Kaufman.
Chapter 5 – Normalizing Data
In the previous chapter we considered a fairly wide variety of practical matters related to functional verification, but one question was not addressed. What will it take to verify our target in our current project? Targets of verification continue to grow and become more complex, requiring more resources to expose those bugs that experience and intuition tell us are surely there. But, how much “bigger” is one target than another? Measures based on lines of uncommented source code (for the RTL) or transistor count or die area are readily available, but these measures do not correlate strongly with the number of bugs we will expose or the number of simulation cycles we will consume over the months in our efforts to expose them. Die area and transistor counts do not correlate well, because large uniform arrays (such as SRAM cells for caches and memories) represent a significant fraction of these measures. Lines of code seem like a better candidate, but experience with code coverage tools (see chapter 5) suggest that this measure will not correlate well either. In fact, short code segments with tightly packed convoluted logical expressions cleverly implementing some functionality may be much buggier than long segments of simpler code implementing the same functionality. A better measure for complexity is certainly needed. This chapter will explore three separate indicators for complexity that can be used to normalize data in a way that can be used to forecast project resources. Once we have a useful measure for complexity, perhaps we can then determine how many cycles of CRV will be needed to achieve sufficiently high functional coverage so that the risk of unexposed functional bugs is acceptably low.
5.1 Estimating Project Resources Determining the resource requirements (simulation resources in particular) to execute a verification project successfully is often based largely on the judgment of the project manager and the verification engineers. However,
122
Chapter 5 – Normalizing Data
using data accumulated from past projects can enable better forecasts of project resource needs as well as enable the manager responsible for multiple verification projects allocate resources across multiple projects. We will discuss three techniques: 1. Examine the power of the test generator to exercise the target 2. Estimate the complexity of the target using synthesis results 3. Estimate the complexity of the target using its VTG
5.2 Power and Convergence One frequent and vital activity that requires good resource allocation is regression. The verification manager needs to be able to determine how much time will be required for a complete regression of the target. This will determine how frequently new changes can be checked into the verification team’s copy of the target RTL. If the available compute resources are able to execute a complete regression of the target in one day’s time, then changes can be verified on a daily basis. If on the other hand it requires a week to regress the target fully, then changes cannot be made as frequently. For a large design, say of a microprocessor, complete regression can require a week or more, depending on the number of simulation platforms allocated for this vital task. The ability of tests to explore the functional space rapidly by continuously producing novel activity is defined as its power. Methods to measure this power directly would require extensive statistical analysis of the contents of the tests, and such direct methods are not readily available. However, it is possible and practical to measure the effects of the tests using readily available tools, such as those that measure code coverage. By examining how quickly (in terms of simulated cycles) a given measure of coverage converges on its maximum value, we can gain insight into the power of the tests produced by the test generator. The higher the convergence of the tests with respect to the target, the more efficiently one achieves a thorough exercise of the target. Collecting the coverage data on which convergence is determined should be done on a code module basis, not on a code instance basis. The precise shape of the convergence curve depends, of course, on the test generator and the target against which tests are being applied. Furthermore, this is only a generalized model for convergence and different combinations of test generator and target may yield rather different curves.
5.2 Power and Convergence
123
Nevertheless, the general concept of converging asymptotically on maximum coverage still applies. The mathematics underlying the computations may need revision for curves that differ greatly from the model shown. The convergence of a test generator is characterized by 2 values: • convergence gap: α = 100% − level of asymptote • beta cycles: coverage(N β ) = (1− α ) ⋅ 2e−1 , an indicator of how many cycles are needed to reach about 74% of the level of the asymptote, analogous to the “rise time” of an exponentially rising signal. A test generator that rapidly reaches its highest level of code coverage (approaches its asymptote) is said to have good convergence (see Fig. 5.1). A test generator that reaches its highest level slowly is said to converge poorly. The convergence gap is the difference between 100% and the value for the asymptote of the generator. The narrower this gap, of course, the greater the power of the generator. There may always be certain lines
Fig. 5.1. Convergence of tests
124
Chapter 5 – Normalizing Data
of code that are only exercised via deterministic tests, but this code might not benefit from the exercising provided by the pseudo-random verification software. Moreover, code coverage—especially for multi-instance IP—can be difficult and clumsy to use. The use of code coverage as a tool will be discussed in some detail in chapter 6. Driving the convergence gap towards zero increases the efficiency of the test generator to exercise the target thoroughly.
5.3 Factors to Consider in using Convergence Fig. 5.2 shows how, when graphing convergence using typical test order (activation tests, ordinary CRV tests, error tests), a stratification appears. This 3-level stratification (or possibly more) can make computation of Νβ tricky. Stratification results from the following sets of tests that exercise specific subsets of logic: • • • • • •
activation morph for testability morph for verification error-imposition tests deterministic tests the bulk of CRV
One way to deal with this is to choose tests at random from all types to obtain a non-stratified, smoother curve of convergence. On the other hand, the stratification may be so slight as to have a negligible effect on Νβ .
5.3 Factors to consider in using Convergence
125
Fig. 5.2. Stratification of convergence
Another factor to consider is which RTL-based coverage measure to use in computing the test generator’s power using the convergence technique. The greater the granularity in the measure (such as using expression coverage instead of line coverage), the better indicator the convergence is likely to be of the test generator’s power. Investigating the empirical results for a given project or multiple projects will provide insight in the best way to estimate power with convergence.
Fig. 5.3. Determining convergence of test generator against a target
126
Chapter 5 – Normalizing Data
Another effect might require some adjustments in how one calculates convergence: an initial step function in coverage from running a handful of tests. In such a case, one might consider the level achieved by the initial step function as the baseline from which the exponential growth curve rises.
5.4 Complexity of a Target Complexity Z is proportional to the amount of logic to be exercised in the target. Many different measures have been proposed and used to model the complexity of computer code (such as lines of uncommented code), but for the purposes of functional verification of an IC, it’s more useful to consider a more reasonable measure: a particular count of the number of gates in the target. A common practice in the IC design community is the division of logic into datapath logic and control logic. Datapath logic usually consists of uniform arrays of cells, such as bits in a register file, slices in an adder, and so on. The remaining logic is regarded as control logic (because it, of course, exerts control over the datapath, telling it what to do). It’s widely known that the vast majority of bugs in a design are associated with the control logic, rather than the datapath logic. It stands to reason (hence, our reasonable measure) that some function of the number of gates of control logic has something to do with the complexity (and, therefore, the potential buggy-ness) of a target. Consider a network of 2 2-input NAND gates. Each gate could connect to the input of either gate, for a total of 2 possible connection destinations for each of the two gates. A bug could dwell in any of 4 possible misconnections. With 3 such 2-input NAND gates, we have 9 possible misconnections. Reasoning along these lines (inductively speaking) we can state that from the standpoint of finding bugs, an estimate for the complexity of our target is the square of the number of 2-input NAND gates required to implement the control logic. This analysis for an estimate of complexity is analogous to that of a general machine made of N parts. Considering all possible interactions among parts, the resulting complexity is O(N2). We can refine this estimate by recognizing that within our control logic are other regular logical structures that we can choose (usefully and with sound reason) to regard as free of bugs. These regular logical structures are readily recognized as the multiplexors and adders and other MSI-like (medium scale integration) elements. Memory elements (flipflops) are also excluded from control logic because they don’t participate
5.4 Complexity of a Target
127
in the connectivity of 2-input NAND gates (they are simply inserted between the output of a NAND gate and the inputs of the gate or gates to which it is connected). One further refinement to this estimate is to acknowledge that verifying 2 exact copies of a given body of logic is no more work than verifying the logic on its own. That is, multiple instances (code instances, not instances of connectivity) of a single module do not contribute directly to the complexity of the target. However, the control logic connected to the 2 instances does contribute to the complexity. Synthesizing our target’s RTL using a library of only 2-input NAND (or NOR) gates and MSI-like elements and excluding multiple instances of any module, and then tallying up the number of resulting 2-input NAND gates. This number C of control gates can be used to compute a reasonable estimate for the target’s complexity as
ZC ∝ C 2 .
(5.1)
The subscript distinguishes this estimate from other estimates for complexity. This estimate ZC is based on control gates, but other estimates are possible. One of these other estimates will be discussed later in this chapter. We can then estimate the complexity ZG for each clock domain by using a commercial synthesizer with a particular library of components (datapath elements and 2-input NANDs only) and excluding repeated sub-networks of multiple instances of any module instantiated more than once. Timing is unconstrained because we only want to obtain a logical mapping of RTL to logic elements. Filtering the output of this special synthesis run to exclude the “zero-risk” datapath elements and to count up remaining 2-input NAND gates in each clock domain. Squaring each sum gives the estimated (and unit-less) measure for complexity of each clock domain.1
1
A whitepaper from Design and Reuse arrives at a similar conclusion, stating that “as designs double in size, the verification effort can easily quadruple.”
128
Chapter 5 – Normalizing Data
Fig. 5.4. Complexity rises rapidly with gate count
Estimating Z by analyzing control logic will be found to correlate strongly with the count of arcs in the target’s VTG. The size |E(G)| of the edge set of the VTG is defined as this count of arcs. More specifically, the count of arcs in the condensed functional space is likely to correlate strongly with ZC as computed from the control logic. It is the control logic, after all, that steers the action along arcs from point to point in the functional space. Mathematically we can state that
ZV ∝ E(G) .
(5.2)
The subscript V indicates that this estimate for complexity is based on the VTG rather than on control gates using the synthesis method. An estimate for complexity based on knowing values of standard variables means that we can make this estimate soon after conducting interpretation of the specifications rather than waiting until synthesis results are available from RTL at FC done (see Fig. 5.5).
5.5 Scaling Regression using Convergence
129
Fig. 5.5. Getting an early estimate for complexity Z
This estimate Z, whether based on control gates or on VTG arcs, can be a useful tool for planning and managing the resources needed to verify an RTL database for tape-out. In particular, by normalizing other measures with respect to Z, we can begin to compare results from differing targets. This is fundamental to the risk assessment that will be discussed in chapter 6. Just as code coverage is a normalized quantity (expressed as a percentage of the lines of code, etc.), functional coverage can be normalized, too.
5.5 Scaling Regression using Convergence How “big” must a regression suite be to be regarded as thorough? And, for that matter, what does it mean to have a “random regression suite”? A regression suite that thoroughly exercises a target in a pseudo-random manner can be said to bombard the target with novelty - new sequences of stimuli under various operational conditions and in various contexts, each different from the other - i.e., novelty (or “newness”). A target that can survive overwhelming novelty is regarded as free from bugs (at a given level of risk) and is protected from the introduction of bugs during enhancement by the same overwhelming novelty. Define NR as the number of clock cycles needed for a thorough regression of the target. In general NR will be found to be proportional to the complexity of the target:
130
Chapter 5 – Normalizing Data
NR ∝ Z .
(5.3)
This number of clock cycles is proportional not only to complexity of the target as indicated by examining control logic, but also as indicated by examining the VTG corresponding to the target. After all, during a single clock cycle the target is performing a single computation–that of determining the next point that will be reached in traversing the graph. Eq. 3.5 (repeated here) expresses mathematically what happens within the target:
r(n + 1) = f (r(n),e(n),s(n),c(n),a(n),k(n))
(3.5)
It is the control logic that determines which of the many arcs departing from a function point at time n will be traversed to the function point at time n+1. It should be no surprise that the number of arcs in the condensed VTG correlates strongly with complexity as determined by analyzing the control logic. The value for NR is determined empirically, unless one already has results from prior projects, in which case that project’s results can be used to forecast requirements for the current project. Consider two projects: prior project X and current project Y. Project X went well, producing an IC with few bugs at first silicon. Project Y is to produce an even larger IC. How much pseudo-random testing is needed to bring the risk at tape-out to the same level as for project X? The complexity of the target in project X at its completion was ZX and at FC (final coding) the complexity of the target in project Y was ZY. The manager had observed during project X that the complexity at tape-out was within 10% of the value at FC and that the complexity was always within 10% of the FC value so it was regarded as a good estimate of the final complexity. The test generator in project X had good convergence but at FC in project Y the convergence is not yet as good as for project X. However, project Y appeared to have the same convergence gap as project X, i.e. α Y = α X . A graph of the convergence for these 2 projects is illustrated in Fig. 5.6.
5.5 Scaling Regression using Convergence
131
Fig. 5.6. Forecasting project Y from project Y
To achieve the same level of risk at tape-out for project Y, the relative complexity of the 2 targets and the relative power of the test generator on each target are used to determine the number of cycles that target Y must survive without any faulty behavior. Both the complexity of a target and the power of the test generator contribute to the resulting convergence of the generator on the target. So, how must the number of cycles be scaled to achieve the same level of risk? Determining this scaling factor requires a bit of math but the result is the pleasingly simple ratio of beta cycles for each target. In general, the coverage achieved by running N cycles of tests is:
coverage = (1− α )(1− e− βN ) .
(5.4)
The variabe β is simply the scaling factor that determines the rise time of the curve.
132
Chapter 5 – Normalizing Data
After N β cycles the coverage will be (1− α ) ⋅ 2e−1 so
(1− α ) ⋅ 2e−1 = (1− α ) ⋅ (1− e 2e−1 = 1− e e
− βN β
− βN β
)
− βN β
= 1− 2e−1
(5.5)
−βN β = log(1− 2e−1) log(1− 2e−1) β =− Nβ
We have derived this general closed-form expression for β so that we can use it below. Because the coverage at the beta cycles for each target is the same, we can find the value of beta cycles for Y from the value for X as follows:
coverage(N RY ) = coverage(N R X ) (1− α )(1− e
− β Y N RY
) = (1− α )(1− e
βY N R = β X N R Y
X
N RY
β = X NR βY
− β X N RX
)
X
(5.6) −1
=
−log(1− 2e ) N β X
=
N βY
−log(1− 2e−1 ) N β Y NβX
N RX
N RX
So, the scaling of regression cycles for Y is straightforward:
N RY =
N βY N βx
N RX
(5.7)
5.6 Normalizing Cycles Counts with Complexity
133
5.6 Normalizing Cycle Counts with Complexity By scaling the number of regression cycles by the complexity of the target we can compare cycle counts needed for regression of different targets. This normalized cycle count QR is defined as:
QR =
NR Z
(5.8)
This constant QR based on empirical data is the quantity of pseudorandom regression that takes into account both the power of the test generator and the complexity of the target. Of course, the same estimate of complexity must be used for generating normalized values to be used together. This normalized measure of clock cycles can be used to compare diverse projects, and then to allocate resources according to the complexity of the targets and the power of the test generator with respect to the target. An organization that has accumulated data over a number of projects K can compute a mean value for QR as the geometric mean of individual values for QR: K
QR = (∏ QR i )1 K
(5.9)
i=1
A design organization with an empirically determined QR can then forecast the amount of regression N for some new design by estimating its complexity Z and running sufficient pseudo-random simulations to estimate Νβ. Then a suitable estimate for the number of regression cycles for this new design is simply.
N R = QR ⋅ Z .
(5.10)
When managing development of multiple devices, allocating resources suitably across projects can be made more effective by understanding the value of Q for your organization. Similarly, as the organization makes changes (tools used, staffing levels, etc.) the effectiveness of these changes is reflected in a change in Q (lower is better).
134
Chapter 5 – Normalizing Data
But, what if the convergence gap α is not the same from project to project? The foregoing analysis would be complicated by differing convergence gaps, but in practice this gap can be disregarded in either of 2 situations: • make the convergence gap as near to zero as possible with suitable adjustments and modifications to the test generator, or • determine that uncovered logic in the target is redundant or unreachable and simply disregard it in computing coverage.
5.7 Using Normalized Cycles in Risk Assessment Q can be a useful measure to determine the risk of a bug in the device after tape-out. Consider the example of 10 tape-outs of a number of different devices. Plotting the number of bugs exposed in the device against the Q for regression prior to tape-out can enable the manager to establish the resources needed to achieve a sufficiently low level of risk (bugs in the device).
Fig. 5.7. Post tape-out bugs as a function of Q
5.8 Bug Count as a Function of Complexity
135
Here it is apparent that, using this organization’s talent and resources, a level of Q of 6000 or higher is needed for a lower-risk tape-out. However, values in the 4000 to 6000 range might also be acceptable if time-to-market considerations require taking on somewhat higher risk at tape-out. It’s also apparent that a value of Q lower than 4000 is quite risky for a tape-out because, historically, that quantity of exercise has left too many unexposed bugs.
5.8 Bug Count as a Function of Complexity One potential measure that may have practical value in estimating the number of bugs in a target is to consider the ratio of bugs to complexity. This may be found to be a simple linear relationship or perhaps some other relationship. Nevertheless, an organization that has empirical data on bug counts and complexity may be able to make more accurate predictions of expected bug counts for new verification targets and thereby plan more effectively. Considering the example provided in Fig. 4.15 and adding upper and lower bounds for expected bug count, we arrive at Fig. 5.8.
Fig. 5.8. Forecasting lower and upper bounds for bug count
136
Chapter 5 – Normalizing Data
In this figure, estimates for lower and upper bounds on bug count have been shown, and it appears that the tapering off might be misleading. There might actually be more bugs yet to be discovered if the improvements in RTL code quality cannot be accounted for by other means. In our example, CRV testing with error imposition hasn’t yet begun and there are more bugs associated with that functionality that remains to be discovered.
5.9 Comparing Size and Complexity If we can determine the size of the functional space, why bother trying to convert that precise value to some approximation of complexity, itself a “thing” that is often not well defined, even in this book? Complexity (whatever it is) still exists, but if measures are instead based on the precise size of the functional space, no information is lost. Functional closure is based on the size of the functional space. The number of bugs encountered along the way is based (partly) on the complexity and partly on many other factors. Even though we are using the size of the functional space as an estimate for complexity, it should be remembered that size and complexity are, in fact, two different concepts that provide insights into different verification problems.
5.10 Summary The measure of complexity of a target based on analysis of the VTG for a target early in a project or analysis of special synthesis late in the project enables normalization of data with respect to a target? As the complexity of targets grow, so do the number of computations the target can perform. Consequently, more cycles of CRV are needed to explore this computational space thoroughly. In the next chapter we will see how to obtain measures of this thoroughness by examining standard variables.
References Zagury S (2004) Mixed-Signal Verification Methodology Using Nanosim Integration with VCS. Design And Reuse.
Chapter 6 – Analyzing Results
Previous chapters have defined standard variables and how their values are related in value transition graphs that provide the basis for determining when functional closure has been achieved. We have also had a brisk overview of practical considerations relevant to managing verification projects to produce the standard results needed for a data-driven risk assessment. This chapter will discuss how to make use of these results to gain understanding of the level of functional coverage indicated by these results by examining standard views and standard measures. Standard views and measures are the IP analog of mechanical drawings and tolerances. The market for interchangeable parts thrives partly because of these standards, forsaking a proliferation of inches and meters and furlongs and cubits. As standards bodies define the standard variables and their ranges for popular interface standards, IP providers and integrators will have a clear concise language for communicating product capabilities and customer needs.
6.1 Functional Coverage Bailey (p. 82.) defines functional coverage as “a user-defined metric that reflects the degree to which functional features have been exercised during the verification process.” The accuracy of this definition is apparent in the many such metrics that have been defined by users over the years. It can be quite instructive to survey the current literature for a current understanding of functional coverage. We have seen that 100% coverage for an instance in its context is when all possible unique trajectories to every function point have been traversed. The industry does not yet have tools to compute those trajectories for us, but formal verification may some day achieve such a feat. It’s worth noting that some function points might be reached along only one single functional trajectory, but it’s more likely that a given function point will be reachable along numerous trajectories, requiring multiple visits to such points to achieve 100% functional coverage.
138
Chapter 6 – Analyzing Results
One vendor’s tool has a “coverage engine [that] keeps track of the number of times the behavior occurs.” This corresponds to the traversal of a particular functional trajectory in a VTG. Any behavior can be described as a trajectory through the VTG containing the function points whose values are assigned (e.g., to stimuli and conditions, etc.) or produced (as a response) in the behavior. Because verification tools are not yet able to produce 100% exhaustive functional coverage, the phrase “thoroughly exercise” is the operative norm. How thoroughly must a target be exercised with CRV prior to tapeout (or shipments)? CRV must be sufficiently thorough to achieve the risk goals declared in the verification plan. Risk assessment is the topic of the next chapter. This chapter will consider the various methods available to evaluate the thoroughness of coverage, both with numerical measures and with visual views. It is on the basis of these measures and views that risk assessment is conducted.
6.2 Standard Results for Analysis Producing the standard results yields, in effect, a coverage database suitable for data mining (Raynaud, pp. 4-5). Modern tools already have the ability to transform raw data into certain views and measures. The move to standard results for standard variables allows comparison of differing projects and, in particular, IP offerings from different vendors. The industry is already rather close in presentation of views and measures. What remains is the partitioning of the results in a standard, universal manner. To facilitate creation of actual standards, the measures and views derived from these results must be economical to generate. Formal verification using theorem proving and model-checking are not yet economical for commercial designs. However, generating useful measures and views from standard results is already possible with commercially available verification tools.
6.3 Statistically Sampling the Function Space The functional space as defined by the variables and their ranges may be prohibitively large for complete analysis in a practical timeframe. Rather than examining all values of all variables attained over the course of regression, one may:
6.4 Measures of Coverage
139
• examine all values of a sample of variables to identify coverage holes (unexercised values, value duples, tuples, etc.) • define randomly-chosen functional points (value tuples) in the functional space and observe whether they are visited or not, how often, etc.
6.4 Measures of Coverage A measure is simply a numerical indication of some quantity. There are formal definitions for measure in a mathematical sense that can be applied to all manner of finite and infinite sets. Fortunately, the measures that are useful for data-driven risk assessment are familiar and straightforward.
Fig. 6.1. Analyzing Results with Standard Views and Measures
140
Chapter 6 – Analyzing Results
Chapter 5 discussed how data can be normalized according to the complexity of the target, whether determined by analyzing the VTG for the target early in the project, or by analyzing a special form of synthesis later in the project. The first (and very familiar) measure itself incorporates a form of normalization, one based on lines of code to express the RTL.
6.5 Code Coverage The process of simulating RTL requires that lines of code be evaluated in some order to determine logical values as time (measured in clock cycles) progresses. Code coverage tools keep track of whether or not the simulator has ever evaluated a given component of code, such as a line or an expression. This is accomplished by instrumenting the code with special procedures that record whether or not the component of code was simulated. Then, when some quantity of simulation has completed (such as a regression run), the count of components simulated is divided by the total count of those components. In theory this appears to be an excellent measure for coverage, but in practice it can prove to be clumsy and awkward to use. Moreover, because it considers each code component individually, it cannot account for interactions among the many components that make up a target’s RTL, and herein lies its weakness. Nevertheless, code coverage still has great usefulness as an indicator of the incompleteness of verification. If code coverage is below 100%, then functional coverage is certainly below 100%. However, the inverse is not true. The clumsiness in using code coverage arises from the fact that most RTL in use for current designs is either designed for intentional re-use or is being re-used (leveraged) from prior designs. Consequently, this RTL intentionally contains code components that might never be used in a given target. This so-called dead code can be very difficult to exclude from code coverage computations and this leads to the clumsiness in using these tools. Many days of painstaking engineering work can be consumed in analyzing a coverage report to determine which unexercised components are truly valid for a target and which are not valid (i.e., dead code). But, there’s an even more important reason why code coverage is not a good indicator of functional coverage. Code coverage tools only examine the code given to them. These tools cannot determine whether the code completely implements the target’s
6.5 Code Coverage
141
required functionality. So, if 100% code coverage is achieved but the code does not implement some requirement, the target is not verified. As one large CAD firm sums up the situation well in a whitepaper, stating that, “three inherent limitations of code coverage tools are: overlooking non-implemented features, the inability to measure the interaction between multiple modules and the ability to measure simultaneous events or sequences of events.” Nevertheless, code coverage tools can be very helpful in understanding certain coverage issues. For example, a block designer may want to understand whether his or her block-level tests have taken some state machine through its paces completely. One vendor’s tool “paints the states and transition arcs [of state machines] in colors indicating verified, unverified, and partially verified.” By “verified” what is actually meant is merely visited as often as the user thought would be sufficient. Understanding which arcs have not yet been “verified” informs the block designer where additional tests are needed. This example, by the way, is one where code coverage can be very effective: when an RTL designer is writing code from scratch for use as a single instance. In this case, every line of code is part of the target’s functionality and every line must be exercised. Code coverage should be evaluated separately for each clock domain of each target of verification. This helps to relate the coverage achieved to the number of clock cycles needed to achieve it, such as for computing a value for convergence of a test generator. Additionally, code coverage should be evaluated on a code module basis and not on a code instance basis if one wishes to understand the coverage of the RTL of that module. On the other hand, if one wishes to understand the coverage of the RTL using that module, code coverage would be evaluated on a code instance basis. If, for example, the RTL using the module has 100% code coverage, but the module does not, perhaps code is missing from the RTL using the module. This is one of the key problems in using code coverage. It says nothing about code which is not present. Standard measures for code coverage have been in place, more or less, for many years. Keating (pp. 174-177) describes six measures based on code coverage and states requirements for each as follows: • Statement: 100% • Branch: 100% • Condition1 : 100% 1
Note that the use of the term “condition” in the context of code coverage does not refer to the variables of condition in the standard framework.
142
Chapter 6 – Analyzing Results
• Path (considered to be secondary with less than 100% required) • Toggle (considered to be secondary with less than 100% required) • Trigger (considered to be secondary with less than 100% required) Other definitions for code coverage exist as well (Bailey, p. 81). Code coverage tools do provide a mechanism for identifying code that is to be excluded from coverage computations for whatever reason. The instrumentation added to the RTL by the tool can slow simulation speed dramatically because the procedure calls made for each instrumented component do take time to execute. So, one might choose to instrument only certain modules so that coverage on those modules can be evaluated more quickly than by running code coverage on the entire RTL. Another situation that merits excluding certain fragments of code is when the code is deemed to be intentionally unreachable (or dead code). For example, RTL constituting multi-instance IP (intellectual property that is intended to generate a variety of different instantiations) might contain code which is dead for certain instantiations. To prevent this known dead code from being included in the code coverage computations, pragmas such as “cov-off” and “cov-on” are used to delineate such code segments. However, exercised code within a cov-off/cov-on pair constitutes an error of some sort. If the code has been deemed to be unreachable but has, in fact, been reached (subjected to evaluation by the simulator), then there is a bug in the code or there is an error in the understanding of the dead code that is actually quite alive. It’s important to track whether any dead code has been reached because some error exists somewhere. IP actually often contains dead code, and this is quite ordinary and not generally a cause for alarm. For example, RTL that is designed as IP to be instantiated with different internal connectivities depending on the IP integrator’s needs might contain one or more state machines containing an arc that are traversed only for certain internal connectivities. It can be difficult to remove or disable the code corresponding to these arcs and this will be reported as uncovered statements, branches, or expressions, etc.
6.6 State Reachability in State Machines Another measure of the power of a test generator is indicated by its ability to reach all states in the target rapidly. Code coverage tools indicate the reachability of states within the individual state machines of the target, but do not indicate the reachability of composite state. A given target will
6.8 Fault Coverage
143
Fig. 6.2. Composite state pairs
typically contain numerous state machines, some of which operate in close cooperation and others that operate more independently of one another. Consider the example of two state machines, A and B, each of which has 4 possible states (see Fig. 6.2). The number of theoretical composite states is 16. However, in this example the actual number is smaller, because some states in each machine are mutually exclusive, i.e. they cannot both exist during the same clock cycle (in Fig. 6.2 these nonexistent pairs of states are indicated by the “X”). The reachability of such state pairs may be determinable using formal verification techniques. The number of cycles needed to reach all state pairs is also an indicator of the power of the test generator. Of course, any reachable pair that is not reached by the test generator indicates a coverage hole in which bugs might remain unexposed.
6.7 Arc Transversability in State Machines It is sometimes useful to apply the technique for determining valid composite states to determine valid composite arc traversals. After weeding out all mutually exclusive pairs of arcs from 2 state machines, a 2-dimensional view of the pairs of arcs can indicate unexercised functionality where unexposed bugs might still be located.
6.8 Fault Coverage If fault coverage is evaluated for the regression suite, the results can be indicative of how thoroughly a regression suite has exercised the target. Using the fault coverage of the manufacturing test suite is not directly applicable, because this suite will make use of the testability morph to achieve sufficiently high fault coverage.
144
Chapter 6 – Analyzing Results
A common rule of thumb states that the functional regression suite should achieve about 80% fault coverage (using a single stuck-at model). A low value of fault coverage from the functional regression suite is an indicator of weak regression suite. Examining the fault coverage for each module in the target can also indicate which modules are not well exercised by the verification software. Of course, fault coverage is only an indirect measure of functional coverage and the actual relation between fault coverage and functional coverage is one of correlation. Historical organizational data can be mined to determine this correlation.
6.9 VTG Coverage We have determined that the entire functional space of an instantiated target can be characterized by the points and arcs as represented in VTGs. A coverage measure corresponding to the fraction of arcs traversed is a true indicator of the degree of exercise of a target. After all functional arcs have been traversed at least once, there is no more functionality to exercise. VTG arc coverage is the measure for functional closure. There is no stronger measure of coverage than VTG arc coverage. High VTG-arc coverage will be found to correspond to high fault coverage of a target without morphing the target for testability. This level of fault coverage is achieved by exercising the functional space of morph0 without resorting to using morph1. However, achieving this level of fault coverage using morph0 will require significantly more test time and this becomes expensive when producing parts in large volume. Consequently, manufacturing tests are applied using morph1 which is specifically designed to achieve the required level of fault coverage in minimal time.
6.10 Strong Measures and Weak Measures The various measures of coverage available differ in the degree to which they indicate how thoroughly the target has been exercised. Code coverage measures are increasingly stronger as the granularity of measurement is made finer. Line coverage is a rather gross approximation of actual coverage as compared to expression coverage that scrutinizes the target’s RTL more closely. At the other extreme is VTG arc coverage, which as we learned in chapter 3 is the basis for determining functional closure. Other measures of
6.11 Standard Measures of Function Space Coverage
145
Fig. 6.3. Relative strength of measures of coverage
coverage fall between these two extremes as shown in Fig. 6.3. The relative positions of the various measures along the vertical axis of increasing strength are only approximate, but serve to indicate how strongly each should be considered when assessing risk at tape-out, the topic of the next chapter.
6.11 Standard Measures of Function Space Coverage In our discussion on standard measures (and later on standard views) it’s necessary to keep in mind that the measures described in this section are only the beginning of the evolution of widely-adopted standards with regard to measuring functional coverage for an IC or other digital system. The measures described in the remainder of this section might better be regarded as candidates for inclusion in a standard.
146
Chapter 6 – Analyzing Results
Useful and productive candidates will be embraced by the verification community and, eventually, by standards-making bodies. Those that are either not useful or not productive will fall by the wayside. Accumulation of empirical data within the industry will lead to the discovery of other more useful and productive measures that will add to the standard or replace some existing measures. Regression results are retained for the purpose of generating standard measures by way of data-mining. Ideally, complete execution histories of all tests would be saved for later analysis. However, this is not always practical due to the considerable disk space consumed by the simulation results. But, considering the economics of cheap disk storage vs. expensive mask sets, this storage can be a very worthwhile investment. The standard measures are the basic measures for how thoroughly each of the various spaces is explored and/or exercised. Many of these measures are simply the fraction of possible things exercised, what is commonly called coverage. Another term commonly used in this analysis is visit. Some commercially available verification tools allow the user to establish some minimum number of visits to a given thing (statement, state machine arc, value, etc.) before claiming that the thing has been covered or verified. Earlier in this chapter it was mentioned that some tools will paint an arc depending on whether the arc has never been traversed (visited), been traversed the number of times (or more) than specified by the user, or somewhere in-between.
6.12 Specific Measures and General Measures It is necessary to account for the four different possible ways in which coverage might be measured for a given target or body of RTL. These are shown in Fig. 6.4. The subspace of connectivity can be usefully regarded as containing 4 non-intersecting quadrants. Quadrant I is without variability. That is, the values of variables of connectivity are all constants within this quadrant. This corresponds to designing an IC for use in one and only one system. Many conventional designs fall into this quadrant, and verification coverage of these designs is where most industry attention has been focused.
6.12 Specific Measures and General Measures
147
Fig. 6.4. Four quadrants of subspace of connectivity
Quadrant II corresponds to designs in which the target is intended to be used in a variety of systems. Many general-purpose ICs fall into this quadrant, such as a microprocessor designed to work with multiple memory subsystems and I/O subsystems. Quadrants III and IV correspond to the design space for commercial IP. RTL can be generated with differing values for variables of internal connectivity per the needs of the IP integrator. Quadrant III may be of little interest but quadrant IV contains most commercial IP available currently. Only one of these four quadrants (quadrant I) can be characterized by specific measures of functional coverage. Code coverage, for example, is only meaningful for a single instance being simulated in a single context. The other three quadrants (II, III, and IV) can only be characterized by general measures of functional coverage. These general measures can be useful to the IP integrator in evaluating a producer’s product in a broad sense, but if the intended usage of the IP is known, then the specific measures can be applied. To know the intended usage means that values for all variables of internal and external (to the IP) connectivity have been determined. Chapter 7 will have more to say on this topic.
148
Chapter 6 – Analyzing Results
6.13 Specific Measures for Quadrant I The standard measures for verification of a single instance in a single context include the familiar measures of code coverage and somewhat less familiar measures based on state machines. In addition to these measures, all of which should be readily available using commercial verification tools, the standard measures include the much stronger measures of VTG point coverage and VTG arc coverage2. Coverage of standard variables is implicit in VTG coverage, but this coverage could also be included explicitly as standard specific measures (expressed as a fraction of the total count of values of the variables as in Table 6.2). To provide a more complete picture and for eventual comparison with other projects (or to compare IP products) both the complexity and the quantity of regression with CRV are included as standard measures. This set of measures is summarized in Table 6.1 below. Note that these measures must be applied to each clock domain in the target individually. Table 6.1. Standard Specific Measures Type of measure Code coverage
Value for measure % statement % branch % condition % path (optional) % toggle (optional) % trigger (optional)
For each state machine
% states reached % arcs traversed
VTG (uncondensed) coverage
% function points % function arcs
VTG (condensed) coverage
% function points % function arcs
Complexity
ZG ZV
Regression cycles
NR QZG QZV
2
At the time of publication, commercial tools that readily determine VTG coverage are not yet available.
6.15 Multiple Clock Domains
149
6.14 General Measures for Quadrants II, III, and IV The sets of measures that provide a broad view of coverage of RTL used in multiple contexts or used to generate multiple instances (or both) are listed in Table 6.2. Table 6.2. Standard General Measures Standard Variables
Within subspace
Connectivity - instance - context
% %
Activation - power - clocking - reset
% % %
Condition - internal direct - internal indirect - external direct - external indirect
% % % %
Stimulus - composition - time - errors in composition - errors in time
% % % %
Response - composition - time
% %
Overall %*
%
%
%
%
* These entries are the fraction, expressed as a percentage, of the values of the indicated variable that have been exercised
6.15 Multiple Clock Domains The measures listed in Tables 6.1 and 6.2 do not quite give a sufficiently clear picture of coverage because clock-domain crossing (CDC) signals aren’t sufficiently addressed by any of these measures. Fortunately, the design issues related to synchronizing signals between two clock domains are well understood.
150
Chapter 6 – Analyzing Results
Commercially available tools are now capable of analyzing RTL statically for correct clock-domain crossings with the exception of handshaking protocols (as discussed in Chapter 4). The use of these tools is routine in many development organizations and their cost need not be prohibitive when risk abatement is considered. The reports generated for RTL containing one or more clock-domain crossings should be free of any errors or warnings. The results of this analysis with these tools should be included as a matter of course with the standard measures. Finally, the specific combinations of clock frequencies used during verification should be declared along with the standard measures and the CDC report.
6.16 Views of Coverage Having collected standard results, it is then possible to obtain standard views of these results. There are many excellent techniques for visualizing the coverage obtained within the regression results, many provided with the commercially available tools. For the purposes of illustrating the concept of visualizing the results to identify coverage holes, consider a couple of examples. In these views one is attempting to make judgments on whether the degree of exercise is sufficiently high such the corresponding risk of an unexposed bug is sufficiently low to meet verification goals. Standard views use any of 3 different indicators for how thoroughly the target has been exercised: • visits: If a value has been assigned to a variable over the course of regression, then that value is said to have been visited at least once. Similarly, if values have been assigned to 2 different variables, then that pair of values is said to have been visited at least once. The degree of exercise expressed as visits is a binary degree: the function point was visited (at least once) or it was never visited. • cycles: The number of cycles during which a given variable was assigned a particular value is indicative of how thoroughly that function point was exercised. Cycles are meaningful for a single clock domain only. A target comprising 2 clock domains needs a separate cycle count for each domain. • Q: This is a normalized view of cycles based on the complexity of the clock domain of the target or on the size of its corresponding VTG.
6.16 Views of Coverage
151
Fig. 6.5. Example of 1-dimensional view
6.16.1 1-dimensional Views A standard 1-dimensional view of the values attained by a single variable over the course of regression are readily visualized with the use of a histogram that plots the count of occurrences of a value against the range of values of that variable (see Fig. 6.5). An area of unexercised functionality is readily seen in the histogram. If there is any faulty behavior associated with these unexercised values, it will not be exposed during regression. 6.16.2 Pareto Views A Pareto chart is simply a histogram whose data have been sorted by the yaxis value. This enables identification of functional regions that may need further exercising. Consider the example of composite state, and in particular the number of times that reachable composite states have been entered. Rather than counting the number of cycles that a given composite
152
Chapter 6 – Analyzing Results
Fig. 6.6. Example of Pareto View
state has been visited (idling state machines spend a lot of cycles without changing state), the number of times the composite state has been entered from a different state is much more enlightening. This example is illustrated in Fig. 6.6. This histogram clearly shows 3 composite states that were never entered over the course of pseudo-random regression. If there are any bugs associated with these composite states, they will not be exposed. Similarly, some composite states were entered rather infrequently and the risk of an unexposed bug associated with these states is much higher than with those that have been entered much more frequently. Pareto views can provide important information about potential coverage holes when coverage is known not to be complete (exhaustive). This is usually the case for commercial designs. Consider as an example the 8entry queue analyzed in Chapter 3. Assume that this queue, deeply buried within some complex design, has been exercised exhaustively by a test generator. That is, every VTG arc has been traversed at least once. Why
6.16 Views of Coverage
153
would there be any interest in a Pareto view of visits to the function points of ITEMS_IN_QUEUE? If such a view revealed that, in fact, the fullqueue value had been visited relatively infrequently as compared to other values, that’s a clue that perhaps other logic elsewhere within the target hasn’t been exercised thoroughly enough and might still contain an unexposed bug. Dynamic behavior must be exercised throughout its range with particular focus on the boundaries (or extremes) of its defined behavior. Is the test generator able to fill and empty the queue? Can it do so frequently? Can it sustain a full or nearly full state for sufficiently long periods of time so that other areas of the target are thoroughly exercised? Without exhaustive coverage these questions should be answered with suitable views of coverage of variables associated with this dynamic behavior, such as the number of items in a queue or the number of bus access per unit time. This will be discussed again later in this chapter under the section about time-based views. 6.16.3 2-dimensional Views A standard 2-dimensional view of the values attained by 2 chosen variables (“cross” items in the e language) over the course of regression are readily visualized with the use of a scatter diagram with a symbol plotted for every occurrence of a pair of values (see Fig. 6.7) recorded over the course of regression. For this particular pair of variables, a dependency exists between their ranges such that 6 pairs of values cannot occur. These create a vacant region in the functional subspace spanned by these 2 variables. Again, an area of unexercised functionality can be readily seen in the diagram, labeled as a coverage hole. Using color as a z-axis value, the count of occurrences can be indicated, and lightly covered areas can be determined from such a diagram. 6.16.4 Time-based Views The temptation to characterize levels of functional coverage with some single figure of merit is great, but no such measures exist yet. Coverage of many types of functionality are often evaluated on a qualitative basis based on engineering knowledge of how the target is intended to function.
154
Chapter 6 – Analyzing Results
Fig. 6.7. Example of 2-dimensional view
Consider the example of a shared bus in a hypothetical target. It can be asserted that thoroughly exercising a shared bus requires that the verification software be able to drive it now and then to saturation and then to quiescence and, possibly, to other levels as defined by high- and low-water marks. Fig. 6.8 illustrates the results of exercising such a bus both well and poorly. The trace labeled “good” shows that the tests are able to cause accesses at the maximum rate (15 in this example) we well as at the minimum rate (zero in this example). The trace labeled “poor” is not able to saturate the bus and, indeed, does not exercise frequent (more than 8 per sample period) accesses at all.
6.17 Standard Views of Functional Coverage
155
Fig. 6.8. Example of time-based view
Because tools are not yet capable of producing exhaustive coverage in a practical timeframe, it remains necessary and prudent to develop good judgment regarding this sort of dynamic behavior. A time-based view of bus utilization that is largely uniform or that is not able to reach and sustain extremes of usage indicates that coverage is poor and unexposed bugs are likely to remain in the RTL. A time-based view that shows lots of burst behavior with sustained high levels and low levels of activity with both rapid and slow transitions between extremes indicates that coverage is much, much better and that the risk of unexposed bugs is much lower.
6.17 Standard Views of Functional Coverage The views described in this chapter are actually quite simple and obvious. It is not the type of views that must be standardized, but rather that which is viewed that merits standardization. In other words, the standard variables and their values must be made available for arbitrary viewing by interested
156
Chapter 6 – Analyzing Results
parties, whether they are from upper management or from a prospective customer of IP. These different views are all quite common and readily available with commercial verification tools. When the standard results are available to such viewers, then it is easy and economical to produce the views on demand. Data-mining of the standard results and “refining the ore” with these views can provide useful insights into the level of coverage achieved for the different subspaces of the design.
6.18 Summary A verification methodology that takes full advantage of standard results will increase productivity in the long run. Everything about the verification of the target can be determined from these standard results. Until 100% functional coverage can be achieved in commercially interesting timeframes, it remains necessary to sample the results. If every view examined indicates thorough exercise with no gaping holes, then the risk of an unexposed functional bug is low. If, on the other hand, one out of ten views reveals a coverage hole, then the target has probably not been exercised thoroughly. Standard variables and the measures and views of them make it possible to have objective comparisons of projects and commercial IP. We will examine this more closely in the next chapter.
References Keating M, Bricaud P (2002) Reuse Methodology Manual: for System-on-a-chip Designs, Third Edition. Kluwer Academic Publishers. Piziali A (2004) Functional Verification Coverage Measurement and Analysis. Kluwer Academic Publishers. Raynaud A (2002) Managing Coverage: A Perspective. Synopsys. Tufte ER (2001) The Visual Display of Quantitative Information, Second Edition. Graphics Press. Verisity Design, Inc. (2005) Coverage-Driven Functional Verification. Verisity Design, Inc.
Chapter 7 – Assessing Risk
So far in this book we have developed important new theory that explains what it means to achieve 100% functional coverage. This theory is based on a set of standard variables that account for the full range of variability within which our design must function. Standard results capture the values of these variables for analysis. If we achieve functional closure, we can tape-out without risk of a functional bug. But, until verification tools are available that can drive a system throughout its functional space, traversing every arc and visiting every point, we will have to live with a certain amount of risk. In this chapter we will learn how to assess the risk of a functional bug based on results of regression.
7.1 Making Decisions Risk assessment is performed to facilitate good decision-making. This was first discussed in chapter 1 and, before diving into the topic of risk assessment, a brief detour into this topic of decision-making will help place risk analysis into proper context. Conventional risk assessment has been well developed for industries in which safety is a consideration, primarily safety for people. Aircraft and nuclear power plants and medical implants must be designed not only with functionality in mind, but also with failure in mind. Things wear out with use and must undergo timely maintenance to sustain the required level of safety. Risk assessment for functional verification is somewhat different, but concepts from conventional risk assessment provide insightful background. The interested reader is referred to the detailed treatment in Fault Tree Handbook by Veseley, et al for a thorough discussion of this subject, but a couple of extended quotations from this publication will provide some useful background.
158
Chapter 7 – Assessing Risk
Fig. 7.1. Decision making based on quantified risk assessment
First, an important reality check: It is possible to postulate an imaginary world in which no decisions are made until all the relevant information is assembled. This is a far cry from the everyday world in which decisions are forced on us by time, and not by the degree of completeness of our knowledge. We all have deadlines to meet. Furthermore, because it is generally impossible to have all the relevant data at the time the decisions must be made, we simply cannot know all the consequences of electing to take a particular course of action. On the difference between good decisions and correct decisions: The existence of the time constraint on the decision making process leads us to make a distinction between good decisions and correct decisions. We can classify a decision as good or
7.2 Some Background on Risk Assessment
159
bad whenever we have the advantage of retrospect. I make a decision to buy 1000 shares of XYZ Corporation. Six months later, I find that the stock has risen 20 points. My original decision can now be classified as good. If, however, the stock has plummeted 20 points in the interim, I would have to conclude that my original decision was bad. Nevertheless, that original decision could very well have been correct if all the information available at the time had indicated a rosy future for XYZ Corporation. We are concerned here with making correct decisions. To do this we require: 1. The identification of that information (or those data) that would be pertinent to the anticipated decision. 2. A systematic program for the acquisition of this pertinent information. 3. A rational assessment or analysis of the data so acquired. In the preceding chapters the standard framework for functional verification has been defined and described in detail. The information pertinent to the anticipated decision appears in the form of values of standard variables collected during the verification process. Collecting this information in such a manner that standard measures and views can be produced constitutes a systematic program for information acquisition. This places us in the position of now being able to form a rational assessment of risk based on analysis of the information.
7.2 Some Background on Risk Assessment Conventional risk assessment is concerned with understanding the extent of possible loss. In this sense risk is formed of two components: 1) some unlikely event, and 2) some undesirable consequence. Mathematically this is usually expressed as
risk = p(event) ⋅ c(event) ,
(7.1)
where p is the probability of the event and c is the consequence of the event. If one purchases a lottery ticket for $1.00 and the probability of not winning is extremely high, then the risk is pretty nearly the $1.00 paid for
160
Chapter 7 – Assessing Risk
the ticket. This example has an easily quantifiable consequence, but many consequences are not so simply described numerically. If one purchases cosmetic surgery for $10,000 but with an associated small risk of death or disability from surgical complications, the risk is considerably more than the original outlay of $10,000. Risk analysis typically progresses through several steps. The first step is identifying threats. Threats come in many forms, ranging from human (a lead engineer resigns) to operational (corporate re-organization delays progress by N weeks) to procedural (handoff of design data between engineering teams not well coordinated) to technical (verification tools do not work as advertised) to natural (hurricanes and earthquakes to name but a few). For the purposes of this book only one threat will be considered, that of an unexposed functional bug. That leaves the problem of estimating the probability of this undesirable event. Estimating this probability is the concern of the remainder of this chapter. In spite of the rather exact definition in Eq. 7.1 for risk as used by those concerned with product safety, traditional usage freely interchanges this term with “probability” and this book will honor this tradition. The “risk of a bug” and the “probability of a bug” will be taken to mean the same thing. Nevertheless the consequences of a given bug will depend on how it might affect the product’s success (see remarks about optional or opportunistic functionality in section 4.6).
7.3 Successful Functional Verification What constitutes a successful functional verification project? There are as many definitions of success as there are managers, and there is likely to be wide disagreement among the various definitions. A project that deploys a new set of verification tools without a hitch or achieves code coverage goals previously unattained or results in a number of new software inventions might be considered wildly successful by some managers and dismissed as routine by other managers. Failure, on the other hand, is usually much easier to pin down. If a product cannot be shipped due to functional bugs in the product, there will be generally strong agreement that the project to develop the product was a failure. Veseley provides a handy formalism for clearly articulating risk (see Fig. 7.2).
7.3 Successful Functional Verification
161
Fig. 7.2. Success and failure spaces
This figure illustrates the complementary nature of success and failure. There are degrees of success that readily map to corresponding degrees of failure. At the two extremes we have total success and total failure. What lies between these two extremes is what is normally experienced in the real world. The natural drive for engineers and managers in pursuit of professional excellence is for total success in what they do. However, experience shows that despite best efforts, the most that can be expected is something a bit less than total success. This point is indicated on the continuum with the designation “maximum anticipated success.” The other points on this continuum are likely to be unconsciously familiar to the reader. In golfing terms, now and then one enjoys the pleasant surprise of hitting that hole-in-one (total success), but the best that one expects on a given hole might be a birdie (maximum anticipated success). At the very least one might expect to break 80 (minimum anticipated success) for 18 holes, but there are days when one just barely breaks 100 (minimum acceptable success). This level of success can be costly in terms of replacing damaged clubs. Analogies are really not necessary here, however, because experienced verification engineers and managers will readily recognize degrees of success for a given verification project. Consider Fig. 7.3. Clearly if the product ships with fully functional first silicon, the project can be considered a total success. But, what might constitute intermediate levels of success? Real-world projects are executed within real-world constraints of scope, schedule, and resources. Few corporations can fund a parade of tape-outs without obtaining silicon worthy of shipping as product. Some limits will be placed on the amount of money (or engineers or simulation engines,
162
Chapter 7 – Assessing Risk
etc.) spent on a project before abandoning it if it doesn’t result in shippable product. The example illustrated in Fig. 7.3 uses a simple budgetary constraint to define these intermediate levels of success. Assume that $3,000,000 has been budgeted for purchasing masks for fabricating the IC. Using the expense assumptions shown in Table 1.2, this budget is sufficient to pay for 3 full mask sets (at $1,000,000 each), or 2 full mask sets and 2 metal-only mask sets (at $500,000 each).
Fig. 7.3. Example of budgeting for success
7.3 Successful Functional Verification
163
The budgeting manager knows that the first $1,000,000 will be spent for the tape-out 1.0. However, experience says that despite best efforts on the part of everyone on the project, something always seems to require at least one metal turn. This is the maximum anticipated success and will cost the manager an additional $500,000 for the metal masks. At this point half of the budget ($1,500,000) will be spent. Experience also tells this manager that the most one can expect (based on prior project) is a second metal-only tape-out. This is the minimum anticipated success and it costs $2,000,000 to achieve. The very least acceptable outcome is that a full spin is needed after these two metal-only tape-outs. This constitutes the minimum acceptable success and it costs everything that has been budgeted for tape-outs, $3.000,000. Note that this particular progression of tape-outs (from 1.0 to 1.1 to 1.2 to 2.0) is within the budget for mask sets, but the project might progress in other ways, based on decisions made during the duration. The four possible progressions within this budget are: • • • •
1.0, 1.1, 1.2, 2.0 (shown in Fig. 7.3) 1.0, 1.1, 2.0, 2.1 1.0, 2.0, 2.1, 2.2 1.0, 2.0, 3.0
The manager responsible for approval of tape-out expenses will want to consider carefully what will be the best strategy for getting product to market. There is no budget for masks above and beyond what is listed above, so if 3.1 or 4.0 should become necessary, this would be regarded as failure. Budget is only one of several considerations in setting goals for risk. Of the manager’s triad of scope, schedule, and resources, any one of these might be regarded as inflexible while the remaining two are regarded as flexible. In Fig. 7.3 resources are treated as inflexible - when the money is spent, that’s the end.1 Many products, especially seasonal consumer products such as games and toys, have financially unforgiving deadlines and schedule is inflexible. The budget column might instead be replaced with one showing schedule dates or perhaps product features (functionality) that can be sacrificed to complete the project One further observation regarding Fig. 7.3 must be made, and that is that success is defined in terms of the manufactured article being shipped as a product (or as a component of a product). The manager responsible for 1
Startups are particularly vulnerable to this constraint. When the Nth round of financing is denied, the project is pretty much over.
164
Chapter 7 – Assessing Risk
approving expenditures for a tape-out will not be interested in a functionally correct and complete design that fails to work properly due to other reasons (excessive heat dissipation, for example) and this manager’s budget and organizational responsibility will extend well beyond functional verification, one of several factors affecting the ship-worthiness of the manufactured article. These factors include: • • • • •
functional electrical thermal mechanical process yield
A complete risk analysis must comprise all factors, but only the functional factor lies within the scope of this book. This same success/failure method could be used with IP vendors on whom the manager’s project is dependent. Can your IP vendor meet the terms that are imposed on your own development? How does your vendor view success in its varying degrees? Is there agreement between vendor and consumer? Is their view compatible with your project?
7.4 Knowledge and Risk Risk can be said to be a consequence of our lack of knowledge about something. The less we know about an undertaking, the greater the risk of an undesirable outcome. Functional verification addresses the risk—or probability - that a bug is present whose manifestations as faulty or undesired behavior limit the usefulness of the device in its intended applications. Keating says (pp. 153, 156), “The verification goal must be for zero defects ... ideally achieving 100% confidence in the functionality of the design …” This is the correct goal for verification, but in the non-ideal world where ICs are developed and without true functional closure, risk will never actually reach zero. It approaches, but does not reach, zero.
7.4 Knowledge and Risk
165
Fig. 7.4. Data-driven risk assessment
A prudent strategy for managers of multiple ICs is to aim for accumulation of empirical data over time to develop a baseline for risk modeling. If you can obtain the standard results, you can apply the standard measures and enjoy the standard views. These measures and views can then enable you to evaluate the benefits, if any, of changes or intended improvements. Create your “organizational baseline” for risk vs. results. Acquired intuition from frequent, detailed interaction with the target of verification can prove invaluable in the search for bugs. Careful consideration of this intuition leads to good engineering judgment, and the verification manager who learns to recognize and to nurture this good judgment within the verification team will be rewarded with better verification results. Turnover of engineering staff erodes the team’s collective engineering judgment, and this should be kept in mind during planning and execution of any verification project.
166
Chapter 7 – Assessing Risk
7.5 Coverage and Risk In chapter 5 we learned how to normalize functional coverage with respect to complexity of the target. This is what enables us to compare widely different projects. Coverage informs us as to how thoroughly a given target has been exercised. The more we know, the less risk we face at tape-out. Risk simply reflects what we do not know. If we knew perfectly the dynamics that will affect the final resting position of a pair of tumbling cubes, we would excel at shooting craps. But, because we do not know these dynamics perfectly, we rely on probability theory and statistics. Graphically, we can visualize the relationship between knowledge and risk (and its inverse, confidence) as shown in Fig. 7.5. The more thoroughly one exercises the target, the more knowledge one gains about the target and, therefore, the lower the risk of an unexposed bug.
Fig. 7.5. Knowledge reduces risk, increasing confidence
7.6 Data-driven Risk Assessment
167
Pseudo-randomly generated tests may be said to sample the functional space randomly. Chapter 3 explained just how large this functional space can be. This justifies the use of statistical quality control for evaluation of risk. Because it is not commercially feasible to explore this vast functional space exhaustively, we rely on methods of statistical quality control to evaluate the probability of a functional bug. Coverage is measured for each individual clock domain, because code is exercised (evaluated) only as the clock advances. A complete target is exercised as a whole, but coverage is measured and risk is assessed individually for each domain in the target. Consider a simple example of a target comprising 2 clock domains, with one clock running at a frequency 10 times that of the other. Can the logic in the slower clock domain be regarded as having been exercised to the same degree as the logic in the faster domain? Of course, not. Because coverage is measured for each clock domain, risk can only be evaluated on the same basis. The overall risk (probability of unexposed functional bug) for the entire target can be computed mathematically as follows. For a target T comprising D domains, the probability p of an unexposed bug for the target is:
pT = 1− ((1− p1 ) ⋅ (1− p2, ) ⋅ ⋅⋅⋅ ⋅ (1− pD )).
(7.2)
7.6 Data-driven Risk Assessment In chapter 1 an example of two projects A and B was discussed, with B having a more favorable outcome based on both time to market and on expenses incurred after first tape-out. The unanswered question from that example was how project C could possibly benefit from knowledge gained in the two prior projects. With standard measures and standard views, this becomes possible. A large design organization with multiple design projects employing standardized verification can accumulate empirical data from which the organization’s risk vs. results curves can be developed. It will be possible to associate a given level of risk with, for example, the organization’s mean value of Q for regression. Better still, multivariate correlation with coverage percentages over values of standard variables may give even better predictors for the outcome of a tape-out.
168
Chapter 7 – Assessing Risk
7.7 VTG Arc Coverage Using standardized verification means that VTG coverage, and VTG arc coverage in particular, can provide a genuine indication of how thoroughly a target has been exercised. This new measure lacks direct empirical data, but experience with multiple verification projects using various toolsets can guide informed speculation, giving us a family of conjectural risk curves as shown in Fig. 7.6. An examination of available data from past projects will bring confirmation to several assumptions underlying the curves in the figure. First, most regression suites exercise a rather small fraction of the total functional space as measured by VTG arcs. So, by exercising just this relatively small fraction of the total arcs, a large fraction of the functional bugs are exposed and the risk drops relatively quickly. Second, even if by exercising this initial fraction of arcs we have reduced the risk of a bug enormously, there remains this nagging doubt that, until 100% of the arcs have been exercised, there could still be that one last bug to be exposed. This nagging doubt gives the risk curve a long tail that eventually does reach 0% risk (probability of unexposed functional bug), but only at 100% coverage when that last reluctant arc has been traversed.
Fig. 7.6. Probability of unexposed bug as function of VTG arc coverage
7.8 Using Q to Estimate Risk of a Bug
169
On the other hand, this risk estimate is valid only to the extent that it comprises the digital design described in the natural-language documents interpreted into the specific variables, ranges, rules, and guidelines defined precisely in computer language (such as e or Vera). That is, the exhaustive exercise of the target (100% VTG coverage) means only that correctness and completeness of the target have been demonstrated. The regression suite says nothing about the presence of superfluous, hidden, or otherwise undocumented functionality. Designers of secure systems may need to consider these limitations to CRV during the planning process. Some readers might speculate whether the risk curves in Fig. 7.6 are based upon coverage of the complete, uncondensed functional space or, instead, the condensed functional space. Accumulation of standard results within the industry will enable creation of risk curves based on arc coverage. As software tools incorporate emerging theory, this will become routine. But, until tools are able to assist in VTG generation and analysis, it remains necessary to fall back on currently available tools and data.
7.8 Using Q to Estimate Risk of a Bug Q, the cycle count normalized by complexity, can be useful measure to determine the risk of a bug in the device after tape-out. Consider the example of tape-outs of a number of different devices. The number of bugs exposed in silicon are shown in Table 7.1 along with the Q for regression for tape-out 1.0. These are tape-outs for 10 different ICs, not multiple tapeouts of a single IC. They are a sample (perhaps the only data we can lay our hands on) of historical results. Table 7.1. Post-silicon bugs for 10 hypothetical projects Project 1 2 3 4 5 6 7 8 9 10
Q 2,000 3,000 3,200 3,500 3,800 4,300 4,700 5,500 6,200 9,000
bugs 12 4 5 7 8 6 5 3 1 0
170
Chapter 7 – Assessing Risk
Plotting the number of bugs exposed in the device against the Q for regression prior to tape-out 1.0 can enable the manager to establish the resources needed to achieve a sufficiently low level of risk (Fig. 7.7). Here it is apparent that, using this organization’s talent and resources, a level of Q of about 6,000 or higher is needed for a lower-risk tape-out. However, values in the 4,000 to 6,000 range might also be acceptable if time-to-market considerations require taking on somewhat higher risk at tape-out. It’s also apparent that a value of Q lower than 4,000 is quite risky for a tape-out because, historically, that quantity of exercise has left too many unexposed bugs. There is a better way to make use of these data, however. A cumulative probability density function, similar to the one for VTG arc coverage in Fig. 7.6, is needed.
Fig. 7.7. Post tape-out bugs vs. Q
7.8 Using Q to Estimate Risk of a Bug
171
Table 7.2. Q to clean tape-out Project Q
bugs
1 2 3 4 5 6 7 8 9 10
12 4 5 7 8 6 5 3 1 0
2,000 3,000 3,200 3,500 3,800 4,300 4,700 5,500 6,200 9,000
Q to clean tape-out 2,400 3,600 3,840 4,200 4,560 5,160 5,640 6,600 7,440 9,000
For the purposes of this example, assume that the regression suite for each tape-out had to be expanded by an 20% (an arbitrary value) to achieve a clean tape-out. That is, after reaching this level of Q and taping out, no further bugs were exposed after verification of the silicon and after its use as a product. Table 7.2 has the data of Table 7.1 with these additional data added. The lowest value of Q in the data (Table 7.2) establishes a lower bound. The available data indicate that no bug-free IC can be taped-out with less Q than 2,400. Regard the count of bugs as a red herring for a moment, and the remaining data reveal a model for risk at tape-out as a function of Q. Only one tape-out resulted in silicon with no unexposed functional bugs, so the regression suite needed no expansion for regression of successive tape-outs. A lowest value for Q for a clean tape-out has been established empirically at 9,000. Applying probability theory to these 10 data points we can derive a cumulative probability density function for the distribution of Q for clean tape-out. This results in the data in Table 7.3. Plotting the results from the Table 7.3 results in the risk (probability of a bug) curve of Fig. 7.8.
172
Chapter 7 – Assessing Risk
Table 7.3. Developing the cumulative probability density function Q 2,000 2,400 3,600 3,840 4,200 4.560 5,160 5,640 6.600 7.440 9,000
targets still having unexposed bug(s) 10 of 10 9 of 10 8 of 10 7 of 10 6 of 10 5 of 10 4 of 10 3 of 10 2 of 10 1 of 10 0 of 10
Pr(bug) 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0%
Fig. 7.8. Risk profile based on historical tape-out results
7.8 Using Q to Estimate Risk of a Bug
173
This risk curve, based on a meager 10 data points, may be crude, but it represents a beginning. In fact, a simple straight line joining the two extreme values for Q may be an equally valid approximation for the cumulative distribution function. As more data are collected, organizational risk curves can be refined. And, over time other relationships among available data will be discovered that model risk more usefully so that goals can be set with more predictable achievement times and costs. Now that we have empirical data from past projects we can establish the level of success for which we are aiming according to the risk we are willing to tolerate at tape-out. Consider Fig. 7.9. Associating our expectations for success with risk is largely a matter of engineering and managerial judgment. Fortunately we have historical data from our organization to guide us in setting our goals for risk on our current project.
Fig. 7.9. Success and Risk
174
Chapter 7 – Assessing Risk
We are realists, and recognize that the very best outcome that we can expect (maximum anticipated success) is the best result achieved by our colleagues on a past project – and a little bit more, so we set the level of risk for our project aggressively. We are aiming, of course, for total success and that is in fact the only workable strategy for achieving our maximum anticipated success. If we aim for less, we will most likely never be able to achieve total success.
7.9 Bug Count as a Function of Z Another potential measure that may have practical value in estimating the number of bugs in a target is to consider the ratio of bugs to complexity. This may be found to be a simple linear relationship or perhaps some other relationship. Nevertheless, an organization that has empirical data on bug counts and complexity may be able to make more accurate predictions of expected bug counts for new verification targets and thereby plan more effectively. This has already been discussed in chapter 5. Fig. 5.8 has an example relating bug count to complexity and using this as a potential indicator of whether or not a project is close to exposing all functional bugs. Using QR and Z does not require the existence of standard results for standard variables. What this means is that data archived from past projects can be analyzed to provide risk curves based on organizational history.
7.10 Evaluating Commercial IP Commercial IP is effectively just RTL developed by someone else and for which one pays a significant sum with the expectation that it constitutes a low-risk ingredient in one’s own design. One major CAD company remarked in a whitepaper that, “The IPdeveloper may be intimately familiar with the micro-architecture of the block of functionality they have created, but have only a fuzzy idea about what the target system will look like.” In fact, what this observation means is that the IP-developer has a detailed understanding of the variables of internal connectivity, but has not defined the variables of external connectivity with which their product will function correctly. Lacking a definition of the contexts in which the IP has been verified to work correctly, the burden falls on the IP integrator to verify the IP, somewhat defeating the purpose of using IP in the first place.
7.10 Evaluating Commercial IP
175
Standard variables allow the integrator to characterize their needs in terms of external (to the IP) connectivity and condition variables and then examine various offerings precisely in terms of those values. Generating standard results such that standard views and measures can be obtained gives IP developers and buyers a solid platform on which to grow so we can all get back to the business of inventing. Standards bodies will eventually begin to define standard variables and their ranges. Then IP producers will be able to disclose what values or ranges are supported and the degree to which they have been verified. This will make it possible for prospective customers to examine standard results prior to licensing.
Fig. 7.10. Evaluating IP Quality
176
Chapter 7 – Assessing Risk
It is extremely valuable to conduct one’s own risk assessment based on standard results provided by the vendors under consideration. And, if there are particular areas of concern, data-mining these results in greater detail will enable one to form a better assessment of risk for each vendor’s IP. If the IP is being acquired for broad deployment in a variety of applications, then the following measures and views are of interest. If the IP is being acquired for a single use, then the views and measures of unused connectivity are less relevant except as an indication of overall thoroughness of verification in general related to the IP. Of course, there are considerations in the buying decision other than verification, such as gate count, price, and so forth, but here we will restrict our attention to quality of verification.
7.11 Evaluating IP for a Single Application The evaluation of IP products from multiple producers is greatly simplified when standardized verification is in effect. If the IP is to be used in a single application, this corresponds to quadrant I in Fig. 6.4. In this case the specific measures listed in Table 6.1 apply. Comparing one vendor’s measures against another’s (or against one’s own internal measures) provides a valid indication of the level of risk (probability of a functional bug) entailed in using the product.
7.12 Nearest Neighbor Analysis If a vendor’s standard results do not contain any data that match the intended usage, then an analysis of nearest neighbors may provide some indication of risk. A formal approach to comparing IP (on the basis of verification, that is) for a particular use considers the accumulated relevance of the standard results to those for your intended usage. This formal approach may prove to be of limited practical value, but it illustrates how bugs can remain hidden even in “pre-verified” or “silicon-proven” IP. We begin with some useful definitions from coding theory: • Hamming weight: the number of positions in a vector with non-zero values. From the standpoint of verification, we will understand this definition to be the number of positions in a vector with non-null values. If a value has not been assigned to a variable in a given subspace, the position corresponding to that variable will have the null value.
7.12 Nearest Neighbor Analysis
177
• Hamming distance: the number of positions in which 2 vectors differ in value. The Hamming weight of a vector is also its distance from the null vector. First, consider the particular instance of the IP that corresponds to your needs. Does your instance appear in the standard results? If so, then some or all of the standard results relate directly to the defined instance because this exact instance produced all or some of the results. Formally we can state that the relevance ri=1. In addition, if only some of the results were generated from your instance, this must also be noted. Call the count of relevant cycles ni. Of course, if the instance is comprised of multiple clock domains, then this analysis must be performed separately for each clock domain. For the purposes of the present discussion, assume that the instance contains only 1 clock domain. If your instance does not appear in the standard results, then determine the nearest neighbor within the subspace of internal connectivity and compute the Hamming distance di between your instance and this nearest neighbor. This distance is simply the number of positions in which the nearest neighbor differs from your instance. Formally, we could say that for this neighbor’s results the relevance ri =1-( di /pi ) , where pi is the number of positions in the vector of variables of internal connectivity. However, not all variables are created equal. Some variables have much greater influence on the behavior of a digital system than others. Consider the variables of connectivity for a USB host controller. A variable corresponding to the number of ports for the controller will not affect the behavior affected by the logic quite as much as a variable corresponding to the speed (HS vs. FS vs. LS) of a given port. A weighting factor for each variable could help give a better indication of the net relevance. In this case we can use the dot product of the difference vector (with a value of one for each position in which the two vectors differ and a zero for each position in which they are the same) with the weighting vector is the resulting weighted distance di.2 So, given a specific instance we can determine the relevance ri of the standard results to this instance as well as the number of cycles ni in the standard results that are produced by this instance (or some chosen nearest neighbor). Then, we continue the analysis by considering the context presented to the instance by our intended usage. If the results “contain” this context (i.e., some or all of the results were produced by the chosen instance or its 2
Existing standards bodies would bear the burden of assigning industry-standard verification weights to standard variables.
178
Chapter 7 – Assessing Risk
neighbor in the chosen context), then the relevance rk = 1 (we use k to represent context so that we can later use c to represent conditions). Otherwise, we find a neighboring context, compute its (weighted) distance from the chosen context and compute the resulting relevance. Then we can compute the relevance of the standard results to our instance/context pair as
r = ri ⋅ rk .
(7.3)
We also note the number of cycles nk, which, it will be observed, is less than or equal to the number of cycles ni. This analysis is then continued to consider the internal conditions and external conditions until we arrive at a final value for the relevance of the standard results as well as the number of cycles in the standard results associated with our intended use of the IP. Unless the match is exact, the relevance will be less (perhaps much less) than 1. In practice this analysis will most likely be a bit more complex than our example, especially when dependent variables enter the picture. For example, given an instance of a USB host controller, for each port accommodated by the controller a new set of variables spring into existence (such as a variable for speed of the port). Computing the relevance can become a bit complicated as these various dependencies are handled. The general analysis proceeds as shown in the following table. Table 7.4. Relevance of vendor’s standard results subspace
relevance cycles
net relevance
net relevant cycles
instance i context k internal conditions ci external conditions ce
ri rk
ni nk ≤ ni
ri ri ⋅ rk
ni nk
rc i
n ci ≤ nk ≤ ni
ri ⋅ rk ⋅ rc i
nci
rce
n ce ≤ n c i ≤ n k ≤ n i ri ⋅ rk ⋅ rci ⋅ rce n c e
The preceding table does not account for variables of activation, but a similar analysis can be conducted to determine how relevant the activation testing is to the IP’s intended usage. The table is largely intended to illustrate the relevance of the tests (sequences of stimuli applied to the instance in its logical context, i.e. the excitation).
7.13 Summary
179
The point of this analysis is not so much to compute a value for the net relevance and, correspondingly, the risk of an undiscovered bug. The point of this analysis is to realize just how quickly uncertainty and risk can accumulate when the many, many differences between instance, context, and conditions are scrutinized. Even “silicon-proven” IP can still contain bugs.
7.13 Summary Standardized functional verification greatly simplifies the challenging task of developing bug-free digital systems. It provides a single framework (a linear algebra) for describing and comparing any digital system, whether comprising a single IC to a system containing multiple ICs. We started with one basic premise: From the standpoint of functional verification, the functionality of a digital system, including both the target of verification as well as its surrounding environment, is characterized completely by the variability inherent in the definition of the system. The fundamental relations among the many values of variables in the functional space are defined as follows. • Instantiation of a target in a context (i.e. a system) chooses a particular function point in the subspace of connectivity. • Activation of a system is a particular functional trajectory from inactive to active through the subspace of activation, arriving at some (usually) persistent region of the subspace within which all subsequent activity takes place. Deactivation is a trajectory from active to inactive. • Initialization of a system is a particular functional trajectory through the subspace of condition, also arriving at some (usually) persistent region of the subspace within which all subsequent activity takes place. • Excitation drives the functional trajectory of the target’s responses. Some day it will be commonplace to write specifications for digital designs in terms of standard variables. Verification tools will make it easy to view and manipulate VTGs usefully for coverage analysis and risk assessment.
180
Chapter 7 – Assessing Risk
The set F of function points p, where p is a tuple of values of standard variables, is simply
F = {p | p ∈ (r(n),e(n),s(n),c(n),a(n),k(n))}.
(3.7)
The set of directed arcs connecting time-variant values is
A = {( p(n), p(n + 1)) | p(n) ∈ F, p(n + 1) ∈ F, and
(3.8)
r(n + 1) = f (r(n),e(n),s)n),c(n),a(n),k(n))}. Faulty behavior is defined as
y(n) ∉ R, or (y(n −1), y(n)) ∉ A, or
(3.9)
both. The actual size S of the functional space of a given clock domain of a given instance in a given context is defined as
S = E(G) .
(3.19)
The risk p of an unexposed functional bug for a target with D clock domains is
pT = 1− ((1− p1 ) ⋅ (1− p2, ) ⋅ ⋅⋅⋅ ⋅ (1− pD )).
(7.2)
Now we can finally talk about a cumulative probability density function like Fig. 7.11.
7.13 Summary
181
Fig. 7.11. Functional Closure and Risk
There remains much to be done by way of researching these new concepts more deeply. Some will be found to be of great value, and others are likely to be deserted over time as richer veins of theory are mined to our industry’s advantage, especially those that give us the ability to achieve functional coverage for every tape-out.
References Montgomery DC (2005) Introduction to Statistical Quality Control, Fifth Edition. Wiley. Vesely WE; Goldberg FF, Roberts NH, Haasl DF (1981) Fault Tree Handbook (PDF). U.S. Nuclear Regulatory Commission. NUREG-0492, Washington, DC, USA. Retrieved on 2006-08-31.
Appendix – Functional Space of a Queue
This analysis applies to a queue of 8 entries with programmable high- and low-water marks and a internal indirect condition that gives priority to draining the queue. When the number of entries is equal to the high-water mark, the drain priority is set to 1 and remains there until the number of entries is equal to the low-water mark when drain priority is reset to 0.
A.1 Basic 8-entry Queue We begin by examining the operation of a simple 8-entry queue without high- and low- water marks to clarify where we are beginning our examination. The response variable ITEMS_IN_QUEUE has a range of [0..8]. That is, it can be empty or contain anywhere between 1 and 8 items. From any value except 8 an item can be added to the queue. Once the queue has 8 items, nothing more can be added. Likewise, from any value except 0 an item can be removed from the queue. At any point in time (that is to say, at any clock cycle) an item can be added to the queue (if it is not already full), removed from the queue (if it is not already empty), or nothing can happen (nothing is either added or removed). Fig. A.1a illustrates the functional space of this queue in terms of the changes in value of the response variable ITEMS_IN_QUEUE. The value of this variable is shown within the ovals representing the function points of this space. This queue has 9 function points. It has 8 arcs representing the addition of an item to the queue, 8 arcs for the removal of an item from the queue, and 9 hold arcs for when nothing is added or removed. Therefore, this queue has a total of 25 arcs connecting the 9 function points in its functional space. Of these 9 function points we can observe that 7 values are condensable into the condensed function point [1..7], because every adjacent point in this range has congruent arrival and departure arcs. Fig. A.1b illustrates the condensed functional space of this 8-entry queue. It has only 3 function points and only 7 arcs connecting them. For this particular target, condensation greatly simplifies the functional space.
184
Appendix – Functional Space of a Queue
Fig. A.1a. Functional space of an 8-entry queue
A.1 B asic 8-entry Queue
Fig. A.1b. Condensed functional space of an 8-entry queue
185
186
Appendix – Functional Space of a Queue
A.2 Adding an Indirect Condition Now, if we add a mechanism to this queue so that priority is given to draining the queue (removing items) when the number of items in the queue reaches some high-water mark. For our example, the high-water mark is set to the constant value of 6. An indirect condition is established on the following clock cycle when the number of the items in the queue reaches the high-water mark. This indirect condition persists until the queue has been emptied. Even though the priority has been given to draining the queue, we still permit items to be added to the queue. Of course, it could be designed with a much stricter drain policy that prevents the addition of items to the queue until it has been emptied. Our example is not so strict, so even though priority has shifted to draining the queue, items can still be added (until it becomes full, of course). Fig. A.2a illustrates the uncondensed functional space of such a queue. The pair of values in each oval represents the value of the response variable ITEMS_IN_QUEUE and the indirect condition variable we shall call DRAIN_PRIORITY. Items can be added to the queue (and removed) in normal operation up until the number of items reaches 6. Then the indirect condition is established (DRAIN_PRIORITY == 1) on the next clock cycle, whether an item is added during that same clock cycle or removed or neither added nor removed. This shifted operation appears on the righthand side of the figure in which the pair of values shows that drain priority has been established. Items can be removed (preferably) or added to the queue, but when it eventually becomes empty the indirect condition reverts back to normal on the next clock cycle. Three function points with DRAIN_PRIORITY == 0 are condensable and three function points with DRAIN_PRIORITY == 1 are condensable. Condensation removes 4 hold arcs, 4 add arcs, and 4 remove arcs. Fig. A.2b illustrates the condensed functional space of the queue after this addition of the high-water mark and the indirect condition that establishes drain priority.
A.2 Adding an Indirect Condition
187
Fig. A.2a. Reaching the high-water mark at 6 sets the indirect condition for drain priority
188
Appendix – Functional Space of a Queue
Fig. A.2b. Condensed functional space of queue with high-water mark
We can also add a low-water mark (at 2 for the purposes of our example) such that when the number of items in the queue reaches this value, the priority reverts back to normal at the next clock cycle. This is illustrated in Fig. A.3. Note that none of the function points are condensable, because no two adjacent points have congruent arrival and departure arcs.
A.2 Adding an Indirect Condition
Fig. A.3. Queue with low-water mark added at 2
189
190
Appendix – Functional Space of a Queue
A.3 Programmable High- and Low-water Marks We can make one more refinement to the basic 8-entry queue by making both the high-water mark and the low-water mark programmable. That is, we can define two new variables of condition called HWM and LWM respectively. Then logic would be designed that creates two software-visible registers to hold these values. Upon activation of the queue these two values would default to some desired value, such as 7 and 2 respectively. Then at initialization either or both of these values can be re-written with other values to suit our intended operation. We can place constraints on the programming of the values such that the following relations always hold true:
LWM ∈ [1..6] HWM ∈ [2..7] LWM < HWM
(A.1)
These constraints might be enforced by logic that prevents software from writing incompatible values into the software-visible registers that will hold them. Or, the burden could be placed on the software engineer to ensure that only compatible values are written into the registers, leaving the operation of the queue undefined when incompatible values are written into the registers. To understand the resulting functional space completely, value transition graphs can be created for each pair of values for LWM and HWM. There are 42 such graphs in total, 21 for the uncondensed functional space ( 21 = 6 + 5 + 4 + 3 + 2 + 1 ) and 21 for the condensed functional space. These are shown on facing pages in Figs. A.4 through A.24. The reader is encouraged to examine each figure carefully to understand how to recognize condensable function points and how to recognize incondensable function spaces.
A.4 Size of the Functional Space for this Queue To better understand the size of this functional space of an 8-entry queue with programmable high- and low-water marks that set and reset an indirect condition of drain priority we can construct tables of the number of function points and of the number of function arcs.
A.5 Condensation in the Functional Space
191
Table A.1. Function points of the queue LWM 1 2 3 4 5 6
2 11
3 12 11
HWM 4 13 12 11
5 14 13 12 11
6 15 14 13 12 11
7 16 15 14 13 12 11
6 43 40 37 34 31
7 46 43 40 37 34 31
Total function points = 266 Table A.2. Function arcs of the queue LWM 1 2 3 4 5 6
2 31
3 34 31
HWM 4 37 34 31
5 40 37 34 31
Total function arcs = 756
Functional closure over this functional space requires traversal of each of the 756 function arcs, visiting each of the 266 function points during the process.
A.5 Condensation in the Functional Space Taking advantage of functional condensation we can construct similar tables for the condensed space as follows. Counts of points and arcs for incondensable sub-spaces are indicated in bold.
192
Appendix – Functional Space of a Queue
Table A.3. Function points of the queue with condensation LWM 1 2 3 4 5 6
2 8
3 10 9
HWM 4 12 11 10
5 14 13 12 11
6 13 14 13 11 9
7 12 13 14 12 10 8
Total function points = 239 Table A.4. Function arcs of the queue with condensation LWM 1 2 3 4 5 6
2 22
3 28 25
HWM 4 34 31 28
5 40 37 34 31
6 37 40 37 31 25
7 34 37 40 34 28 22
Total function arcs = 675
Functional closure over the condensed functional space requires traversal of each of the 675 function arcs (and visiting each of the 239 function points along the way). Now consider the functional space that results when the two variables LWM and HWM are variables of internal topology rather than of stimulus/response. This would be the case for an IP core that could be instantiated in any of 21 ways, corresponding to the possible allowable values for LWM and HWM. For the IP integrator only one of the 21 possible value transition graphs is applicable, corresponding to the chosen values for LWM and HWM. For the IP producer, on the other hand, all 21value transition graphs must be considered if indeed all 21 possible instantiations are to be considered valid within the IP core product. One additional observation we can make is that there is a large vacant region in the functional space for this queue, corresponding to the blank entries in tables A.3 and A.4.
A.7 VTGs for 8-entry Queue with Programmable HWM & LWM
193
A.6 No O ther Variables? It is worth mentioning that for the particular way in which this queue is defined, there is no specification related to the values of items added to or removed from the queue and there is no behavior defined that is dependent on the values of these items. Consequently, there are no variables of response defined for them and we do not observe them. Likewise, our specification does not mention any particular width for the queue and our analysis is independent of the width. There is no variable of topology defined for the width of the queue, but we could certainly modify the specification and define such a variable. If we did so, we would find that the functional arc connecting the points corresponding to number of items present in the queue are unaffected by this variable of topology.
A.7 VTGs for 8-entry Queue with Programmable HWM & LWM The following figures contain VTGs for both the uncondensed and the condensed functional space for the queue. Figs. A.4 through A.24 have VTGs for the LWM and HWM relations defined in Eq. A.1 (repeated below.
LWM ∈ [1..6] HWM ∈ [2..7] LWM < HWM
(A.1)
These ranges are what might typically be found in a realistic queue design. However, there may be some reason to extend the ranges to the extremes of 0 and 8 entries as in Eq. A.2:
LWM ∈ [0..7] HWM ∈ [1..8] LWM < HWM
(A.2)
To complete the picture, Figs. A.25 through A.32 illustrate the queue with LWM == 0, and Figs. A.32 through A.39 have the VTGs with HWM == 8. VTGs are shown both without condensation and with condensation.
194
Appendix – Functional Space of a Queue
Fig. A.4a. LWM=1 and HWM=2
A.7 VTGs for 8-entry Queue with Programmable HWM & LWM
Fig. A.4b. LWM=1 and HWM=2, with condensation
195
196
Appendix – Functional Space of a Queue
Fig. A.5a. LWM=1 and HWM=3
A.7 VTGs for 8-entry Queue with Programmable HWM & LWM
Fig. A.5b. LWM=1 and HWM=3, with condensation
197
198
Appendix – Functional Space of a Queue
Fig. A.6a. LWM=1 and HWM=4
A.7 VTGs for 8-entry Queue with Programmable HWM & LWM
Fig. A.6b. LWM=1 and HWM=4, with condensation
199
200
Appendix – Functional Space of a Queue
Fig. A.7a. LWM=1 and HWM=5
A.7 VTGs for 8-entry Queue with Programmable HWM & LWM
Fig. A.7b. LWM=1 and HWM=5, incondensable
201
202
Appendix – Functional Space of a Queue
Fig. A.8a. LWM=1 and HWM=6
A.7 VTGs for 8-entry Queue with Programmable HWM & LWM
Fig. A.8b. LWM=1 and HWM=6, with condensation
203
204
Appendix – Functional Space of a Queue
Fig. A.9a. LWM=1 and HWM=7
A.7 VTGs for 8-entry Queue with Programmable HWM & LWM
Fig. A.9b. LWM=1 and HWM=7, with condensation
205
206
Appendix – Functional Space of a Queue
Fig. A.10a. LWM=2 and HWM=3
A.7 VTGs for 8-entry Queue with Programmable HWM & LWM
Fig. A.10b. LWM=2 and HWM=3, with condensation
207
208
Appendix – Functional Space of a Queue
Fig. A.11a. LWM=2 and HWM=4
A.7 VTGs for 8-entry Queue with Programmable HWM & LWM
Fig. A.11b. LWM=2 and HWM=4. with condensation
209
210
Appendix – Functional Space of a Queue
Fig. A.12a. LWM=2 and HWM=5
A.7 VTGs for 8-entry Queue with Programmable HWM & LWM
Fig. A.12b. LWM=2 and HWM=5, incondensable
211
212
Appendix – Functional Space of a Queue
Fig. A.13a. LWM=2 and HWM=6
A.7 VTGs for 8-entry Queue with Programmable HWM & LWM
Fig. A.13b. LWM=2 and HWM=6, incondensable
213
214
Appendix – Functional Space of a Queue
Fig. A.14a. LWM=2 and HWM=7
A.7 VTGs for 8-entry Queue with Programmable HWM & LWM
Fig. A.14b. LWM=2 and HWM=7, with condensation
215
216
Appendix – Functional Space of a Queue
Fig. A.15a. LWM=3 and HWM=4
A.7 VTGs for 8-entry Queue with Programmable HWM & LWM
Fig. A.15b. LWM=3 and HWM=4, with condensation
217
218
Appendix – Functional Space of a Queue
Fig. A.16a. LWM=3 and HWM=5
A.7 VTGs for 8-entry Queue with Programmable HWM & LWM
Fig. A.16b. LWM=3 and HWM=5. incondensable
219
220
Appendix – Functional Space of a Queue
Fig. A.17a. LWM=3 and HWM=6
A.7 VTGs for 8-entry Queue with Programmable HWM & LWM
Fig. A.17b. LWM=3 and HWM=6, incondensable
221
222
Appendix – Functional Space of a Queue
Fig. A.18a. LWM=3 and HWM=7
A.7 VTGs for 8-entry Queue with Programmable HWM & LWM
Fig. A.18b. LWM=3 and HWM=7, incondensable
223
224
Appendix – Functional Space of a Queue
0,0
1,0
2,0
3,0
4,0
4,1
5,0
5,1
6,1
7,1
8,1 Fig. A.19a. LWM=4 and HWM=5
11 function points 9 hold arcs 8 add arcs 8 remove arcs 6 shift arcs --31 arcs total
A.7 VTGs for 8-entry Queue with Programmable HWM & LWM
225
0,0
[1..2],0
3,0
4,0
4,1
5,0
5,1
6,1
7,1
8,1 Fig. A.19b. LWM=4 and HWM=5
10 function points 8 hold arcs 7 add arcs 7 remove arcs 6 shift arcs --28 arcs total
226
Appendix – Functional Space of a Queue
Fig. A.20a. LWM=4 and HWM=6
A.7 VTGs for 8-entry Queue with Programmable HWM & LWM
Fig. A.20b. LWM=4 and HWM=6, with condensation
227
228
Appendix – Functional Space of a Queue
Fig. A.21a. LWM=4 and HWM=7
A.7 VTGs for 8-entry Queue with Programmable HWM & LWM
Fig. A.21b. LWM=4 and HWM=7, with condensation
229
230
Appendix – Functional Space of a Queue
Fig. A.22a. LWM=5 and HWM=6
A.7 VTGs for 8-entry Queue with Programmable HWM & LWM
Fig. A.22b. LWM=5 and HWM=6, with condensation
231
232
Appendix – Functional Space of a Queue
Fig. A.23a. LWM=5 and HWM=7
A.7 VTGs for 8-entry Queue with Programmable HWM & LWM
Fig. A.23b. LWM=5 and HWM=7, with condensation
233
234
Appendix – Functional Space of a Queue
Fig. A.24a. LWM=6 and HWM=7
A.7 VTGs for 8-entry Queue with Programmable HWM & LWM
Fig. A.24b. LWM=6 and HWM=7, with condensation
235
236
Appendix – Functional Space of a Queue
Fig. A.25a. LWM=0 and HWM=1
A.7 VTGs for 8-entry Queue with Programmable HWM & LWM
Fig. A.25b. LWM=0 and HWM=1, with condensation
237
238
Appendix – Functional Space of a Queue
Fig. A.26a. LWM=0 and HWM=2
A.7 VTGs for 8-entry Queue with Programmable HWM & LWM
Fig. A.26b. LWM=0 and HWM=2, with condensation
239
240
Appendix – Functional Space of a Queue
Fig. A.27a. LWM=0 and HWM=3
A.7 VTGs for 8-entry Queue with Programmable HWM & LWM
Fig. A.27b. LWM=0 and HWM=3, with condensation
241
242
Appendix – Functional Space of a Queue
Fig. A.28a. LWM=0 and HWM=4
A.7 VTGs for 8-entry Queue with Programmable HWM & LWM
Fig. A.28b. LWM=0 and HWM=4, with condensation
243
244
Appendix – Functional Space of a Queue
Fig. A.29a. LWM=0 and HWM=5
A.7 VTGs for 8-entry Queue with Programmable HWM & LWM
Fig. A.29b. LWM=0 and HWM=5, with condensation
245
246
Appendix – Functional Space of a Queue
Fig. A.30b. LWM=0 and HWM=6
A.7 VTGs for 8-entry Queue with Programmable HWM & LWM
Fig. A.30b. LWM=0 and HWM=6, with condensation
247
248
Appendix – Functional Space of a Queue
Fig. A.31a. LWM=0 and HWM=7
A.7 VTGs for 8-entry Queue with Programmable HWM & LWM
Fig. A.31b. LWM=0 and HWM=7, with condensation
249
250
Appendix – Functional Space of a Queue
Fig. A.32a. LWM=0 and HWM=8
A.7 VTGs for 8-entry Queue with Programmable HWM & LWM
Fig. A.32b. LWM=0 and HWM=8, with condensation
251
252
Appendix – Functional Space of a Queue
Fig. A.33a. LWM=1 and HWM=8
A.7 VTGs for 8-entry Queue with Programmable HWM & LWM
Fig. A.33b. LWM=1 and HWM=8, with condensation
253
254
Appendix – Functional Space of a Queue
Fig. A.34a. LWM=2 and HWM=8
A.7 VTGs for 8-entry Queue with Programmable HWM & LWM
Fig. A.34b. LWM=2 and HWM=8, with condensation
255
256
Appendix – Functional Space of a Queue
Fig. A.35a. LWM=3 and HWM=8
A.7 VTGs for 8-entry Queue with Programmable HWM & LWM
Fig. A.35b. LWM=3 and HWM=8, with condensation
257
258
Appendix – Functional Space of a Queue
Fig. A.36a. LWM=4 and HWM=8
A.7 VTGs for 8-entry Queue with Programmable HWM & LWM
Fig. A.36b. LWM=4 and HWM=8, with condensation
259
260
Appendix – Functional Space of a Queue
Fig. A.37a. LWM=5 and HWM=8
A.7 VTGs for 8-entry Queue with Programmable HWM & LWM
Fig. A.37b. LWM=5 and HWM=8, with condensation
261
262
Appendix – Functional Space of a Queue
Fig. A.38a. LWM=6 and HWM=8
A.7 VTGs for 8-entry Queue with Programmable HWM & LWM
Fig. A.38b. LWM=6 and HWM=8, with condensation
263
264
Appendix – Functional Space of a Queue
Fig. A.39a. LWM=7 and HWM=8
A.7 VTGs for 8-entry Queue with Programmable HWM & LWM
Fig. A.39b. LWM=7 and HWM=8, with condensation
265
Index
Activation of architecture for verification software, 95, 104–107 examples, 22–24 models, 97 standard variables, 23 of system, 51, 179 variables, 21–24, 35 in functional space, 44, 59–60 processor, 22–24 Active target, 24 Actual time variables, 29–30 Aesthetic evaluations, 36 Analysis, standard results for, 138 Arc convergence, 124 Architecture for verification software, 95–110 activation, 104–107 activation models, 97 context models, 97 coverage models, 98 CRV process, 102–104 deterministic generator, 97–98 expected results checkers, 98 flow for soft prototype, 99–100 functionality, 98–99 gate-level simulation, 109–110 halting individual tests, 108 ingredients, 97–98 initialization, 104–107 monitors, 98 production test vectors, 110 protocol checkers, 98 random value assignment, 101 sanity checking and other tests, 108–109 sockets for instances of verification
target, 97 static vs. dynamic test generation, 108 test generator, 97 transactors, 98 Arc transversability, in state machines, 143, 145 Assertions, 87–88 Associated synchronization logic, 21 Asynchronous digital systems, 10 Automated test program generation (ATPG), 110 Autonomous responses, 31 Basis variables, orthogonal system of, 38 Bugs, 2, 4–6, 13, 33–34, 38 count as function of complexity, 135–136 counts and severity, 115 discovery rate, 115–117 distribution in target, 71 driven testing, 77 fixing, 77–78 locality, 115 of omission, 74 tracking, 75 using Q to estimate risk of, 169–174 workarounds, 91, 114 Built-in self-test unit (BIST), in processor, 22 Cadence Encounter Conformal CDC capability, 82 Clock domain, 10 Clock-domain crossings (CDCs), 89, 149–150
268
Index
verification plan, 81–83 Clocking, variables, 21, 38, 44 Code coverage, 24, 140–142, 144 standard specific measures, 148 Collision-detect multiple-access algorithm, 37 Commercial IP, evaluation of, 174–176 Complexity bug count as function of, 135–136, 174 cycles counts with, 133–134 estimate for, 129 gate count and, 128 size of functional space, 136 standard specific measures, 148 of targets, 121, 126–129 Compliance test, 36 Composition of instruction, 32 variables of stimulus and response, 29–30 Condensation, in functional space, 47–50 Condensed functional space, 44, 47, 128 8-entry queue analysis, 53–59 Condition coverage, 24 direct conditions, 24 examples, 25–27 external conditions, 25–26, 38 indirect conditions, 24–25, 27 internal conditions, 25–26, 38 and responses variables, 31–32, 35, 38 variables, 24–27, 35, 38 in functional space, 45 Connecting dots, in functional space, 50–53 Connectivity external, 40–42, 44 internal, 40–42, 44 quadrants of subspace of, 147 time-invariant variables of, 44 Connectivity variables, 16–21, 35, 38
external, 16–19, 38 context of instance and, 17 specification and standard variables for, 17–19 in functional space, 40–44 internal, 16, 19–21, 38 instance of target and, 17 standard variables of, 20 Constrained pseudo-random functional verification, 5–6 Constrained random verification (CRV) technique, 5–7, 10, 33, 36–7, 50, 62, 64, 71–2, 89, 99–104 of architecture for verification software, 102–104 principles of, 5–6 saving initialized systems for, 105 using previously initialized system, 107 Context models, of architecture for verification software, 97 Contiguous ranges, of variables, 13 Control signal, synchronization, 81 Convergence arc convergence, 125 determination against target, 125 expression convergence, 125 factors to consider in using, 124–126 gap, 123 line convergence, 125 power and, 122–124 scaling regression using, 129–132 state convergence, 125 stratification of, 124–125 of tests, 123 in verification projects, 122–126, 129–132 Corner cases, 94 Counting function points in functional space, 40–47 activation variables, 44 condition variables, 45 connectivity variables, 40–44 error variables, 45–46
Index response variables, 45 special cases variables, 46 stimulus variables, 45 time-variant variables, 44 upper bound, 46–47 Coverage analysis, 21–22, 88 closure, see Functional closure driven testing, 77 measures of, 139–140 models of architecture for verification software, 98 and risk assessment, 166–167 CRV tests, 77, 79, 105, 124 Cybernetic properties, 36 Cycles counts with complexity, 133–134 Data-driven risk assessment, 4, 9, 165, 167 De-activation testing, 104 Debugging, 4 Decision making, and risk assessment, 157–159 Default morph, 29 Deterministic generator, of architecture for verification software, 97–98 Deterministic tests, 35 Development costs, and verification, 2–3 Device Under Test (DUT), 9–10 Device Under Verification (DUV), 9–10 Diagnostic morph, 28 Digital system analysis, 9 bases of analysis, 10 functionality, 10, 37 linear algebra for verification, 10–12 standard framework for interpretation, 12 standardized approach to verification, 6 variables related to functionality
269
activation, 11–12 condition, 11–12 connectivity, 11–12 stimuli and response, 11–12 Directed random verification, 5 Directed tests, 2, 77 D-MUX, 81 Drop dead date, 111 Dynamically generated tests, 102 Dynamic test generation, 108 Dynamic transition, 36 Energized target, 24 8-Entry queue, 183–185 analysis condensed functional space, 54–56, 58, 185, 186–188 explicit representation of reset for, 59–60 functional closure, 68–69 of functional space, 53–59, 183 uncondensed functional space, 54, 57 with programmable HWM and LWM, 193 Enumerated ranges, of variables, 13–14 Error of commission, 4 in composition of stimulus, 45 in functional space, 45–46 imposition, 33–35, 38 of omission, 4 in time of stimulus, 45 variables, 45–46 Ethernet magic packet, 23 protocol, 23, 89 specifications for, 37 Excitation, 179 Excitation drives, 51 Excitement generation, 35 Expected results checkers, 98 Expression convergence, 125 External activity, 36 External condition variables, 45
270
Index
External connectivity example, 17–19 standard variable for verification, 18 systems in condensed space of, 49–50 variables, 16–19, 38, 40–42, 44 context of instance and, 17 specification and standard, 17–19 Failure analysis, 64, 78, 88, 103, 114 Fault coverage, 143–144 Faulty behavior, 75–76, 94–95, 103, 106, 114, 180 risk of, 5–6 Firewire devices, 24 Firm prototype, 78, 93 Floating-point unit (FPU), 22 Focused ion beam (FIB) measure, 91 Focused random verification, 5 FPGA, 78, 89, 93 FPGA-based prototype, 78 Functional bug, 6 Functional closure, 1–2, 39–40, 69, 163 8-entry queue analysis, 68 in functional space, 39–40, 68–69 and risk, 181 Functional coverage, 137–138 Functionality interpretation, 35 Functional shmoo testing, 114 Functional space, 39–69 arc transversability in state machines, 143, 145, 148 budgeting for success, 162–163 code coverage, 140–142, 145, 148 complexity and size of, 136 condensation in, 47–50 connecting dots, 50–53 counting function points, 40–47 activation variables, 44 condition variables, 45 connectivity variables, 40–44 error variables, 45–46 response variables, 45
special cases variables, 46 stimulus variables, 45–46 time-variant variables, 44 upper bound, 46–47 8-entry queue analysis, 53–59 fault coverage, 143–144 functional closure, 39–40, 68–69 graph theory, 65–68 measures of coverage, 139–140 mediation by response variable, 59 modeling faulty behavior, 63–64 multiple clock domains, 149–150 of queue, 183–193 relations among values of variables in, 51 reset in VTG, 59–63 hard reset, 61 soft reset, 62 soft origin of, 62–63 special cases variables, 46, 64–65 specific and general measures, 146–149 standard measures of, 145–146, 148 state reachability in state machines, 142–143, 145, 148 statistically sampling, 138–139 success and failure spaces, 161–162 values of variables in, 179 views of coverage, 150–155 1-dimensional views, 151 2-dimensional views, 153–154 indicators for, 150 Pareto views, 151–153 standard views, 155–156 time-based views, 153–155 VTG arc coverage, 144–145, 148 Functional trajectory, 50, 59, 63, 65, 179 Functional verification cost and risk, 1–2 CRV technique for, 5–7 development costs and, 2–3 effectiveness, 1 elements, 7
Index lessons learned from, 3–4 objective of, 4 risk assessment for, see Risk assessment standardized approach, 6–8 terminology, 9 time to market and, 2 Function arcs, 50–51, 88 Gate-level simulation, in architecture for verification software, 109–110 Graph theory, for functional space, 65–68 Halting individual tests, in architecture for verification software, 108 Hamming distance, 177 Hamming weight, 42, 177 Handshake protocol, 82–83, 89 Hard prototype, 78 instrumentation for, 88, 90 internal responses, 31 mechanisms for verification of, 90 Hard resets, 31, 35–36, 104 Holding a split lot at metal measure, 90–91 Human interaction, responsiveness to, 36 Human programmers’ inefficiency, and verification, 5 IC development projects complexity of verification, 5 costs after first tape-out, 3 functional verification, 1–2 human programmers’ inefficiency, 5 multiple variables of internal connectivity in, 21 programming habits and, 5 time from tape-out to shipping product for revenue, 2–3 Incompleteness of verification, indicator of, 140
271
Indirect conditions, 36 Inherent value boundaries of ranges, 13 Initialization in architecture for verification software, 104–107 process, 25–26, 51, 59 of system, 179 Inspection, verification by, 61 Instruction, composition of, 32–33 Internal condition variables, 45 Internal connectivity, 27 examples, 19–21 variables, 16, 19–21, 38, 40–42, 44 instance of target and, 17 standard, 20 Inverse time variables, 30 IP evaluation of, 174–176 for single application, 176 Knowledge, and risk assessment, 164–166 Linear algebra loosely orthogonal system in, 11 for verification of digital system, 10–12 Line convergence, 125 Measures code coverage, 140–142, 145 of coverage, 139–140 fault coverage, 143–144 functional space coverage, 137–138, 145–146 for quadrant, 148–149 relative strength of, 145 specific and general, 146–149 standard specific, 145–146, 148 strong and weak, 144–145 VTG coverage, 144–145 Mediation, by response variable in functional space, 51, 59 Metamorphosis, 27–29, 38 of target, 27–28
272
Index
values for variables that determine, 28 Modeling faulty behavior, for functional space, 63–64 Morphing target, 36 Morphs, see Metamorphosis Multi-instance RTL, 19 Multiple clock domains, 149–150 MUX synchronizer, 81–82 Nearest neighbor analysis, 176–179 Normalized cycles, in risk assessment, 134–135 Ordering errors, 33 Performance requirements, 14 Physical prototype, 78 Piece-wise contiguous ranges, of variables, 13–14 Planning, of verification project, 76; see also Verification plan Post-silicon bugs, 169 Post tape-out bugs, 170 Power and convergence, 122–124 saving circuitry in processor, 22 variables, 21, 38, 44 Processor activation variables for, 22–24 built-in self-test unit, 22 internal connectivity in, 19 power-saving circuitry, 22 variable clock-speeds, 22 as verification target, 19 Production test vectors, for architecture verification software, 110 Programmable logic array (PLA), 90 Programmers habits, 5 Programming habits, 5 Project resources estimation, 121–122 Protocol checkers, for architecture verification software, 98 Prototype instrumentation checking accommodated by, 88–90 expected results, 88–89
hard prototype, 88, 90 mechanisms for verification, 90 properties, 89 protocol, 89 soft prototype, 88 state coherency, 89 transformation of data, 89 transport, 89 for verification plan, 87–91 Pseudo-random excitation, 35 Pseudo-random sequence, 35 Pseudo-random test generators, 35 Pseudo-random testing, 4–5 Queue, see also 8-Entry queue adding indirect condition, 186–189 basic 8-entry queue, 183–185 condensation in functional space, 191–192 functional space of, 183–193 programmable high-and low-water mark, 190 size of functional space for, 190–191 Random regression suite, 129 Random testing, 2–4 Random value assignment, in architecture for verification software, 101 Random verification, principles of, 5–6 Re-activation testing, 104 Regression suite, 72 fault coverage for, 143–144 standard specific measures, 145–146, 148 using convergence, 129–132 and verification plan, 72–73 Regression testing, 6 Relative time variables, 30 Reset for explicit representation of 8-entry queue analysis, 59–60 variables, 21–22, 24, 31, 35–36, 38 assertion of, 21, 24, 35, 60, 62
Index de-assertion of, 21, 24, 31, 35, 60 in VTG functional space, 59–63 hard reset, 59, 61–62 soft reset, 62 Response autonomous, 31 composition variables, 29 conditions and, 31–32 example, 32–33 internal, 30–31 sequence abstraction, 30 time variables, 29 transactions abstraction, 30 variables, 29–33, 35 in functional space, 45 Result analysis, with standard views and measures, 139 Risk assessment, 4 background on, 159–160 of bug, 169–174 coverage and, 166–167 data-driven, 165, 167 decision making and, 157–159 functional closure and, 181 knowledge and, 164–165 nearest neighbor analysis for, 176–179 normalized cycles in, 134–135 success and, 173 successful functional verification project and, 160–164 VTG arc coverage, 168–169 RTL (register transfer level) description, 2, 4, 6, 10, 16, 19–20, 64, 71–72, 74 Rules and guidelines examples, 14–16 performance objectives and, 14–15 from specification, 14–15 technical specifications, 14 for verification, 14 Sanity checking, for architecture verification software, 108–109 Signal, with positive and negative slack, 39
273
Sockets for instances of verification target, 97 Soft origin, of functional space, 62–63 Soft prototype, 78 architecture for software for verification of, 98 flow for, 99–100 instrumentation for, 88 Soft resets, 31, 35–36, 104 Spare gates, 90 Special cases, variables in functional space, 46, 64–65 Specification-driven testing, 77 Specifications, 14, 83–87 Standardized functional verification, 6–8 Standard measures, 91–93, 99, 137, 145–146, 148 result analysis with, 139 Standard results, for analysis, 138 Standard variables of external connectivity, 18 variability, 13 for verification, 12–13 Standard views, 91–93, 99, 137 result analysis with, 139 State convergence, 125 State machines arc transversability in, 143, 145, 148 state reachability in, 142–143, 145, 148 State-of-the-art, verification tools, 51 State reachability, in state machines, 142–143, 145, 148 Statically generated tests, 102 Static test generation, 108 Stimulus composition variables, 29 example, 32–33 internal stimuli, 30–31 sequence abstraction, 30 time variables, 29 transactions abstraction, 30 variables, 29–33, 38 in functional space, 45
274
Index
Stratification, of convergence, 124–125 Sweep testing, 114 Synchronization of control signal, 81–82 Synchronizer MUX, 81 Synchronous digital systems, 10, 22 System activation of, 51 initialization of, 51 SystemVerilog, assertions in, 88 Target, 10 distribution of bugs in, 71 instantiation of, 51, 179 magical initialization of memory, 106 metamorphosis of, 27–28 morphing, 36 transcendental behaviors of, 36 value state of, 50 Technical specifications, 14 Temporal errors, 33 Test environment, architecture for, 96 Time checks, 88 Time to market, and verification, 2 Time variables actual time, 29–30 inverse time, 30 relative time, 30 of stimulus and response, 29 Time-variant variables, in functional space, 44 Timing closure, 39 Transactors, in architecture for verification software, 98 Uncondensed functional space, 8-entry queue analysis, 54, 57 Upper bound, in functional space, 46–47 Value checks, 88 Value errors, 33 Value transition graph (VTG), 45, 51, 53, 57–66, 68, 95, 114, 122, 128–129, 136, 138, 140, 144, 148, 152, 168–170, 179, 193
Variability, of standard variables, 13 Variable clock-speeds, in processor, 22 Variables of activation, 21–24, 35 boundary values, 47–48 composition of stimulus, 50 of condition, 24–27, 35 of connectivity, 16–21, 35, 50 external, 16–19 internal, 16, 19–21 of error in composition of stimulus, 45 external condition, 45 internal condition, 45 orthogonal system of, 38 ranges, 13–14 contiguous, 13 enumerated, 13–14 inherent value boundaries, 13 piece-wise contiguous, 13–14 relations among values of functional space, 51 special cases, 35–37 standard variables, 12–13 of stimulus and response, 29–33, 35 Verification advantages inherent in, 3–4 challenges to, 5 complexity of, 5 development costs and, 2–3 human programmers’ inefficiency, 5 incompleteness of, 140 by inspection, 61 linear algebra for digital system, 10–12 process, 13 programming habits and, 5 ranges of variables, 13–14 rules and guidelines, 14–16 standard variable of external connectivity used for, 18 standard variables for, 12–13 terminology, 9 time to market and, 2
Index Verification plan, 79–119 architecture, 80, 95–110 change from outside, 79 change management, 80, 110–111 clock domain crossings, 81–83 definition of target, 80–81 design, 80 documents, 80, 118 failure analysis, 114 goals, 80 incorporate discovery and invention, 79 instances and morphs for opportunistic inclusion, 81 instrumentation for prototype, 87–91 interpretation of specification, 83–87 learning, 79 matrix, 80 prevention measures, 90–91 regression suite and, 72–73 resources, 80, 118–119 results, 80, 91–94 scope and schedule, 80, 118–119 setting goals for coverage and risks, 94–95 focusing resources, 94–95 making trade-offs, 94 standard measures, 91–93 teams organization, 111–114 tracking progress, 115–118 bug counts and severity, 115–116 bug discovery rate, 115–117 bug locality, 115 code coverage, 115–116 verifying changes to existing device, 83 Verification process, 99 Verification projects bug count as function of complexity, 135–136 complexity of target and, 126–129 convergence determination against target, 125
275
factors to consider in using, 124–126 gap, 123 power and, 122–124 stratification of, 124–125 of tests, 123 cycles counts with convergence, 133–134 execution, 74 goal, 72–73 management, 71 plan execution for results, 73–77 bug fixing, 77–78 code construction, 74–76 code revision, 76 final coding, 75–76 graduated testing, 76–77 initial coding, 75–76 preparation, 74 planning, 72 resource allocation, 122 scaling regression using convergence and, 129–132 size and complexity, 136 soft prototype and hard prototype, 78 successful functional verification, 160–164 using cycles in risk management, 134–135 verification plan, 83–123; see also Verification plan Verification software architecture, 95–110 Verification tools, state-of-the-art, 51 Virtual prototype, 78 VTG arc coverage, 168–169 for 8-entry queue with programmable HWM and LWM, 193 for functional space, 144–145, 148 Warm resets, 104 Workarounds bug, 91, 114