Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large...

85
Traceless Automated Design Debugging of Liveness Properties Using Property Directed Reachability by Ryan Berryhill A thesis submitted in conformity with the requirements for the degree of Master of Applied Science Graduate Department of Electrical and Computer Engineering University of Toronto c Copyright 2016 by Ryan Berryhill

Transcript of Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large...

Page 1: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Traceless Automated Design Debugging of Liveness Properties UsingProperty Directed Reachability

by

Ryan Berryhill

A thesis submitted in conformity with the requirementsfor the degree of Master of Applied Science

Graduate Department of Electrical and Computer EngineeringUniversity of Toronto

c© Copyright 2016 by Ryan Berryhill

Page 2: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Abstract

Traceless Automated Design Debugging of Liveness Properties Using Property Directed

Reachability

Ryan Berryhill

Master of Applied Science

Graduate Department of Electrical and Computer Engineering

University of Toronto

2016

The growth in complexity of digital hardware drives an increase in the importance of au-

tomated computer-aided design (CAD) tools. Verification consumes most of the design effort,

with debugging accounting for half of the verification time. These are therefore important

targets for automation. Traditionally, when a failure is detected through an observation value

mismatch, an error trace is returned. The error trace can be used with a Boolean Satisfiability

(SAT)-based automated debugging tool to aid the engineer in finding the error source. How-

ever, when a state is shown to be unreachable, no error trace is available to guide the tool.

Debugging these errors is a manual process. This thesis presents two novel automated tech-

niques to perform design debugging in the absence of an error trace. The use of PDR avoids

the memory-intensive ILA representation, making it possible to solve larger problem instances.

Experiments demonstrate the practicality of the proposed techniques.

ii

Page 3: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Contents

1 Introduction 1

1.1 Background and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Purpose and Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.3 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2 Background 7

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.2 Boolean Satisfiability for CAD . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.2.1 CNF Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.2.2 ILA Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.3 SAT-based Debugging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.4 Property Directed Reachability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.4.1 Notation and Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.4.2 High-Level Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.4.3 Detailed Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3 Traceless Debugging Using Approximation and Unrolling 22

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3.2.1 Previous Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3.2.2 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3.3 Iterative Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

iii

Page 4: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

3.3.1 Single-Cycle Unreachability . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.3.2 Sample Debugging Problem . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.3.3 Multi-Cycle Unreachability . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3.4 Performance-Driven Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

4 Traceless Debugging Without Unrolling 37

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

4.2 Initial Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

4.2.1 Constructing the Enhanced Model . . . . . . . . . . . . . . . . . . . . . . 39

4.2.2 Searching for Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

4.2.3 Soundness and Completeness . . . . . . . . . . . . . . . . . . . . . . . . . 45

4.3 Incremental Application of PDR . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

4.4 Efficient Suspect Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

5 Experimental Results 54

5.1 Algorithm Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

5.2 Benefits of Incrementality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

5.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

6 Conclusion and Future Work 70

6.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

Bibliography 73

iv

Page 5: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

List of Tables

2.1 Characteristic functions of elementary gates . . . . . . . . . . . . . . . . . . . . 9

3.1 Glossary of symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

5.1 Summary of presented algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

5.2 Runtime and solutions found . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

5.3 Runtime and solutions found (K = 10) . . . . . . . . . . . . . . . . . . . . . . . . 56

5.4 Effect of K and N on runtime and solutions found . . . . . . . . . . . . . . . . . 57

5.5 Runtime comparison for Unreachability and SEUnreachability . . . . . . . 62

5.6 Number of suspects and solutions in iterations 1 through 5 . . . . . . . . . . . . 62

5.7 Effect of |L| on Unreachability runtime . . . . . . . . . . . . . . . . . . . . . 64

5.8 Effect of incrementality on runtime . . . . . . . . . . . . . . . . . . . . . . . . . . 68

v

Page 6: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

List of Figures

1.1 Typical VLSI design flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2.1 (a) A sequential circuit (b) ILA representation with 2 time-frames . . . . . . . . 10

2.2 Error multiplexer inserted at suspect location li . . . . . . . . . . . . . . . . . . . 11

2.3 Debugging ILA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.4 Hardware construction for error cardinality n > 1 . . . . . . . . . . . . . . . . . . 13

2.5 Example finite state machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.6 Example state-space over-approximations . . . . . . . . . . . . . . . . . . . . . . 17

2.7 (a) A predecessor of an unsafe state in F3 (b) Approximations after refining F3 . 18

3.1 Representation of the debugging instance . . . . . . . . . . . . . . . . . . . . . . 27

3.2 Set Fi (a) initially (b) after detecting a spurious result from state t . . . . . . . . 27

3.3 (a) Correct implementation of shift register (b) Erroneous implementation in

which states are unreachable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.4 Erroneous shift register circuit with solutions highlighted . . . . . . . . . . . . . 32

3.5 Representation of the multi-cycle debugging instance . . . . . . . . . . . . . . . . 33

4.1 Error-select register and multiplexer at suspect location li . . . . . . . . . . . . . 40

4.2 (a) Original circuit (b) Circuit used to construct Ten (error-select registers omitted) 40

4.3 State space representation of (a) M and (b) Mblk . . . . . . . . . . . . . . . . . . 48

4.4 Example circuit with fanout branches highlighted . . . . . . . . . . . . . . . . . . 49

5.1 Total runtime and time spent checking spurious solutions in MMCUnreacha-

bility for wb versus K . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

vi

Page 7: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

5.2 Runtime relative to Unreachability for presented algorithms (Unreachability

= 1.0) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

5.3 Suspects and solutions per iteration for spi . . . . . . . . . . . . . . . . . . . . . 62

5.4 Runtime per iteration for spi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

5.5 Total runtime and average runtime per SAT call for divider . . . . . . . . . . . 63

5.6 Solutions versus N for divider and wb (K = 10) . . . . . . . . . . . . . . . . . . 65

5.7 Solutions found for spi vs. running time for (a) Unreachability (b) SEUn-

reachability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

5.8 Average percentage of solutions found by algorithms across all experiments . . . 67

5.9 Solutions found by Unreachability vs. running time for (a) mrisc core (b)

usb core . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

vii

Page 8: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 1

Introduction

1.1 Background and Motivation

Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size

and complexity. Computer aided design (CAD) tools are more important than ever to the design

cycle as the hardware becomes too complex for human understanding. Realizing a modern VLSI

design is a complex process involving both manual tasks and automated procedures carried

out by CAD tools. Figure 1.1 presents a simplified view of a typical VLSI design flow. The

process begins with a behavioral specification of the design, which consists of a natural language

document or a formal specification written in e.g., C or a behavioral hardware description

language (HDL). The design is then transformed into a register transfer level (RTL) specification

in a language such as VHDL or Verilog. Subsequently, the logic synthesis step converts the RTL

description into a gate-level netlist. The gate-level netlist is used to produce a transistor-level

netlist, which is then placed and routed to give a physical layout. The physical layout is finally

sent to a fabrication facility where the chip is manufactured.

At each stage of the design flow, a verification or testing step is performed to ensure compli-

ance with the specification. Functional verification techniques such as model checking are used

to ensure that the behavior of the synthesized design fits its specification. After optimizing the

gate-level netlist, equivalence checking is used to ensure that the optimized layout is function-

ally equivalent to the logic netlist. Additional timing-based tests are performed to ensure that

the layout meets performance requirements. Finally, a battery of tests is carried out against

1

Page 9: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 1. Introduction 2

BehavioralSpecification

LogicSynthesis

RTLDescription

LayoutSynthesis

LogicNetlist

Fabrication

Layout

Chip

FunctionalVerification

TestingSiliconDebugging

DesignDebugging

Design Testing, Verification, and Debug

BehavioralSynthesis

Figure 1.1: Typical VLSI design flow

the fabricated chips before they are packaged and sold.

When a failure occurs at any verification or testing step, some form of debugging is per-

formed to correct it. When functional verification reveals an error, design debugging is carried

Page 10: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 1. Introduction 3

out to locate the root cause of the failure so it may be corrected. If an error is revealed during

chip-level testing, silicon debugging is performed to locate the failing part of the chip. Since

chips have already been manufactured, a failure discovered during chip-level testing can be very

expensive to correct. As a result, care is taken to prevent errors from escaping to this stage of

the design process.

Many of the processes in Figure 1.1 have been partially or fully-automated. A common

theme among automated verification and debugging techniques is formulating the problem as a

Boolean Satisfiability (SAT) instance. Many CAD problems such as equivalence checking [20,

29, 16, 1, 32], model checking [9, 28, 8, 40, 11], and design debugging [38, 19, 23, 12] have been

successfully encoded in SAT-based formulations. During the past 20 years, the performance

and capabilities of SAT solvers has improved immensely [30, 37, 17, 4, 2]. Since numerous

CAD problems are formulated as SAT instances, any improvement to the state-of-the-art in

SAT solving immediately benefits all such automated techniques. Along with the increasing

availability of computational power, this has resulted in automation becoming increasingly

applicable to CAD problems.

Since combinational circuitry essentially implements a Boolean formula, transforming it to

a SAT instance is straightforward [27]. For sequential designs, many SAT formulations use a

technique called circuit unrolling, also known as the Iterative Logic Array (ILA) [26] to model

the sequential behavior using only combinational logic. Circuit unrolling constructs an ILA

containing one copy of the circuit for each clock cycle that is to be modeled. While this enables

a simple transformation to a SAT instance, it can consume a large amount of memory to

accommodate the unrolled design.

SAT-based design debugging [38] in particular is based on the use of an ILA. Traditionally,

functional verification reveals an error through means such as a firing assertion, observation

value mismatch, or scoreboard discrepancy, and returns an error trace that exposes the failure.

The error trace is then used to guide the debugging tool, which creates an ILA representation

of the circuit and error trace. The ILA contains one time-frame for each clock cycle in the error

trace, potentially resulting in excessive memory usage.

Additional techniques exist to facilitate debugging problems that would otherwise be com-

putationally infeasible due to the size of the design and the length of the error trace. Trace

Page 11: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 1. Introduction 4

compaction [33] can somewhat mitigate the memory use resulting from long traces by finding

a shorter trace that still exposes the failure. A technique known as bounded model debug-

ging [34] can similarly be used to handle long traces by initially modeling only a small portion

of the trace and iteratively adding more clock cycles as needed. Abstraction and refinement

techniques [36, 35] can be used to abstract portions of the design, thereby reducing the memory

required to model each time-frame.

Similarly to SAT solvers, the model checking technique of Property Directed Reachability

(PDR) [9] has seen tremendous advancement in recent years [13, 18, 24]. Unlike traditional

model checking techniques [8], PDR does not use the ILA to model sequential behavior. As a

result, it can avoid the substantial memory use characteristic to some CAD algorithms. While

PDR has been restricted to model checking problems thus far, is has seen applications to non-

hardware model checking problems, in domains such as quantifier-free formulae [41, 43] and

software [22, 42].

Evidently, the problem of automated design debugging is well-studied for cases where an

error trace is available. However, when verification reveals that a state is unreachable in vio-

lation of the design specification, no error trace is available to guide an automated debugging

tool and comparatively little automation is available to aid the engineer. This thesis presents

two automated debugging techniques based on PDR that solve the design debugging problem

in cases where no error trace is present. As such, these techniques automate a debugging task

that previously was handled manually while avoiding the large memory usage possible with

existing SAT-based automated debugging techniques.

In practice, these techniques are expected to be highly valuable. In an industry setting,

liveness checking does not seem to be widely-deployed as a verification technique. This is likely

because a failed liveness property is inherently difficult to manually debug. No error trace is

available and the designer may not know what conditions should lead to the property being

reached. However, liveness checking is very easy to deploy. A regression testing suite can simply

count the number of times a state is entered throughout the regression run. When particular

states are found to never be entered, it provides suspicion of a liveness failure. This could then

be confirmed using a formal tool, and finally debugged using the techniques presented in this

thesis.

Page 12: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 1. Introduction 5

1.2 Purpose and Scope

This thesis presents two PDR-based techniques to automate design debugging in the absence

of an error trace. The first technique formulates the debugging problem in a manner similar to

traditional SAT-based debugging. Rather than using an error trace to constrain the SAT in-

stance, PDR is used to compute an over-approximation of the set of all possible error traces up

to a fixed size. Drawing from traditional SAT-based debugging, a partial ILA is constructed and

constrained using this over-approximation. Due to the inherent nature of over-approximations,

this formulation may lead to spurious solutions that do not actually solve the debugging prob-

lem. This is handled by again using PDR to verify each solution. As an added benefit, verifying

a solution refines the approximations, reducing the chances of finding more spurious solutions.

As only a partial ILA is constructed and the approximations drawn from PDR are only valid

for a fixed trace size, the approach is able to find a subset of all solutions. These limitations are

inherent to the formulation, as it may not be known what length of trace and ILA is needed to

fully solve the problem. Experiments demonstrate that while the provided solution set is not

necessarily complete, with reasonable parameters the technique is often able to find the actual

error source.

Next, a complete and exact approach is presented. The debugging problem is constructed

as a model checking instance passed directly to PDR. As such, it makes no use of the ILA

representation of the design. Further, it is not susceptible to spurious solutions, nor does

it require the application of a priori knowledge regarding the needed trace length and ILA

size. It returns the complete solution set of the problem at the cost of increased runtime

when compared to the approximation-based approach. An enhancement that maintains the

completeness of the solution set while drastically improving performance is also developed.

This performance enhancement prunes a substantial portion of the non-solution space, often

achieving lower runtimes than the approximation-based approach.

Finally, a set of experiments comparing the effectiveness and performance of these techniques

is presented and various practical tradeoffs are contrasted. It is found that the approximate

approach is able to find an average of 60% of the complete solution set using reasonable param-

eter values. However, in most cases the solution set it finds includes the actual error source,

Page 13: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 1. Introduction 6

indicating that it tends to find a useful subset of the complete solution set. The exact ap-

proach naturally finds the full set of solutions. This comes at a cost of a non-trivial runtime

increase when compared to the approximate approach. However, with the presented perfor-

mance enhancements the exact approach is able to find the complete solution set with runtime

comparable to the approximate approach.

1.3 Thesis Outline

This thesis is organized as follows. Chapter 2 provides background on SAT, traditional SAT-

based automated debugging, and PDR. Chapter 3 presents the approach based on traditional

debugging and approximation. Chapter 4 presents a complete and exact PDR-based approach

and performance enhancements. Chapter 5 presents the set of experiments comparing these

techniques while demonstrating their practical applicability. Finally, Chapter 6 summarizes the

results of this work and suggests potential topics of future work.

Page 14: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 2

Background

2.1 Introduction

This chapter provides background relevant to the contributions of this thesis. Section 2.2 gives

an overview of Boolean Satisfiability (SAT) and its application in computer-aided design (CAD)

problems. Section 2.3 defines the design debugging problem and explains traditional SAT-based

debugging. Section 2.4 introduces Property Directed Reachability (PDR). Finally, section 2.5

summarizes the chapter.

2.2 Boolean Satisfiability for CAD

The Boolean Satisfiability (SAT) problem can be stated as follows: given a propositional formula

Φ(x1, ..., xn) find an assignment to x1, ..., xn such that Φ evaluates to true(1), or indicate that

none exists. Such an assignment, if one exists, is called a satisfying assignment. If a formula

has a satisfying assignment, it is said to be satisfiable. Otherwise it is unsatisfiable. A SAT

solver is tasked with determining whether or not a given formula is satisfiable. If so, it returns

SAT along with a satisfying assignment. Otherwise, it returns UNSAT.

2.2.1 CNF Representation

In practice, modern SAT solvers operate only on formulas in Conjunctive Normal Form (CNF).

A formula in CNF is a conjunction of clauses, while a clause is a disjunction of literals. A literal

7

Page 15: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 2. Background 8

is an instance of a variable xi or its negation xi. Thus, the following formula is in CNF:

Φ = (x1 ∨ x2 ∨ x3) ∧ (x2 ∨ x3) ∧ (x2 ∨ x3 ∨ x4) (2.1)

Any propositional formula can be converted to CNF in polynomial time [15]. It is also

possible to convert a logic circuit to CNF. In the CNF representation of a circuit each internal

line, input, and output is represented by a variable. Logically, a gate simply imposes a constraint

on the values of the lines attached to it. For instance, the output of an inverter is always the

negation of its input. These constraints can be converted to CNF in linear time [39]. This is

done by replacing each logic gate with its characteristic function, which is a set of clauses in

CNF that represents the same constraints as the logic gate. The clauses are satisfiable if and

only if the constraints of the gate are met. For instance, a two-input AND gate implementing

the function y = x1 ∧ x2 has the characteristic function given below in Eq. 2.2.

(x1 ∨ y) ∧ (x2 ∨ y) ∧ (x1 ∨ y) ∧ (x2 ∨ y) (2.2)

Each gate can be replaced by its characteristic function. The CNF representations of the

elementary gate types are shown in Table 2.1. Taking the conjunction of the CNF representa-

tion of each gate in a circuit gives a CNF representation of the entire circuit. Any satisfying

assignment to this formula represents a valid assignment of Boolean values to the circuit lines.

By adding constraints on the variables representing the input lines, this representation can be

used to simulate the combinational behavior of the circuit. By adding constraints on the output

lines, this representation can be used to determine if the circuit is capable of producing specific

outputs. Many CAD problems can be formulated in a similar manner.

2.2.2 ILA Representation

Often CAD problems are concerned with the sequential behavior of a circuit. A sequential

circuit contains state elements such as D flip-flips (DFFs) and latches. A DFF has a data

input, clock input, and an output. On a positive edge of the clock input, the output is set to

the data input value at the moment of the positive edge. The output then remains constant

until the next positive clock edge. As SAT solvers have no concept of time and state elements,

Page 16: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 2. Background 9

Table 2.1: Characteristic functions of elementary gates

Gate Function CNF Representation

AND y = x1 ∧ ... ∧ xn( n∧

i=1

(xi ∨ y))∧( n∨

i=1

(xi ∨ y))

NAND y = x1 ∧ ... ∧ xn( n∧

i=1

(xi ∨ y))∧( n∨

i=1

(xi ∨ y))

OR y = x1 ∨ ... ∨ xn( n∧

i=1

(xi ∨ y))∧( n∨

i=1

(xi ∨ y))

NOR y = x1 ∨ ... ∨ xn( n∧

i=1

(xi ∨ y))∧( n∨

i=1

(xi ∨ y))

XOR y = x1 ⊕ x2 (x1 ∨ x2 ∨ y) ∧ (x1 ∨ x2 ∨ y)∧(x1 ∨ x2 ∨ y) ∧ (x1 ∨ x2 ∨ y)

XNOR y = x1 ⊕ x2 (x1 ∨ x2 ∨ y) ∧ (x1 ∨ x2 ∨ y)∧(x1 ∨ x2 ∨ y) ∧ (x1 ∨ x2 ∨ y)

BUFFER y = x (x ∨ y) ∧ (x ∨ y)

NOT y = x (x ∨ y) ∧ (x ∨ y)

the representation in the previous section is insufficient to model the sequential behavior of a

circuit.

A common way of modeling sequential circuits is the Iterative Logic Array (ILA) represen-

tation. To explain this representation, it is first necessary to introduce appropriate notation.

Given a sequential circuit, let X = {x1, ..., x|X|} denote its primary input, Y = {y1, ..., y|Y |}

denote its primary output, and S = {s1, ..., s|S|} denote the state variables (flipflops) of the

circuit. Further let vectors Xi = {xi1, ..., xi|X|}, Y i = {yi1, ..., yi|Y |}, and Si = {si1, ..., si|s|}, denote

the values of the primary input, primary output, and state variables at cycle i, respectively.

A circuit’s ILA representation is constructed by replicating the combinational part (tran-

sition relation) k times, where each copy is called a time-frame. The next-state variables of

frame i are connected to the current state variables of frame i + 1. As such, an ILA of k

time-frames models the circuit’s behavior over k clock cycles. The following example illustrates

the construction of an ILA.

Example 2.1. Figure 2.1 depicts the construction of an ILA of two time-frames. In Fig-

Page 17: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 2. Background 10

s1

x1

x2

DQ

>

y1

(a)

s11

x11

x12

y11

s21

x21

x22

y21

s31

(b)

Figure 2.1: (a) A sequential circuit (b) ILA representation with 2 time-frames

ure 2.1(a), a sequential circuit is shown with a single state element s1, two inputs x1 and x2,

and a single output y1. To construct an ILA, all of the primary input, primary output, and

internal lines are replicated along with the transition relation of the circuit. This can be seen in

Figure 2.1(b) where each box contains one copy of the transition relation and all lines are dupli-

cated. Additionally for flipflop s1, the input D from time-frame 1 has replaced the output Q in

time-frame 2. The combinational circuit in Figure 2.1(b) models the behavior of the sequential

circuit over 2 clock cycles. For instance, consider constraining the values of the primary input

variables with X1 in frame 1 and X2 in frame 2. Doing so, this ILA could be used to compute

the response Y 1 and Y 2 that would be generated by the original circuit for clock cycles 1 and

2, respectively.

The ILA is a combinational circuit that effectively models the sequential behavior of a circuit

for a limited number of clock cycles. As such, it can be transformed to a formula in CNF in

the manner described in section 2.2.1. This representation can be used to solve various CAD

problems. For instance, conjoining the clause (s31) to the CNF representation of the ILA in

Figure 2.1(b) gives a formula that is satisfiable if and only if it is possible to reach a state in

which s1 = 1 in 2 clock cycles.

2.3 SAT-based Debugging

In verification and debugging, a failure is incorrect behavior that differs from the specification.

For a particular failure, an error is a design location (i.e., a wire in the design) that can be

changed to correct the failure. Design debugging is the task of locating the error when functional

Page 18: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 2. Background 11

verification detects a failure. Traditionally, a failure is revealed through means such as firing

assertions, observation value mismatches, scoreboard discrepancies, etc., and an error trace that

demonstrates the failure is returned.

In these cases, an automated debugging utility [38] can be used with the error trace to find

the error. Assume the error trace has k clock cycles, and let vector X1, ..., Xk represent the

input values from the error trace. Let vector Y1, ...,Yk denote the correct output values for the

trace according to the circuit’s specification (the expected response). Since the trace exhibits a

failure, Y i must differ from the observed response Y i at some clock cycle i.

First, the transition relation is enhanced at a set of suspect locations. A suspect location is

a line in the circuit that is suspected of being an error. In the absence of a priori knowledge

about the cause of the failure, every location in the circuit is a suspect. Let L = {l1, ..., l|L|} be

the suspect locations. For each suspect location li, an error-select line ei and a free variable wi

are added. The suspect location is replaced by a multiplexer with output zi, 0-input li, 1-input

wi, and select input ei as depicted in Figure 2.2. At each of its fanout locations, li is replaced

by zi. It can be seen that when ei = 0 the behavior is unaffected. Conversely, the suspect

location is replaced by an unconstrained free variable when ei = 1, allowing it to behave as an

arbitrary Boolean function.

li

wi

ei

0

1

zi

Figure 2.2: Error multiplexer inserted at suspect location li

A SAT-based debugging tool then constructs a k-frame ILA representation of the circuit.

In this step, the error-select lines are handled separately. While each other line in the enhanced

transition relation is replicated k times, the error-select lines are not replicated. Instead, for

location li, a single copy of ei is the select-input of the inserted multiplexer in every time-frame.

Figure 2.3 depicts a debugging ILA for the circuit of Figure 2.1(a). It can be seen that the

error-select lines are not replicated.

The debugging ILA is constrained with the input values X1, ..., Xk from the error trace

Page 19: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 2. Background 12

s11

x11

x12

y11

s21

0

1

0

1

w11

w12

x21

x22

y21

0

1

0

1

w21

w22

s31

e2e1

Figure 2.3: Debugging ILA

and the expected response Y1, ...,Yk according to the specification. Since the output of the

circuit does not match the specification, the instance is unsatisfiable if ei = 0 for all 1 ≤ i ≤ |L|

as the ILA will behave exactly the same as the original circuit. Conversely, the instance is

trivially satisfiable if ei = 1 for all 1 ≤ i ≤ |L| since every line can be replaced by an arbitrary

value. Notice that the set of locations associated with the error-select lines assigned to 1 in

any satisfying assignment represents a set of locations that can be simultaneously modified to

correct the failure exposed by the error trace.

In order to get meaningful results the number of simultaneous errors must be constrained.

This is accomplished by adding a cardinality constraint φn on the error-select lines. The con-

straint φn enforces that exactly n error-select lines are active in any satisfying assignment. For

the case of n = 1, the CNF representation of φ1 is shown in Eq. 2.3. In that equation, the first

clause ensures that at least one error-select register is active. The remaining clauses ensure

that no pair of error-select registers is simultaneously active.

φ1 = (e1 ∨ ... ∨ e|L|) ·∧

1≤i<|L|i≤j≤|L|

(ei ∨ ej) (2.3)

For cases with a larger error cardinality, the constraint can be implemented using an adder

with n one-bit inputs and the appropriate output size to accommodate values from 0 to n. The

adder’s output is then fed to a comparator that outputs 1 if and only if its input is exactly

equal to n. This hardware construction is shown in Figure 2.4.

Page 20: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 2. Background 13

Σ

e1 e2 en...

n

P QP = Q

Figure 2.4: Hardware construction for error cardinality n > 1

A satisfying assignment therefore has exactly n active error-select lines. This corresponds

to an n-tuple of suspect locations that can be simultaneously modified to correct the erroneous

behavior exposed by the error trace. An all-solutions SAT solver is then used to find every

satisfying assignment to the resulting formula.

2.4 Property Directed Reachability

2.4.1 Notation and Terminology

Before introducing concepts relevant to Property Directed Reachability (PDR), it is necessary

to give notation and terminology that will be used throughout this thesis. Given a sequential

circuit C, let S = {s1, ..., s|S|} denote the set of current-state variables (flipflops) of C. Similarly,

let S′ = {s′1, ..., s′|S|} denote the set of next-state variables (inputs to flipflops) of C. The set

of initial states of C is denoted by I ⊆ {0, 1}S . For the purpose of model checking, the circuit

can be modeled as a Finite State Machine (FSM) M = (S, I, T ). What follows formally defines

a state of the circuit.

Definition 2.1. Each assignment to the state variables t ∈ {0, 1}S is a state of the circuit.

A state t can be represented by a cube, which is simply a conjunction of literals. The cube

is formed by taking the conjunction of the positive literals for each variable assigned to 1 along

with the negative literals for each variable assigned to 0. The transition relation of the circuit

is denoted by T ⊆ {0, 1}S × {0, 1}S . A pair of states 〈t0, t1〉 ∈ T if and only if there is an

assignment to the primary input that causes the circuit to transition from state t0 to t1. The

following definition will be used to formally define the reachability of states under the circuit’s

Page 21: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 2. Background 14

transition relation.

Definition 2.2. A sequence of states t0, ..., tn is a trace of the circuit if and only if 〈ti, ti+1〉 ∈ T

for all 0 ≤ i < n and t0 ∈ I.

When considering the reachability of a state, the number of cycles it takes to reach the

state might be relevant. In other cases, it may only be important to know whether or not the

state can be reached at all. There are therefore two notions of reachability that follow from the

definition of a trace, defined below.

Definition 2.3. A state t is reachable if and only if t appears in some trace of the circuit. It

is also i-step reachable if and only if it appears in a trace of length less than i.

Another aspect of reachability that is often relevant is the reachability of sets of states. For

some predicate P ⊆ {0, 1}S , any state t ∈ P is referred to as a P -state in this thesis. The

predicate can be represented by a Boolean formula over the state variables of S. For predicate

P , let P ′ denote the same predicate over the corresponding variables of S′. This allows the

construction of SAT instances such as P ∧ T ∧ Q′, which indicate whether any P -state can

transition to a Q-state. The predicate P is said to be (i-step) reachable if and only if any

P -state is (i-step) reachable. The following definition allows reasoning about the reachability

of every state in P .

Definition 2.4. A predicate P is i-step invariant if and only if I ⊆ P and P includes every

i-step reachable state. It is also invariant if and only if it includes every reachable state.

From this definition an i-step invariant over-approximates the set of i-step reachable states,

while an invariant over-approximates the set of all reachable states. This is the intuitive mean-

ing behind the term invariant. Since an invariant over-approximates all reachable states, it

represents a property that always holds during the operation of the FSM. In practice, it may be

difficult to determine whether or not a given predicate is an invariant. The following definition

gives a stronger notion of invariance that can be checked for using a SAT solver.

Definition 2.5. A predicate P is an inductive invariant if and only if P is an invariant that is

also closed under T (i.e., t ∈ P ∧ t′ ∈ ¬P ⇒ 〈t, t′〉 6∈ T ).

Page 22: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 2. Background 15

An inductive invariant P over-approximates the set of all reachable states since it contains

all initial states and no ¬P -state is reachable from a P -state. The Boolean formula P ∧T ∧¬P ′

is unsatisfiable if and only if P is closed under T , providing a means to determine if a predicate

is an inductive invariant.

2.4.2 High-Level Overview

PDR is a model checking algorithm [9]. Model checking refers to the task of determining if a

property holds for a design. Properties can be classified as either liveness or safety properties.

A safety property is a set of states that must be unreachable for the property to hold. A

liveness property is a set of states where at least one member of the set must be reachable

for the property to hold. Given a safety (liveness) property P , an inductive invariant J where

J ∩ P = ∅ is a certificate proving that P holds (does not hold). Conversely, a trace t0, ..., tn

where ti ∈ P for some 0 ≤ i ≤ n is a certificate proving that safety (liveness) property P

does not hold (holds). Throughout this thesis, it is assumed that an algorithm PDR(M,P, k)

exists, where M = (S, I, T ) is an FSM. It returns Reachable if and only if a P -state is k-step

reachable under M . If k =∞, it returns Reachable if and only if a P -state is reachable under

M . Otherwise it returns Unreachable.

Given a safety property P ⊆ {0, 1}S , PDR attempts to find an inductive invariant proving

that the property holds. The property P represents the set of “safe” states, i.e., P holds if

and only if no ¬P -state is reachable. Its complement ¬P represents the set of “unsafe” states.

Unlike many CAD algorithms based on SAT, PDR does not use an ILA representation of the

circuit. Rather, it constructs a series of SAT instances using a single copy of the transition

relation. The SAT instances are aimed at finding states that lead to a violation of the property.

PDR then attempts to prove that these states cannot be reached in a bounded number of

steps. If P holds, these proofs eventually allow the algorithm to discover an inductive invariant

proving that fact. Otherwise, PDR fails to compute a needed invariant and instead finds a

trace leading to an unsafe state.

Page 23: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 2. Background 16

2.4.3 Detailed Description

In greater detail, the algorithm proceeds as follows. The set of initial states I is represented

as a CNF formula also referred to as I, since these are merely different representations of the

same thing. The given safety property P is similarly represented by a formula in CNF, as is the

transition relation T . The first step is a precheck for zero-step and one-step counter-example

traces. The formula I ∧ (¬P ) checks for zero-step counter-example traces, as it is satisfiable

if and only if ¬P contains an initial state. Subsequently, the satisfiability of I ∧ T ∧ (¬P ′) is

checked to find any one-step counter-example traces as this formula is satisfiable if and only

if an initial state reaches an unsafe state in one step. The following example illustrates the

precheck step and will be referred to throughout this section.

Example 2.2. Consider the FSM depicted in Figure 2.5. In the figure, states are identified

by their binary encodings. For example, the state 100 is the state represented by the cube

(s1 ∧ s2 ∧ s3). Arrows indicate state transitions that are present in the transition relation. For

instance, the figure shows that 〈000, 001〉 ∈ T . The only unsafe state is 111. The precheck step

executes two SAT queries. The first is I ∧ ¬P , which checks if any initial state is unsafe. This

query is UNSAT, since the initial state is not 111. The second SAT query is I ∧ T ∧ ¬P ′, which

checks if any unsafe state is one-step reachable. This query is also UNSAT as 〈000, 111〉 6∈ T . At

this point, the precheck step is done, indicating that P is one-step invariant.

000 111001

010

011 100 110

101

I ¬P

P

Figure 2.5: Example finite state machine

After the precheck step, the algorithm begins computing invariants. It maintains a series

of predicates 〈F0 = I, F1, F2, ...〉 represented as CNF formulas. This is referred to as the

inductive trace. At every step in the process, each clause of each Fi is i-step invariant. Since

the clauses of Fi over-approximate the i-step reachable set, their conjunction (i.e., Fi itself)

Page 24: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 2. Background 17

also over-approximates the i-step reachable set. Each clause of Fi is also a clause of every Fj

for 1 ≤ j < i, as any i-step invariant clause is also j-step invariant for 1 ≤ j < i. In other

words, Fj+1 has a subset of the clauses of Fj for all j > 0. This implies that Fj ⊆ Fj+1, as Fj

simply has more clauses constraining it than Fj+1. Additionally, F0 is exactly the set of initial

states. Figure 2.6 below illustrates the intuition behind this process. In the example it shows

I ⊆ F1 ⊆ F2 ⊆ F3 and F3 ∩ ¬P = ∅. The latter proves that P is three-step invariant, as no

unsafe state (i.e., no ¬P -state) can be reached in 3 steps from an initial state.

¬P

I

F1

F2

F3

Figure 2.6: Example state-space over-approximations

Initially, PDR sets F1 = P as P is known to be one-step invariant from the SAT queries in

the precheck step. Next, PDR proceeds through a series of iterations 1, 2, ... in which iteration

i attempts to prove that P is (i+ 1)-step invariant. To do so, PDR repeatedly finds a state of

Fi that is one step away from a ¬P -state. This is done by finding a satisfying assignment to

the formula below:

Fi ∧ T ∧ (¬P ′) (2.4)

If Eq. 2.4 is satisfiable, then the satisfying assignment includes a state t ∈ Fi that is one step

from a violation of P . If t is indeed i-step reachable, then P does not hold. Assume for the

moment that t is not i-step reachable. This means that the clause ¬t is an i-step invariant that

can safely be conjoined to Fi. Doing so would prevent finding any further satisfying assignments

to Eq. 2.4 involving t. This procedure is demonstrated in the following example.

Example 2.3. Consider the FSM of Figure 2.5. After the precheck step explained in Exam-

Page 25: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 2. Background 18

¬P

I

F1

F2

F3

(a)

¬P

I

F1

F2F3

(b)

Figure 2.7: (a) A predecessor of an unsafe state in F3 (b) Approximations after refining F3

ple 2.2, P is known to be one-step invariant. Therefore, F1 is initialized to P . Subsequently,

the algorithm tries to find a satisfying assignment to the formula F1 ∧ T ∧ ¬P ′. Since F1 = P ,

we have 110 ∈ F1, and the formula is satisfied with 110 as the chosen F1-state. The SAT solver

could also have found an assignment involving the state 101, as it is also an F1-state that is

one step from an unsafe state.

Finding and blocking every state in this manner could be highly inefficient. PDR therefore

uses a process known as generalization to compute a more general invariant clause d, which is

then conjoined to Fi. The clause d contains a subset of the literals of ¬t. As d is still i-step

invariant, it blocks state t and may block other states that are not i-step reachable. Figure 2.7

illustrates this procedure. In Figure 2.7(a) a predecessor of an unsafe state is found in F3.

Assume this state is not actually three-step reachable. It is therefore removed from F3 as

depicted in Figure 2.7(b), along with some other states that are not three-step reachable. The

generalization procedure is key to the performance of PDR and is the subject of a great deal

of research [9, 13, 21, 25, 24, 3]. A full discussion is beyond the scope of this thesis, as it can

be considered a “black box” procedure.

In order to determine whether t is merely a “spurious” result of the over-approximate nature

of Fi or is actually reachable, PDR essentially calls itself recursively. In doing so, it may add

new clauses to the formulas Fj where 1 ≤ j < i. If i > 1, then this results in a search for an

Fi−1-state that reaches t in one step, which may lead to further recursive calls. If i = 1, then

this results in a search for an initial state that reaches t in one step. Notice that the inductive

trace F1, F2, ..., Fi may play a substantial role in the depth of recursion required to block state

Page 26: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 2. Background 19

t. If each Fj (j < i) is a poor over-approximation of the set of j-step reachable states (i.e., it

includes many states that aren’t j-step reachable), then many recursive calls may be required

to block t. However, if the formulas closely represent the j-step reachable sets, then few of these

spurious states are found and the recursive calls may terminate sooner. The following example

demonstrates an update of the inductive trace.

Example 2.4. Continuing the illustration of the algorithm from Example 2.3, consider the

FSM of Figure 2.5. Assume the SAT solver found 110 as an F1-state that is one step from an

unsafe state. This leads to a recursive query to determine if the state is one-step reachable.

PDR executes the SAT query I ∧ T ∧ (s1 ∧ s2 ∧ s3)′ to determine if an initial state can reach

the state 110 in one step. This query is UNSAT since 〈000, 110〉 6∈ T . Therefore, this state

must be removed from F1. Without generalization, this could be accomplished by adding the

clause (s1 ∨ s2 ∨ s3) to F1. However, it can be seen that no s1-state is one-step reachable, so

generalization may drop the literals s2 and s3. In this case, F1 = P ∧ (s1) is the resulting

updated formula.

This process may lead to a trace that reaches t, thereby disproving the property P . Al-

ternatively, it leads to a proof that t is not i-step reachable, allowing PDR to update Fi in

the manner described above. PDR continues finding such states until Eq. 2.4 is unsatisfiable,

implying that no Fi-state can reach a ¬P -state in one step. Since Fi over-approximates the set

of i-step reachable states, this implies that P is (i + 1)-step invariant. The following example

builds on the previous one to demonstrate this process.

Example 2.5. Continuing from Example 2.4, assume that F1 = P ∧ (s1) after generalization.

The algorithm then executes a SAT query using the formula F1 ∧ T ∧ ¬P ′. The only states

that are one step from a ¬P -state are 101 and 110. Therefore, the query is UNSAT, as neither

of these states satisfy the clause (s1).

Next, the algorithm begins constructing Fi+1 in preparation for the next iteration. It is

possible to simply set Fi+1 = P , since P is now known to be (i + 1)-step invariant. However,

some of the clauses from Fi may be provably (i+ 1)-step invariant. Additionally, the iteration

may have resulted in new clauses being added to F1, ..., Fi−1, and clauses from any Fj may be

Page 27: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 2. Background 20

provably (j + 1)-step invariant. To make use of these facts, the algorithm performs a clause

propagation step. For each Fj , PDR attempts to prove that each clause of Fj is (j + 1)-step

invariant. If so, the clause is added to Fj+1. For a clause c ∈ Fj , this is accomplished with the

following SAT query:

Fj ∧ T ∧ ¬c′ (2.5)

This query asks the question “can any Fj-state reach a ¬c-state?” As Fj includes every j-step

reachable state, this can be rephrased as “can any j-step reachable state reach a ¬c-state?” If

the answer is no (i.e., the query is unsatisfiable), then c also over-approximates the (j+ 1)-step

reachable set and can be added to Fj+1. This process is repeated for every clause from each of

F1, ..., Fi. The following example demonstrates this procedure.

Example 2.6. Continuing the illustration of the algorithm from the previous example, recall

that F1 = P ∧ (s1). The algorithm now tries to propagate the clause (s1). It executes the SAT

query F1∧T ∧(s1′). This query is satisfiable, because 011 ∈ F1, 100 ∈ (s1), and 〈011, 100〉 ∈ T .

Despite this result, note that the clause (s1) does over-approximate two-step reachability and

therefore could be included in F2. However, F1 would require additional clauses to support the

proof of this fact. Therefore, at the beginning of iteration 2, F2 = P .

During the process of clause propagation, it is possible that every clause from formula Fj is

propagated to Fj+1 for some j ≤ i. Since Fj is in CNF, it can be rewritten as Fj = c1 ∧ ...∧ cn.

Since every clause of Fj was propagated to Fj+1, Eq. 2.5 is unsatisfiable for every clause of Fj .

This implies that the following formula is unsatisfiable:

Fj ∧ T ∧ (¬c1 ∨ ... ∨ ¬cn)′ (2.6)

Note that (¬c1 ∨ ... ∨ ¬cn)′ = ¬F ′j . In other words, Fj ∧ T ∧ ¬F ′j is unsatisfiable, meaning Fj

is closed under the transition relation. By definition 2.5, Fj is an inductive invariant. It also

does not include any unsafe states by construction, so it proves that P holds. If this occurs, the

algorithm terminates. Otherwise, PDR begins a new iteration after clause propagation. The

following example demonstrates the termination of the algorithm and concludes this section.

Page 28: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 2. Background 21

Example 2.7. Continuing from Example 2.6, iteration 2 begins with F2 = P . As in Exam-

ple 2.3, the query F2 ∧ T ∧ ¬P ′ is satisfied by the states 110 ∈ F2 and 111 ∈ ¬P . Similar to

the earlier example, the state 110 is not two-step reachable, so the clause (s1 ∨ s2 ∨ s3) can

be conjoined to F2. Assume generalization yields the clause (s1 ∨ s2), and let c denote that

clause. Now F2 = P ∧ c and the SAT query F2 ∧ T ∧ ¬P ′ is UNSAT, ending iteration 2. Clause

propagation begins, and the formula F2 ∧ T ∧ ¬c is UNSAT. Since F2 only has clauses c and P ,

this implies that F2 is an inductive invariant proving that P holds. Indeed, one can verify that

F2 includes all states other than 101, 110, and 111. From Figure 2.5 it can be seen that F2

models the set of reachable states exactly.

The end result of Example 2.7 demonstrates a more intuitive notion of the inductive invariant

returned by PDR. No state in F2 can reach a state outside of F2. Since F2 includes all initial

states, this implies it over-approximates the reachable states. Additionally, F2 is merely a subset

of P , as it is represented by P with additional clauses conjoined. Since P includes F2 and F2

includes the reachable set, P includes the reachable set, implying it is invariant. The inductive

invariant F2 is simply a stronger version of P , that is, one that includes fewer states. The

benefit of an inductive invariant is that a single SAT query can be used to check for inductive

invariance, as explained in Definition 2.5. However, no simple method is known to check a

formula for invariance in the general case.

2.5 Summary

This chapter introduces background material relevant to the contributions of the thesis. First,

Boolean satisfiability is introduced along with its application to typical CAD problems. Tra-

ditional SAT-based debugging is presented next. Finally, a brief introduction to the model

checking algorithm of Property Directed Reachability is given.

Page 29: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 3

Traceless Debugging Using

Approximation and Unrolling

3.1 Introduction

When functional verification reveals an error, debugging begins in an attempt to localize and

correct the failure. Dynamic verification involves simulating the design using known input

vectors and observing that the response matches expectations. When dynamic verification

reveals an error, the known input vector and expected response provide an error trace that can

be used to guide a traditional SAT-based automated debugging tool [38]. Static verification

involves using formal methods to prove that the design implements a set of properties. Typically,

this consists of using a model checking algorithm such as PDR [9] to prove that a set of safety

properties holds. When a property fails, the model checker returns an error trace and SAT-based

debugging can be readily applied.

Static and dynamic verification can also reveal other types of errors for which no error trace

is available. With static verification, it is possible to use a model checker to prove that the

design implements a set of liveness properties. In this case, failure implies that a set of states

is unreachable in violation of the design specification. If the model checker reports a failure,

an error is clearly detected but an error trace is not readily available to guide an automated

debugging tool. Similarly, during dynamic verification it is possible to count the number of

22

Page 30: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 3. Traceless Debugging Using Approximation and Unrolling 23

times the design enters known desirable states. If it is revealed that a state is never entered,

this provides suspicion that the state is unreachable in violation of the specification. Static

verification could be used to confirm this fact. In this case as well, the error traces needed to

guide a SAT-based automated debugging tool are not readily available.

This chapter presents a novel automated technique to debug these kinds of failures in the

absence of an error trace [5]. The algorithm makes use of the liveness property itself in order

to debug the failure. Instead of using an error trace directly, PDR is used to compute an

over-approximation of the set of reachable states for a bounded number of clock cycles. The

algorithm proceeds through a user-specified number of iterations in which the i-th iteration over-

approximates the set of all i-step reachable states. Subsequently, a partial ILA similar to that

described in section 2.3 is constrained with the over-approximation states and the target states

and then converted to a SAT instance. For conceptual simplicity, in the initial formulation the

partial ILA consists of a single time-frame. The approach is later extended to use ILAs with

an arbitrary number of time-frames. Any solution found in iteration i represents a location

that can be changed such that a target state becomes (i + 1)-step reachable. Due to the use

of over-approximation, some solutions found may be “spurious.” These solutions are detected

using PDR and discarded. As a side effect of the detection process the over-approximations are

refined, potentially making further spurious solutions less likely.

This approach is effective, but finds only a subset of the solutions. In particular, it only

finds solutions that correct the failure by making a target state reachable one cycle after an

already-reachable state. An extension is presented that uses a partial ILA consisting of N

time-frames for a user-specified parameter N . This allows the approach to find solutions that

make a target state reachable N steps following an already-reachable state. This is referred to

as the “multi-cycle” formulation, in contrast to the “single-cycle” formulation that uses a single

time-frame.

Subsequently, a performance enhancement is presented that formulates the technique in a

monolithic fashion without using iterations. Instead of computing an over-approximation of

the set of error traces of length 1, then length 2, etc. the monolithic approach simply over-

approximates all traces less than or equal to a user-specified length in one step. Experiments

presented in Chapter 5 find that this yields a substantial speedup. This represents a trade-off

Page 31: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 3. Traceless Debugging Using Approximation and Unrolling 24

between runtime and resolution, as some information is lost by this formulation. In particular,

a solution found by the iterative formulations in iteration i may be used to make a target state

(i + 1)-step reachable. The monolithic formulation finds the same solution, but is unable to

indicate the minimum number of cycles in which the solution can reach the target state.

The remainder of this chapter is organized as follows. Section 3.2 presents related work and

the relevant notation. Section 3.3 presents both the single-cycle and multi-cycle variants of the

iterative approach. Section 3.4 presents the performance-enhanced monolithic formulation of

the approach. Finally, section 3.5 concludes the chapter.

3.2 Preliminaries

3.2.1 Previous Work

The authors in [14] tackle the similar problem of dead or unreachable code. The technique they

propose can be applied when verification or coverage analysis indicates a line of HDL code is

unreachable. It involves using a novel symbolic simulation technique to explore non-existent

code paths and determine which variables are the cause of the unreachable code. Subsequently,

suggested values that would make the code reachable are provided. The approach in [14]

complements the approaches presented in this thesis, as it provides different insight into the

source of the error. That approach provides the user appropriate values for variables to reach

the dead code. In essence, it informs the user what unreachable states may be responsible for

the unreachable code. The approaches presented in this thesis inform the user which locations

are responsible for the unreachable states.

3.2.2 Notation

Before presenting the algorithms, it is necessary to first introduce the relevant notation. Ta-

ble 3.1 contains a summary of key symbols used throughout this chapter, which are described

in greater detail here. The input to the debugging algorithm is an erroneous circuit C, a set

of unreachable target states S, and a set of suspect locations that is assumed in this chapter

to simply include every location in the design. The set of target states can be represented by

a propositional formula also called S. Let T denote the transition relation of circuit C. A

Page 32: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 3. Traceless Debugging Using Approximation and Unrolling 25

Table 3.1: Glossary of symbolsSymbol Meaning

C Erroneous circuit being debugged

K Iteration limit parameter to Algorithms 3.1, 3.2, 3.3

N Window-size parameter to Algorithms 3.2 and 3.3,

S Target state for an unreachability debug problem

S Current-state variables (registers)

S′ Next-state variables (inputs to registers)

X Primary input

Y Primary output

T Transition relation

Ten Enhanced transition relation used in debugging′ (prime) When applied to a formula over S, indicates the same formula over S′

I Set of initial states of the circuit

solution to the debugging problem is defined below.

Definition 3.1. A solution of error cardinality n is an n-tuple of suspect locations where a

change can be implemented to make some target state reachable.

Naturally, the task of a debugging algorithm is to find which of the suspect locations are

solutions. In the common case where n = 1, a solution is merely an individual suspect location

that can correct the error. If n > 1, then it consists of multiple locations that must be

simultaneously corrected. The algorithms presented in this chapter are sound but may be

incomplete. That is, every n-tuple of locations returned is guaranteed to be a solution. However,

it is not guaranteed that every solution is returned.

3.3 Iterative Formulation

This section presents a novel iterative formulation to debug unreachable states in the absence

of an error trace. It first presents the single-cycle formulation which finds solutions that make

some target state reachable one cycle following an already-reachable state. The algorithm

accepts as input an error cardinality n, a set of target states S, an iteration limit parameter K,

and an erroneous circuit C with transition relation T and initial states I. It returns locations

where a change can be made to make a target state (K + 1)-step reachable. This approach is

then extended to find solutions that can make a target state reachable up to N cycles after an

already-reachable state for a user-specified parameter N .

Page 33: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 3. Traceless Debugging Using Approximation and Unrolling 26

3.3.1 Single-Cycle Unreachability

The initial formulation of the algorithm involves sequence of iterations, each of which models

and debugs a single state transition from a reachable state to the target states. As calculating

the exact set of reachable states is intractable, in each iteration an over-approximation is used to

model the set of potentially-reachable states. Due to the use of an over-approximation spurious

solutions may be found. These are detected and discarded in a process that strengthens the

over-approximation, thereby reducing the chances of finding more spurious solutions.

In greater detail, the i-th iteration searches for solutions that may make a target state

(i + 1)-step reachable. Each iteration consists of two steps: reachability analysis and debug-

ging. The reachability analysis step simply extracts Fi from PDR, which is the formula over-

approximating the set of i-step reachable states. The PDR instance is directed at proving S is

unreachable, using the circuit’s original transition relation and initial states. The approxima-

tion Fi is merely an initial approximation of the i-step reachable set and may be strengthened

during the debugging step.

The debugging step constructs a SAT-based debugging instance, with the goal of finding

suspect locations that can be changed to allow for a state transition from some i-step reachable

state to a target state. Towards this end, the instance is constructed with a single copy of the

transition relation constrained by Fi at its input and S at its output. Intuitively, the current

set of states is constrained by Fi while the next state is constrained to S. The primary input

and output are left unconstrained, allowing the SAT solver to find solutions for every input

assignment. Finally, a cardinality constraint φn is added to find solutions of error cardinality

n. Letting Ten denote the enhanced transition relation, the resulting debugging instance can

be expressed as the following Boolean formula:

Fi ∧ T ∧ S ′ ∧ φn (3.1)

Figure 3.1 depicts the debugging instance of Eq. 3.1, where the shaded region represents the

actual set of i-step reachable states. The SAT solver is free to select candidate states from the

set Fi in order to reach the target state. In this context, a solution of cardinality n consists of n

active error-select lines and an Fi-state. As Fi over-approximates the set of all i-step reachable

Page 34: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 3. Traceless Debugging Using Approximation and Unrolling 27

states, inherent to the method is that some solutions found may not be exact. If the chosen

state from Fi is not actually reachable, then the active error-select lines do not necessarily

correspond to locations where a change can be implemented to make the target state reachable.

This means that some solutions to Eq. 3.1 may not be solutions to the debugging problem as

defined in definition 3.1. These are referred to in this thesis as spurious solutions.

Ten SFi

Figure 3.1: Representation of the debugging instance

It is necessary to detect such cases and reject them. On the other hand, any solution to

Eq. 3.1 for which the current state is reachable is indeed a solution. This is the intuition behind

the spurious solution detection methodology. Let t ∈ Fi be the chosen current state. To check

for a spurious solution, PDR is called to check if t is i-step reachable. If so, then the solution is

accepted, recorded, and a blocking clause is added to Ten. If instead t is not i-step reachable,

PDR will update its inductive trace by adding new clauses to Fi such that t and potentially

other states that are not i-step reachable are excluded from Fi. The debugging instance is

similarly updated by adding the new clauses and further solutions are sought. In either case,

the formula is then passed back to the SAT solver to find more solutions.

In practice, when a spurious solution is detected, the generalization procedure of PDR may

remove many other states that are not i-step reachable from Fi. This tends to result in a rapid

increase in the accuracy of the approximation and hence the debugging methodology. This is

shown in Figure 3.2 where state t has been found to be spurious in the current iteration of

the algorithm. As a result of PDR detecting that t is not i-step reachable and generalizing

Fi

t

(a)

Fi

t

(b)

Figure 3.2: Set Fi (a) initially (b) after detecting a spurious result from state t

Page 35: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 3. Traceless Debugging Using Approximation and Unrolling 28

that fact, the more accurate approximation shown in Figure 3.2(b) is derived. The improved

approximation reduces the chances of finding spurious solutions in the current iteration and

may improve the runtime of future iterations.

Notice that PDR is called to check for i-step reachability rather than checking for reach-

ability in any number of steps. A state t that is not i-step reachable does not always lead to

a spurious solution. It may be the case that t is reachable in a larger number of cycles than

i. It may also be the case that the solution coincidentally both makes t reachable and makes

S reachable from t. The extension presented in section 3.3.3 better handles the latter case.

The former case can be handled by increasing K, potentially at the cost of increased runtime.

The check for i-step reachability is preferred over the unbounded check for three reasons. The

first is related to precision. Iteration i seeks solutions that could make the target (i + 1)-step

reachable. If the algorithm used an unbounded call, it may admit solutions that make the tar-

get state reachable only in a larger number of clock cycles. The second reason is repeatability,

as different random conditions in the algorithm may yield different results if an unbounded

call is used. This is because an Fi-state that is not i-step reachable may lead to a solution in

one run of the algorithm. However, in another run that state may not be included in Fi as

PDR uses randomness in computing its approximations. Therefore, the solution would not be

found. The final reason is performance-related. Doing an unbounded call to PDR once for each

satisfying assignment found could be very inefficient. Intuitively, proving a state is unreachable

in a bounded number of cycles is a much easier problem than proving it in the unbounded case.

Pseudocode for the entire procedure is shown in Algorithm 3.1. The algorithm is called

SCUnreachability (single-cycle unreachability) as it finds solutions that reach the target

one cycle after a reachable state. It assumes the existence of a procedure ExtractState

that extracts the chosen state of Fi from a satisfying assignment of the debugging instance.

Line 3 executes PDR directed at proving S unreachable, implicitly computing Fi. Line 4 of

the algorithm constructs the initial debugging instance for the current iteration. The loop on

lines 5-13 repeatedly finds satisfying assignments. Line 7 checks if the found solution could be

spurious. If it is a real solution, line 8 records it, while line 9 blocks the solution from being

found again by adding a clause to Ten. If the solution contains active error-select lines e1, ..., en,

the blocking clause is (¬e1 ∨ ... ∨ ¬en). Otherwise, the solution is spurious and PDR updates

Page 36: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 3. Traceless Debugging Using Approximation and Unrolling 29

Algorithm 3.1 SCUnreachability(C,S,K, n)1: solutions = ∅2: for i in 0, 1, ...,K do3: PDR((S, I, T ),S, i)4: U = Fi ∧ Ten ∧ S ′ ∧ φn5: while (Solution = SAT (U)) 6= UNSAT do6: t =ExtractState(Solution)7: if PDR((S, I, T ), t, i) then8: solutions = solutions ∪ {Solution}9: Ten = Ten ∧BlockingClause

10: else11: U = Fi ∧ Ten ∧ S ′ ∧ φn12: end if13: end while14: end for15: return solutions

Fi to block state t. In this case, line 11 uses the newly-updated Fi to update the debugging

instance. The iteration continues until the debugging instance is unsatisfiable.

Note that all calls to PDR in this algorithm are done incrementally. That is, only the

first call starts “clean.” All other calls reuse the inductive trace F0, F1, ... from previous calls,

resulting in much better performance. This can be done because each call to PDR uses the same

transition relation and set of initial states. Therefore, each Fi is still an over-approximation of

the set of i-step reachable states. As more incremental calls are made to PDR, all reusing and

refining the same inductive trace, each Fi may begin to model the set of i-step reachable states

more closely. As a result, each run of PDR is expected to perform better than earlier runs.

The following theorem demonstrates that the algorithm finds the desired solution set in

each iteration.

Theorem 3.1. In iteration i, Algorithm 3.1 finds exactly the set of all solutions that make the

target state reachable one step from an i-step reachable state.

Proof. The debugging instance of Eq. 3.1 uses Fi as the current state set. Therefore, due to

the exhaustive nature of SAT-based debugging, it finds all solutions that make S reachable one

step from an Fi-state. Since Fi is an over-approximation of the set of i-step reachable states,

this includes every solution that makes S reachable one step after any i-step reachable state.

Additionally, the check for spurious solutions filters out any solutions where the chosen current

Page 37: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 3. Traceless Debugging Using Approximation and Unrolling 30

state is not i-step reachable. Therefore, the set of solutions found is precisely those that make

the target state reachable one step after an i-step reachable state.

The solution set of the overall algorithm is simply the union of the solution sets from each

individual iteration. Therefore, Theorem 3.1 implies that the algorithm finds every solution

that makes the target state (K + 1)-step reachable in one step from a K-step reachable state.

3.3.2 Sample Debugging Problem

This subsection steps through an example run of Algorithm 3.1 to better demonstrate its

operation. The shift register circuit shown in Figure 3.3 is used to demonstrate the algorithm.

Assume the initial state is the all-zero state represented by the cube (s1∧ s2∧ s3). Figure 3.3(a)

shows a correct implementation of the shift register. In Figure 3.3(b), the highlighted gate

has been changed introducing unreachable states. The reader can observe that it is impossible

to reach a state in which s2 = 1 or s3 = 1. The latter will be the target state given to the

algorithm (i.e., S = (s3)). To determine an appropriate value for K, note that for a three-bit

shift register, every state should be reachable within three cycles. Therefore, K will be set to

three. As such, the algorithm is called using target state S = (s3), iteration limit K = 3, and

error cardinality n = 1.

The algorithm begins with iteration i = 0. Since i = 0, the exact set of initial states is used

to constrain the debugging problem rather than an approximation. Therefore, it constructs

debugging instance U = I ∧ Ten ∧ (s′3) ∧ φ1. This debugging instance seeks solutions that

allow for a transition from the initial state directly to a target state. Two such solutions are

the output of register s3 and its input. Clearly, changing s3 itself to a constant 1 solves the

problem. Alternatively, setting its input (i.e., the OR gate) to a function that can evaluate to

1 also solves the problem. Additionally, each of the inputs to the OR gate is a solution, because

setting either of them to 1 makes the target state reachable. Finally, the output of register s2

is a solution, as a 1 at that location can propagate to s3 if primary input x2 is 1. The reader

can verify that no other solutions to the debugging instance exist. Blocking clauses are added

to Ten to block these five solutions, concluding the first iteration.

Subsequently, the i = 1 iteration begins. PDR is used to compute the initial approximation

Page 38: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 3. Traceless Debugging Using Approximation and Unrolling 31

DQ

FF

DQ

FF

DQ

FF

x1 x2

s1

s2

s3

(a)

DQ

FF

DQ

FF

DQ

FF

x1 x2

s1

s2

s3

(b)

Figure 3.3: (a) Correct implementation of shift register (b) Erroneous implementation in whichstates are unreachable

F1 of the set of one-step reachable states. The debugging instance F1∧Ten∧S ′∧φ1 is constructed

and solved. Which satisfying assignments are found depends on the formula F1 that PDR

computed. However, regardless of F1, the algorithm’s solution set is the same. In other words,

the set of real solutions is independent of F1, while the exact set of spurious solutions found is

not.

To illustrate the process of solving the debugging instance, assume F1 = S = (s3). This

clearly over-approximates the set of one-step reachable states and is disjoint from the target

state, making it a valid over-approximation for PDR to compute. The debugging instance

will find a solution that allows for a transition from an F1-state to a S-state. In fact, such

transitions are possible in the original circuit. For instance, any state in which s2 = 1 can

transition to a target state. As a result, any design location could be found as a solution to

the debugging instance. However, since states in which s2 = 1 are not one-step reachable, this

satisfying assignment represents a spurious solution. Therefore it will be discarded and F1 will

be updated to block the chosen state.

Assume it is updated to F1 = (s2) ∧ (s3), as this blocks all states in which s2 = 1. Now

the debugging instance seeks a solution that allows for a transition from an F1-state to a S-

Page 39: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 3. Traceless Debugging Using Approximation and Unrolling 32

state. All such solutions were found in iteration 0, so no further solutions exist. The iteration

therefore concludes. Iterations 2 and 3 still remain. However, these iterations would proceed

exactly the same as iteration 1. This occurs because there are no two-step reachable states that

are not also one-step reachable. Therefore, no new solutions can be found in these iterations.

As a result, the algorithm terminates with the five solutions found in iteration 0, as shown in

Figure 3.4, where the crosses indicate solutions.

DQ

FF

DQ

FF

DQ

FF

x1 x2

s1

s2

s3

Figure 3.4: Erroneous shift register circuit with solutions highlighted

Notice that the actual error source is not in the solution set. This is because any fix made

at the actual error source would not be able to make the target state reachable one cycle after

an already-reachable state. It would first require a 1 to propagate into s2, and then propagate

to s3 in the following cycle. In other words, it reaches the target state two cycles following

an already-reachable state. The technique presented in the next subsection better handles this

case. Note that if S = (s2) was the target state, the actual error source would be found as a

solution.

3.3.3 Multi-Cycle Unreachability

This section presents an extension of the methodology from the previous section. To find

solutions that reach S in more than one step from an already-reachable state, the approach

models a sequence of N state transitions that originates from a reachable state and ultimately

Page 40: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 3. Traceless Debugging Using Approximation and Unrolling 33

transitions to S. The iterative methodology remains similar, the primary difference is the

debugging instance, which is expressed as follows:

Fi ∧ TNen ∧ S ′ ∧ φn (3.2)

where TNen represents N copies of the transition relation configured as an ILA. The user-specified

parameter N is the number of cycles the algorithm is allowed to “look forward” for solutions.

Rather than finding solutions that make S (i + 1)-step reachable, the debugging instance of

Eq. 3.2 can return solutions that make the target state (i+N)-step reachable. As was the case

for Eq. 3.1, a solution to this equation includes a target state in Fi and n active error-select

lines.

Ten STen Ten...Fi

Figure 3.5: Representation of the multi-cycle debugging instance

As with the approach of the previous section, due to the use of approximation some of the

solutions to Eq. 3.2 may not be solutions to the unreachability problem. Intuitively, a solution

for which the current state is i-step reachable is non-spurious by the same argument used in

the previous section. Therefore, the extended algorithm uses the same mechanism as used in

Algorithm 3.1 to detect and discard spurious solutions.

Pseudocode for the procedure is shown in Algorithm 3.2. It is referred to as MCUnreach-

ability (multi-cycle unreachability), in contrast with Algorithm 3.1. Line 2 commences the

iteration. Lines 3 and 4 perform some initialization steps to compute the values of i and N that

will be used with Eq. 3.2. This is intended to handle the special case of iterations 0 ≤ i < N . In

these iterations, the debugging instance uses i+ 1 copies of the transition relation and F0 = I

as the current state set, allowing it to find solutions that make the target state (i + 1)-step

reachable. These iterations are analogous to iteration 0 in Algorithm 3.1. Subsequently, line 5

executes PDR directed at proving S is unreachable which computes Fj . The remaining lines

find solutions and check if they are spurious in a manner similar to Algorithm 3.1. As with

Page 41: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 3. Traceless Debugging Using Approximation and Unrolling 34

Algorithm 3.2 MCUnreachability(C,S,K,N, n)1: solutions = ∅2: for i in 0, 1, ...,K do3: N ′ = min(N, i+ 1)4: j = i−N ′ + 15: PDR((S, I, T ),S, j)6: M = Fj ∧ TN ′

en ∧ S ′ ∧ φn7: while (Solution = SAT (M)) 6= UNSAT do8: t =ExtractState(Solution)9: if PDR((S, I, T ), t, j) then

10: solutions = solutions ∪ {Solution}11: Ten = Ten ∧BlockingClause12: else13: M = Fj ∧ TN ′

en ∧ S ′ ∧ φn14: end if15: end while16: end for17: return solutions

Algorithm 3.1 all calls to PDR are incremental and reuse the inductive trace from the previous

calls. The following theorem demonstrates that Algorithm 3.2 returns the desired solution set.

Theorem 3.2. The solution set of Eq. 3.2 after removing potentially spurious solutions is

exactly the set of all solutions that make the target state reachable N cycles following an i-step

reachable state.

By Theorem 3.2, the algorithm finds all solutions that make the target state reachable N

cycles following any (K−N +1)-step reachable state. This includes all solutions that make the

target state reachable 1 cycle after a K-step reachable state. The solution set of Algorithm 3.2

is therefore a superset of the solution set of Algorithm 3.1. Indeed, Algorithm 3.1 is merely a

special case of Algorithm 3.2, as with N = 1, the two algorithms are identical.

In particular, the solution set may include solutions that would be discarded by Algo-

rithm 3.1. A solution to Eq. 3.1 with a current state that is not i-step reachable may not be

spurious if a fix at the same location can make the current state reachable and make S reachable

from that state. By modeling multiple state transitions, Algorithm 3.2 can handle this case. It

additionally handles cases where reaching S requires first reaching a sequence of up to (N − 1)

other unreachable states.

Page 42: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 3. Traceless Debugging Using Approximation and Unrolling 35

3.4 Performance-Driven Formulation

This section presents a performance-driven enhancement for the methodologies in the previ-

ous section. It is presented as a modification of Algorithm 3.2 but can also be applied to

Algorithm 3.1, which is merely a special case of Algorithm 3.2.

Given iteration limit K, Algorithm 3.2 must solve K + 1 debugging instances. Each itera-

tion finds a new over-approximation essentially enlarging the set of current states to consider.

This implies that each iteration has the potential to find solutions that were not possible in

earlier iterations. The proposed modification essentially skips the first K iterations and starts

by executing the final iteration directly. However, it must also account for rare cases where

solutions are only possible in small numbers of cycles. It therefore solves N different debugging

instances, using 1, 2, ..., N copies of the enhanced transition relation. In each case, FK is the

over-approximation used.

Note that in general, for Algorithm 3.2 to be of interest, K must be greater than N .

Otherwise, no approximations are ever used and the algorithm essentially degenerates to a

form of traditional SAT-based debugging. In most applications, it is further expected that K

is much greater than N . Solutions that are only found when N is close to K require reaching

a long sequence of unreachable states in order to reach the target, which is expected to be a

rare case. Indeed, this is the intuition that explains the effectiveness of Algorithm 3.1, as it is

merely the special case of Algorithm 3.2 in which N = 1.

While this approach returns the same solution set as the original algorithm, it sacrifices

resolution. In particular, a solution that Algorithm 3.2 finds in iteration i may be used to make

a target state (i+ 1)-step reachable. The modified approach would still find every solution, but

is unable to indicate the number of clock cycles in which the corrected design could be able to

reach the target state. The benefit of course comes from solving N problem instances when

compared to the K + 1 incremental instances of the previous approach.

Pseudocode for the performance-driven formulation of the algorithm is presented in Al-

gorithm 3.3, which is called MMCUnreachability (monolithic multi-cycle unreachability).

Similar to the earlier formulations, line 1 initializes the set of solutions while lines 5 through 13

find and verify solutions from the debugging instance. They key difference is that PDR is only

Page 43: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 3. Traceless Debugging Using Approximation and Unrolling 36

Algorithm 3.3 MMCUnreachability(C,S,K,N, n)1: solutions = ∅2: PDR((S, I, T ),S,K)3: for i in 1, 2, ..., N do4: M = FK ∧ T i

en ∧ S ′ ∧ φn5: while (Solution = SAT (M)) 6= UNSAT do6: t =ExtractState(Solution)7: if PDR((S, I, T ), t,K) then8: solutions = solutions ∪ {Solution}9: Ten = Ten ∧BlockingClause

10: else11: M = FK ∧ T i

en ∧ S ′ ∧ φn12: end if13: end while14: end for15: return solutions

called to compute the inductive trace once to the inductive trace F1, ..., FK on line 2, and the

only approximation used is FK on lines 4 and 11. Note that PDR may be called additional

times to check for spurious solutions. As in the previous formulations, all calls to PDR are

incremental.

3.5 Summary

This chapter presents an approach based on PDR and SAT-based debugging to debug unreach-

able states in the absence of an error trace. Section 3.3.1 introduces the technique and an initial

formulation thereof to find solutions that reach a target state one transition after an already-

reachable state. Section 3.3.3 extends the algorithm to handle cases where up to N transitions

are required. Finally, section 3.4 presents a performance-driven enhancement that accomplishes

the same goal while solving fewer debugging instances, but sacrificing some resolution.

Page 44: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 4

Traceless Debugging Without

Unrolling

This chapter presents an alternative approach that solves the same debugging problem as the

approach of Chapter 3. Unlike that approach, the one presented in this chapter is complete

by nature. Given an unreachable target state and error cardinality n, the methodology returns

every set of n locations where a change can be made to make the target state reachable. The

problem is formulated as an unbounded model checking problem, allowing the algorithm to find

solutions that may reach the target state in any number of clock cycles. The algorithm addi-

tionally returns an inductive invariant proving that no further solutions exist. These benefits

come at the cost of increased runtime compared to the approach of Chapter 3.

4.1 Introduction

The previous chapter presents a set of algorithms to debug unreachable states using a com-

bination of approximation and unrolling. In those algorithms, the problem is expressed as a

CNF formula in which each satisfying assignment contains a state t of the circuit and a set of

n active error-select lines. The active error-select lines correspond to a set of locations that

can be modified to correct the design if t is reachable. If t is unreachable, then the solution

may be spurious and is therefore discarded. Due to the inherent nature of this approach, it is

not guaranteed to explore the complete solution space of the problem. The user is required to

37

Page 45: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 4. Traceless Debugging Without Unrolling 38

select parameters that create a trade-off between runtime and the number of solutions found.

As such it is not complete, in that it does not necessarily return every solution to the problem.

This chapter presents a methodology that overcomes these limitations [6]. Rather than

expressing the problem as a CNF formula, it is expressed as an unbounded model checking

problem. Additionally, the algorithms presented do not require the user to set parameters

that balance runtime with the number of solutions found. Instead, they find every solution

to the problem, which comes with the cost of increased runtime when compared to the earlier

approach. Finally, these approaches solve the problem using only a single copy of the transition

relation, avoiding the need to construct a partial debugging ILA.

The algorithm works as follows. First, the transition relation of the circuit is enhanced by

inserting error-select registers. Each error-select register is associated with a suspect location

such that if the register is active, the suspect location is effectively disconnected from its fanout

and replaced by an arbitrary Boolean function. Non-active error-select registers have no effect

on the functionality of the circuit. Under this enhanced transition relation, a target state is

reachable if and only if there is a set of locations that can be changed to make it reachable

in the original design (i.e., a solution). In order to get meaningful results, the number of

simultaneously-active error-select registers is limited to exactly n using a cardinality constraint.

In order to find the solutions, the algorithm calls PDR to check if the target state is reachable

under the enhanced transition relation. If so, PDR returns a counter-example trace in which

exactly n error-select registers are active. This indicates that the n corresponding suspect

locations form a solution. The solution is subsequently blocked and PDR is executed to find

further solutions. This process is repeated until the target state is unreachable, indicating that

no further solutions exist. As an added benefit, PDR returns an inductive invariant that proves

this fact.

This chapter is organized as follows. Section 4.2 presents the initial formulation of the

algorithm and theorems demonstrating the soundness and completeness of this formulation.

Section 4.3 demonstrates that the potentially numerous calls to PDR in the initial formula-

tion can be executed incrementally, potentially decreasing runtime substantially. Section 4.4

presents an alternative algorithm that uses the structure of the circuit to potentially prune

large portions of non-solution space when the error cardinality is one, potentially resulting in

Page 46: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 4. Traceless Debugging Without Unrolling 39

decreased runtime.

4.2 Initial Formulation

This section presents the initial formulation of the debugging algorithm. The algorithm takes

as input an erroneous circuit C, error cardinality n, a set of suspect locations L = {l1, l2, ..., l|L|}

(the suspect set), and an unreachable set of target states S. The transition relation of the circuit

is denoted by T while its initial state set is I. The algorithm works by constructing an enhanced

model of the circuit and solving a series of unbounded model checking problems on this model,

the results of which indicate solutions to the debugging problem. It determines precisely which

n-combinations of locations in L are solutions. As such, it returns a set Lsol ⊆ Ln, where each

element of Lsol is a solution. In order to find every solution in the circuit, L must include every

line in the circuit. For cases in which the error cardinality is one the algorithm presented in

section 4.4 finds every solution in the circuit using smaller suspect sets.

This section is organized as follows. Section 4.2.1 explains how the enhanced model is

constructed and the intuition behind its behavior. Section 4.2.2 explains how the unbounded

model checking problems are constructed and solutions are found. Finally, section 4.2.3 presents

theorems demonstrating that the algorithm is both sound and complete.

4.2.1 Constructing the Enhanced Model

The algorithm involves solving a series of unbounded model checking problems using an en-

hanced FSM model of the circuit. This subsection explains the construction of the enhanced

model, along with the rationale behind its functionality. The enhanced model behaves like the

original circuit with certain suspect locations replaced by arbitrary Boolean functions. Which

suspect locations are replaced depends on assignments to the error-select registers, which are

new hardware added in the enhanced model of the circuit and depicted in Figure 4.1. Their

exact purpose and functionality is described later. Each model checking problem is crafted so

that the result indicates a solution from Ln or proves that no solutions exist in Ln \ Lsol, at

which point the algorithm terminates.

Towards this end, the algorithm constructs an enhanced model of the circuit M = (S ∪

Page 47: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 4. Traceless Debugging Without Unrolling 40

li

wi

ei

0

1

zi

D Q

FF

Figure 4.1: Error-select register and multiplexer at suspect location li

s1

x1

x2 l1

l2DQ

FF

(a)

s1

x1

x2

l1

l2

w1

w2

e1 e2

0

1

0

1

DQ

FF

(b)

Figure 4.2: (a) Original circuit (b) Circuit used to construct Ten (error-select registers omitted)

E, Ien, Ten). The enhanced model contains new hardware in the form of error-select registers

E = {e1, ..., e|L|}. It additionally has an enhanced initial state set Ien and an enhanced transition

relation Ten. The exact manner in which these are constructed is explained later in this section.

A trace of the circuit tC,0, ..., tC,n is said to be equivalent to a trace of the model tM,0, ..., tM,n

if and only if the original registers in the set S have the same value assignments in states tM,i

and tC,i for all 0 ≤ i ≤ n.

The enhanced transition relation is constructed from T by adding hardware to facilitate

debugging. For each suspect location li ∈ L, an associated free variable wi and error-select

register ei are added. The error-select register is made immutable (i.e., its value cannot change)

by feeding its output back to its input so that e′i = ei. As explained later, this allows an

association between the reachability of particular states under M with an n-tuple of suspect

locations being a solution. Subsequently, new hardware is added such that li is effectively

replaced by an arbitrary Boolean function when ei = 1. When ei = 0 the behavior of the

circuit is unaffected. This functionality is implemented by a multiplexer where the 0-input is

li, the 1-input is wi, and the select line is ei. The multiplexer output (denoted zi) is connected

Page 48: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 4. Traceless Debugging Without Unrolling 41

to the original fanout of li. This is similar to the error-multiplexer used in [38] and serves a

similar purpose. This construction is shown in Figure 4.1 while its CNF representation is shown

in Eq. 4.1 below. Eq. 4.2 shows the CNF representation of the added hardware that enforces

e′i = ei.

mux = (ei ∨ li ∨ zi)(ei ∨ li ∨ zi)(ei ∨ wi ∨ zi)(ei ∨ wi ∨ zi) (4.1)

reg = (ei ∨ e′i)(ei ∨ e′i) (4.2)

The enhanced transition relation is constructed from the circuit with the added hardware.

The multiplexers are each represented by four clauses, while the additional lines setting e′i = ei

are represented by the two clauses. The CNF encoding of Ten therefore has only O(|L|) more

clauses than that of T . To clarify the behavior of Ten, an example follows. It will be extended

throughout the chapter to explain various aspects of the enhanced model used to debug the

circuit.

Example 4.1. Consider the circuit of Figure 4.2(a). It has one state element s1, two primary

inputs x1 and x2, and two suspect locations are labeled as l1 and l2. Assume that the initial

state is s1 = 0 (i.e., I = (s1)). It is impossible for the circuit to reach a state where s1 = 1,

which is easily verified by noting that if s1 = 0 the AND-gate can never output a 1. This

unreachability can be diagnosed by using the target state condition S = (s1). In doing so,

the enhanced transition relation is constructed from the circuit shown in Figure 4.2(b). When

e1 = e2 = 0, this circuit behaves the same as the original circuit. When e1 = 1, l1 is replaced

by the free variable w1 which can assume any value during model checking. Similar behavior

applies to e2 and l2. It can be seen that when any ei = 1, this circuit behaves like the original

circuit with li replaced by an unknown function.

To debug unreachable states, the reachability of particular states under the enhanced model

must be associated with the fact that specific locations are solutions. Consider a trace of the

enhanced model. All states in the trace have the same value assignments to the error-select

registers. This occurs because the error-select registers are immutable, which means that after

they assume a value due to the chosen initial state, they remain constant. Assume e1, ..., em are

the active error-select registers in the trace. The enhanced model therefore behaves the same

Page 49: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 4. Traceless Debugging Without Unrolling 42

as the original circuit with locations l1, ..., lm replaced by unknown Boolean functions. It can

be concluded that the original circuit would have an equivalent trace if those locations were

simultaneously replaced by different functions.

For the trace to indicate a solution, it must satisfy additional properties. Specifically, it

must start from an I-state, end on a target state, and have exactly n active error-select registers

e1, ..., en where n is the error cardinality. From the discussion in the previous paragraph, the

original circuit would have an equivalent trace if locations l1, ..., ln were replaced by different

functions. Since the trace starts from an I-state and ends at a target state, replacing those

locations makes a target state reachable. This implies that l1, ..., ln constitutes a solution.

Solutions can therefore be found by finding traces that satisfy these three properties.

This motivates the construction of the enhanced set of initial states Ien. The original

registers of the circuit are constrained using I, ensuring that the initial states of the enhanced

model correspond to initial states of the original circuit. Since exactly n error-select registers

must be active, a cardinality constraint φn is applied to the error-select registers. The enhanced

initial state condition is therefore Ien = I∧φn. This completes the construction of the enhanced

FSM M = (S ∪E, Ien, Ten). The following example demonstrates and clarifies the purpose and

behavior of Ien.

Example 4.2. Consider again the example from Figure 4.2. The enhanced initial state con-

dition Ien is the conjunction of I = (s1) and the cardinality constraint φn. Assume for this

example that n = 1. Therefore, Ien = (s1)∧(e1∨e2)∧(e1∨e2). Enumerating all of the satisfying

assignments to that formula, the set of states in Ien is {(s1∧e1∧ e2), (s1∧ e1∧e2)}. Notice that

these states all share two key properties. The first is that all have s1 = 0, corresponding to

I-states. Additionally, every state of Ien has exactly one active error-select register, satisfying

the cardinality constraint. These are exactly the initial states that may appear at the beginning

of traces that indicate solutions.

4.2.2 Searching for Solutions

As discussed earlier, specific traces of the enhanced model correspond to solutions. This sub-

section explains how PDR is used to find traces that satisfy the relevant properties. In the

Page 50: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 4. Traceless Debugging Without Unrolling 43

enhanced model M = (S ∪ E, Ien, Ten), the initial state condition Ien ensures that the traces

PDR finds begin on an I-state with exactly n active error-select registers, as required. The

enhanced transition relation ensures that the same n error-select registers are active in every

state in the trace. All that remains is to force the trace to end on a target state. To do so,

PDR is executed using S as its unsafe state set. If any target state is reachable, PDR will

return a counter-example trace that meets the requirements previously described. As such, if

e1, ..., en are the active error-select registers in the counter-example then l1, ..., ln is a solution

of cardinality n. Continuing the illustration of the algorithm from example 4.2, the following

example demonstrates the procedure of finding a solution.

Example 4.3. Recall from example 4.2 that the target state condition is S = (s1) and the

initial state condition is I = (s1). PDR(M,S,∞) returns the following two-step counter-

example trace: 〈t0, t1〉 = 〈(s1∧ e1∧ e2), (s1∧ e1∧ e2)〉. Notice that t0 corresponds to an I-state,

t1 is a target state, and e2 is the active error-select register. In states t0 and t1, the model

behaves identically to the circuit with l2 replaced by an unknown function. Since t0 is an initial

state and t1 is target state, replacing l2 with a different function makes a target state reachable

in the original circuit. This indicates that location l2 is a solution. Indeed, the reader can verify

that replacing the AND-gate that drives l2 with an OR-gate makes the target state reachable.

Other corrections to the problem are also possible.

After finding a solution, it is blocked, allowing the algorithm to find any remaining solutions.

For a solution l1, ..., ln of cardinality n, this is accomplished by conjoining the clause (¬l1∨ ...∨

¬ln) to Ien. This prevents PDR from finding any further traces that indicate the same solution.

Eventually, the algorithm reaches a point at which the target state is unreachable under

the enhanced model. This occurs under one of two conditions. The more common case is

when all solutions have been found and all remaining states of Ien cannot reach a target state.

Alternatively, it occurs when all possible solutions are blocked, and therefore no states satisfy

Ien. In both cases, PDR will terminate indicating the target state is unreachable. The following

example demonstrates blocking a solution and terminating when no further solutions exist.

Example 4.4. Continuing with the example of Figure 4.2, after solution l2 is found, the

enhanced initial state condition becomes Ien = (s1)∧(e1∨e2)∧(e1∨e2)∧(e2), leaving (s1∧e1∧e2)

Page 51: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 4. Traceless Debugging Without Unrolling 44

Algorithm 4.1 Unreachability(C,S, L, n)1: Lsol = ∅2: S = state element set of C3: I = initial state condition of C4: Ten, E = ConstructModel(C,L)5: Ien = I ∧ φn6: M = (S ∪ E, Ien, Ten)7: while PDR(M,S,∞) == Reachable do8: e1, ..., en = active error-select registers in counter-example9: B = (¬e1 ∨ ... ∨ ¬en)

10: Lsol = Lsol ∪ {l1, ..., ln}11: Mblk = (S ∪ E, Ien ∧B, Ten)12: M = Mblk

13: end while14: invariant = inductive invariant extracted from PDR15: return (Lsol, invariant)

as the only remaining initial state. It is easily verified that this state cannot reach any target

states. This implies that location l1 is not a solution, which is indeed true. To reach a state

where s1 = 1, the output of the AND-gate must be 1. In the initial state s1 = 0 and s1 is an

input to the AND-gate, so it will always output 0 regardless of the value at l1. Therefore, there

is no way to modify the circuit at l1 to rectify the unreachability of the target state.

Pseudocode for the procedure is shown in Algorithm 4.1. In that description, algorithm

ConstructModel receives input C and L and returns the enhanced transition relation and

error-select register set. Lines 4 through 6 construct the enhanced FSM model. Lines 7 to 13

contain the main loop that finds solutions. If a solution exists it is extracted on line 8 and

added to Lsol on line 10. Subsequently, line 11 constructs a new model Mblk in which the

solution is blocked. The distinction between M and Mblk is included to simplify the discussion

of the performance optimization presented in section 4.3. As the number of suspect locations

is finite, the loop must terminate eventually. At this point, PDR indicates S is unreachable

and the inductive invariant is extracted on line 14. Finally Lsol and the invariant proving the

completeness of the solution set are returned on line 15. In contrast with the algorithms of the

previous chapter, Algorithm 4.1 is referred to merely as Unreachability as it is complete and

therefore determines the exact set of solutions definitively.

Page 52: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 4. Traceless Debugging Without Unrolling 45

4.2.3 Soundness and Completeness

This section presents two theorems demonstrating that Algorithm 4.1 is both sound and com-

plete with respect to its input set. In this context, soundness implies that every n-tuple of

suspect locations Lsol is a solution. Completeness requires that every solution in Ln is included

in Lsol. Theorem 4.1 below uses the nature of traces of M to prove that Unreachability is

sound.

Theorem 4.1. Upon termination every element of Lsol is a solution.

Proof. Line 7 finds a counter-example trace t0, ..., tm of M . As it is a counter-example trace,

it starts at an initial state and ends at a target state, implying t0 ∈ Ien and tm ∈ S. As

Ien = I ∧ φn, the cardinality constraint φn ensures that exactly n error-select registers are

assigned to 1 in state t0. Let e1, ..., en denote the active error-select registers.

Since the error-select registers are immutable (i.e., their value assignments never change),

each state in the trace also has e1, ..., en active and all other error-select registers inactive.

Further, the fact that t0 ∈ Ien ensures that t0 corresponds to an initial state of C. Therefore,

an equivalent trace also exists for C if l1, ..., ln are replaced by unknown Boolean functions. As

tm is a target state, S can be made reachable in C by replacing those locations, indicating that

they are a solution. All elements of Lsol are found in this manner, implying that every element

of Lsol is a solution.

Since Lsol is the solution set of Algorithm 4.1, Theorem 4.1 proves that the algorithm is

sound. Theorem 4.2 below shows that the approach is also complete. That is, it returns all

solutions from Ln.

Theorem 4.2. Upon termination Lsol contains every solution from Ln.

Proof. Lines 7 to 13 are executed to find solutions until all target states are unreachable.

First, consider the case when Lsol includes all(|L|n

)possible solutions at the termination of

Algorithm 4.1. Clearly, this includes every n-combination in Ln, and therefore every solution.

Now assume the opposite case, Algorithm 4.1 terminates when all target states are unreach-

able. Let Lrem denote the set of n-combinations of L that are not elements of Lsol. It suffices

to show that the unreachability of all target states implies that no solutions exist in Lrem.

Page 53: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 4. Traceless Debugging Without Unrolling 46

Consider the final call to PDR that returns Unreachable, implying that all target states are

unreachable. This means that there are no traces of M that end in a target state.

Consider a fixed initial state IC of C. There are |Lrem| corresponding initial states of M ,

each with a different set of n active error-select registers. Since all target states are unreachable,

none of these states can reach a target state under M . This implies that for every element

(l1, ..., ln) ∈ Lrem, it is impossible to replace l1, ..., ln with different Boolean functions such that

S is reachable from IC in C. Since IC is an arbitrary initial state of C, this holds for every

initial state of C.

Therefore, none of the elements of Lrem are solutions which implies that when Algorithm 4.1

terminates Lsol contains every solution from Ln.

As the algorithm only examines locations from the suspect set L, it cannot find solutions

that are not in that set. If every solution in the circuit is needed, the user may choose L to

include every location. As a larger suspect set may increase runtime, the algorithm offers the

user a trade-off where one can limit the suspect set L to locations suspected to be error sources.

For instance, an engineer may introduce a bug when modifying a specific module. In this case,

it may be desirable to restrict the suspect set to said module and treat the rest of the design

as correct. An additional case where the suspect set is restricted to specific locations is the

algorithm presented in section 4.4. That algorithm makes multiple calls to Algorithm 4.1, each

with a different suspect set.

4.3 Incremental Application of PDR

Algorithm 4.1 makes one call to PDR for each solution it finds. In the worst case this will require

O((|L|n

)) calls to PDR. In the common case where n = 1, this simplifies to O(|L|). For each

algorithm presented in Chapter 3, it was noted that each call to PDR is executed incrementally

by reusing and refining the inductive trace from the previous call. This is possible because each

call to PDR uses exactly the same model, as the same initial state set and transition relation

are used throughout the algorithm. Algorithm 4.1 modifies the model between calls to PDR,

so it is not immediately obvious that the calls can be done incrementally.

Intuitively, each call to PDR in Algorithm 4.1 uses a very similar model. The only difference

Page 54: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 4. Traceless Debugging Without Unrolling 47

between consecutive PDR calls is that a single solution is blocked. In other words, a combination

of n error-select registers is forced to 0 in Ien. After making such a minor change to the model,

it is expected that many of the invariants would remain valid [13]. In fact, as shown later in

this section, all of the invariants generated by PDR remain valid. The allows each call to PDR

to reuse the entire inductive trace from previous calls.

As is explained in section 2.4, PDR maintains an inductive trace F = 〈F0, F1, ...〉. Each

Fi is a predicate represented by a CNF formula. Each Fi is i-step invariant, as is each clause

of Fi. As i-step invariants of M = (S ∪ E, Ien, Ten), each clause c also includes every state of

Ien, i.e., Ien ⊆ c. The work of [13] presents an invariant finder that extracts the portion of the

invariants computed for one model that are also invariant for another model. This provides a

means for the reuse of invariants after modifying the model in the general case. However, due

to the nature of the model updates in Algorithm 4.1 it is possible to reuse the entire inductive

trace without any additional verification. To reuse a clause c of Fi with the new model, it

must maintain the properties above with respect to the new model. That is, c must include

every initial state and every i-step reachable state of the new model. The rest of this section

demonstrates that this is the case for every clause of every Fi in Algorithm 4.1.

Consider the state of Algorithm 4.1 immediately after executing line 12. At this point

M is the FSM model used to find a solution and Mblk is the FSM model after blocking that

solution, while B is the blocking clause for the solution. Since I ′en is simply Ien with additional

constraints, it is immediately obvious that I ′en ⊆ Ien. Since Ien ⊆ c, it is trivially true that

I ′en ⊆ c.

This leaves only to show that all i-step invariant clauses of M are also i-step invariant for

Mblk. This ultimately arises from the fact that the reachable state set of Mblk is a strict subset

of that of M . As a result, any over-approximation of the set of states reachable under M is

also an over-approximation of the set of reachable states of Mblk. An i-step invariant is simply

an over-approximation of the set of i-step reachable states, so intuitively the i-step invariant

clauses of M are also i-step invariant clauses of Mblk. Lemma 4.3 below provides a first step

towards proving this by showing that the clause conjoined to I ′en does not make any new states

reachable.

Page 55: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 4. Traceless Debugging Without Unrolling 48

Rc

(a)

cR′

(b)

Figure 4.3: State space representation of (a) M and (b) Mblk

Lemma 4.3. All B-states that are not i-step reachable under M are not i-step reachable under

Mblk for all i ≥ 0.

Proof. Consider a state t ∈ B that is not i-step reachable under M . Assume towards a con-

tradiction that it is i-step reachable under Mblk. For some m ≤ i the model Mblk must have

a trace t0, ..., tm where t0 ∈ I ′en and tm = t. As all literals of B are error-select registers and

t is B-state, t0 is also a B-state. This is because the error-select registers cannot change their

value assignments.

Both models M and Mblk have the same transition relation. Therefore each transition in

the trace is valid under M . As a result, t is only unreachable under M if t0 6∈ Ien. This is a

contradiction as t0 ∈ I ′en and it has already been shown that I ′en ⊆ Ien. Therefore, all B-states

that are not i-step reachable under M are not i-step reachable under Mblk.

As shown, the model updates in Algorithm 4.1 do not make any unreachable B-states

reachable. Further, they clearly make all ¬B-states unreachable. These two facts imply that

no states unreachable under Mblk are reachable under M . Letting R (Rblk) denote the set of

states reachable under M (Mblk), it is clear that Rblk ⊆ R. It only remains to show how this

implies that all i-step invariants M are i-step invariant for Mblk. To visualize this fact, consider

a clause c that is invariant for M . As an invariant clause of M , it must over-approximate R.

This is depicted in Figure 4.3(a) where the set of c-states contains R. Figure 4.3(b) shows that

the set of c-states also over-approximates Rblk.

The above discussion focuses on invariant clauses but the same reasoning applies to i-step

invariant clauses. The following theorem proves this claim.

Theorem 4.4. All clauses that are i-step invariant under M are i-step invariant under Mblk.

Page 56: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 4. Traceless Debugging Without Unrolling 49

Proof. Let c be a clause that is i-step invariant under M . Assume towards a contradiction

that c is not i-step invariant under Mblk. This implies that there is a state t 6∈ c that is i-step

reachable under Mblk. Additionally, since c is i-step invariant for M and t 6∈ c, t must not be

i-step reachable under M .

Since t is i-step reachable under Mblk and not M , by Lemma 4.3 it is a ¬B-state. No

¬B-states are reachable under Mblk, contradicting the assumption that c is not i-step invariant

under Mblk.

Theorem 4.4 proves that it is possible to reuse the inductive trace from previous calls to

PDR. That is, the execution of PDR on line 7 of Algorithm 4.1 can be done incrementally. This

can potentially result in a substantial reduction of the algorithm’s runtime.

4.4 Efficient Suspect Selection

The algorithm presented in the previous section is both sound and complete with respect to

its input set of suspect locations. However, in order to find every solution in the circuit, it is

required that the suspect set includes every location in the circuit. This section presents an

iterative approach to solve the same problem in which each iteration calls Algorithm 4.1 with

a different suspect set [7]. Each iteration’s suspect set is constructed to limit the number of

suspects considered across all iterations. In a given iteration, each solution found is used to

add suspect locations to the suspect set of the next iteration. The algorithm presented in this

section is sound. It is also complete if the error cardinality is one. It is therefore assumed for

the rest of this section that the error cardinality n = 1.

Figure 4.4: Example circuit with fanout branches highlighted

Page 57: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 4. Traceless Debugging Without Unrolling 50

The algorithm begins with a preprocessing step. This step consists of computing the set of

all fanout branches. A fanout point is simply a line in the circuit that fans out to more than

one other location. Figure 4.4 illustrates this concept graphically. Let fanout(l) denote the

set of locations to which l fans out. The set of fanout branches is F = {l : |fanout(l)| > 1}.

Additionally, let set R denote the set of all registers that appear in the target state predicate

S. The rationale behind the use of these sets is explained later in this section.

After preprocessing to compute sets F and R the algorithm proceeds through a series of

iterations, each of which calls Algorithm 4.1 once. Each iteration uses a different suspect set.

The suspect sets are constructed in a manner intended to limit the total number of suspects

considered across all iterations. The suspect set of iteration i is denoted Li. The initial suspect

set is constructed as L1 = R ∪ F , thereby including all fanout branches and all registers that

appear in the target state predicate. Subsequently Algorithm 4.1 is executed using this suspect

set and returning a set of solutions S1. A new suspect set L2 is computed from S1 and used in

the subsequent iteration. In general, after iteration i the suspect set of the subsequent iteration

Li+1 is computed as shown in Eq. 4.3 below.

Li+1 = {l ∈ Si : fanin(l)} \i⋃

j=1

(Lj) (4.3)

where fanin(l) denotes the set of all fanin for location l. Suspect set Li+1 contains the fanin of

every solution found in iteration i minus the suspect sets for all previous iterations. Therefore

it does not include any locations used as suspects in a previous iteration. This ensures that

no location is a suspect more than once. It also guarantees that the algorithm terminates, as

otherwise a group of solutions that form a cycle in the circuit could result in an infinite loop.

The reasoning behind this approach is intuitive. Consider a location l that is also a solution.

As l is a solution, the Boolean function at l can be changed to correct the design. This implies

one of two possibilities. First, it may also be possible to replace an element of fanin(l) to

correct the problem, and therefore one or more elements of fanin(l) may also be solutions.

This occurs if the needed change at l is equivalent to modifying only one of its fanin locations.

Alternatively, it may not be possible to correct the problem at any element of fanin(l) and

therefore l is a solution but no elements of its fanin are. As a result, the fact that l is a solution

Page 58: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 4. Traceless Debugging Without Unrolling 51

is insufficient information to decide whether or not the elements of fanin(l) are solutions.

On the other hand, if a location l′ is not a solution then there is no way to modify the design

at location l′ to make S reachable. Consider a location l ∈ fanin(l′). If l has other fanout

besides l′, then it may be possible for l to be a solution even if every member of its fanout is

not. This can occur if multiple elements of the fanout need to be simultaneously corrected to

fix the error. Similarly, if l ∈ R then it may be the case that l is a solution but no element of

its fanout is. However, if fanout(l) = 1, l 6∈ R and the single fanout of l is not a solution, then

l is also not a solution. The lemma below formalizes this intuition.

Lemma 4.5. For a location l 6∈ R with |fanout(l)| = 1, if the single element of fanout(l) is

not a solution, then l is not a solution.

Proof. Suppose that l is a solution and that the single element l′ ∈ fanout(l) is not a solution.

This implies that it is possible to replace l by some other Boolean function to make some S-state

reachable. Since l 6∈ R but l is a solution, l must be in the cone-of-influence of some element

of R. Otherwise, a change at l would not be observable at R and could not correct the error.

This implies that either l′ ∈ R or l′ is also in the cone-of-influence of an element of R since l

fans out to l′ and nothing else.

However, there is no way to replace l′ with a different Boolean function to make an S-state

reachable. Since l′ is the only fanout of l, this implies that it is possible to replace l in a manner

that changes the behavior at R but not l′. This is a contradiction since the behavior of the

circuit must also change at l′ to be observable at R.

This demonstrates the rationale behind constructing the initial suspect set as F ∪ R. The

set F includes every location l with |fanout(l)| > 1. As a result, every l 6∈ L1 satisfies l 6∈ R

and fanout(l) ≤ 1. By Lemma 4.5, every l 6∈ L1 can be removed from consideration if the

single element of fanout(l) is not a solution. Essentially, the initial suspect set is constructed

to handle every case that the lemma cannot.

Pseudocode for the approach is shown in Algorithm 4.2. The algorithm is named SE-

Unreachability (single error unreachability) as it assumes that the error cardinality is one.

Lines 1 and 2 construct sets R and F , respectively. Line 3 constructs the initial suspect set.

Lines 5 through 8 contain the main loop that repeatedly calls Algorithm 4.1. Within the main

Page 59: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 4. Traceless Debugging Without Unrolling 52

Algorithm 4.2 SEUnreachability(C,S)

1: R = state elements in the formula defining S2: F = {l : |fanout(l)| > 1}3: L1 = F ∪R4: i = 15: while Si =Unreachability(C,S, Li, 1) 6= ∅ do6: Li+1 = {l ∈ Si : fanin(l)} \⋃i

j=1(Li)7: i = i+ 18: end while9: S =

⋃ij=1 Sj

10: return S

loop, line 6 constructs the suspect set for the next iteration according to Eq. 4.3. Finally, line 9

constructs the solution set which is returned on line 10.

As mentioned, the algorithm assumes that the error cardinality n = 1. The completeness of

the algorithm rests on this assumption, as Lemma 4.5 only holds when n = 1. The remainder

of this section focuses on demonstrating the soundness and completeness of the algorithm

under this assumption. In this context, soundness implies that every location returned is

a solution. Completeness implies that every solution in the circuit is found. The following

theorem demonstrates the soundness of the algorithm. It follows trivially from the soundness

of Algorithm 4.1.

Theorem 4.6. Every location in S is a solution.

Proof. Since the set S only includes locations identified as solutions by Unreachability, every

location in S is a solution by Theorem 4.1.

As S is the solution set of Algorithm 4.2, Theorem 4.6 proves the algorithm is sound.

Theorem 4.7 below proves that algorithm is complete, which follows from the construction of

the initial suspect set and Lemma 4.5.

Theorem 4.7. When Algorithm 4.2 terminates, S includes every solution.

Proof. The initial suspect set is L1 = F ∪ R. Since Unreachability is complete by Theo-

rem 4.2, S includes every solution in F ∪R after iteration 1.

Consider an arbitrary location l 6∈ L1. This implies l 6∈ R and |fanout(l)| ≤ 1. If

|fanout(l)| = 0 and l 6∈ R, l is not a solution as it is not in the cone-of-influence of any

Page 60: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 4. Traceless Debugging Without Unrolling 53

element of R. Alternatively, |fanout(l)| = 1, and by Lemma 4.5 l is only a solution if it is in

the fanin of another solution. On line 6, the algorithm constructs a new suspect set including

the fanin of all solutions found in the previous iteration. It continues in this manner until it

reaches an iteration in which no solutions are found. As a result, any location in the fanin of a

solution is included in a suspect set passed to Unreachability. Therefore, by Theorem 4.2,

it is identified as a solution and included in S. Since l was an arbitrary location, this applies to

every l 6∈ (F ∪R) and S therefore includes every solution when the algorithm terminates.

Since S is the solution set of Algorithm 4.2, Theorem 4.7 proves that the algorithm is

complete. In contrast with the previous approach, the algorithm does not require the user to

specify a set of suspect locations. Since the algorithm selects the suspect sets it examines, it

essentially performs this step for the user.

4.5 Summary

This chapter presents an algorithm to debug unreachable states using repeated executions of

PDR. Section 4.2 presents the initial formulation of the algorithm and presents theorems demon-

strating its soundness and completeness. Section 4.3 proves that the algorithm is functionally

equivalent when the repeated calls to PDR are executed incrementally. Section 4.4 presents an

optimization of the algorithm that uses the structure of the circuit to potentially prune large

portions of non-solution space when the error cardinality is restricted to one.

Page 61: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 5

Experimental Results

A prototype unreachability debugging engine is implemented based on the algorithms in this

thesis. The tool is developed using a reference implementation of PDR [9]. The SAT engine

used in Algorithms 3.1, 3.2, and 3.3 is MiniSat v2.2.0 [17]. The PDR implementation also

uses the same SAT engine internally. All experiments are executed on a single core of an i5-

3570K 3.4 GHz workstation with 16GB of RAM. Experiments are performed on designs from

OpenCores [31]. Each experiment uses an error cardinality of one. All experimental runs are

timed out after 4 hours. Each problem instance is created by injecting a design error such

as complementing conditions in if-statements, introducing incorrect state transitions, changing

operators in expressions, etc. These are typical design errors introduced unintentionally by

human designers. Each design error is chosen such that it makes at least one state unreachable

in violation of the design specification.

In this chapter, the five presented algorithms are referred to by their intuitive names (e.g.,

Unreachability) rather than indices (e.g., Algorithm 4.1) so that the reader may more easily

distinguish between them. For convenience, Table 5.1 below summarizes the algorithms’ names,

the sections of the thesis in which they may be found, and their distinguishing features.

5.1 Algorithm Comparison

This section presents experiments comparing and contrasting the algorithms presented in this

thesis. Towards that end, Table 5.2 and Table 5.3 show the runtime and number of solutions

54

Page 62: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 5. Experimental Results 55

Table 5.1: Summary of presented algorithmsName Index Section Distinguishing Feature

SCUnreachability Algorithm 3.1 3.3.1 Single-cycle unreachability

MCUnreachability Algorithm 3.2 3.3.3 Multi-cycle unreachability

MMCUnreachability Algorithm 3.3 3.4 Optimized monolithic formulation

Unreachability Algorithm 4.1 4.2 Canonical complete algorithm

SEUnreachability Algorithm 4.2 4.4 Single error unreachability

found for each of the algorithms. The first five columns show the name of the problem instance,

number of gates in the design, number of registers in the design, number of suspect locations

used in every algorithm, and the number of solutions present, respectively. The remaining

columns show the runtime, speedup (relative to Unreachability), number of solutions found,

and percentage of all solutions found for each algorithm. For Unreachability, percentage

of solutions found and speedup are omitted, as it is the baseline against which the other algo-

rithms are compared. The number of gates and registers are derived from an AND-INVERTER

representation [10] of the circuit. The size of the complete solution set is determined by a run

of Unreachability, as it is proven to find the complete solution set of the problem. Speedups

are computed relative to Unreachability, as it is considered to be the canonical baseline

approach.

As expected, the results confirm that Unreachability and SEUnreachability find the

same solution sets. It is additionally confirmed that MMCUnreachability and MCUn-

reachability find the same solution sets. Finally, it can also be seen that the algorithms of

Chapter 3 return a subset of the solutions returned by the algorithms of Chapter 4. These

results confirm that the functionality of the developed algorithms is as expected.

From the results in the tables, it can be seen that SEUnreachability achieves a 30.7x me-

dian speedup over Unreachability. In some cases, these approaches are able to outperform

the approximation-based approaches while still finding the complete solution set. This demon-

strates the importance of selecting the parameters K and N correctly in the approximation-

based algorithms. For these experiments, the values were chosen to be reasonable across the

entire set of experiments rather than for the specific needs of the design. This can result in

the algorithm performing more work than needed. In general, these parameters could be set

using a structural analysis of the circuit. For instance, if there is a pipeline in the circuit, N

Page 63: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter5.

Experim

entalResu

lts

56

Table 5.2: Runtime and solutions foundBenchmark Unreach- SEUnreachability SCUnreachability

ability (K = 10)benchmark # # |L| # time # time spee- # % time spee- # %

gate reg sol (sec) sol (sec) dup sol sol (sec) dup sol solac97 ctrl 12607 2325 14967 13 490.8 13 18.2 27.0x 13 100% 220.9 2.2x 13 100.0%divider 3555 360 3915 38 419.4 38 12.2 34.3x 38 100% 424.7 1.0x 5 13.2%mrisc core 8206 1328 9573 18 276.4 18 6.2 44.8x 18 100% 5.5 50.4x 18 100.0%spi 1020 136 1156 23 7.6 23 0.7 11.5x 23 100% 2.9 2.6x 11 47.8%usb core 5010 534 5545 6 644.4 6 2.3 279.0x 6 100% 3.2 201.1x 6 100.0%wb 390 61 451 193 3.6 193 0.4 8.2x 193 100% 0.3 10.7x 2 1.0%

GEOMEAN 32.1x 9.3xAVERAGE 100% 60.3%MEDIAN 30.7x 100% 6.7x 73.9%

Table 5.3: Runtime and solutions found (K = 10)Benchmark MMCUnreachability MCUnreachability MMCUnreachability

(N = 1) (N = 5) (N = 5)benchmark # # |L| # time spee- # % time spee- # % time spee- # %

gate reg sol (sec) dup sol sol (sec) dup sol sol (sec) dup sol solac97 ctrl 12607 2325 14967 13 441.8 1.1x 13 100.0% 28.0 17.6x 13 100.0% 24.7 19.9x 13 100.0%divider 3555 360 3915 38 43.4 9.7x 5 13.2% 249.9 1.7x 21 55.3% 6.1 68.6x 21 55.3%mrisc core 8206 1328 9573 18 3.4 82.3x 18 100.0% 17.3 16.0x 18 100.0% 15.1 18.3x 18 100.0%spi 1020 136 1156 23 4.8 1.6x 11 47.8% 12.0 0.6x 23 100.0% 1.7 4.5x 23 100.0%usb core 5010 534 5545 6 1.7 375.5x 6 100.0% 10.3 62.4x 6 100.0% 8.7 74.1x 6 100.0%wb 390 61 451 193 0.6 5.7x 2 1.0% 1.2 2.9x 193 100.0% 1.0 3.5x 193 100.0%

GEOMEAN 12.0x 6.2x 17.6xAVERAGE 60.3% 92.6% 92.6%MEDIAN 7.7x 73.9% 9.5x 100.0% 19.1x 100.0%

Page 64: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter5.

Experim

entalResu

lts

57

Table 5.4: Effect of K and N on runtime and solutions foundN = 1 N = 5

Benchmark MCUnreachability MMCUnreachability MCUnreachability MMCUnreachabilitybenchmark K time #sol- #spur- time #sol- #spur- time #sol- #spur- time #sol- #spur-

(sec) utions ious (sec) utions ious (sec) utions ious (sec) utions iousac97 ctrl 5 121.1 13 655 103.8 13 505 25.2 13 0 24.7 13 0ac97 ctrl 10 220.9 13 1101 441.8 13 807 28.0 13 0 24.7 13 0ac97 ctrl 15 1854.7 13 1543 545.6 13 476 30.7 13 0 24.7 13 0ac97 ctrl 20 4955.6 13 2201 8208.3 13 461 33.5 13 0 24.7 13 0divider 5 213.8 5 0 43.4 5 0 46.5 21 0 6.1 21 0divider 10 424.7 5 0 43.4 5 0 249.9 21 0 6.1 21 0divider 15 637.9 5 0 43.5 5 0 454.4 21 0 6.1 21 0divider 20 848.9 5 0 43.6 5 0 655.2 21 0 6.2 21 0mrisc core 5 4.3 18 0 3.3 18 0 15.5 18 0 15.2 18 0mrisc core 10 5.5 18 0 3.4 18 0 17.3 18 0 15.1 18 0mrisc core 15 6.7 18 0 3.4 18 0 18.9 18 0 15.2 18 0mrisc core 20 8.0 18 0 3.4 18 0 20.8 18 0 15.3 18 0spi 5 1.4 6 55 3.1 6 78 3.4 23 0 1.7 23 0spi 10 2.9 11 84 4.8 11 55 12.0 23 0 1.7 23 0spi 15 4.1 16 127 3.3 16 31 20.5 23 0 1.7 23 0spi 20 5.9 21 175 5.2 21 27 29.1 23 0 1.7 23 0usb core 5 2.3 6 0 1.7 6 0 8.9 6 0 8.7 6 0usb core 10 3.2 6 0 1.7 6 0 10.3 6 0 8.7 6 0usb core 15 4.1 6 0 1.7 6 0 11.8 6 0 8.7 6 0usb core 20 4.9 6 0 1.8 6 0 13.3 6 0 8.7 6 0wb 5 0.3 2 42 0.5 2 44 1.1 193 29 1.0 193 0wb 10 0.3 5 57 0.6 5 43 1.2 193 29 1.0 193 0wb 15 0.4 193 134 0.7 193 41 1.4 193 61 1.0 193 0wb 20 0.5 193 153 0.8 193 43 1.5 193 61 1.0 193 0

Page 65: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 5. Experimental Results 58

should be greater than or equal to the number of stages. Otherwise, the algorithm may only

find solutions in stages of the pipeline close to the error’s observation point rather than the

pipeline stage in which the bug exists. It is more difficult to set K appropriately. In practice,

it is expected that simulation metrics, such as the average number of cycles to reach particular

states via simulation could be used. The algorithm Unreachability can be thought of as

setting K and N optimally in a manner that finds the complete solution set but never does

more work than needed. It can therefore outperform the approximation-based approaches if

these parameters are not set carefully. Similarly, SEUnreachability essentially chooses the

suspect set L in a manner that guarantees completeness while reducing the amount of work

needed.

Table 5.4 demonstrates the effect of the parameters K and N on runtime and the number of

solutions found. In Table 5.4, the first two columns show the name of the problem instance and

the value of K, respectively. The remaining columns show the runtime, number of solutions

found, and number of spurious solutions found for MCUnreachability with N = 1 (equiv-

alent to SCUnreachability), MMCUnreachability with N = 1, MCUnreachability

with N = 5, and MMCUnreachability with N = 5, respectively.

It can be seen that increasing the value of K increases the runtime for the algorithms SCUn-

reachability and MCUnreachability, as expected. Increasing the value of K results in

more iterations, meaning more debugging instances must be solved. This alone increases run-

time. In many cases, increasing K results in finding more solutions or more spurious solutions as

well. This results in additional calls to PDR in order to determine whether or not the debugging

solutions are real or spurious. Both of these factors contribute to the increased runtime.

The algorithm MMCUnreachability exhibits somewhat different runtime behavior. It

can be seen that its runtime is often constant with increasing values of K. The exceptions

to this occur when increasing this parameter results in finding additional solutions or spurious

solutions. Since MMCUnreachability does not use iterations, increasing K does not result in

solving additional debugging instances. Therefore, it makes sense that the runtime would only

increase as a result of finding more solutions or spurious solutions, as this results in additional

calls to PDR.

Note that, as the table demonstrates, it may be the case that increasing values of K result

Page 66: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 5. Experimental Results 59

5 10 15 20

KT

ime

(s)

0.0

0.2

0.4

0.6

0.8

1.0

Spurious check timeTotal runtime

Figure 5.1: Total runtime and time spent checking spurious solutions in MMCUnreachabil-ity for wb versus K

in fewer spurious solutions that are simply more expensive to detect. Figure 5.1 plots the total

runtime and time spent checking spurious solutions for wb. It can be seen that despite fewer

spurious solutions being found at higher values of K, more time is spent performing the check.

This means that each solution becomes more expensive to check. This is reasonable, as proving

a state is unreachable in K cycles is harder for larger values of K. It can additionally be seen

that the time spent doing things other than the spurious solution checks is essentially constant

with increasing values of K, confirming that most of the increase is due to this effect.

To summarize the performance characteristics of the presented algorithms, Figure 5.2 plots

the runtime relative to Unreachability for each algorithm on each design. For the algo-

rithms of Chapter 3, results are shown for N = 1 and K = 10. Overall, it appears that in

most cases, SEUnreachability is the best performing algorithm. In every case, it outper-

forms Unreachability. In most cases, it is able to outperform the algorithms presented in

Chapter 3. In two cases (ac97 ctrl and spi) it outperforms every one of the other algorithms.

In the cases where it is not the best-performing algorithm, the other algorithms tend to only

slightly outperform it. This once again demonstrates the importance of selecting the parame-

ters appropriately. As SEUnreachability effectively sets every parameter automatically and

in a manner that reduces the work it must perform, it is expected to outperform the other

algorithms unless their parameters are chosen intelligently. Therefore, it seems that in most

automated applications, such as automatically debugging failures in a regression suite, SEUn-

Page 67: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 5. Experimental Results 60

ac97_ctrl divider mrisc_core spi usb_core wb

UnreachabilitySEUnreachabilityMCUnreachabilityMMCUnreachability

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

Figure 5.2: Runtime relative to Unreachability for presented algorithms (Unreachability= 1.0)

reachability is the most appropriate choice. Other algorithms may be more appropriate in

cases where either a limited solution set is desired or extra information is available that allows

parameters to be set appropriately. Alternatively, if an error cardinality greater than one is

needed, then the other algorithms must be used.

It can also be seen that in the case of ac97 ctrl, the performance-driven approach of MM-

CUnreachability has higher runtime than the other approximation-based approaches. This

also appears to be related to spurious solutions, as this algorithm finds many spurious solutions

for ac97 ctrl. The checks for spurious solutions in early iterations refine the approximations,

making spurious solutions less likely in later iterations. However, it appears that in this case, it is

much more expensive to detect spurious solutions when skipping the early iterations. Evidently,

refining the approximations early on can heavily impact the performance of the algorithms. In

some cases this can result in MMCUnreachability performing substantially worse than the

iterative approaches.

Page 68: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 5. Experimental Results 61

On the other hand, the performance enhancement of SEUnreachability appears to

be consistently successful. Table 5.5 compares it against Unreachability. The first three

columns show the name of the problem instance, the number of gates in the design, and num-

ber of registers, respectively. The next two columns show the size of the suspect set and

runtime for Unreachability. The remaining columns relate to SEUnreachability. They

show the number of iterations executed, total number of suspects considered across all itera-

tions, runtime, total percentage of suspects considered (|⋃Li|/|L|), and speedup relative to

the Unreachability, respectively.

Across all experiments, the performance enhancement of SEUnreachability offers a ge-

ometric mean speedup of 32.1x with a median of 30.7x. Critically, SEUnreachability safely

ignores the majority of all design locations. Across all experiments, it considers an average of

only 26.1% of the design locations as suspects, and a median of 20.6%. Since the runtime of

Unreachability appears to be heavily dependent on the size of its input suspect set, elimi-

nating the majority of locations from consideration naturally yields a substantial reduction in

runtime.

Table 5.6 shows the number of solutions and number of suspects per iteration for the first five

iterations of SEUnreachability. The first column shows the name of the problem instance,

while the remaining columns show the number of suspect locations and solutions found for each

of the first five iterations. A blank cell indicates that the algorithm did not proceed through the

relevant iteration for that design. It can be seen that the initial suspect set contains few solutions

in most cases, meaning that very few suspects are considered in subsequent iterations. This

is intuitive, as in general only a small portion of all design locations are solutions. Figure 5.3

plots this data for spi, visualizing the drastic drop-off in suspect set sizes.

The design wb is an exception, as a relatively large portion of the design locations are

solutions. Despite this, the algorithm only needs to consider a total of 237 suspect locations

in order to find 193 solutions. Even in this pathological case, SEUnreachability is able to

ignore nearly half of the design locations and achieve an 8x speedup over Unreachability.

As mentioned previously, the runtime of Unreachability is expected to be heavily-

dependent on the size of the suspect set it is given. This is demonstrated in Table 5.7. The

first two columns show the name of the problem instance and the size of the suspect set L. The

Page 69: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 5. Experimental Results 62

1 2 3 4 5 6 7

Iteration NumberS

uspe

cts

05

1015

20

SolutionsNon−solutions222

|

Figure 5.3: Suspects and solutions per iteration for spi

Table 5.5: Runtime comparison for Unreachability and SEUnreachabilityUnreachability SEUnreachability

benchmark |L| time (s) #iter |⋃Li| time (s) |⋃Li|/|L| speedupac97 ctrl 14967 490.8 4 2697 18.2 18.0% 27.0xdivider 3915 419.4 3 1056 12.2 27.0% 34.3xmrisc core 9573 276.4 6 1708 6.2 17.8% 44.8xspi 1156 7.6 7 246 0.7 21.3% 11.5xusb core 5545 644.4 3 1140 2.3 20.6% 279.0xwb 451 3.6 8 237 0.4 52.5% 8.2xGEOMEAN 32.1xAVERAGE 26.1%MEDIAN 20.6% 30.7x

remaining columns show the number of SAT calls made by PDR, the run-time of SAT, and the

total runtime of the algorithm. In each experiment, |L| suspects are chosen at random from the

set of all design locations. Each suspect location can make additional states reachable, making

it more difficult to approximate the reachable state space. This results in a greater number of

calls to the SAT solver, as the table shows. Additionally, each suspect location increases the

complexity of the transition relation by adding more clauses and more variables. Figure 5.5(a)

Table 5.6: Number of suspects and solutions in iterations 1 through 5benchmark |S1|/|L1| |S2|/|L2| |S3|/|L3| |S4|/|L4| |S5|/|L5|ac97 ctrl 5/2689 2/2 2/2 2/2 2/2divider 10/1028 10/10 18/18 - -mrisc core 4/1688 4/4 2/2 3/4 3/6spi 7/229 2/2 2/2 2/2 4/4usb core 3/1136 1/2 1/1 1/1 -wb 33/76 33/34 33/33 34/34 4/4

Page 70: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 5. Experimental Results 63

● ● ● ● ● ●

1 2 3 4 5 6 7

0.0

0.1

0.2

0.3

0.4

0.5

0.6

Iteration NumberIte

ratio

n R

untim

e(s)

Figure 5.4: Runtime per iteration for spi

0 1000 2000 3000 4000

010

020

030

040

0

|L|

Tota

l Run

time

(a)

0 1000 2000 3000 4000

0.0

0.1

0.2

0.3

0.4

|L|

Ave

rage

SAT

run

time

(b)

Figure 5.5: Total runtime and average runtime per SAT call for divider

plots the total runtime versus |L| for divider, while Figure 5.5(b) plots the average runtime of

each SAT call in Unreachability versus |L|. It demonstrates the impact of the more complex

transition relation that results from increasing the number of suspects. It can be seen that in-

creasing |L| substantially increases both the total runtime and the runtime of each SAT query

made by PDR.

Since the runtime of Unreachability is dependent on the size of its given suspect set,

earlier iterations of SEUnreachability are expected to take substantially longer than later

iterations. Figure 5.4 plots the runtime of each iteration for spi, confirming this intuition. It

can be seen that the first iteration consumes substantially more runtime than later iterations.

Page 71: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 5. Experimental Results 64

Table 5.7: Effect of |L| on Unreachability runtimebenchmark |L| #SAT SAT run- total run-

calls time (s) time (s)usb core 125 13 0.01 0.12usb core 250 13 0.01 0.15usb core 500 121 0.24 0.48usb core 1000 331 1.51 2.02usb core 2000 299 5.11 6.10usb core 4000 8213 22.89 28.58divider 125 142 0.10 0.20divider 250 169 0.33 0.48divider 500 311 1.95 2.21divider 1000 391 8.50 9.12divider 2000 868 57.82 59.57divider 4000 1042 459.01 465.74

As shown in Figure 5.3, this iteration has the largest number of suspects by far. This appears

to confirm that larger suspect sets result in increased runtime, as expected. A larger suspect set

results in more hardware being added to the enhanced FSM model of the circuit that is used by

Unreachability. It therefore increases the complexity of the individual PDR instances the

algorithm must solve. It is additionally expected that suspect sets with many non-solutions are

difficult to solve for. To find a solution, PDR only needs to find a counter-example trace that

ends on a target state. However, to prove locations are not solutions it is necessary to prove

that no such traces exist. Intuitively, this seems to be an inherently difficult problem. When

a large number of non-solution locations are in the suspect set, proving no counter-examples

exist can be an expensive operation due to the complexity of the model used in PDR and the

large number of potential state transitions.

In addition to runtime, it is instructive to compare the solution sets found by the algorithms

presented in this thesis. It can be seen that in many case, using a window size N = 1 is sufficient

to find the complete solution set. This demonstrates the effectiveness of the SCUnreacha-

bility algorithm. However, both wb and divider present interesting counter-examples to this

observation. Figure 5.6 plots the number of solutions found versus N for both of these designs.

It can be seen that the two designs exhibit drastically different behavior in this regard. For

divider, the number of solutions steadily increases until it plateaus at N = 9. This occurs

because the error is in a pipelined portion of the design. As a result, increasing N allows so-

lutions to be found in earlier pipeline stages. Conversely for wb, very few solutions are found

Page 72: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 5. Experimental Results 65

for N < 3 and many new solutions appear at N = 3. This suggests that the design error can

be corrected in a manner that requires reaching a sequence of other unreachable states before

reaching the target state. This demonstrates the importance of the other algorithms presented

in this thesis, as while SCUnreachability is sufficient in many cases, it is inadequate to

debug certain types of errors.

● ● ●

● ● ● ● ● ●

2 4 6 8 10

050

100

150

200

N

Sol

utio

ns F

ound

●●

●●

●●

●●

● ●

wbdivider

Figure 5.6: Solutions versus N for divider and wb (K = 10)

Figure 5.7 plots the cumulative number of solutions found over time for spi for both Un-

reachability and SEUnreachability, confirming this intuition. It can be seen in Fig-

ure 5.7(a) that the former appears to find many solutions towards the beginning of the run.

These solutions result from counter-examples that PDR is able to find relatively quickly. Af-

ter exhausting the easy counter-examples, it becomes more difficult to find later solutions, as

indicated by the increased time between solutions being found. Finally, after every solution is

found, it takes a significant amount of time for the algorithm to prove that no further solutions

exist before it terminates.

Conversely, Figure 5.7(b) shows that SEUnreachability finds few of its solutions at the

start of the run. This is because the first iteration has the largest suspect set, thereby making

it much slower for Unreachability to solve. It can be seen that after finding all seven of

the solutions for iteration 1, there is a large delay before finding another solution. This delay

results from the time required to prove that the non-solution locations in set L1 are in fact not

solutions. As L1 is the largest suspect set, this takes a substantial amount of time. In iterations

2 and later, the suspect sets are all much smaller than L1. As a result, each iteration is very

Page 73: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 5. Experimental Results 66

0 2 4 6 8

05

1015

2025

Time (s)

Sol

utio

ns F

ound

(a)

0.0 0.2 0.4 0.6

05

1015

2025

Time (s)

Sol

utio

ns F

ound

(b)

Figure 5.7: Solutions found for spi vs. running time for (a) Unreachability (b) SEUn-reachability

fast and many solutions are found in a short period of time. This confirms the effectiveness of

this performance optimization.

To compare the solution sets found by the approaches, Figure 5.8 plots the percentage of

the complete solution set found by the presented algorithms across the set of experiments. It

can be seen that when N = 5 increasing K does not result in finding additional solutions for

the presented experiments. In these cases, N = 5 gives the solver sufficient freedom to find

a large portion of the solutions. On the other hand, when N = 1, increasing K can result in

finding substantially more solutions. In these cases, the solver needs the additional freedom

afforded by enlarging the approximation of the reachable state set to find a larger portion of

the solution set.

5.2 Benefits of Incrementality

This section presents experiments quantifying the runtime performance gained by applying PDR

incrementally in each of the algorithms presented in Chapter 4 of this thesis. The algorithms

of Chapter 3 are “inherently incremental.” That is, if each call to PDR were made with a

new inductive trace, it is unclear what information should be extracted to constrain the SAT-

based debugging problems. For instance, if a spurious solution was detected, it would make

sense to use only a single clause from PDR that blocks the relevant unreachable state in the

Page 74: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 5. Experimental Results 67

Unreachability

SEUnreachability

K=5,N=1

K=10,N=1

K=15,N=1

K=20,N=1

K=5,N=5

K=10,N=5

K=15,N=5

K=20,N=5

0 20 40 60 80 100

Figure 5.8: Average percentage of solutions found by algorithms across all experiments

debugging step. However, it would also be reasonable to use the entire inductive invariant.

Essentially, in the algorithms of Chapter 4, incrementality is a performance optimization. For

the algorithms of Chapter 3, it is a critical design decision that heavily impacts the construction

of the algorithm. Those algorithms are therefore not considered in this section.

The algorithms of Chapter 4, however, are more naturally and intuitively non-incremental.

Without the analysis of section 4.3, it may not be clear that they can be applied incrementally.

Additionally, it is immediately clear how the non-incremental version of Unreachability

works. It simply does not re-use the inductive trace between calls to PDR. Similarly, for

SEUnreachability, the non-incremental version of the algorithm simply makes calls to the

non-incremental version of Unreachability. Unlike the other algorithms, no aspects of the

internal state of PDR are exported to other problem domains, making incrementality or non-

incrementality both natural choices.

Table 5.8 shows comprehensive results. The first column shows the name of the problem

instance. The remaining six columns show the incremental runtime, non-incremental runtime,

Page 75: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 5. Experimental Results 68

Table 5.8: Effect of incrementality on runtimeBenchmark Unreachability SEUnreachabilitybenchmark Incr. Non- Incr. Incr. Non- Incr.

runtime incr. speedup runtime incr. speedup(sec) runtime (sec) runtime

(sec) (sec)ac97 ctrl 490.8 6877.2 14.0x 18.2 32.7 1.8xdivider 419.4 8106.3 19.3x 12.2 66.4 5.4xmrisc core 276.4 282.2 1.0x 6.2 6.4 1.0xspi 7.6 457.4 60.5x 0.7 16.1 24.4xusb core 644.4 643.0 1.0x 2.3 2.4 1.0xwb 3.6 1672.6 470.6x 0.4 22.4 51.9xGEOMEAN 16.7x 3.6xMEDIAN 14.1x 4.9x

and incremental speedup for Unreachability and SEUnreachability, respectively. As the

tables demonstrate, incrementality provides substantial performance benefits in most cases.

Overall, for Unreachability, it provides a geometric mean 16.7x speedup with a median of

14.1x. For SEUnreachability, it gives a 3.6x geometric mean speedup and median speedup

of 4.9x. As expected, the speedup for Unreachability is higher. Since that algorithm uses

the same inductive trace throughout its entire execution, incrementality naturally benefits it

more. The algorithm SEUnreachability discards the inductive trace between internal calls

to Unreachability, so it gains less.

There are some cases where incrementality does not give a speedup. In the cases of usb core

and mrisc core, the incremental speedup is negligible. This suggests that the solutions in

these designs are very easy to find. That is, the algorithm finds all solutions very quickly and

spends the rest of its time attempting to prove that no more solutions exist. Since the calls

that find solutions terminate very quickly, applying incrementality does not speed them up

significantly. Towards confirming this intuition, Figure 5.9 plots the number of solutions found

versus running time for the non-incremental version of Unreachability with mrisc core and

usb core. Notice that usb core, all solutions are found within four seconds, implying that the

rest of the 644.4 seconds are spent proving that there are no other solutions. A similar pattern

can be observed for mrisc core, where all of the solutions are found in the first 20 seconds of

execution.

This further explains the substantial speedup obtained by the algorithms of Chapter 3 in

these cases, as shown in Table 5.2. Since those algorithms do not attempt to prove that they

Page 76: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 5. Experimental Results 69

0 50 100 200

05

1015

20

Time (s)

Sol

utio

ns F

ound

(a)

0 100 300 500

02

46

810

Time (s)

Sol

utio

ns F

ound

(b)

Figure 5.9: Solutions found by Unreachability vs. running time for (a) mrisc core (b)usb core

have found all solutions, they avoid this substantial overhead. A similar but less extreme effect

is observed on these designs with SEUnreachability. This is because, while that algorithm

does need to prove that it has found all solutions, it uses a much smaller suspect set. As such, it

saves substantial runtime in this step, explaining the large speedups observed on these designs.

5.3 Summary

This chapter presents experiments comparing and contrasting the algorithms presented in this

thesis. Section 5.1 compares the algorithms in terms of their runtime and the solution sets they

compute. It is found that in most cases, SEUnreachability is the most suitable algorithm

when no additional information is given that allow setting the parameters in the other algo-

rithms intelligently. In cases where such information is available or an error cardinality greater

than one is needed, the other algorithms may be more suitable.

Section 5.2 quantifies the speedup obtained by applying PDR incrementally. It is found that

incrementality gives substantial speedups in most cases. However, in cases where solutions are

very easy to find, incrementality provides very little benefit. In these cases, Unreachability

performs very poorly relative to the other algorithms. This suggests that Unreachability is

most suited to finding solutions that are difficult to find, such as when the approximation-based

algorithms fail to reveal solutions.

Page 77: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 6

Conclusion and Future Work

6.1 Contributions

Verification has become the primary bottleneck in the modern VLSI design cycle, and debugging

is the most time-consuming task within verification. As a result, the automation of debugging

tasks is of critical importance. This thesis presents a set of automated techniques that leverage

Boolean Satisfiability and Property Directed Reachability to automate a previously-manual

debugging task. Specifically, the techniques automate the debugging of errors that manifest

themselves in the form of unreachable states. In this case, all that is known about the debugging

problem is that some state is unreachable in violation of the design specification. As such,

no error trace is available to guide traditional SAT-based automated debugging techniques.

The presented techniques handle this case, and can be divided into two broad classifications:

approximation-based approaches and complete approaches.

The approximation-based approaches represent a practical step forward in the field of au-

tomated debugging. While they are not guaranteed to find the complete solution set to the

problem, in many cases they can find a useful subset of the solutions more quickly than the

complete approaches. In greater detail, these approaches work as follows. First, PDR are used

to compute an over-approximation of the set of states reachable in a specific bounded number

of clock cycles. Subsequently, a SAT-based debugging instance is constructed that models a

sequence of state transitions beginning at one of the approximation states and ending at the

target state which is erroneously unreachable. Each satisfying assignment may correspond to

70

Page 78: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 6. Conclusion and Future Work 71

a solution to the debugging problem. However, due to the use of over-approximation, it is

also possible to find spurious solutions when the chosen approximation state is not reachable.

These are detected using PDR and discarded. As a side effect of the spurious solution detection

process, the approximation is refined, making it less likely that more spurious solutions are

found. The initial formulation of the approach can find solutions that may make the target

state reachable one step after an already-reachable state. This is later extended to handle a

specific bounded number of steps specified by the user referred to as the window size.

The complete approaches use formal techniques to compute the complete solution set to

the problem. As such, they are capable of finding every design location where a change can be

made to correct the error. In greater detail, the canonical complete formulation of the approach

works as follows. An enhanced FSM model of the design is constructed. In the enhanced

model, particular states are reachable if and only if specific design locations are solutions to the

debugging problem. Multiple calls to PDR are used to find traces that reach these states. This

technique is shown to benefit from the incremental application of PDR, where each execution of

the PDR solver reuses the inductive trace from the previous call. It is also shown that applying

the underlying PDR engine incrementally in this fashion preserves the completeness of the

solution set. In addition to the initial formulation of the complete approach, an optimization

is presented that uses the structure of the circuit to prune a potentially large portion of the

non-solution space. This optimization is proven to still find the complete solution set to the

problem under the assumption that only one design error causes the observed unreachability.

A set of experiments is presented to compare the approaches and reveal practical tradeoffs

between them. The initial formulation of the approximation-based approach in which the

window size is one is sufficient to find an average of 60% of the complete solution set and a

median of 74%. This approach is found to be 9x faster than the canonical complete approach

across the set of experiments. Extending the approach to use window size of five allows it to

find an average of 92% of the solutions, and to find all of the solutions in 5 out of 6 benchmarks.

However, the average speedup is reduced to 6x.

Turning to the complete approaches, it is found that the use of incrementality offers an

average speedup of 17x across the set of experiments in the canonical complete algorithm.

It also provides a speedup of 4x for the optimized algorithm that assumes a single design

Page 79: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 6. Conclusion and Future Work 72

error. Additionally, this optimized approach offers an impressive 32x speedup over the canonical

approach and is often able to outperform the approximation-based approaches. However, unlike

the other approaches, it is limited to cases in which only a single design error is present.

6.2 Future Work

The contributions of this thesis rely heavily on PDR. While PDR is explicitly a model checking

algorithm, it has extensive capabilities and can be seen as a powerful reasoning engine much like

Boolean Satisfiability. As SAT solvers have improved dramatically in terms of performance in

recent years, it is expected that PDR engines will do the same. This presents many promising

future directions for research into applying PDR to other verification and debugging problems

beyond its original target of model checking and the debugging formulations presented in this

thesis. In particular, traditional debugging techniques that make use of an error trace create

SAT instances with numerous copies of the transition relation. Due to the high degree of dupli-

cation in this problem, it is expected that modifying the formulation somewhat and applying

PDR in place of SAT could result in performance gains. A similar approach [26] does the same

using solvers for Quantified Boolean Formulas in place of SAT.

An additional direction is in leveraging variants of PDR with more powerful reasoning

capabilities in the algorithms presented in this thesis. QUIP (Quest for an Inductive Proof) [24]

is such an engine. It extends PDR with the ability to detect so-called “good” and “bad” clauses

during the execution of the algorithm. Good clauses are those that will end up in the inductive

invariant PDR ultimately returns, while bad clauses are those that have no chance of being

part of the inductive invariant. As the algorithms in this thesis leverage PDR incrementally, it

is expected that some of the clauses from earlier runs of PDR are not entirely relevant to later

runs. As such, a solver such as QUIP could be used to detect and purge these clauses so the

solver does not waste time processing them. In particular, the clause propagation step of PDR

could waste substantial time trying to propagate bad clauses.

Finally, a direction that has yet to be explored is alternative formulations of the debugging

problem that better leverage all of the capabilities of the PDR engine, particularly its ability to

compute inductive invariants. In the algorithms presented in this thesis, the inductive invariant

Page 80: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Chapter 6. Conclusion and Future Work 73

PDR returns is simply an additional feature that the user may analyze themselves. However,

inductive invariants have many useful properties that may allow additional information to be

extracted. By formulating the debug problem differently, it may be possible to obtain more

meaningful inductive invariants that provide extra information to the user or to additional

debugging and verification algorithms.

Page 81: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Bibliography

[1] F. V. Andrade, L. M. Silva, and A. O. Fernandes. Improving sat-based combinational

equivalence checking through circuit preprocessing. In Computer Design, 2008. ICCD

2008. IEEE International Conference on, pages 40–45, Oct 2008.

[2] S. Asghar, E. Aubanel, and D. Bremner. A dynamic moldable job scheduling based parallel

sat solver. In Parallel Processing (ICPP), 2013 42nd International Conference on, pages

110–119, Oct 2013.

[3] J. D. Backes and M. D. Riedel. Using cubes of non-state variables with property directed

reachability. In Design, Automation Test in Europe Conference Exhibition (DATE), 2013,

pages 807–810, March 2013.

[4] B. Benhamou, T. Nabhani, R. Ostrowski, and M. R. Saidi. Enhancing clause learning by

symmetry in sat solvers. In Tools with Artificial Intelligence (ICTAI), 2010 22nd IEEE

International Conference on, volume 1, pages 329–335, Oct 2010.

[5] Ryan Berryhill and Andreas Veneris. Automated rectification methodologies to functional

state-space unreachability. In Proceedings of the 2015 Design, Automation & Test in Europe

Conference & Exhibition, DATE ’15, pages 1401–1406, 2015.

[6] Ryan Berryhill and Andreas Veneris. A complete approach to unreachable state diagnos-

ability via property directed reachability. In Proceedings of the 2016 Asia and South Pacific

Design Automation Conference, ASP-DAC ’16, 2016.

74

Page 82: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Bibliography 75

[7] Ryan Berryhill and Andreas Veneris. Efficient selection of suspect sets in unreachable

state diagnosis. In Proceedings of the 2016 Int’l Symposium on Artificial Intelligence and

Mathematics, ISAIM ’16, 2016.

[8] A. Biere, A. Cimatti, E. M. Clarke, O. Strichman, and Y. Zhu. Bounded model checking.

In Advances in Computers, volume 58, pages 118–149, 2003.

[9] A.R. Bradley. Sat-based model checking without unrolling. In Intl Conf. on Verification,

Model Checking, and Abstract Interpretation, pages 70–87, 2011.

[10] Robert Brummayer and Armin Biere. Local two-level and-inverter graph minimization

without blowup. In Proceedings of the 2nd Doctoral Workshop on Mathematical and En-

gineering Methods in Computer Science, MEMICS ’06, 2006.

[11] G. Cabodi, M. Palena, and P. Pasini. Interpolation with guided refinement: Revisiting

incrementality in sat-based unbounded model checking. In Formal Methods in Computer-

Aided Design (FMCAD), 2014, pages 43–50, Oct 2014.

[12] Kai-Hui Chang, I.L. Markov, and V. Bertacco. Automating post-silicon debugging and

repair. In Computer-Aided Design, 2007. ICCAD 2007. IEEE/ACM International Con-

ference on, pages 91–98, Nov 2007.

[13] Hana Chockler, Alexander Ivrii, Arie Matsliah, Shiri Moran, and Ziv Nevo. Incremental

formal verification of hardware. In Proceedings of the International Conference on For-

mal Methods in Computer-Aided Design, FMCAD ’11, pages 135–143, Austin, TX, 2011.

FMCAD Inc.

[14] Hong-Zu Chou, Kai-Hui Chang, and Sy-Yen Kuo. Facilitating unreachable code diagnosis

and debugging. In Proceedings of the 16th Asia and South Pacific Design Automation

Conference, ASPDAC ’11, pages 485–490, Piscataway, NJ, USA, 2011. IEEE Press.

[15] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduc-

tion to Algorithms, Third Edition. The MIT Press, 3rd edition, 2009.

Page 83: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Bibliography 76

[16] S. Disch and C. Scholl. Combinational equivalence checking using incremental sat solving,

output ordering, and resets. In Design Automation Conference, 2007. ASP-DAC ’07. Asia

and South Pacific, pages 938–943, Jan 2007.

[17] N. Een and N. Sorensson. An extensible SAT-solver. In SAT, pages 502–518, 2003.

[18] Niklas Een, Alan Mishchenko, and Robert Brayton. Efficient implementation of property

directed reachability. In Proceedings of the International Conference on Formal Methods

in Computer-Aided Design, FMCAD ’11, pages 125–134, Austin, TX, 2011. FMCAD Inc.

[19] G. Fey, S. Staber, R. Bloem, and R. Drechsler. Automatic fault localization for property

checking. Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions

on, 27(6):1138–1149, June 2008.

[20] E. Goldberg, M. Prasad, and R. Brayton. Using sat for combinational equivalence checking.

In Design, Automation and Test in Europe, DATE ’01, pages 114–121, 2001.

[21] Zyad Hassan, Aaron R. Bradley, and Fabio Somenzi. Better generalization in ic3. In

Formal Methods in Computer-Aided Design, FMCAD’13, pages 157–164. IEEE, 2013.

[22] Krystof Hoder and Nikolaj Bjørner. Generalized property directed reachability. In Pro-

ceedings of the 15th International Conference on Theory and Applications of Satisfiability

Testing, SAT’12, pages 157–171, Berlin, Heidelberg, 2012. Springer-Verlag.

[23] Shi-Yu Huang and Kwant-Ting Cheng. Formal Equivalence Checking and Design DeBug-

ging. Kluwer Academic Publishers, Norwell, MA, USA, 1998.

[24] Alexander Ivrii and Arie Gurfinkel. Pushing to the top. In Formal Methods in Computer-

Aided Design, FMCAD ’15, 2015.

[25] Alexander Ivrii, Arie Gurfinkel, and Anton Belov. Small inductive safe invariants. In

Formal Methods in Computer-Aided Design, FMCAD ’14, pages 21:115–21:122, 2014.

[26] H. Mangassarian, A.Veneris, S.Safarpour, M.Benedetti, and D.Smith. A performance-

driven qbf-based on iterative logic array representation with applications to verification,

debug and test. In Intl Conf. on CAD, 2007.

Page 84: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Bibliography 77

[27] Joao P. Marques-Silva and Karem A. Sakallah. Boolean satisfiability in electronic design

automation. In Design Automation Conference, DAC ’00, pages 675–680, 2000.

[28] K. McMillan. Interpolation and sat-based model checking. In Computer Aided Verification,

2003.

[29] A. Mishchenko, S. Chatterjee, R. K. Brayton, and N. Een. Improvements to combinational

equivalence checking. In Intl Conf. on CAD (ICCAD), pages 836–843, 2006.

[30] Matthew W. Moskewicz, Conor F. Madigan, Ying Zhao, Lintao Zhang, and Sharad Malik.

Chaff: Engineering an efficient sat solver. In Design Automation Conference, DAC ’01,

pages 530–535, 2001.

[31] OpenCores.org. http://www.opencores.org, 2007.

[32] V. Paruthi and A. Kuehlmann. Equivalence checking combining a structural sat-solver,

bdds, and simulation. In Computer Design, 2000. Proceedings. 2000 International Confer-

ence on, pages 459–464, 2000.

[33] S. Safarpour, A. Veneris, and H. Mangassarian. Trace compaction using sat-based reach-

ability analysis. In Design Automation Conference, 2007. ASP-DAC ’07. Asia and South

Pacific, pages 932–937, 2007.

[34] S. Safarpour, A. Veneris, and F. Najm. Managing verification error traces with bounded

model debugging. In Design Automation Conference (ASP-DAC), 2010 15th Asia and

South Pacific, pages 601–606, 2010.

[35] Sean Safarpour and Andreas Veneris. Abstraction and refinement techniques in automated

design debugging. In Proceedings of the Conference on Design, Automation and Test in

Europe, DATE ’07, pages 1182–1187, 2007.

[36] Sean Safarpour and Andreas Veneris. Automated design debugging with abstraction and

refinement. Trans. Comp.-Aided Des. Integ. Cir. Sys., 28(10):1597–1608, October 2009.

Page 85: Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size and

Bibliography 78

[37] Joao P. Marques Silva and Karem A. Sakallah. Grasp&mdash;a new search algorithm for

satisfiability. In International Conference on Computer-aided Design, ICCAD ’96, pages

220–227, 1996.

[38] A. Smith, A. Veneris, M. F. Ali, and A. Viglas. Fault diagnosis and logic debugging

using boolean satisfiability. IEEE Trans. Comput.-Aided Design Integr. Circuits Syst,

24(10):1606–1621, Oct. 2005.

[39] G. S. Tseitin. On the complexity of derivations in the propositional calculus. Studies in

Mathematics and Mathematical Logic, Part II:115–125, 1968.

[40] Yakir Vizel, Orna Grumberg, and Sharon Shoham. Tools and Algorithms for the Construc-

tion and Analysis of Systems: 19th International Conference, TACAS 2013, Held as Part of

the European Joint Conferences on Theory and Practice of Software, ETAPS 2013, Rome,

Italy, March 16-24, 2013. Proceedings, chapter Intertwined Forward-Backward Reacha-

bility Analysis Using Interpolants, pages 308–323. Springer Berlin Heidelberg, Berlin,

Heidelberg, 2013.

[41] T. Welp and A. Kuehlmann. Qf bv model checking with property directed reachability. In

Design, Automation Test in Europe Conference Exhibition (DATE), 2013, pages 791–796,

March 2013.

[42] T. Welp and A. Kuehlmann. Property directed invariant refinement for program verifi-

cation. In Design, Automation and Test in Europe Conference and Exhibition (DATE),

2014, pages 1–6, March 2014.

[43] T. Welp and A. Kuehlmann. Property directed reachability for qf bv with mixed type

atomic reasoning units. In Design Automation Conference (ASP-DAC), 2014 19th Asia

and South Pacific, pages 738–743, Jan 2014.