Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large...
Transcript of Traceless Automated Design Debugging of Liveness Properties … · 2016-06-21 · Modern Very Large...
Traceless Automated Design Debugging of Liveness Properties UsingProperty Directed Reachability
by
Ryan Berryhill
A thesis submitted in conformity with the requirementsfor the degree of Master of Applied Science
Graduate Department of Electrical and Computer EngineeringUniversity of Toronto
c© Copyright 2016 by Ryan Berryhill
Abstract
Traceless Automated Design Debugging of Liveness Properties Using Property Directed
Reachability
Ryan Berryhill
Master of Applied Science
Graduate Department of Electrical and Computer Engineering
University of Toronto
2016
The growth in complexity of digital hardware drives an increase in the importance of au-
tomated computer-aided design (CAD) tools. Verification consumes most of the design effort,
with debugging accounting for half of the verification time. These are therefore important
targets for automation. Traditionally, when a failure is detected through an observation value
mismatch, an error trace is returned. The error trace can be used with a Boolean Satisfiability
(SAT)-based automated debugging tool to aid the engineer in finding the error source. How-
ever, when a state is shown to be unreachable, no error trace is available to guide the tool.
Debugging these errors is a manual process. This thesis presents two novel automated tech-
niques to perform design debugging in the absence of an error trace. The use of PDR avoids
the memory-intensive ILA representation, making it possible to solve larger problem instances.
Experiments demonstrate the practicality of the proposed techniques.
ii
Contents
1 Introduction 1
1.1 Background and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Purpose and Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 Background 7
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Boolean Satisfiability for CAD . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2.1 CNF Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2.2 ILA Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3 SAT-based Debugging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4 Property Directed Reachability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4.1 Notation and Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4.2 High-Level Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.4.3 Detailed Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3 Traceless Debugging Using Approximation and Unrolling 22
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2.1 Previous Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2.2 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.3 Iterative Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
iii
3.3.1 Single-Cycle Unreachability . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.3.2 Sample Debugging Problem . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.3.3 Multi-Cycle Unreachability . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.4 Performance-Driven Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4 Traceless Debugging Without Unrolling 37
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.2 Initial Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.2.1 Constructing the Enhanced Model . . . . . . . . . . . . . . . . . . . . . . 39
4.2.2 Searching for Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.2.3 Soundness and Completeness . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.3 Incremental Application of PDR . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.4 Efficient Suspect Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5 Experimental Results 54
5.1 Algorithm Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.2 Benefits of Incrementality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
6 Conclusion and Future Work 70
6.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
Bibliography 73
iv
List of Tables
2.1 Characteristic functions of elementary gates . . . . . . . . . . . . . . . . . . . . 9
3.1 Glossary of symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.1 Summary of presented algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.2 Runtime and solutions found . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.3 Runtime and solutions found (K = 10) . . . . . . . . . . . . . . . . . . . . . . . . 56
5.4 Effect of K and N on runtime and solutions found . . . . . . . . . . . . . . . . . 57
5.5 Runtime comparison for Unreachability and SEUnreachability . . . . . . . 62
5.6 Number of suspects and solutions in iterations 1 through 5 . . . . . . . . . . . . 62
5.7 Effect of |L| on Unreachability runtime . . . . . . . . . . . . . . . . . . . . . 64
5.8 Effect of incrementality on runtime . . . . . . . . . . . . . . . . . . . . . . . . . . 68
v
List of Figures
1.1 Typical VLSI design flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.1 (a) A sequential circuit (b) ILA representation with 2 time-frames . . . . . . . . 10
2.2 Error multiplexer inserted at suspect location li . . . . . . . . . . . . . . . . . . . 11
2.3 Debugging ILA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4 Hardware construction for error cardinality n > 1 . . . . . . . . . . . . . . . . . . 13
2.5 Example finite state machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.6 Example state-space over-approximations . . . . . . . . . . . . . . . . . . . . . . 17
2.7 (a) A predecessor of an unsafe state in F3 (b) Approximations after refining F3 . 18
3.1 Representation of the debugging instance . . . . . . . . . . . . . . . . . . . . . . 27
3.2 Set Fi (a) initially (b) after detecting a spurious result from state t . . . . . . . . 27
3.3 (a) Correct implementation of shift register (b) Erroneous implementation in
which states are unreachable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.4 Erroneous shift register circuit with solutions highlighted . . . . . . . . . . . . . 32
3.5 Representation of the multi-cycle debugging instance . . . . . . . . . . . . . . . . 33
4.1 Error-select register and multiplexer at suspect location li . . . . . . . . . . . . . 40
4.2 (a) Original circuit (b) Circuit used to construct Ten (error-select registers omitted) 40
4.3 State space representation of (a) M and (b) Mblk . . . . . . . . . . . . . . . . . . 48
4.4 Example circuit with fanout branches highlighted . . . . . . . . . . . . . . . . . . 49
5.1 Total runtime and time spent checking spurious solutions in MMCUnreacha-
bility for wb versus K . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
vi
5.2 Runtime relative to Unreachability for presented algorithms (Unreachability
= 1.0) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.3 Suspects and solutions per iteration for spi . . . . . . . . . . . . . . . . . . . . . 62
5.4 Runtime per iteration for spi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.5 Total runtime and average runtime per SAT call for divider . . . . . . . . . . . 63
5.6 Solutions versus N for divider and wb (K = 10) . . . . . . . . . . . . . . . . . . 65
5.7 Solutions found for spi vs. running time for (a) Unreachability (b) SEUn-
reachability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.8 Average percentage of solutions found by algorithms across all experiments . . . 67
5.9 Solutions found by Unreachability vs. running time for (a) mrisc core (b)
usb core . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
vii
Chapter 1
Introduction
1.1 Background and Motivation
Modern Very Large Scale Integration (VLSI) hardware designs are relentlessly increasing in size
and complexity. Computer aided design (CAD) tools are more important than ever to the design
cycle as the hardware becomes too complex for human understanding. Realizing a modern VLSI
design is a complex process involving both manual tasks and automated procedures carried
out by CAD tools. Figure 1.1 presents a simplified view of a typical VLSI design flow. The
process begins with a behavioral specification of the design, which consists of a natural language
document or a formal specification written in e.g., C or a behavioral hardware description
language (HDL). The design is then transformed into a register transfer level (RTL) specification
in a language such as VHDL or Verilog. Subsequently, the logic synthesis step converts the RTL
description into a gate-level netlist. The gate-level netlist is used to produce a transistor-level
netlist, which is then placed and routed to give a physical layout. The physical layout is finally
sent to a fabrication facility where the chip is manufactured.
At each stage of the design flow, a verification or testing step is performed to ensure compli-
ance with the specification. Functional verification techniques such as model checking are used
to ensure that the behavior of the synthesized design fits its specification. After optimizing the
gate-level netlist, equivalence checking is used to ensure that the optimized layout is function-
ally equivalent to the logic netlist. Additional timing-based tests are performed to ensure that
the layout meets performance requirements. Finally, a battery of tests is carried out against
1
Chapter 1. Introduction 2
BehavioralSpecification
LogicSynthesis
RTLDescription
LayoutSynthesis
LogicNetlist
Fabrication
Layout
Chip
FunctionalVerification
TestingSiliconDebugging
DesignDebugging
Design Testing, Verification, and Debug
BehavioralSynthesis
Figure 1.1: Typical VLSI design flow
the fabricated chips before they are packaged and sold.
When a failure occurs at any verification or testing step, some form of debugging is per-
formed to correct it. When functional verification reveals an error, design debugging is carried
Chapter 1. Introduction 3
out to locate the root cause of the failure so it may be corrected. If an error is revealed during
chip-level testing, silicon debugging is performed to locate the failing part of the chip. Since
chips have already been manufactured, a failure discovered during chip-level testing can be very
expensive to correct. As a result, care is taken to prevent errors from escaping to this stage of
the design process.
Many of the processes in Figure 1.1 have been partially or fully-automated. A common
theme among automated verification and debugging techniques is formulating the problem as a
Boolean Satisfiability (SAT) instance. Many CAD problems such as equivalence checking [20,
29, 16, 1, 32], model checking [9, 28, 8, 40, 11], and design debugging [38, 19, 23, 12] have been
successfully encoded in SAT-based formulations. During the past 20 years, the performance
and capabilities of SAT solvers has improved immensely [30, 37, 17, 4, 2]. Since numerous
CAD problems are formulated as SAT instances, any improvement to the state-of-the-art in
SAT solving immediately benefits all such automated techniques. Along with the increasing
availability of computational power, this has resulted in automation becoming increasingly
applicable to CAD problems.
Since combinational circuitry essentially implements a Boolean formula, transforming it to
a SAT instance is straightforward [27]. For sequential designs, many SAT formulations use a
technique called circuit unrolling, also known as the Iterative Logic Array (ILA) [26] to model
the sequential behavior using only combinational logic. Circuit unrolling constructs an ILA
containing one copy of the circuit for each clock cycle that is to be modeled. While this enables
a simple transformation to a SAT instance, it can consume a large amount of memory to
accommodate the unrolled design.
SAT-based design debugging [38] in particular is based on the use of an ILA. Traditionally,
functional verification reveals an error through means such as a firing assertion, observation
value mismatch, or scoreboard discrepancy, and returns an error trace that exposes the failure.
The error trace is then used to guide the debugging tool, which creates an ILA representation
of the circuit and error trace. The ILA contains one time-frame for each clock cycle in the error
trace, potentially resulting in excessive memory usage.
Additional techniques exist to facilitate debugging problems that would otherwise be com-
putationally infeasible due to the size of the design and the length of the error trace. Trace
Chapter 1. Introduction 4
compaction [33] can somewhat mitigate the memory use resulting from long traces by finding
a shorter trace that still exposes the failure. A technique known as bounded model debug-
ging [34] can similarly be used to handle long traces by initially modeling only a small portion
of the trace and iteratively adding more clock cycles as needed. Abstraction and refinement
techniques [36, 35] can be used to abstract portions of the design, thereby reducing the memory
required to model each time-frame.
Similarly to SAT solvers, the model checking technique of Property Directed Reachability
(PDR) [9] has seen tremendous advancement in recent years [13, 18, 24]. Unlike traditional
model checking techniques [8], PDR does not use the ILA to model sequential behavior. As a
result, it can avoid the substantial memory use characteristic to some CAD algorithms. While
PDR has been restricted to model checking problems thus far, is has seen applications to non-
hardware model checking problems, in domains such as quantifier-free formulae [41, 43] and
software [22, 42].
Evidently, the problem of automated design debugging is well-studied for cases where an
error trace is available. However, when verification reveals that a state is unreachable in vio-
lation of the design specification, no error trace is available to guide an automated debugging
tool and comparatively little automation is available to aid the engineer. This thesis presents
two automated debugging techniques based on PDR that solve the design debugging problem
in cases where no error trace is present. As such, these techniques automate a debugging task
that previously was handled manually while avoiding the large memory usage possible with
existing SAT-based automated debugging techniques.
In practice, these techniques are expected to be highly valuable. In an industry setting,
liveness checking does not seem to be widely-deployed as a verification technique. This is likely
because a failed liveness property is inherently difficult to manually debug. No error trace is
available and the designer may not know what conditions should lead to the property being
reached. However, liveness checking is very easy to deploy. A regression testing suite can simply
count the number of times a state is entered throughout the regression run. When particular
states are found to never be entered, it provides suspicion of a liveness failure. This could then
be confirmed using a formal tool, and finally debugged using the techniques presented in this
thesis.
Chapter 1. Introduction 5
1.2 Purpose and Scope
This thesis presents two PDR-based techniques to automate design debugging in the absence
of an error trace. The first technique formulates the debugging problem in a manner similar to
traditional SAT-based debugging. Rather than using an error trace to constrain the SAT in-
stance, PDR is used to compute an over-approximation of the set of all possible error traces up
to a fixed size. Drawing from traditional SAT-based debugging, a partial ILA is constructed and
constrained using this over-approximation. Due to the inherent nature of over-approximations,
this formulation may lead to spurious solutions that do not actually solve the debugging prob-
lem. This is handled by again using PDR to verify each solution. As an added benefit, verifying
a solution refines the approximations, reducing the chances of finding more spurious solutions.
As only a partial ILA is constructed and the approximations drawn from PDR are only valid
for a fixed trace size, the approach is able to find a subset of all solutions. These limitations are
inherent to the formulation, as it may not be known what length of trace and ILA is needed to
fully solve the problem. Experiments demonstrate that while the provided solution set is not
necessarily complete, with reasonable parameters the technique is often able to find the actual
error source.
Next, a complete and exact approach is presented. The debugging problem is constructed
as a model checking instance passed directly to PDR. As such, it makes no use of the ILA
representation of the design. Further, it is not susceptible to spurious solutions, nor does
it require the application of a priori knowledge regarding the needed trace length and ILA
size. It returns the complete solution set of the problem at the cost of increased runtime
when compared to the approximation-based approach. An enhancement that maintains the
completeness of the solution set while drastically improving performance is also developed.
This performance enhancement prunes a substantial portion of the non-solution space, often
achieving lower runtimes than the approximation-based approach.
Finally, a set of experiments comparing the effectiveness and performance of these techniques
is presented and various practical tradeoffs are contrasted. It is found that the approximate
approach is able to find an average of 60% of the complete solution set using reasonable param-
eter values. However, in most cases the solution set it finds includes the actual error source,
Chapter 1. Introduction 6
indicating that it tends to find a useful subset of the complete solution set. The exact ap-
proach naturally finds the full set of solutions. This comes at a cost of a non-trivial runtime
increase when compared to the approximate approach. However, with the presented perfor-
mance enhancements the exact approach is able to find the complete solution set with runtime
comparable to the approximate approach.
1.3 Thesis Outline
This thesis is organized as follows. Chapter 2 provides background on SAT, traditional SAT-
based automated debugging, and PDR. Chapter 3 presents the approach based on traditional
debugging and approximation. Chapter 4 presents a complete and exact PDR-based approach
and performance enhancements. Chapter 5 presents the set of experiments comparing these
techniques while demonstrating their practical applicability. Finally, Chapter 6 summarizes the
results of this work and suggests potential topics of future work.
Chapter 2
Background
2.1 Introduction
This chapter provides background relevant to the contributions of this thesis. Section 2.2 gives
an overview of Boolean Satisfiability (SAT) and its application in computer-aided design (CAD)
problems. Section 2.3 defines the design debugging problem and explains traditional SAT-based
debugging. Section 2.4 introduces Property Directed Reachability (PDR). Finally, section 2.5
summarizes the chapter.
2.2 Boolean Satisfiability for CAD
The Boolean Satisfiability (SAT) problem can be stated as follows: given a propositional formula
Φ(x1, ..., xn) find an assignment to x1, ..., xn such that Φ evaluates to true(1), or indicate that
none exists. Such an assignment, if one exists, is called a satisfying assignment. If a formula
has a satisfying assignment, it is said to be satisfiable. Otherwise it is unsatisfiable. A SAT
solver is tasked with determining whether or not a given formula is satisfiable. If so, it returns
SAT along with a satisfying assignment. Otherwise, it returns UNSAT.
2.2.1 CNF Representation
In practice, modern SAT solvers operate only on formulas in Conjunctive Normal Form (CNF).
A formula in CNF is a conjunction of clauses, while a clause is a disjunction of literals. A literal
7
Chapter 2. Background 8
is an instance of a variable xi or its negation xi. Thus, the following formula is in CNF:
Φ = (x1 ∨ x2 ∨ x3) ∧ (x2 ∨ x3) ∧ (x2 ∨ x3 ∨ x4) (2.1)
Any propositional formula can be converted to CNF in polynomial time [15]. It is also
possible to convert a logic circuit to CNF. In the CNF representation of a circuit each internal
line, input, and output is represented by a variable. Logically, a gate simply imposes a constraint
on the values of the lines attached to it. For instance, the output of an inverter is always the
negation of its input. These constraints can be converted to CNF in linear time [39]. This is
done by replacing each logic gate with its characteristic function, which is a set of clauses in
CNF that represents the same constraints as the logic gate. The clauses are satisfiable if and
only if the constraints of the gate are met. For instance, a two-input AND gate implementing
the function y = x1 ∧ x2 has the characteristic function given below in Eq. 2.2.
(x1 ∨ y) ∧ (x2 ∨ y) ∧ (x1 ∨ y) ∧ (x2 ∨ y) (2.2)
Each gate can be replaced by its characteristic function. The CNF representations of the
elementary gate types are shown in Table 2.1. Taking the conjunction of the CNF representa-
tion of each gate in a circuit gives a CNF representation of the entire circuit. Any satisfying
assignment to this formula represents a valid assignment of Boolean values to the circuit lines.
By adding constraints on the variables representing the input lines, this representation can be
used to simulate the combinational behavior of the circuit. By adding constraints on the output
lines, this representation can be used to determine if the circuit is capable of producing specific
outputs. Many CAD problems can be formulated in a similar manner.
2.2.2 ILA Representation
Often CAD problems are concerned with the sequential behavior of a circuit. A sequential
circuit contains state elements such as D flip-flips (DFFs) and latches. A DFF has a data
input, clock input, and an output. On a positive edge of the clock input, the output is set to
the data input value at the moment of the positive edge. The output then remains constant
until the next positive clock edge. As SAT solvers have no concept of time and state elements,
Chapter 2. Background 9
Table 2.1: Characteristic functions of elementary gates
Gate Function CNF Representation
AND y = x1 ∧ ... ∧ xn( n∧
i=1
(xi ∨ y))∧( n∨
i=1
(xi ∨ y))
NAND y = x1 ∧ ... ∧ xn( n∧
i=1
(xi ∨ y))∧( n∨
i=1
(xi ∨ y))
OR y = x1 ∨ ... ∨ xn( n∧
i=1
(xi ∨ y))∧( n∨
i=1
(xi ∨ y))
NOR y = x1 ∨ ... ∨ xn( n∧
i=1
(xi ∨ y))∧( n∨
i=1
(xi ∨ y))
XOR y = x1 ⊕ x2 (x1 ∨ x2 ∨ y) ∧ (x1 ∨ x2 ∨ y)∧(x1 ∨ x2 ∨ y) ∧ (x1 ∨ x2 ∨ y)
XNOR y = x1 ⊕ x2 (x1 ∨ x2 ∨ y) ∧ (x1 ∨ x2 ∨ y)∧(x1 ∨ x2 ∨ y) ∧ (x1 ∨ x2 ∨ y)
BUFFER y = x (x ∨ y) ∧ (x ∨ y)
NOT y = x (x ∨ y) ∧ (x ∨ y)
the representation in the previous section is insufficient to model the sequential behavior of a
circuit.
A common way of modeling sequential circuits is the Iterative Logic Array (ILA) represen-
tation. To explain this representation, it is first necessary to introduce appropriate notation.
Given a sequential circuit, let X = {x1, ..., x|X|} denote its primary input, Y = {y1, ..., y|Y |}
denote its primary output, and S = {s1, ..., s|S|} denote the state variables (flipflops) of the
circuit. Further let vectors Xi = {xi1, ..., xi|X|}, Y i = {yi1, ..., yi|Y |}, and Si = {si1, ..., si|s|}, denote
the values of the primary input, primary output, and state variables at cycle i, respectively.
A circuit’s ILA representation is constructed by replicating the combinational part (tran-
sition relation) k times, where each copy is called a time-frame. The next-state variables of
frame i are connected to the current state variables of frame i + 1. As such, an ILA of k
time-frames models the circuit’s behavior over k clock cycles. The following example illustrates
the construction of an ILA.
Example 2.1. Figure 2.1 depicts the construction of an ILA of two time-frames. In Fig-
Chapter 2. Background 10
s1
x1
x2
DQ
>
y1
(a)
s11
x11
x12
y11
s21
x21
x22
y21
s31
(b)
Figure 2.1: (a) A sequential circuit (b) ILA representation with 2 time-frames
ure 2.1(a), a sequential circuit is shown with a single state element s1, two inputs x1 and x2,
and a single output y1. To construct an ILA, all of the primary input, primary output, and
internal lines are replicated along with the transition relation of the circuit. This can be seen in
Figure 2.1(b) where each box contains one copy of the transition relation and all lines are dupli-
cated. Additionally for flipflop s1, the input D from time-frame 1 has replaced the output Q in
time-frame 2. The combinational circuit in Figure 2.1(b) models the behavior of the sequential
circuit over 2 clock cycles. For instance, consider constraining the values of the primary input
variables with X1 in frame 1 and X2 in frame 2. Doing so, this ILA could be used to compute
the response Y 1 and Y 2 that would be generated by the original circuit for clock cycles 1 and
2, respectively.
The ILA is a combinational circuit that effectively models the sequential behavior of a circuit
for a limited number of clock cycles. As such, it can be transformed to a formula in CNF in
the manner described in section 2.2.1. This representation can be used to solve various CAD
problems. For instance, conjoining the clause (s31) to the CNF representation of the ILA in
Figure 2.1(b) gives a formula that is satisfiable if and only if it is possible to reach a state in
which s1 = 1 in 2 clock cycles.
2.3 SAT-based Debugging
In verification and debugging, a failure is incorrect behavior that differs from the specification.
For a particular failure, an error is a design location (i.e., a wire in the design) that can be
changed to correct the failure. Design debugging is the task of locating the error when functional
Chapter 2. Background 11
verification detects a failure. Traditionally, a failure is revealed through means such as firing
assertions, observation value mismatches, scoreboard discrepancies, etc., and an error trace that
demonstrates the failure is returned.
In these cases, an automated debugging utility [38] can be used with the error trace to find
the error. Assume the error trace has k clock cycles, and let vector X1, ..., Xk represent the
input values from the error trace. Let vector Y1, ...,Yk denote the correct output values for the
trace according to the circuit’s specification (the expected response). Since the trace exhibits a
failure, Y i must differ from the observed response Y i at some clock cycle i.
First, the transition relation is enhanced at a set of suspect locations. A suspect location is
a line in the circuit that is suspected of being an error. In the absence of a priori knowledge
about the cause of the failure, every location in the circuit is a suspect. Let L = {l1, ..., l|L|} be
the suspect locations. For each suspect location li, an error-select line ei and a free variable wi
are added. The suspect location is replaced by a multiplexer with output zi, 0-input li, 1-input
wi, and select input ei as depicted in Figure 2.2. At each of its fanout locations, li is replaced
by zi. It can be seen that when ei = 0 the behavior is unaffected. Conversely, the suspect
location is replaced by an unconstrained free variable when ei = 1, allowing it to behave as an
arbitrary Boolean function.
li
wi
ei
0
1
zi
Figure 2.2: Error multiplexer inserted at suspect location li
A SAT-based debugging tool then constructs a k-frame ILA representation of the circuit.
In this step, the error-select lines are handled separately. While each other line in the enhanced
transition relation is replicated k times, the error-select lines are not replicated. Instead, for
location li, a single copy of ei is the select-input of the inserted multiplexer in every time-frame.
Figure 2.3 depicts a debugging ILA for the circuit of Figure 2.1(a). It can be seen that the
error-select lines are not replicated.
The debugging ILA is constrained with the input values X1, ..., Xk from the error trace
Chapter 2. Background 12
s11
x11
x12
y11
s21
0
1
0
1
w11
w12
x21
x22
y21
0
1
0
1
w21
w22
s31
e2e1
Figure 2.3: Debugging ILA
and the expected response Y1, ...,Yk according to the specification. Since the output of the
circuit does not match the specification, the instance is unsatisfiable if ei = 0 for all 1 ≤ i ≤ |L|
as the ILA will behave exactly the same as the original circuit. Conversely, the instance is
trivially satisfiable if ei = 1 for all 1 ≤ i ≤ |L| since every line can be replaced by an arbitrary
value. Notice that the set of locations associated with the error-select lines assigned to 1 in
any satisfying assignment represents a set of locations that can be simultaneously modified to
correct the failure exposed by the error trace.
In order to get meaningful results the number of simultaneous errors must be constrained.
This is accomplished by adding a cardinality constraint φn on the error-select lines. The con-
straint φn enforces that exactly n error-select lines are active in any satisfying assignment. For
the case of n = 1, the CNF representation of φ1 is shown in Eq. 2.3. In that equation, the first
clause ensures that at least one error-select register is active. The remaining clauses ensure
that no pair of error-select registers is simultaneously active.
φ1 = (e1 ∨ ... ∨ e|L|) ·∧
1≤i<|L|i≤j≤|L|
(ei ∨ ej) (2.3)
For cases with a larger error cardinality, the constraint can be implemented using an adder
with n one-bit inputs and the appropriate output size to accommodate values from 0 to n. The
adder’s output is then fed to a comparator that outputs 1 if and only if its input is exactly
equal to n. This hardware construction is shown in Figure 2.4.
Chapter 2. Background 13
Σ
e1 e2 en...
n
P QP = Q
Figure 2.4: Hardware construction for error cardinality n > 1
A satisfying assignment therefore has exactly n active error-select lines. This corresponds
to an n-tuple of suspect locations that can be simultaneously modified to correct the erroneous
behavior exposed by the error trace. An all-solutions SAT solver is then used to find every
satisfying assignment to the resulting formula.
2.4 Property Directed Reachability
2.4.1 Notation and Terminology
Before introducing concepts relevant to Property Directed Reachability (PDR), it is necessary
to give notation and terminology that will be used throughout this thesis. Given a sequential
circuit C, let S = {s1, ..., s|S|} denote the set of current-state variables (flipflops) of C. Similarly,
let S′ = {s′1, ..., s′|S|} denote the set of next-state variables (inputs to flipflops) of C. The set
of initial states of C is denoted by I ⊆ {0, 1}S . For the purpose of model checking, the circuit
can be modeled as a Finite State Machine (FSM) M = (S, I, T ). What follows formally defines
a state of the circuit.
Definition 2.1. Each assignment to the state variables t ∈ {0, 1}S is a state of the circuit.
A state t can be represented by a cube, which is simply a conjunction of literals. The cube
is formed by taking the conjunction of the positive literals for each variable assigned to 1 along
with the negative literals for each variable assigned to 0. The transition relation of the circuit
is denoted by T ⊆ {0, 1}S × {0, 1}S . A pair of states 〈t0, t1〉 ∈ T if and only if there is an
assignment to the primary input that causes the circuit to transition from state t0 to t1. The
following definition will be used to formally define the reachability of states under the circuit’s
Chapter 2. Background 14
transition relation.
Definition 2.2. A sequence of states t0, ..., tn is a trace of the circuit if and only if 〈ti, ti+1〉 ∈ T
for all 0 ≤ i < n and t0 ∈ I.
When considering the reachability of a state, the number of cycles it takes to reach the
state might be relevant. In other cases, it may only be important to know whether or not the
state can be reached at all. There are therefore two notions of reachability that follow from the
definition of a trace, defined below.
Definition 2.3. A state t is reachable if and only if t appears in some trace of the circuit. It
is also i-step reachable if and only if it appears in a trace of length less than i.
Another aspect of reachability that is often relevant is the reachability of sets of states. For
some predicate P ⊆ {0, 1}S , any state t ∈ P is referred to as a P -state in this thesis. The
predicate can be represented by a Boolean formula over the state variables of S. For predicate
P , let P ′ denote the same predicate over the corresponding variables of S′. This allows the
construction of SAT instances such as P ∧ T ∧ Q′, which indicate whether any P -state can
transition to a Q-state. The predicate P is said to be (i-step) reachable if and only if any
P -state is (i-step) reachable. The following definition allows reasoning about the reachability
of every state in P .
Definition 2.4. A predicate P is i-step invariant if and only if I ⊆ P and P includes every
i-step reachable state. It is also invariant if and only if it includes every reachable state.
From this definition an i-step invariant over-approximates the set of i-step reachable states,
while an invariant over-approximates the set of all reachable states. This is the intuitive mean-
ing behind the term invariant. Since an invariant over-approximates all reachable states, it
represents a property that always holds during the operation of the FSM. In practice, it may be
difficult to determine whether or not a given predicate is an invariant. The following definition
gives a stronger notion of invariance that can be checked for using a SAT solver.
Definition 2.5. A predicate P is an inductive invariant if and only if P is an invariant that is
also closed under T (i.e., t ∈ P ∧ t′ ∈ ¬P ⇒ 〈t, t′〉 6∈ T ).
Chapter 2. Background 15
An inductive invariant P over-approximates the set of all reachable states since it contains
all initial states and no ¬P -state is reachable from a P -state. The Boolean formula P ∧T ∧¬P ′
is unsatisfiable if and only if P is closed under T , providing a means to determine if a predicate
is an inductive invariant.
2.4.2 High-Level Overview
PDR is a model checking algorithm [9]. Model checking refers to the task of determining if a
property holds for a design. Properties can be classified as either liveness or safety properties.
A safety property is a set of states that must be unreachable for the property to hold. A
liveness property is a set of states where at least one member of the set must be reachable
for the property to hold. Given a safety (liveness) property P , an inductive invariant J where
J ∩ P = ∅ is a certificate proving that P holds (does not hold). Conversely, a trace t0, ..., tn
where ti ∈ P for some 0 ≤ i ≤ n is a certificate proving that safety (liveness) property P
does not hold (holds). Throughout this thesis, it is assumed that an algorithm PDR(M,P, k)
exists, where M = (S, I, T ) is an FSM. It returns Reachable if and only if a P -state is k-step
reachable under M . If k =∞, it returns Reachable if and only if a P -state is reachable under
M . Otherwise it returns Unreachable.
Given a safety property P ⊆ {0, 1}S , PDR attempts to find an inductive invariant proving
that the property holds. The property P represents the set of “safe” states, i.e., P holds if
and only if no ¬P -state is reachable. Its complement ¬P represents the set of “unsafe” states.
Unlike many CAD algorithms based on SAT, PDR does not use an ILA representation of the
circuit. Rather, it constructs a series of SAT instances using a single copy of the transition
relation. The SAT instances are aimed at finding states that lead to a violation of the property.
PDR then attempts to prove that these states cannot be reached in a bounded number of
steps. If P holds, these proofs eventually allow the algorithm to discover an inductive invariant
proving that fact. Otherwise, PDR fails to compute a needed invariant and instead finds a
trace leading to an unsafe state.
Chapter 2. Background 16
2.4.3 Detailed Description
In greater detail, the algorithm proceeds as follows. The set of initial states I is represented
as a CNF formula also referred to as I, since these are merely different representations of the
same thing. The given safety property P is similarly represented by a formula in CNF, as is the
transition relation T . The first step is a precheck for zero-step and one-step counter-example
traces. The formula I ∧ (¬P ) checks for zero-step counter-example traces, as it is satisfiable
if and only if ¬P contains an initial state. Subsequently, the satisfiability of I ∧ T ∧ (¬P ′) is
checked to find any one-step counter-example traces as this formula is satisfiable if and only
if an initial state reaches an unsafe state in one step. The following example illustrates the
precheck step and will be referred to throughout this section.
Example 2.2. Consider the FSM depicted in Figure 2.5. In the figure, states are identified
by their binary encodings. For example, the state 100 is the state represented by the cube
(s1 ∧ s2 ∧ s3). Arrows indicate state transitions that are present in the transition relation. For
instance, the figure shows that 〈000, 001〉 ∈ T . The only unsafe state is 111. The precheck step
executes two SAT queries. The first is I ∧ ¬P , which checks if any initial state is unsafe. This
query is UNSAT, since the initial state is not 111. The second SAT query is I ∧ T ∧ ¬P ′, which
checks if any unsafe state is one-step reachable. This query is also UNSAT as 〈000, 111〉 6∈ T . At
this point, the precheck step is done, indicating that P is one-step invariant.
000 111001
010
011 100 110
101
I ¬P
P
Figure 2.5: Example finite state machine
After the precheck step, the algorithm begins computing invariants. It maintains a series
of predicates 〈F0 = I, F1, F2, ...〉 represented as CNF formulas. This is referred to as the
inductive trace. At every step in the process, each clause of each Fi is i-step invariant. Since
the clauses of Fi over-approximate the i-step reachable set, their conjunction (i.e., Fi itself)
Chapter 2. Background 17
also over-approximates the i-step reachable set. Each clause of Fi is also a clause of every Fj
for 1 ≤ j < i, as any i-step invariant clause is also j-step invariant for 1 ≤ j < i. In other
words, Fj+1 has a subset of the clauses of Fj for all j > 0. This implies that Fj ⊆ Fj+1, as Fj
simply has more clauses constraining it than Fj+1. Additionally, F0 is exactly the set of initial
states. Figure 2.6 below illustrates the intuition behind this process. In the example it shows
I ⊆ F1 ⊆ F2 ⊆ F3 and F3 ∩ ¬P = ∅. The latter proves that P is three-step invariant, as no
unsafe state (i.e., no ¬P -state) can be reached in 3 steps from an initial state.
¬P
I
F1
F2
F3
Figure 2.6: Example state-space over-approximations
Initially, PDR sets F1 = P as P is known to be one-step invariant from the SAT queries in
the precheck step. Next, PDR proceeds through a series of iterations 1, 2, ... in which iteration
i attempts to prove that P is (i+ 1)-step invariant. To do so, PDR repeatedly finds a state of
Fi that is one step away from a ¬P -state. This is done by finding a satisfying assignment to
the formula below:
Fi ∧ T ∧ (¬P ′) (2.4)
If Eq. 2.4 is satisfiable, then the satisfying assignment includes a state t ∈ Fi that is one step
from a violation of P . If t is indeed i-step reachable, then P does not hold. Assume for the
moment that t is not i-step reachable. This means that the clause ¬t is an i-step invariant that
can safely be conjoined to Fi. Doing so would prevent finding any further satisfying assignments
to Eq. 2.4 involving t. This procedure is demonstrated in the following example.
Example 2.3. Consider the FSM of Figure 2.5. After the precheck step explained in Exam-
Chapter 2. Background 18
¬P
I
F1
F2
F3
(a)
¬P
I
F1
F2F3
(b)
Figure 2.7: (a) A predecessor of an unsafe state in F3 (b) Approximations after refining F3
ple 2.2, P is known to be one-step invariant. Therefore, F1 is initialized to P . Subsequently,
the algorithm tries to find a satisfying assignment to the formula F1 ∧ T ∧ ¬P ′. Since F1 = P ,
we have 110 ∈ F1, and the formula is satisfied with 110 as the chosen F1-state. The SAT solver
could also have found an assignment involving the state 101, as it is also an F1-state that is
one step from an unsafe state.
Finding and blocking every state in this manner could be highly inefficient. PDR therefore
uses a process known as generalization to compute a more general invariant clause d, which is
then conjoined to Fi. The clause d contains a subset of the literals of ¬t. As d is still i-step
invariant, it blocks state t and may block other states that are not i-step reachable. Figure 2.7
illustrates this procedure. In Figure 2.7(a) a predecessor of an unsafe state is found in F3.
Assume this state is not actually three-step reachable. It is therefore removed from F3 as
depicted in Figure 2.7(b), along with some other states that are not three-step reachable. The
generalization procedure is key to the performance of PDR and is the subject of a great deal
of research [9, 13, 21, 25, 24, 3]. A full discussion is beyond the scope of this thesis, as it can
be considered a “black box” procedure.
In order to determine whether t is merely a “spurious” result of the over-approximate nature
of Fi or is actually reachable, PDR essentially calls itself recursively. In doing so, it may add
new clauses to the formulas Fj where 1 ≤ j < i. If i > 1, then this results in a search for an
Fi−1-state that reaches t in one step, which may lead to further recursive calls. If i = 1, then
this results in a search for an initial state that reaches t in one step. Notice that the inductive
trace F1, F2, ..., Fi may play a substantial role in the depth of recursion required to block state
Chapter 2. Background 19
t. If each Fj (j < i) is a poor over-approximation of the set of j-step reachable states (i.e., it
includes many states that aren’t j-step reachable), then many recursive calls may be required
to block t. However, if the formulas closely represent the j-step reachable sets, then few of these
spurious states are found and the recursive calls may terminate sooner. The following example
demonstrates an update of the inductive trace.
Example 2.4. Continuing the illustration of the algorithm from Example 2.3, consider the
FSM of Figure 2.5. Assume the SAT solver found 110 as an F1-state that is one step from an
unsafe state. This leads to a recursive query to determine if the state is one-step reachable.
PDR executes the SAT query I ∧ T ∧ (s1 ∧ s2 ∧ s3)′ to determine if an initial state can reach
the state 110 in one step. This query is UNSAT since 〈000, 110〉 6∈ T . Therefore, this state
must be removed from F1. Without generalization, this could be accomplished by adding the
clause (s1 ∨ s2 ∨ s3) to F1. However, it can be seen that no s1-state is one-step reachable, so
generalization may drop the literals s2 and s3. In this case, F1 = P ∧ (s1) is the resulting
updated formula.
This process may lead to a trace that reaches t, thereby disproving the property P . Al-
ternatively, it leads to a proof that t is not i-step reachable, allowing PDR to update Fi in
the manner described above. PDR continues finding such states until Eq. 2.4 is unsatisfiable,
implying that no Fi-state can reach a ¬P -state in one step. Since Fi over-approximates the set
of i-step reachable states, this implies that P is (i + 1)-step invariant. The following example
builds on the previous one to demonstrate this process.
Example 2.5. Continuing from Example 2.4, assume that F1 = P ∧ (s1) after generalization.
The algorithm then executes a SAT query using the formula F1 ∧ T ∧ ¬P ′. The only states
that are one step from a ¬P -state are 101 and 110. Therefore, the query is UNSAT, as neither
of these states satisfy the clause (s1).
Next, the algorithm begins constructing Fi+1 in preparation for the next iteration. It is
possible to simply set Fi+1 = P , since P is now known to be (i + 1)-step invariant. However,
some of the clauses from Fi may be provably (i+ 1)-step invariant. Additionally, the iteration
may have resulted in new clauses being added to F1, ..., Fi−1, and clauses from any Fj may be
Chapter 2. Background 20
provably (j + 1)-step invariant. To make use of these facts, the algorithm performs a clause
propagation step. For each Fj , PDR attempts to prove that each clause of Fj is (j + 1)-step
invariant. If so, the clause is added to Fj+1. For a clause c ∈ Fj , this is accomplished with the
following SAT query:
Fj ∧ T ∧ ¬c′ (2.5)
This query asks the question “can any Fj-state reach a ¬c-state?” As Fj includes every j-step
reachable state, this can be rephrased as “can any j-step reachable state reach a ¬c-state?” If
the answer is no (i.e., the query is unsatisfiable), then c also over-approximates the (j+ 1)-step
reachable set and can be added to Fj+1. This process is repeated for every clause from each of
F1, ..., Fi. The following example demonstrates this procedure.
Example 2.6. Continuing the illustration of the algorithm from the previous example, recall
that F1 = P ∧ (s1). The algorithm now tries to propagate the clause (s1). It executes the SAT
query F1∧T ∧(s1′). This query is satisfiable, because 011 ∈ F1, 100 ∈ (s1), and 〈011, 100〉 ∈ T .
Despite this result, note that the clause (s1) does over-approximate two-step reachability and
therefore could be included in F2. However, F1 would require additional clauses to support the
proof of this fact. Therefore, at the beginning of iteration 2, F2 = P .
During the process of clause propagation, it is possible that every clause from formula Fj is
propagated to Fj+1 for some j ≤ i. Since Fj is in CNF, it can be rewritten as Fj = c1 ∧ ...∧ cn.
Since every clause of Fj was propagated to Fj+1, Eq. 2.5 is unsatisfiable for every clause of Fj .
This implies that the following formula is unsatisfiable:
Fj ∧ T ∧ (¬c1 ∨ ... ∨ ¬cn)′ (2.6)
Note that (¬c1 ∨ ... ∨ ¬cn)′ = ¬F ′j . In other words, Fj ∧ T ∧ ¬F ′j is unsatisfiable, meaning Fj
is closed under the transition relation. By definition 2.5, Fj is an inductive invariant. It also
does not include any unsafe states by construction, so it proves that P holds. If this occurs, the
algorithm terminates. Otherwise, PDR begins a new iteration after clause propagation. The
following example demonstrates the termination of the algorithm and concludes this section.
Chapter 2. Background 21
Example 2.7. Continuing from Example 2.6, iteration 2 begins with F2 = P . As in Exam-
ple 2.3, the query F2 ∧ T ∧ ¬P ′ is satisfied by the states 110 ∈ F2 and 111 ∈ ¬P . Similar to
the earlier example, the state 110 is not two-step reachable, so the clause (s1 ∨ s2 ∨ s3) can
be conjoined to F2. Assume generalization yields the clause (s1 ∨ s2), and let c denote that
clause. Now F2 = P ∧ c and the SAT query F2 ∧ T ∧ ¬P ′ is UNSAT, ending iteration 2. Clause
propagation begins, and the formula F2 ∧ T ∧ ¬c is UNSAT. Since F2 only has clauses c and P ,
this implies that F2 is an inductive invariant proving that P holds. Indeed, one can verify that
F2 includes all states other than 101, 110, and 111. From Figure 2.5 it can be seen that F2
models the set of reachable states exactly.
The end result of Example 2.7 demonstrates a more intuitive notion of the inductive invariant
returned by PDR. No state in F2 can reach a state outside of F2. Since F2 includes all initial
states, this implies it over-approximates the reachable states. Additionally, F2 is merely a subset
of P , as it is represented by P with additional clauses conjoined. Since P includes F2 and F2
includes the reachable set, P includes the reachable set, implying it is invariant. The inductive
invariant F2 is simply a stronger version of P , that is, one that includes fewer states. The
benefit of an inductive invariant is that a single SAT query can be used to check for inductive
invariance, as explained in Definition 2.5. However, no simple method is known to check a
formula for invariance in the general case.
2.5 Summary
This chapter introduces background material relevant to the contributions of the thesis. First,
Boolean satisfiability is introduced along with its application to typical CAD problems. Tra-
ditional SAT-based debugging is presented next. Finally, a brief introduction to the model
checking algorithm of Property Directed Reachability is given.
Chapter 3
Traceless Debugging Using
Approximation and Unrolling
3.1 Introduction
When functional verification reveals an error, debugging begins in an attempt to localize and
correct the failure. Dynamic verification involves simulating the design using known input
vectors and observing that the response matches expectations. When dynamic verification
reveals an error, the known input vector and expected response provide an error trace that can
be used to guide a traditional SAT-based automated debugging tool [38]. Static verification
involves using formal methods to prove that the design implements a set of properties. Typically,
this consists of using a model checking algorithm such as PDR [9] to prove that a set of safety
properties holds. When a property fails, the model checker returns an error trace and SAT-based
debugging can be readily applied.
Static and dynamic verification can also reveal other types of errors for which no error trace
is available. With static verification, it is possible to use a model checker to prove that the
design implements a set of liveness properties. In this case, failure implies that a set of states
is unreachable in violation of the design specification. If the model checker reports a failure,
an error is clearly detected but an error trace is not readily available to guide an automated
debugging tool. Similarly, during dynamic verification it is possible to count the number of
22
Chapter 3. Traceless Debugging Using Approximation and Unrolling 23
times the design enters known desirable states. If it is revealed that a state is never entered,
this provides suspicion that the state is unreachable in violation of the specification. Static
verification could be used to confirm this fact. In this case as well, the error traces needed to
guide a SAT-based automated debugging tool are not readily available.
This chapter presents a novel automated technique to debug these kinds of failures in the
absence of an error trace [5]. The algorithm makes use of the liveness property itself in order
to debug the failure. Instead of using an error trace directly, PDR is used to compute an
over-approximation of the set of reachable states for a bounded number of clock cycles. The
algorithm proceeds through a user-specified number of iterations in which the i-th iteration over-
approximates the set of all i-step reachable states. Subsequently, a partial ILA similar to that
described in section 2.3 is constrained with the over-approximation states and the target states
and then converted to a SAT instance. For conceptual simplicity, in the initial formulation the
partial ILA consists of a single time-frame. The approach is later extended to use ILAs with
an arbitrary number of time-frames. Any solution found in iteration i represents a location
that can be changed such that a target state becomes (i + 1)-step reachable. Due to the use
of over-approximation, some solutions found may be “spurious.” These solutions are detected
using PDR and discarded. As a side effect of the detection process the over-approximations are
refined, potentially making further spurious solutions less likely.
This approach is effective, but finds only a subset of the solutions. In particular, it only
finds solutions that correct the failure by making a target state reachable one cycle after an
already-reachable state. An extension is presented that uses a partial ILA consisting of N
time-frames for a user-specified parameter N . This allows the approach to find solutions that
make a target state reachable N steps following an already-reachable state. This is referred to
as the “multi-cycle” formulation, in contrast to the “single-cycle” formulation that uses a single
time-frame.
Subsequently, a performance enhancement is presented that formulates the technique in a
monolithic fashion without using iterations. Instead of computing an over-approximation of
the set of error traces of length 1, then length 2, etc. the monolithic approach simply over-
approximates all traces less than or equal to a user-specified length in one step. Experiments
presented in Chapter 5 find that this yields a substantial speedup. This represents a trade-off
Chapter 3. Traceless Debugging Using Approximation and Unrolling 24
between runtime and resolution, as some information is lost by this formulation. In particular,
a solution found by the iterative formulations in iteration i may be used to make a target state
(i + 1)-step reachable. The monolithic formulation finds the same solution, but is unable to
indicate the minimum number of cycles in which the solution can reach the target state.
The remainder of this chapter is organized as follows. Section 3.2 presents related work and
the relevant notation. Section 3.3 presents both the single-cycle and multi-cycle variants of the
iterative approach. Section 3.4 presents the performance-enhanced monolithic formulation of
the approach. Finally, section 3.5 concludes the chapter.
3.2 Preliminaries
3.2.1 Previous Work
The authors in [14] tackle the similar problem of dead or unreachable code. The technique they
propose can be applied when verification or coverage analysis indicates a line of HDL code is
unreachable. It involves using a novel symbolic simulation technique to explore non-existent
code paths and determine which variables are the cause of the unreachable code. Subsequently,
suggested values that would make the code reachable are provided. The approach in [14]
complements the approaches presented in this thesis, as it provides different insight into the
source of the error. That approach provides the user appropriate values for variables to reach
the dead code. In essence, it informs the user what unreachable states may be responsible for
the unreachable code. The approaches presented in this thesis inform the user which locations
are responsible for the unreachable states.
3.2.2 Notation
Before presenting the algorithms, it is necessary to first introduce the relevant notation. Ta-
ble 3.1 contains a summary of key symbols used throughout this chapter, which are described
in greater detail here. The input to the debugging algorithm is an erroneous circuit C, a set
of unreachable target states S, and a set of suspect locations that is assumed in this chapter
to simply include every location in the design. The set of target states can be represented by
a propositional formula also called S. Let T denote the transition relation of circuit C. A
Chapter 3. Traceless Debugging Using Approximation and Unrolling 25
Table 3.1: Glossary of symbolsSymbol Meaning
C Erroneous circuit being debugged
K Iteration limit parameter to Algorithms 3.1, 3.2, 3.3
N Window-size parameter to Algorithms 3.2 and 3.3,
S Target state for an unreachability debug problem
S Current-state variables (registers)
S′ Next-state variables (inputs to registers)
X Primary input
Y Primary output
T Transition relation
Ten Enhanced transition relation used in debugging′ (prime) When applied to a formula over S, indicates the same formula over S′
I Set of initial states of the circuit
solution to the debugging problem is defined below.
Definition 3.1. A solution of error cardinality n is an n-tuple of suspect locations where a
change can be implemented to make some target state reachable.
Naturally, the task of a debugging algorithm is to find which of the suspect locations are
solutions. In the common case where n = 1, a solution is merely an individual suspect location
that can correct the error. If n > 1, then it consists of multiple locations that must be
simultaneously corrected. The algorithms presented in this chapter are sound but may be
incomplete. That is, every n-tuple of locations returned is guaranteed to be a solution. However,
it is not guaranteed that every solution is returned.
3.3 Iterative Formulation
This section presents a novel iterative formulation to debug unreachable states in the absence
of an error trace. It first presents the single-cycle formulation which finds solutions that make
some target state reachable one cycle following an already-reachable state. The algorithm
accepts as input an error cardinality n, a set of target states S, an iteration limit parameter K,
and an erroneous circuit C with transition relation T and initial states I. It returns locations
where a change can be made to make a target state (K + 1)-step reachable. This approach is
then extended to find solutions that can make a target state reachable up to N cycles after an
already-reachable state for a user-specified parameter N .
Chapter 3. Traceless Debugging Using Approximation and Unrolling 26
3.3.1 Single-Cycle Unreachability
The initial formulation of the algorithm involves sequence of iterations, each of which models
and debugs a single state transition from a reachable state to the target states. As calculating
the exact set of reachable states is intractable, in each iteration an over-approximation is used to
model the set of potentially-reachable states. Due to the use of an over-approximation spurious
solutions may be found. These are detected and discarded in a process that strengthens the
over-approximation, thereby reducing the chances of finding more spurious solutions.
In greater detail, the i-th iteration searches for solutions that may make a target state
(i + 1)-step reachable. Each iteration consists of two steps: reachability analysis and debug-
ging. The reachability analysis step simply extracts Fi from PDR, which is the formula over-
approximating the set of i-step reachable states. The PDR instance is directed at proving S is
unreachable, using the circuit’s original transition relation and initial states. The approxima-
tion Fi is merely an initial approximation of the i-step reachable set and may be strengthened
during the debugging step.
The debugging step constructs a SAT-based debugging instance, with the goal of finding
suspect locations that can be changed to allow for a state transition from some i-step reachable
state to a target state. Towards this end, the instance is constructed with a single copy of the
transition relation constrained by Fi at its input and S at its output. Intuitively, the current
set of states is constrained by Fi while the next state is constrained to S. The primary input
and output are left unconstrained, allowing the SAT solver to find solutions for every input
assignment. Finally, a cardinality constraint φn is added to find solutions of error cardinality
n. Letting Ten denote the enhanced transition relation, the resulting debugging instance can
be expressed as the following Boolean formula:
Fi ∧ T ∧ S ′ ∧ φn (3.1)
Figure 3.1 depicts the debugging instance of Eq. 3.1, where the shaded region represents the
actual set of i-step reachable states. The SAT solver is free to select candidate states from the
set Fi in order to reach the target state. In this context, a solution of cardinality n consists of n
active error-select lines and an Fi-state. As Fi over-approximates the set of all i-step reachable
Chapter 3. Traceless Debugging Using Approximation and Unrolling 27
states, inherent to the method is that some solutions found may not be exact. If the chosen
state from Fi is not actually reachable, then the active error-select lines do not necessarily
correspond to locations where a change can be implemented to make the target state reachable.
This means that some solutions to Eq. 3.1 may not be solutions to the debugging problem as
defined in definition 3.1. These are referred to in this thesis as spurious solutions.
Ten SFi
Figure 3.1: Representation of the debugging instance
It is necessary to detect such cases and reject them. On the other hand, any solution to
Eq. 3.1 for which the current state is reachable is indeed a solution. This is the intuition behind
the spurious solution detection methodology. Let t ∈ Fi be the chosen current state. To check
for a spurious solution, PDR is called to check if t is i-step reachable. If so, then the solution is
accepted, recorded, and a blocking clause is added to Ten. If instead t is not i-step reachable,
PDR will update its inductive trace by adding new clauses to Fi such that t and potentially
other states that are not i-step reachable are excluded from Fi. The debugging instance is
similarly updated by adding the new clauses and further solutions are sought. In either case,
the formula is then passed back to the SAT solver to find more solutions.
In practice, when a spurious solution is detected, the generalization procedure of PDR may
remove many other states that are not i-step reachable from Fi. This tends to result in a rapid
increase in the accuracy of the approximation and hence the debugging methodology. This is
shown in Figure 3.2 where state t has been found to be spurious in the current iteration of
the algorithm. As a result of PDR detecting that t is not i-step reachable and generalizing
Fi
t
(a)
Fi
t
(b)
Figure 3.2: Set Fi (a) initially (b) after detecting a spurious result from state t
Chapter 3. Traceless Debugging Using Approximation and Unrolling 28
that fact, the more accurate approximation shown in Figure 3.2(b) is derived. The improved
approximation reduces the chances of finding spurious solutions in the current iteration and
may improve the runtime of future iterations.
Notice that PDR is called to check for i-step reachability rather than checking for reach-
ability in any number of steps. A state t that is not i-step reachable does not always lead to
a spurious solution. It may be the case that t is reachable in a larger number of cycles than
i. It may also be the case that the solution coincidentally both makes t reachable and makes
S reachable from t. The extension presented in section 3.3.3 better handles the latter case.
The former case can be handled by increasing K, potentially at the cost of increased runtime.
The check for i-step reachability is preferred over the unbounded check for three reasons. The
first is related to precision. Iteration i seeks solutions that could make the target (i + 1)-step
reachable. If the algorithm used an unbounded call, it may admit solutions that make the tar-
get state reachable only in a larger number of clock cycles. The second reason is repeatability,
as different random conditions in the algorithm may yield different results if an unbounded
call is used. This is because an Fi-state that is not i-step reachable may lead to a solution in
one run of the algorithm. However, in another run that state may not be included in Fi as
PDR uses randomness in computing its approximations. Therefore, the solution would not be
found. The final reason is performance-related. Doing an unbounded call to PDR once for each
satisfying assignment found could be very inefficient. Intuitively, proving a state is unreachable
in a bounded number of cycles is a much easier problem than proving it in the unbounded case.
Pseudocode for the entire procedure is shown in Algorithm 3.1. The algorithm is called
SCUnreachability (single-cycle unreachability) as it finds solutions that reach the target
one cycle after a reachable state. It assumes the existence of a procedure ExtractState
that extracts the chosen state of Fi from a satisfying assignment of the debugging instance.
Line 3 executes PDR directed at proving S unreachable, implicitly computing Fi. Line 4 of
the algorithm constructs the initial debugging instance for the current iteration. The loop on
lines 5-13 repeatedly finds satisfying assignments. Line 7 checks if the found solution could be
spurious. If it is a real solution, line 8 records it, while line 9 blocks the solution from being
found again by adding a clause to Ten. If the solution contains active error-select lines e1, ..., en,
the blocking clause is (¬e1 ∨ ... ∨ ¬en). Otherwise, the solution is spurious and PDR updates
Chapter 3. Traceless Debugging Using Approximation and Unrolling 29
Algorithm 3.1 SCUnreachability(C,S,K, n)1: solutions = ∅2: for i in 0, 1, ...,K do3: PDR((S, I, T ),S, i)4: U = Fi ∧ Ten ∧ S ′ ∧ φn5: while (Solution = SAT (U)) 6= UNSAT do6: t =ExtractState(Solution)7: if PDR((S, I, T ), t, i) then8: solutions = solutions ∪ {Solution}9: Ten = Ten ∧BlockingClause
10: else11: U = Fi ∧ Ten ∧ S ′ ∧ φn12: end if13: end while14: end for15: return solutions
Fi to block state t. In this case, line 11 uses the newly-updated Fi to update the debugging
instance. The iteration continues until the debugging instance is unsatisfiable.
Note that all calls to PDR in this algorithm are done incrementally. That is, only the
first call starts “clean.” All other calls reuse the inductive trace F0, F1, ... from previous calls,
resulting in much better performance. This can be done because each call to PDR uses the same
transition relation and set of initial states. Therefore, each Fi is still an over-approximation of
the set of i-step reachable states. As more incremental calls are made to PDR, all reusing and
refining the same inductive trace, each Fi may begin to model the set of i-step reachable states
more closely. As a result, each run of PDR is expected to perform better than earlier runs.
The following theorem demonstrates that the algorithm finds the desired solution set in
each iteration.
Theorem 3.1. In iteration i, Algorithm 3.1 finds exactly the set of all solutions that make the
target state reachable one step from an i-step reachable state.
Proof. The debugging instance of Eq. 3.1 uses Fi as the current state set. Therefore, due to
the exhaustive nature of SAT-based debugging, it finds all solutions that make S reachable one
step from an Fi-state. Since Fi is an over-approximation of the set of i-step reachable states,
this includes every solution that makes S reachable one step after any i-step reachable state.
Additionally, the check for spurious solutions filters out any solutions where the chosen current
Chapter 3. Traceless Debugging Using Approximation and Unrolling 30
state is not i-step reachable. Therefore, the set of solutions found is precisely those that make
the target state reachable one step after an i-step reachable state.
The solution set of the overall algorithm is simply the union of the solution sets from each
individual iteration. Therefore, Theorem 3.1 implies that the algorithm finds every solution
that makes the target state (K + 1)-step reachable in one step from a K-step reachable state.
3.3.2 Sample Debugging Problem
This subsection steps through an example run of Algorithm 3.1 to better demonstrate its
operation. The shift register circuit shown in Figure 3.3 is used to demonstrate the algorithm.
Assume the initial state is the all-zero state represented by the cube (s1∧ s2∧ s3). Figure 3.3(a)
shows a correct implementation of the shift register. In Figure 3.3(b), the highlighted gate
has been changed introducing unreachable states. The reader can observe that it is impossible
to reach a state in which s2 = 1 or s3 = 1. The latter will be the target state given to the
algorithm (i.e., S = (s3)). To determine an appropriate value for K, note that for a three-bit
shift register, every state should be reachable within three cycles. Therefore, K will be set to
three. As such, the algorithm is called using target state S = (s3), iteration limit K = 3, and
error cardinality n = 1.
The algorithm begins with iteration i = 0. Since i = 0, the exact set of initial states is used
to constrain the debugging problem rather than an approximation. Therefore, it constructs
debugging instance U = I ∧ Ten ∧ (s′3) ∧ φ1. This debugging instance seeks solutions that
allow for a transition from the initial state directly to a target state. Two such solutions are
the output of register s3 and its input. Clearly, changing s3 itself to a constant 1 solves the
problem. Alternatively, setting its input (i.e., the OR gate) to a function that can evaluate to
1 also solves the problem. Additionally, each of the inputs to the OR gate is a solution, because
setting either of them to 1 makes the target state reachable. Finally, the output of register s2
is a solution, as a 1 at that location can propagate to s3 if primary input x2 is 1. The reader
can verify that no other solutions to the debugging instance exist. Blocking clauses are added
to Ten to block these five solutions, concluding the first iteration.
Subsequently, the i = 1 iteration begins. PDR is used to compute the initial approximation
Chapter 3. Traceless Debugging Using Approximation and Unrolling 31
DQ
FF
DQ
FF
DQ
FF
x1 x2
s1
s2
s3
(a)
DQ
FF
DQ
FF
DQ
FF
x1 x2
s1
s2
s3
(b)
Figure 3.3: (a) Correct implementation of shift register (b) Erroneous implementation in whichstates are unreachable
F1 of the set of one-step reachable states. The debugging instance F1∧Ten∧S ′∧φ1 is constructed
and solved. Which satisfying assignments are found depends on the formula F1 that PDR
computed. However, regardless of F1, the algorithm’s solution set is the same. In other words,
the set of real solutions is independent of F1, while the exact set of spurious solutions found is
not.
To illustrate the process of solving the debugging instance, assume F1 = S = (s3). This
clearly over-approximates the set of one-step reachable states and is disjoint from the target
state, making it a valid over-approximation for PDR to compute. The debugging instance
will find a solution that allows for a transition from an F1-state to a S-state. In fact, such
transitions are possible in the original circuit. For instance, any state in which s2 = 1 can
transition to a target state. As a result, any design location could be found as a solution to
the debugging instance. However, since states in which s2 = 1 are not one-step reachable, this
satisfying assignment represents a spurious solution. Therefore it will be discarded and F1 will
be updated to block the chosen state.
Assume it is updated to F1 = (s2) ∧ (s3), as this blocks all states in which s2 = 1. Now
the debugging instance seeks a solution that allows for a transition from an F1-state to a S-
Chapter 3. Traceless Debugging Using Approximation and Unrolling 32
state. All such solutions were found in iteration 0, so no further solutions exist. The iteration
therefore concludes. Iterations 2 and 3 still remain. However, these iterations would proceed
exactly the same as iteration 1. This occurs because there are no two-step reachable states that
are not also one-step reachable. Therefore, no new solutions can be found in these iterations.
As a result, the algorithm terminates with the five solutions found in iteration 0, as shown in
Figure 3.4, where the crosses indicate solutions.
DQ
FF
DQ
FF
DQ
FF
x1 x2
s1
s2
s3
Figure 3.4: Erroneous shift register circuit with solutions highlighted
Notice that the actual error source is not in the solution set. This is because any fix made
at the actual error source would not be able to make the target state reachable one cycle after
an already-reachable state. It would first require a 1 to propagate into s2, and then propagate
to s3 in the following cycle. In other words, it reaches the target state two cycles following
an already-reachable state. The technique presented in the next subsection better handles this
case. Note that if S = (s2) was the target state, the actual error source would be found as a
solution.
3.3.3 Multi-Cycle Unreachability
This section presents an extension of the methodology from the previous section. To find
solutions that reach S in more than one step from an already-reachable state, the approach
models a sequence of N state transitions that originates from a reachable state and ultimately
Chapter 3. Traceless Debugging Using Approximation and Unrolling 33
transitions to S. The iterative methodology remains similar, the primary difference is the
debugging instance, which is expressed as follows:
Fi ∧ TNen ∧ S ′ ∧ φn (3.2)
where TNen represents N copies of the transition relation configured as an ILA. The user-specified
parameter N is the number of cycles the algorithm is allowed to “look forward” for solutions.
Rather than finding solutions that make S (i + 1)-step reachable, the debugging instance of
Eq. 3.2 can return solutions that make the target state (i+N)-step reachable. As was the case
for Eq. 3.1, a solution to this equation includes a target state in Fi and n active error-select
lines.
Ten STen Ten...Fi
Figure 3.5: Representation of the multi-cycle debugging instance
As with the approach of the previous section, due to the use of approximation some of the
solutions to Eq. 3.2 may not be solutions to the unreachability problem. Intuitively, a solution
for which the current state is i-step reachable is non-spurious by the same argument used in
the previous section. Therefore, the extended algorithm uses the same mechanism as used in
Algorithm 3.1 to detect and discard spurious solutions.
Pseudocode for the procedure is shown in Algorithm 3.2. It is referred to as MCUnreach-
ability (multi-cycle unreachability), in contrast with Algorithm 3.1. Line 2 commences the
iteration. Lines 3 and 4 perform some initialization steps to compute the values of i and N that
will be used with Eq. 3.2. This is intended to handle the special case of iterations 0 ≤ i < N . In
these iterations, the debugging instance uses i+ 1 copies of the transition relation and F0 = I
as the current state set, allowing it to find solutions that make the target state (i + 1)-step
reachable. These iterations are analogous to iteration 0 in Algorithm 3.1. Subsequently, line 5
executes PDR directed at proving S is unreachable which computes Fj . The remaining lines
find solutions and check if they are spurious in a manner similar to Algorithm 3.1. As with
Chapter 3. Traceless Debugging Using Approximation and Unrolling 34
Algorithm 3.2 MCUnreachability(C,S,K,N, n)1: solutions = ∅2: for i in 0, 1, ...,K do3: N ′ = min(N, i+ 1)4: j = i−N ′ + 15: PDR((S, I, T ),S, j)6: M = Fj ∧ TN ′
en ∧ S ′ ∧ φn7: while (Solution = SAT (M)) 6= UNSAT do8: t =ExtractState(Solution)9: if PDR((S, I, T ), t, j) then
10: solutions = solutions ∪ {Solution}11: Ten = Ten ∧BlockingClause12: else13: M = Fj ∧ TN ′
en ∧ S ′ ∧ φn14: end if15: end while16: end for17: return solutions
Algorithm 3.1 all calls to PDR are incremental and reuse the inductive trace from the previous
calls. The following theorem demonstrates that Algorithm 3.2 returns the desired solution set.
Theorem 3.2. The solution set of Eq. 3.2 after removing potentially spurious solutions is
exactly the set of all solutions that make the target state reachable N cycles following an i-step
reachable state.
By Theorem 3.2, the algorithm finds all solutions that make the target state reachable N
cycles following any (K−N +1)-step reachable state. This includes all solutions that make the
target state reachable 1 cycle after a K-step reachable state. The solution set of Algorithm 3.2
is therefore a superset of the solution set of Algorithm 3.1. Indeed, Algorithm 3.1 is merely a
special case of Algorithm 3.2, as with N = 1, the two algorithms are identical.
In particular, the solution set may include solutions that would be discarded by Algo-
rithm 3.1. A solution to Eq. 3.1 with a current state that is not i-step reachable may not be
spurious if a fix at the same location can make the current state reachable and make S reachable
from that state. By modeling multiple state transitions, Algorithm 3.2 can handle this case. It
additionally handles cases where reaching S requires first reaching a sequence of up to (N − 1)
other unreachable states.
Chapter 3. Traceless Debugging Using Approximation and Unrolling 35
3.4 Performance-Driven Formulation
This section presents a performance-driven enhancement for the methodologies in the previ-
ous section. It is presented as a modification of Algorithm 3.2 but can also be applied to
Algorithm 3.1, which is merely a special case of Algorithm 3.2.
Given iteration limit K, Algorithm 3.2 must solve K + 1 debugging instances. Each itera-
tion finds a new over-approximation essentially enlarging the set of current states to consider.
This implies that each iteration has the potential to find solutions that were not possible in
earlier iterations. The proposed modification essentially skips the first K iterations and starts
by executing the final iteration directly. However, it must also account for rare cases where
solutions are only possible in small numbers of cycles. It therefore solves N different debugging
instances, using 1, 2, ..., N copies of the enhanced transition relation. In each case, FK is the
over-approximation used.
Note that in general, for Algorithm 3.2 to be of interest, K must be greater than N .
Otherwise, no approximations are ever used and the algorithm essentially degenerates to a
form of traditional SAT-based debugging. In most applications, it is further expected that K
is much greater than N . Solutions that are only found when N is close to K require reaching
a long sequence of unreachable states in order to reach the target, which is expected to be a
rare case. Indeed, this is the intuition that explains the effectiveness of Algorithm 3.1, as it is
merely the special case of Algorithm 3.2 in which N = 1.
While this approach returns the same solution set as the original algorithm, it sacrifices
resolution. In particular, a solution that Algorithm 3.2 finds in iteration i may be used to make
a target state (i+ 1)-step reachable. The modified approach would still find every solution, but
is unable to indicate the number of clock cycles in which the corrected design could be able to
reach the target state. The benefit of course comes from solving N problem instances when
compared to the K + 1 incremental instances of the previous approach.
Pseudocode for the performance-driven formulation of the algorithm is presented in Al-
gorithm 3.3, which is called MMCUnreachability (monolithic multi-cycle unreachability).
Similar to the earlier formulations, line 1 initializes the set of solutions while lines 5 through 13
find and verify solutions from the debugging instance. They key difference is that PDR is only
Chapter 3. Traceless Debugging Using Approximation and Unrolling 36
Algorithm 3.3 MMCUnreachability(C,S,K,N, n)1: solutions = ∅2: PDR((S, I, T ),S,K)3: for i in 1, 2, ..., N do4: M = FK ∧ T i
en ∧ S ′ ∧ φn5: while (Solution = SAT (M)) 6= UNSAT do6: t =ExtractState(Solution)7: if PDR((S, I, T ), t,K) then8: solutions = solutions ∪ {Solution}9: Ten = Ten ∧BlockingClause
10: else11: M = FK ∧ T i
en ∧ S ′ ∧ φn12: end if13: end while14: end for15: return solutions
called to compute the inductive trace once to the inductive trace F1, ..., FK on line 2, and the
only approximation used is FK on lines 4 and 11. Note that PDR may be called additional
times to check for spurious solutions. As in the previous formulations, all calls to PDR are
incremental.
3.5 Summary
This chapter presents an approach based on PDR and SAT-based debugging to debug unreach-
able states in the absence of an error trace. Section 3.3.1 introduces the technique and an initial
formulation thereof to find solutions that reach a target state one transition after an already-
reachable state. Section 3.3.3 extends the algorithm to handle cases where up to N transitions
are required. Finally, section 3.4 presents a performance-driven enhancement that accomplishes
the same goal while solving fewer debugging instances, but sacrificing some resolution.
Chapter 4
Traceless Debugging Without
Unrolling
This chapter presents an alternative approach that solves the same debugging problem as the
approach of Chapter 3. Unlike that approach, the one presented in this chapter is complete
by nature. Given an unreachable target state and error cardinality n, the methodology returns
every set of n locations where a change can be made to make the target state reachable. The
problem is formulated as an unbounded model checking problem, allowing the algorithm to find
solutions that may reach the target state in any number of clock cycles. The algorithm addi-
tionally returns an inductive invariant proving that no further solutions exist. These benefits
come at the cost of increased runtime compared to the approach of Chapter 3.
4.1 Introduction
The previous chapter presents a set of algorithms to debug unreachable states using a com-
bination of approximation and unrolling. In those algorithms, the problem is expressed as a
CNF formula in which each satisfying assignment contains a state t of the circuit and a set of
n active error-select lines. The active error-select lines correspond to a set of locations that
can be modified to correct the design if t is reachable. If t is unreachable, then the solution
may be spurious and is therefore discarded. Due to the inherent nature of this approach, it is
not guaranteed to explore the complete solution space of the problem. The user is required to
37
Chapter 4. Traceless Debugging Without Unrolling 38
select parameters that create a trade-off between runtime and the number of solutions found.
As such it is not complete, in that it does not necessarily return every solution to the problem.
This chapter presents a methodology that overcomes these limitations [6]. Rather than
expressing the problem as a CNF formula, it is expressed as an unbounded model checking
problem. Additionally, the algorithms presented do not require the user to set parameters
that balance runtime with the number of solutions found. Instead, they find every solution
to the problem, which comes with the cost of increased runtime when compared to the earlier
approach. Finally, these approaches solve the problem using only a single copy of the transition
relation, avoiding the need to construct a partial debugging ILA.
The algorithm works as follows. First, the transition relation of the circuit is enhanced by
inserting error-select registers. Each error-select register is associated with a suspect location
such that if the register is active, the suspect location is effectively disconnected from its fanout
and replaced by an arbitrary Boolean function. Non-active error-select registers have no effect
on the functionality of the circuit. Under this enhanced transition relation, a target state is
reachable if and only if there is a set of locations that can be changed to make it reachable
in the original design (i.e., a solution). In order to get meaningful results, the number of
simultaneously-active error-select registers is limited to exactly n using a cardinality constraint.
In order to find the solutions, the algorithm calls PDR to check if the target state is reachable
under the enhanced transition relation. If so, PDR returns a counter-example trace in which
exactly n error-select registers are active. This indicates that the n corresponding suspect
locations form a solution. The solution is subsequently blocked and PDR is executed to find
further solutions. This process is repeated until the target state is unreachable, indicating that
no further solutions exist. As an added benefit, PDR returns an inductive invariant that proves
this fact.
This chapter is organized as follows. Section 4.2 presents the initial formulation of the
algorithm and theorems demonstrating the soundness and completeness of this formulation.
Section 4.3 demonstrates that the potentially numerous calls to PDR in the initial formula-
tion can be executed incrementally, potentially decreasing runtime substantially. Section 4.4
presents an alternative algorithm that uses the structure of the circuit to potentially prune
large portions of non-solution space when the error cardinality is one, potentially resulting in
Chapter 4. Traceless Debugging Without Unrolling 39
decreased runtime.
4.2 Initial Formulation
This section presents the initial formulation of the debugging algorithm. The algorithm takes
as input an erroneous circuit C, error cardinality n, a set of suspect locations L = {l1, l2, ..., l|L|}
(the suspect set), and an unreachable set of target states S. The transition relation of the circuit
is denoted by T while its initial state set is I. The algorithm works by constructing an enhanced
model of the circuit and solving a series of unbounded model checking problems on this model,
the results of which indicate solutions to the debugging problem. It determines precisely which
n-combinations of locations in L are solutions. As such, it returns a set Lsol ⊆ Ln, where each
element of Lsol is a solution. In order to find every solution in the circuit, L must include every
line in the circuit. For cases in which the error cardinality is one the algorithm presented in
section 4.4 finds every solution in the circuit using smaller suspect sets.
This section is organized as follows. Section 4.2.1 explains how the enhanced model is
constructed and the intuition behind its behavior. Section 4.2.2 explains how the unbounded
model checking problems are constructed and solutions are found. Finally, section 4.2.3 presents
theorems demonstrating that the algorithm is both sound and complete.
4.2.1 Constructing the Enhanced Model
The algorithm involves solving a series of unbounded model checking problems using an en-
hanced FSM model of the circuit. This subsection explains the construction of the enhanced
model, along with the rationale behind its functionality. The enhanced model behaves like the
original circuit with certain suspect locations replaced by arbitrary Boolean functions. Which
suspect locations are replaced depends on assignments to the error-select registers, which are
new hardware added in the enhanced model of the circuit and depicted in Figure 4.1. Their
exact purpose and functionality is described later. Each model checking problem is crafted so
that the result indicates a solution from Ln or proves that no solutions exist in Ln \ Lsol, at
which point the algorithm terminates.
Towards this end, the algorithm constructs an enhanced model of the circuit M = (S ∪
Chapter 4. Traceless Debugging Without Unrolling 40
li
wi
ei
0
1
zi
D Q
FF
Figure 4.1: Error-select register and multiplexer at suspect location li
s1
x1
x2 l1
l2DQ
FF
(a)
s1
x1
x2
l1
l2
w1
w2
e1 e2
0
1
0
1
DQ
FF
(b)
Figure 4.2: (a) Original circuit (b) Circuit used to construct Ten (error-select registers omitted)
E, Ien, Ten). The enhanced model contains new hardware in the form of error-select registers
E = {e1, ..., e|L|}. It additionally has an enhanced initial state set Ien and an enhanced transition
relation Ten. The exact manner in which these are constructed is explained later in this section.
A trace of the circuit tC,0, ..., tC,n is said to be equivalent to a trace of the model tM,0, ..., tM,n
if and only if the original registers in the set S have the same value assignments in states tM,i
and tC,i for all 0 ≤ i ≤ n.
The enhanced transition relation is constructed from T by adding hardware to facilitate
debugging. For each suspect location li ∈ L, an associated free variable wi and error-select
register ei are added. The error-select register is made immutable (i.e., its value cannot change)
by feeding its output back to its input so that e′i = ei. As explained later, this allows an
association between the reachability of particular states under M with an n-tuple of suspect
locations being a solution. Subsequently, new hardware is added such that li is effectively
replaced by an arbitrary Boolean function when ei = 1. When ei = 0 the behavior of the
circuit is unaffected. This functionality is implemented by a multiplexer where the 0-input is
li, the 1-input is wi, and the select line is ei. The multiplexer output (denoted zi) is connected
Chapter 4. Traceless Debugging Without Unrolling 41
to the original fanout of li. This is similar to the error-multiplexer used in [38] and serves a
similar purpose. This construction is shown in Figure 4.1 while its CNF representation is shown
in Eq. 4.1 below. Eq. 4.2 shows the CNF representation of the added hardware that enforces
e′i = ei.
mux = (ei ∨ li ∨ zi)(ei ∨ li ∨ zi)(ei ∨ wi ∨ zi)(ei ∨ wi ∨ zi) (4.1)
reg = (ei ∨ e′i)(ei ∨ e′i) (4.2)
The enhanced transition relation is constructed from the circuit with the added hardware.
The multiplexers are each represented by four clauses, while the additional lines setting e′i = ei
are represented by the two clauses. The CNF encoding of Ten therefore has only O(|L|) more
clauses than that of T . To clarify the behavior of Ten, an example follows. It will be extended
throughout the chapter to explain various aspects of the enhanced model used to debug the
circuit.
Example 4.1. Consider the circuit of Figure 4.2(a). It has one state element s1, two primary
inputs x1 and x2, and two suspect locations are labeled as l1 and l2. Assume that the initial
state is s1 = 0 (i.e., I = (s1)). It is impossible for the circuit to reach a state where s1 = 1,
which is easily verified by noting that if s1 = 0 the AND-gate can never output a 1. This
unreachability can be diagnosed by using the target state condition S = (s1). In doing so,
the enhanced transition relation is constructed from the circuit shown in Figure 4.2(b). When
e1 = e2 = 0, this circuit behaves the same as the original circuit. When e1 = 1, l1 is replaced
by the free variable w1 which can assume any value during model checking. Similar behavior
applies to e2 and l2. It can be seen that when any ei = 1, this circuit behaves like the original
circuit with li replaced by an unknown function.
To debug unreachable states, the reachability of particular states under the enhanced model
must be associated with the fact that specific locations are solutions. Consider a trace of the
enhanced model. All states in the trace have the same value assignments to the error-select
registers. This occurs because the error-select registers are immutable, which means that after
they assume a value due to the chosen initial state, they remain constant. Assume e1, ..., em are
the active error-select registers in the trace. The enhanced model therefore behaves the same
Chapter 4. Traceless Debugging Without Unrolling 42
as the original circuit with locations l1, ..., lm replaced by unknown Boolean functions. It can
be concluded that the original circuit would have an equivalent trace if those locations were
simultaneously replaced by different functions.
For the trace to indicate a solution, it must satisfy additional properties. Specifically, it
must start from an I-state, end on a target state, and have exactly n active error-select registers
e1, ..., en where n is the error cardinality. From the discussion in the previous paragraph, the
original circuit would have an equivalent trace if locations l1, ..., ln were replaced by different
functions. Since the trace starts from an I-state and ends at a target state, replacing those
locations makes a target state reachable. This implies that l1, ..., ln constitutes a solution.
Solutions can therefore be found by finding traces that satisfy these three properties.
This motivates the construction of the enhanced set of initial states Ien. The original
registers of the circuit are constrained using I, ensuring that the initial states of the enhanced
model correspond to initial states of the original circuit. Since exactly n error-select registers
must be active, a cardinality constraint φn is applied to the error-select registers. The enhanced
initial state condition is therefore Ien = I∧φn. This completes the construction of the enhanced
FSM M = (S ∪E, Ien, Ten). The following example demonstrates and clarifies the purpose and
behavior of Ien.
Example 4.2. Consider again the example from Figure 4.2. The enhanced initial state con-
dition Ien is the conjunction of I = (s1) and the cardinality constraint φn. Assume for this
example that n = 1. Therefore, Ien = (s1)∧(e1∨e2)∧(e1∨e2). Enumerating all of the satisfying
assignments to that formula, the set of states in Ien is {(s1∧e1∧ e2), (s1∧ e1∧e2)}. Notice that
these states all share two key properties. The first is that all have s1 = 0, corresponding to
I-states. Additionally, every state of Ien has exactly one active error-select register, satisfying
the cardinality constraint. These are exactly the initial states that may appear at the beginning
of traces that indicate solutions.
4.2.2 Searching for Solutions
As discussed earlier, specific traces of the enhanced model correspond to solutions. This sub-
section explains how PDR is used to find traces that satisfy the relevant properties. In the
Chapter 4. Traceless Debugging Without Unrolling 43
enhanced model M = (S ∪ E, Ien, Ten), the initial state condition Ien ensures that the traces
PDR finds begin on an I-state with exactly n active error-select registers, as required. The
enhanced transition relation ensures that the same n error-select registers are active in every
state in the trace. All that remains is to force the trace to end on a target state. To do so,
PDR is executed using S as its unsafe state set. If any target state is reachable, PDR will
return a counter-example trace that meets the requirements previously described. As such, if
e1, ..., en are the active error-select registers in the counter-example then l1, ..., ln is a solution
of cardinality n. Continuing the illustration of the algorithm from example 4.2, the following
example demonstrates the procedure of finding a solution.
Example 4.3. Recall from example 4.2 that the target state condition is S = (s1) and the
initial state condition is I = (s1). PDR(M,S,∞) returns the following two-step counter-
example trace: 〈t0, t1〉 = 〈(s1∧ e1∧ e2), (s1∧ e1∧ e2)〉. Notice that t0 corresponds to an I-state,
t1 is a target state, and e2 is the active error-select register. In states t0 and t1, the model
behaves identically to the circuit with l2 replaced by an unknown function. Since t0 is an initial
state and t1 is target state, replacing l2 with a different function makes a target state reachable
in the original circuit. This indicates that location l2 is a solution. Indeed, the reader can verify
that replacing the AND-gate that drives l2 with an OR-gate makes the target state reachable.
Other corrections to the problem are also possible.
After finding a solution, it is blocked, allowing the algorithm to find any remaining solutions.
For a solution l1, ..., ln of cardinality n, this is accomplished by conjoining the clause (¬l1∨ ...∨
¬ln) to Ien. This prevents PDR from finding any further traces that indicate the same solution.
Eventually, the algorithm reaches a point at which the target state is unreachable under
the enhanced model. This occurs under one of two conditions. The more common case is
when all solutions have been found and all remaining states of Ien cannot reach a target state.
Alternatively, it occurs when all possible solutions are blocked, and therefore no states satisfy
Ien. In both cases, PDR will terminate indicating the target state is unreachable. The following
example demonstrates blocking a solution and terminating when no further solutions exist.
Example 4.4. Continuing with the example of Figure 4.2, after solution l2 is found, the
enhanced initial state condition becomes Ien = (s1)∧(e1∨e2)∧(e1∨e2)∧(e2), leaving (s1∧e1∧e2)
Chapter 4. Traceless Debugging Without Unrolling 44
Algorithm 4.1 Unreachability(C,S, L, n)1: Lsol = ∅2: S = state element set of C3: I = initial state condition of C4: Ten, E = ConstructModel(C,L)5: Ien = I ∧ φn6: M = (S ∪ E, Ien, Ten)7: while PDR(M,S,∞) == Reachable do8: e1, ..., en = active error-select registers in counter-example9: B = (¬e1 ∨ ... ∨ ¬en)
10: Lsol = Lsol ∪ {l1, ..., ln}11: Mblk = (S ∪ E, Ien ∧B, Ten)12: M = Mblk
13: end while14: invariant = inductive invariant extracted from PDR15: return (Lsol, invariant)
as the only remaining initial state. It is easily verified that this state cannot reach any target
states. This implies that location l1 is not a solution, which is indeed true. To reach a state
where s1 = 1, the output of the AND-gate must be 1. In the initial state s1 = 0 and s1 is an
input to the AND-gate, so it will always output 0 regardless of the value at l1. Therefore, there
is no way to modify the circuit at l1 to rectify the unreachability of the target state.
Pseudocode for the procedure is shown in Algorithm 4.1. In that description, algorithm
ConstructModel receives input C and L and returns the enhanced transition relation and
error-select register set. Lines 4 through 6 construct the enhanced FSM model. Lines 7 to 13
contain the main loop that finds solutions. If a solution exists it is extracted on line 8 and
added to Lsol on line 10. Subsequently, line 11 constructs a new model Mblk in which the
solution is blocked. The distinction between M and Mblk is included to simplify the discussion
of the performance optimization presented in section 4.3. As the number of suspect locations
is finite, the loop must terminate eventually. At this point, PDR indicates S is unreachable
and the inductive invariant is extracted on line 14. Finally Lsol and the invariant proving the
completeness of the solution set are returned on line 15. In contrast with the algorithms of the
previous chapter, Algorithm 4.1 is referred to merely as Unreachability as it is complete and
therefore determines the exact set of solutions definitively.
Chapter 4. Traceless Debugging Without Unrolling 45
4.2.3 Soundness and Completeness
This section presents two theorems demonstrating that Algorithm 4.1 is both sound and com-
plete with respect to its input set. In this context, soundness implies that every n-tuple of
suspect locations Lsol is a solution. Completeness requires that every solution in Ln is included
in Lsol. Theorem 4.1 below uses the nature of traces of M to prove that Unreachability is
sound.
Theorem 4.1. Upon termination every element of Lsol is a solution.
Proof. Line 7 finds a counter-example trace t0, ..., tm of M . As it is a counter-example trace,
it starts at an initial state and ends at a target state, implying t0 ∈ Ien and tm ∈ S. As
Ien = I ∧ φn, the cardinality constraint φn ensures that exactly n error-select registers are
assigned to 1 in state t0. Let e1, ..., en denote the active error-select registers.
Since the error-select registers are immutable (i.e., their value assignments never change),
each state in the trace also has e1, ..., en active and all other error-select registers inactive.
Further, the fact that t0 ∈ Ien ensures that t0 corresponds to an initial state of C. Therefore,
an equivalent trace also exists for C if l1, ..., ln are replaced by unknown Boolean functions. As
tm is a target state, S can be made reachable in C by replacing those locations, indicating that
they are a solution. All elements of Lsol are found in this manner, implying that every element
of Lsol is a solution.
Since Lsol is the solution set of Algorithm 4.1, Theorem 4.1 proves that the algorithm is
sound. Theorem 4.2 below shows that the approach is also complete. That is, it returns all
solutions from Ln.
Theorem 4.2. Upon termination Lsol contains every solution from Ln.
Proof. Lines 7 to 13 are executed to find solutions until all target states are unreachable.
First, consider the case when Lsol includes all(|L|n
)possible solutions at the termination of
Algorithm 4.1. Clearly, this includes every n-combination in Ln, and therefore every solution.
Now assume the opposite case, Algorithm 4.1 terminates when all target states are unreach-
able. Let Lrem denote the set of n-combinations of L that are not elements of Lsol. It suffices
to show that the unreachability of all target states implies that no solutions exist in Lrem.
Chapter 4. Traceless Debugging Without Unrolling 46
Consider the final call to PDR that returns Unreachable, implying that all target states are
unreachable. This means that there are no traces of M that end in a target state.
Consider a fixed initial state IC of C. There are |Lrem| corresponding initial states of M ,
each with a different set of n active error-select registers. Since all target states are unreachable,
none of these states can reach a target state under M . This implies that for every element
(l1, ..., ln) ∈ Lrem, it is impossible to replace l1, ..., ln with different Boolean functions such that
S is reachable from IC in C. Since IC is an arbitrary initial state of C, this holds for every
initial state of C.
Therefore, none of the elements of Lrem are solutions which implies that when Algorithm 4.1
terminates Lsol contains every solution from Ln.
As the algorithm only examines locations from the suspect set L, it cannot find solutions
that are not in that set. If every solution in the circuit is needed, the user may choose L to
include every location. As a larger suspect set may increase runtime, the algorithm offers the
user a trade-off where one can limit the suspect set L to locations suspected to be error sources.
For instance, an engineer may introduce a bug when modifying a specific module. In this case,
it may be desirable to restrict the suspect set to said module and treat the rest of the design
as correct. An additional case where the suspect set is restricted to specific locations is the
algorithm presented in section 4.4. That algorithm makes multiple calls to Algorithm 4.1, each
with a different suspect set.
4.3 Incremental Application of PDR
Algorithm 4.1 makes one call to PDR for each solution it finds. In the worst case this will require
O((|L|n
)) calls to PDR. In the common case where n = 1, this simplifies to O(|L|). For each
algorithm presented in Chapter 3, it was noted that each call to PDR is executed incrementally
by reusing and refining the inductive trace from the previous call. This is possible because each
call to PDR uses exactly the same model, as the same initial state set and transition relation
are used throughout the algorithm. Algorithm 4.1 modifies the model between calls to PDR,
so it is not immediately obvious that the calls can be done incrementally.
Intuitively, each call to PDR in Algorithm 4.1 uses a very similar model. The only difference
Chapter 4. Traceless Debugging Without Unrolling 47
between consecutive PDR calls is that a single solution is blocked. In other words, a combination
of n error-select registers is forced to 0 in Ien. After making such a minor change to the model,
it is expected that many of the invariants would remain valid [13]. In fact, as shown later in
this section, all of the invariants generated by PDR remain valid. The allows each call to PDR
to reuse the entire inductive trace from previous calls.
As is explained in section 2.4, PDR maintains an inductive trace F = 〈F0, F1, ...〉. Each
Fi is a predicate represented by a CNF formula. Each Fi is i-step invariant, as is each clause
of Fi. As i-step invariants of M = (S ∪ E, Ien, Ten), each clause c also includes every state of
Ien, i.e., Ien ⊆ c. The work of [13] presents an invariant finder that extracts the portion of the
invariants computed for one model that are also invariant for another model. This provides a
means for the reuse of invariants after modifying the model in the general case. However, due
to the nature of the model updates in Algorithm 4.1 it is possible to reuse the entire inductive
trace without any additional verification. To reuse a clause c of Fi with the new model, it
must maintain the properties above with respect to the new model. That is, c must include
every initial state and every i-step reachable state of the new model. The rest of this section
demonstrates that this is the case for every clause of every Fi in Algorithm 4.1.
Consider the state of Algorithm 4.1 immediately after executing line 12. At this point
M is the FSM model used to find a solution and Mblk is the FSM model after blocking that
solution, while B is the blocking clause for the solution. Since I ′en is simply Ien with additional
constraints, it is immediately obvious that I ′en ⊆ Ien. Since Ien ⊆ c, it is trivially true that
I ′en ⊆ c.
This leaves only to show that all i-step invariant clauses of M are also i-step invariant for
Mblk. This ultimately arises from the fact that the reachable state set of Mblk is a strict subset
of that of M . As a result, any over-approximation of the set of states reachable under M is
also an over-approximation of the set of reachable states of Mblk. An i-step invariant is simply
an over-approximation of the set of i-step reachable states, so intuitively the i-step invariant
clauses of M are also i-step invariant clauses of Mblk. Lemma 4.3 below provides a first step
towards proving this by showing that the clause conjoined to I ′en does not make any new states
reachable.
Chapter 4. Traceless Debugging Without Unrolling 48
Rc
(a)
cR′
(b)
Figure 4.3: State space representation of (a) M and (b) Mblk
Lemma 4.3. All B-states that are not i-step reachable under M are not i-step reachable under
Mblk for all i ≥ 0.
Proof. Consider a state t ∈ B that is not i-step reachable under M . Assume towards a con-
tradiction that it is i-step reachable under Mblk. For some m ≤ i the model Mblk must have
a trace t0, ..., tm where t0 ∈ I ′en and tm = t. As all literals of B are error-select registers and
t is B-state, t0 is also a B-state. This is because the error-select registers cannot change their
value assignments.
Both models M and Mblk have the same transition relation. Therefore each transition in
the trace is valid under M . As a result, t is only unreachable under M if t0 6∈ Ien. This is a
contradiction as t0 ∈ I ′en and it has already been shown that I ′en ⊆ Ien. Therefore, all B-states
that are not i-step reachable under M are not i-step reachable under Mblk.
As shown, the model updates in Algorithm 4.1 do not make any unreachable B-states
reachable. Further, they clearly make all ¬B-states unreachable. These two facts imply that
no states unreachable under Mblk are reachable under M . Letting R (Rblk) denote the set of
states reachable under M (Mblk), it is clear that Rblk ⊆ R. It only remains to show how this
implies that all i-step invariants M are i-step invariant for Mblk. To visualize this fact, consider
a clause c that is invariant for M . As an invariant clause of M , it must over-approximate R.
This is depicted in Figure 4.3(a) where the set of c-states contains R. Figure 4.3(b) shows that
the set of c-states also over-approximates Rblk.
The above discussion focuses on invariant clauses but the same reasoning applies to i-step
invariant clauses. The following theorem proves this claim.
Theorem 4.4. All clauses that are i-step invariant under M are i-step invariant under Mblk.
Chapter 4. Traceless Debugging Without Unrolling 49
Proof. Let c be a clause that is i-step invariant under M . Assume towards a contradiction
that c is not i-step invariant under Mblk. This implies that there is a state t 6∈ c that is i-step
reachable under Mblk. Additionally, since c is i-step invariant for M and t 6∈ c, t must not be
i-step reachable under M .
Since t is i-step reachable under Mblk and not M , by Lemma 4.3 it is a ¬B-state. No
¬B-states are reachable under Mblk, contradicting the assumption that c is not i-step invariant
under Mblk.
Theorem 4.4 proves that it is possible to reuse the inductive trace from previous calls to
PDR. That is, the execution of PDR on line 7 of Algorithm 4.1 can be done incrementally. This
can potentially result in a substantial reduction of the algorithm’s runtime.
4.4 Efficient Suspect Selection
The algorithm presented in the previous section is both sound and complete with respect to
its input set of suspect locations. However, in order to find every solution in the circuit, it is
required that the suspect set includes every location in the circuit. This section presents an
iterative approach to solve the same problem in which each iteration calls Algorithm 4.1 with
a different suspect set [7]. Each iteration’s suspect set is constructed to limit the number of
suspects considered across all iterations. In a given iteration, each solution found is used to
add suspect locations to the suspect set of the next iteration. The algorithm presented in this
section is sound. It is also complete if the error cardinality is one. It is therefore assumed for
the rest of this section that the error cardinality n = 1.
Figure 4.4: Example circuit with fanout branches highlighted
Chapter 4. Traceless Debugging Without Unrolling 50
The algorithm begins with a preprocessing step. This step consists of computing the set of
all fanout branches. A fanout point is simply a line in the circuit that fans out to more than
one other location. Figure 4.4 illustrates this concept graphically. Let fanout(l) denote the
set of locations to which l fans out. The set of fanout branches is F = {l : |fanout(l)| > 1}.
Additionally, let set R denote the set of all registers that appear in the target state predicate
S. The rationale behind the use of these sets is explained later in this section.
After preprocessing to compute sets F and R the algorithm proceeds through a series of
iterations, each of which calls Algorithm 4.1 once. Each iteration uses a different suspect set.
The suspect sets are constructed in a manner intended to limit the total number of suspects
considered across all iterations. The suspect set of iteration i is denoted Li. The initial suspect
set is constructed as L1 = R ∪ F , thereby including all fanout branches and all registers that
appear in the target state predicate. Subsequently Algorithm 4.1 is executed using this suspect
set and returning a set of solutions S1. A new suspect set L2 is computed from S1 and used in
the subsequent iteration. In general, after iteration i the suspect set of the subsequent iteration
Li+1 is computed as shown in Eq. 4.3 below.
Li+1 = {l ∈ Si : fanin(l)} \i⋃
j=1
(Lj) (4.3)
where fanin(l) denotes the set of all fanin for location l. Suspect set Li+1 contains the fanin of
every solution found in iteration i minus the suspect sets for all previous iterations. Therefore
it does not include any locations used as suspects in a previous iteration. This ensures that
no location is a suspect more than once. It also guarantees that the algorithm terminates, as
otherwise a group of solutions that form a cycle in the circuit could result in an infinite loop.
The reasoning behind this approach is intuitive. Consider a location l that is also a solution.
As l is a solution, the Boolean function at l can be changed to correct the design. This implies
one of two possibilities. First, it may also be possible to replace an element of fanin(l) to
correct the problem, and therefore one or more elements of fanin(l) may also be solutions.
This occurs if the needed change at l is equivalent to modifying only one of its fanin locations.
Alternatively, it may not be possible to correct the problem at any element of fanin(l) and
therefore l is a solution but no elements of its fanin are. As a result, the fact that l is a solution
Chapter 4. Traceless Debugging Without Unrolling 51
is insufficient information to decide whether or not the elements of fanin(l) are solutions.
On the other hand, if a location l′ is not a solution then there is no way to modify the design
at location l′ to make S reachable. Consider a location l ∈ fanin(l′). If l has other fanout
besides l′, then it may be possible for l to be a solution even if every member of its fanout is
not. This can occur if multiple elements of the fanout need to be simultaneously corrected to
fix the error. Similarly, if l ∈ R then it may be the case that l is a solution but no element of
its fanout is. However, if fanout(l) = 1, l 6∈ R and the single fanout of l is not a solution, then
l is also not a solution. The lemma below formalizes this intuition.
Lemma 4.5. For a location l 6∈ R with |fanout(l)| = 1, if the single element of fanout(l) is
not a solution, then l is not a solution.
Proof. Suppose that l is a solution and that the single element l′ ∈ fanout(l) is not a solution.
This implies that it is possible to replace l by some other Boolean function to make some S-state
reachable. Since l 6∈ R but l is a solution, l must be in the cone-of-influence of some element
of R. Otherwise, a change at l would not be observable at R and could not correct the error.
This implies that either l′ ∈ R or l′ is also in the cone-of-influence of an element of R since l
fans out to l′ and nothing else.
However, there is no way to replace l′ with a different Boolean function to make an S-state
reachable. Since l′ is the only fanout of l, this implies that it is possible to replace l in a manner
that changes the behavior at R but not l′. This is a contradiction since the behavior of the
circuit must also change at l′ to be observable at R.
This demonstrates the rationale behind constructing the initial suspect set as F ∪ R. The
set F includes every location l with |fanout(l)| > 1. As a result, every l 6∈ L1 satisfies l 6∈ R
and fanout(l) ≤ 1. By Lemma 4.5, every l 6∈ L1 can be removed from consideration if the
single element of fanout(l) is not a solution. Essentially, the initial suspect set is constructed
to handle every case that the lemma cannot.
Pseudocode for the approach is shown in Algorithm 4.2. The algorithm is named SE-
Unreachability (single error unreachability) as it assumes that the error cardinality is one.
Lines 1 and 2 construct sets R and F , respectively. Line 3 constructs the initial suspect set.
Lines 5 through 8 contain the main loop that repeatedly calls Algorithm 4.1. Within the main
Chapter 4. Traceless Debugging Without Unrolling 52
Algorithm 4.2 SEUnreachability(C,S)
1: R = state elements in the formula defining S2: F = {l : |fanout(l)| > 1}3: L1 = F ∪R4: i = 15: while Si =Unreachability(C,S, Li, 1) 6= ∅ do6: Li+1 = {l ∈ Si : fanin(l)} \⋃i
j=1(Li)7: i = i+ 18: end while9: S =
⋃ij=1 Sj
10: return S
loop, line 6 constructs the suspect set for the next iteration according to Eq. 4.3. Finally, line 9
constructs the solution set which is returned on line 10.
As mentioned, the algorithm assumes that the error cardinality n = 1. The completeness of
the algorithm rests on this assumption, as Lemma 4.5 only holds when n = 1. The remainder
of this section focuses on demonstrating the soundness and completeness of the algorithm
under this assumption. In this context, soundness implies that every location returned is
a solution. Completeness implies that every solution in the circuit is found. The following
theorem demonstrates the soundness of the algorithm. It follows trivially from the soundness
of Algorithm 4.1.
Theorem 4.6. Every location in S is a solution.
Proof. Since the set S only includes locations identified as solutions by Unreachability, every
location in S is a solution by Theorem 4.1.
As S is the solution set of Algorithm 4.2, Theorem 4.6 proves the algorithm is sound.
Theorem 4.7 below proves that algorithm is complete, which follows from the construction of
the initial suspect set and Lemma 4.5.
Theorem 4.7. When Algorithm 4.2 terminates, S includes every solution.
Proof. The initial suspect set is L1 = F ∪ R. Since Unreachability is complete by Theo-
rem 4.2, S includes every solution in F ∪R after iteration 1.
Consider an arbitrary location l 6∈ L1. This implies l 6∈ R and |fanout(l)| ≤ 1. If
|fanout(l)| = 0 and l 6∈ R, l is not a solution as it is not in the cone-of-influence of any
Chapter 4. Traceless Debugging Without Unrolling 53
element of R. Alternatively, |fanout(l)| = 1, and by Lemma 4.5 l is only a solution if it is in
the fanin of another solution. On line 6, the algorithm constructs a new suspect set including
the fanin of all solutions found in the previous iteration. It continues in this manner until it
reaches an iteration in which no solutions are found. As a result, any location in the fanin of a
solution is included in a suspect set passed to Unreachability. Therefore, by Theorem 4.2,
it is identified as a solution and included in S. Since l was an arbitrary location, this applies to
every l 6∈ (F ∪R) and S therefore includes every solution when the algorithm terminates.
Since S is the solution set of Algorithm 4.2, Theorem 4.7 proves that the algorithm is
complete. In contrast with the previous approach, the algorithm does not require the user to
specify a set of suspect locations. Since the algorithm selects the suspect sets it examines, it
essentially performs this step for the user.
4.5 Summary
This chapter presents an algorithm to debug unreachable states using repeated executions of
PDR. Section 4.2 presents the initial formulation of the algorithm and presents theorems demon-
strating its soundness and completeness. Section 4.3 proves that the algorithm is functionally
equivalent when the repeated calls to PDR are executed incrementally. Section 4.4 presents an
optimization of the algorithm that uses the structure of the circuit to potentially prune large
portions of non-solution space when the error cardinality is restricted to one.
Chapter 5
Experimental Results
A prototype unreachability debugging engine is implemented based on the algorithms in this
thesis. The tool is developed using a reference implementation of PDR [9]. The SAT engine
used in Algorithms 3.1, 3.2, and 3.3 is MiniSat v2.2.0 [17]. The PDR implementation also
uses the same SAT engine internally. All experiments are executed on a single core of an i5-
3570K 3.4 GHz workstation with 16GB of RAM. Experiments are performed on designs from
OpenCores [31]. Each experiment uses an error cardinality of one. All experimental runs are
timed out after 4 hours. Each problem instance is created by injecting a design error such
as complementing conditions in if-statements, introducing incorrect state transitions, changing
operators in expressions, etc. These are typical design errors introduced unintentionally by
human designers. Each design error is chosen such that it makes at least one state unreachable
in violation of the design specification.
In this chapter, the five presented algorithms are referred to by their intuitive names (e.g.,
Unreachability) rather than indices (e.g., Algorithm 4.1) so that the reader may more easily
distinguish between them. For convenience, Table 5.1 below summarizes the algorithms’ names,
the sections of the thesis in which they may be found, and their distinguishing features.
5.1 Algorithm Comparison
This section presents experiments comparing and contrasting the algorithms presented in this
thesis. Towards that end, Table 5.2 and Table 5.3 show the runtime and number of solutions
54
Chapter 5. Experimental Results 55
Table 5.1: Summary of presented algorithmsName Index Section Distinguishing Feature
SCUnreachability Algorithm 3.1 3.3.1 Single-cycle unreachability
MCUnreachability Algorithm 3.2 3.3.3 Multi-cycle unreachability
MMCUnreachability Algorithm 3.3 3.4 Optimized monolithic formulation
Unreachability Algorithm 4.1 4.2 Canonical complete algorithm
SEUnreachability Algorithm 4.2 4.4 Single error unreachability
found for each of the algorithms. The first five columns show the name of the problem instance,
number of gates in the design, number of registers in the design, number of suspect locations
used in every algorithm, and the number of solutions present, respectively. The remaining
columns show the runtime, speedup (relative to Unreachability), number of solutions found,
and percentage of all solutions found for each algorithm. For Unreachability, percentage
of solutions found and speedup are omitted, as it is the baseline against which the other algo-
rithms are compared. The number of gates and registers are derived from an AND-INVERTER
representation [10] of the circuit. The size of the complete solution set is determined by a run
of Unreachability, as it is proven to find the complete solution set of the problem. Speedups
are computed relative to Unreachability, as it is considered to be the canonical baseline
approach.
As expected, the results confirm that Unreachability and SEUnreachability find the
same solution sets. It is additionally confirmed that MMCUnreachability and MCUn-
reachability find the same solution sets. Finally, it can also be seen that the algorithms of
Chapter 3 return a subset of the solutions returned by the algorithms of Chapter 4. These
results confirm that the functionality of the developed algorithms is as expected.
From the results in the tables, it can be seen that SEUnreachability achieves a 30.7x me-
dian speedup over Unreachability. In some cases, these approaches are able to outperform
the approximation-based approaches while still finding the complete solution set. This demon-
strates the importance of selecting the parameters K and N correctly in the approximation-
based algorithms. For these experiments, the values were chosen to be reasonable across the
entire set of experiments rather than for the specific needs of the design. This can result in
the algorithm performing more work than needed. In general, these parameters could be set
using a structural analysis of the circuit. For instance, if there is a pipeline in the circuit, N
Chapter5.
Experim
entalResu
lts
56
Table 5.2: Runtime and solutions foundBenchmark Unreach- SEUnreachability SCUnreachability
ability (K = 10)benchmark # # |L| # time # time spee- # % time spee- # %
gate reg sol (sec) sol (sec) dup sol sol (sec) dup sol solac97 ctrl 12607 2325 14967 13 490.8 13 18.2 27.0x 13 100% 220.9 2.2x 13 100.0%divider 3555 360 3915 38 419.4 38 12.2 34.3x 38 100% 424.7 1.0x 5 13.2%mrisc core 8206 1328 9573 18 276.4 18 6.2 44.8x 18 100% 5.5 50.4x 18 100.0%spi 1020 136 1156 23 7.6 23 0.7 11.5x 23 100% 2.9 2.6x 11 47.8%usb core 5010 534 5545 6 644.4 6 2.3 279.0x 6 100% 3.2 201.1x 6 100.0%wb 390 61 451 193 3.6 193 0.4 8.2x 193 100% 0.3 10.7x 2 1.0%
GEOMEAN 32.1x 9.3xAVERAGE 100% 60.3%MEDIAN 30.7x 100% 6.7x 73.9%
Table 5.3: Runtime and solutions found (K = 10)Benchmark MMCUnreachability MCUnreachability MMCUnreachability
(N = 1) (N = 5) (N = 5)benchmark # # |L| # time spee- # % time spee- # % time spee- # %
gate reg sol (sec) dup sol sol (sec) dup sol sol (sec) dup sol solac97 ctrl 12607 2325 14967 13 441.8 1.1x 13 100.0% 28.0 17.6x 13 100.0% 24.7 19.9x 13 100.0%divider 3555 360 3915 38 43.4 9.7x 5 13.2% 249.9 1.7x 21 55.3% 6.1 68.6x 21 55.3%mrisc core 8206 1328 9573 18 3.4 82.3x 18 100.0% 17.3 16.0x 18 100.0% 15.1 18.3x 18 100.0%spi 1020 136 1156 23 4.8 1.6x 11 47.8% 12.0 0.6x 23 100.0% 1.7 4.5x 23 100.0%usb core 5010 534 5545 6 1.7 375.5x 6 100.0% 10.3 62.4x 6 100.0% 8.7 74.1x 6 100.0%wb 390 61 451 193 0.6 5.7x 2 1.0% 1.2 2.9x 193 100.0% 1.0 3.5x 193 100.0%
GEOMEAN 12.0x 6.2x 17.6xAVERAGE 60.3% 92.6% 92.6%MEDIAN 7.7x 73.9% 9.5x 100.0% 19.1x 100.0%
Chapter5.
Experim
entalResu
lts
57
Table 5.4: Effect of K and N on runtime and solutions foundN = 1 N = 5
Benchmark MCUnreachability MMCUnreachability MCUnreachability MMCUnreachabilitybenchmark K time #sol- #spur- time #sol- #spur- time #sol- #spur- time #sol- #spur-
(sec) utions ious (sec) utions ious (sec) utions ious (sec) utions iousac97 ctrl 5 121.1 13 655 103.8 13 505 25.2 13 0 24.7 13 0ac97 ctrl 10 220.9 13 1101 441.8 13 807 28.0 13 0 24.7 13 0ac97 ctrl 15 1854.7 13 1543 545.6 13 476 30.7 13 0 24.7 13 0ac97 ctrl 20 4955.6 13 2201 8208.3 13 461 33.5 13 0 24.7 13 0divider 5 213.8 5 0 43.4 5 0 46.5 21 0 6.1 21 0divider 10 424.7 5 0 43.4 5 0 249.9 21 0 6.1 21 0divider 15 637.9 5 0 43.5 5 0 454.4 21 0 6.1 21 0divider 20 848.9 5 0 43.6 5 0 655.2 21 0 6.2 21 0mrisc core 5 4.3 18 0 3.3 18 0 15.5 18 0 15.2 18 0mrisc core 10 5.5 18 0 3.4 18 0 17.3 18 0 15.1 18 0mrisc core 15 6.7 18 0 3.4 18 0 18.9 18 0 15.2 18 0mrisc core 20 8.0 18 0 3.4 18 0 20.8 18 0 15.3 18 0spi 5 1.4 6 55 3.1 6 78 3.4 23 0 1.7 23 0spi 10 2.9 11 84 4.8 11 55 12.0 23 0 1.7 23 0spi 15 4.1 16 127 3.3 16 31 20.5 23 0 1.7 23 0spi 20 5.9 21 175 5.2 21 27 29.1 23 0 1.7 23 0usb core 5 2.3 6 0 1.7 6 0 8.9 6 0 8.7 6 0usb core 10 3.2 6 0 1.7 6 0 10.3 6 0 8.7 6 0usb core 15 4.1 6 0 1.7 6 0 11.8 6 0 8.7 6 0usb core 20 4.9 6 0 1.8 6 0 13.3 6 0 8.7 6 0wb 5 0.3 2 42 0.5 2 44 1.1 193 29 1.0 193 0wb 10 0.3 5 57 0.6 5 43 1.2 193 29 1.0 193 0wb 15 0.4 193 134 0.7 193 41 1.4 193 61 1.0 193 0wb 20 0.5 193 153 0.8 193 43 1.5 193 61 1.0 193 0
Chapter 5. Experimental Results 58
should be greater than or equal to the number of stages. Otherwise, the algorithm may only
find solutions in stages of the pipeline close to the error’s observation point rather than the
pipeline stage in which the bug exists. It is more difficult to set K appropriately. In practice,
it is expected that simulation metrics, such as the average number of cycles to reach particular
states via simulation could be used. The algorithm Unreachability can be thought of as
setting K and N optimally in a manner that finds the complete solution set but never does
more work than needed. It can therefore outperform the approximation-based approaches if
these parameters are not set carefully. Similarly, SEUnreachability essentially chooses the
suspect set L in a manner that guarantees completeness while reducing the amount of work
needed.
Table 5.4 demonstrates the effect of the parameters K and N on runtime and the number of
solutions found. In Table 5.4, the first two columns show the name of the problem instance and
the value of K, respectively. The remaining columns show the runtime, number of solutions
found, and number of spurious solutions found for MCUnreachability with N = 1 (equiv-
alent to SCUnreachability), MMCUnreachability with N = 1, MCUnreachability
with N = 5, and MMCUnreachability with N = 5, respectively.
It can be seen that increasing the value of K increases the runtime for the algorithms SCUn-
reachability and MCUnreachability, as expected. Increasing the value of K results in
more iterations, meaning more debugging instances must be solved. This alone increases run-
time. In many cases, increasing K results in finding more solutions or more spurious solutions as
well. This results in additional calls to PDR in order to determine whether or not the debugging
solutions are real or spurious. Both of these factors contribute to the increased runtime.
The algorithm MMCUnreachability exhibits somewhat different runtime behavior. It
can be seen that its runtime is often constant with increasing values of K. The exceptions
to this occur when increasing this parameter results in finding additional solutions or spurious
solutions. Since MMCUnreachability does not use iterations, increasing K does not result in
solving additional debugging instances. Therefore, it makes sense that the runtime would only
increase as a result of finding more solutions or spurious solutions, as this results in additional
calls to PDR.
Note that, as the table demonstrates, it may be the case that increasing values of K result
Chapter 5. Experimental Results 59
5 10 15 20
KT
ime
(s)
0.0
0.2
0.4
0.6
0.8
1.0
Spurious check timeTotal runtime
Figure 5.1: Total runtime and time spent checking spurious solutions in MMCUnreachabil-ity for wb versus K
in fewer spurious solutions that are simply more expensive to detect. Figure 5.1 plots the total
runtime and time spent checking spurious solutions for wb. It can be seen that despite fewer
spurious solutions being found at higher values of K, more time is spent performing the check.
This means that each solution becomes more expensive to check. This is reasonable, as proving
a state is unreachable in K cycles is harder for larger values of K. It can additionally be seen
that the time spent doing things other than the spurious solution checks is essentially constant
with increasing values of K, confirming that most of the increase is due to this effect.
To summarize the performance characteristics of the presented algorithms, Figure 5.2 plots
the runtime relative to Unreachability for each algorithm on each design. For the algo-
rithms of Chapter 3, results are shown for N = 1 and K = 10. Overall, it appears that in
most cases, SEUnreachability is the best performing algorithm. In every case, it outper-
forms Unreachability. In most cases, it is able to outperform the algorithms presented in
Chapter 3. In two cases (ac97 ctrl and spi) it outperforms every one of the other algorithms.
In the cases where it is not the best-performing algorithm, the other algorithms tend to only
slightly outperform it. This once again demonstrates the importance of selecting the parame-
ters appropriately. As SEUnreachability effectively sets every parameter automatically and
in a manner that reduces the work it must perform, it is expected to outperform the other
algorithms unless their parameters are chosen intelligently. Therefore, it seems that in most
automated applications, such as automatically debugging failures in a regression suite, SEUn-
Chapter 5. Experimental Results 60
ac97_ctrl divider mrisc_core spi usb_core wb
UnreachabilitySEUnreachabilityMCUnreachabilityMMCUnreachability
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
Figure 5.2: Runtime relative to Unreachability for presented algorithms (Unreachability= 1.0)
reachability is the most appropriate choice. Other algorithms may be more appropriate in
cases where either a limited solution set is desired or extra information is available that allows
parameters to be set appropriately. Alternatively, if an error cardinality greater than one is
needed, then the other algorithms must be used.
It can also be seen that in the case of ac97 ctrl, the performance-driven approach of MM-
CUnreachability has higher runtime than the other approximation-based approaches. This
also appears to be related to spurious solutions, as this algorithm finds many spurious solutions
for ac97 ctrl. The checks for spurious solutions in early iterations refine the approximations,
making spurious solutions less likely in later iterations. However, it appears that in this case, it is
much more expensive to detect spurious solutions when skipping the early iterations. Evidently,
refining the approximations early on can heavily impact the performance of the algorithms. In
some cases this can result in MMCUnreachability performing substantially worse than the
iterative approaches.
Chapter 5. Experimental Results 61
On the other hand, the performance enhancement of SEUnreachability appears to
be consistently successful. Table 5.5 compares it against Unreachability. The first three
columns show the name of the problem instance, the number of gates in the design, and num-
ber of registers, respectively. The next two columns show the size of the suspect set and
runtime for Unreachability. The remaining columns relate to SEUnreachability. They
show the number of iterations executed, total number of suspects considered across all itera-
tions, runtime, total percentage of suspects considered (|⋃Li|/|L|), and speedup relative to
the Unreachability, respectively.
Across all experiments, the performance enhancement of SEUnreachability offers a ge-
ometric mean speedup of 32.1x with a median of 30.7x. Critically, SEUnreachability safely
ignores the majority of all design locations. Across all experiments, it considers an average of
only 26.1% of the design locations as suspects, and a median of 20.6%. Since the runtime of
Unreachability appears to be heavily dependent on the size of its input suspect set, elimi-
nating the majority of locations from consideration naturally yields a substantial reduction in
runtime.
Table 5.6 shows the number of solutions and number of suspects per iteration for the first five
iterations of SEUnreachability. The first column shows the name of the problem instance,
while the remaining columns show the number of suspect locations and solutions found for each
of the first five iterations. A blank cell indicates that the algorithm did not proceed through the
relevant iteration for that design. It can be seen that the initial suspect set contains few solutions
in most cases, meaning that very few suspects are considered in subsequent iterations. This
is intuitive, as in general only a small portion of all design locations are solutions. Figure 5.3
plots this data for spi, visualizing the drastic drop-off in suspect set sizes.
The design wb is an exception, as a relatively large portion of the design locations are
solutions. Despite this, the algorithm only needs to consider a total of 237 suspect locations
in order to find 193 solutions. Even in this pathological case, SEUnreachability is able to
ignore nearly half of the design locations and achieve an 8x speedup over Unreachability.
As mentioned previously, the runtime of Unreachability is expected to be heavily-
dependent on the size of the suspect set it is given. This is demonstrated in Table 5.7. The
first two columns show the name of the problem instance and the size of the suspect set L. The
Chapter 5. Experimental Results 62
1 2 3 4 5 6 7
Iteration NumberS
uspe
cts
05
1015
20
SolutionsNon−solutions222
|
Figure 5.3: Suspects and solutions per iteration for spi
Table 5.5: Runtime comparison for Unreachability and SEUnreachabilityUnreachability SEUnreachability
benchmark |L| time (s) #iter |⋃Li| time (s) |⋃Li|/|L| speedupac97 ctrl 14967 490.8 4 2697 18.2 18.0% 27.0xdivider 3915 419.4 3 1056 12.2 27.0% 34.3xmrisc core 9573 276.4 6 1708 6.2 17.8% 44.8xspi 1156 7.6 7 246 0.7 21.3% 11.5xusb core 5545 644.4 3 1140 2.3 20.6% 279.0xwb 451 3.6 8 237 0.4 52.5% 8.2xGEOMEAN 32.1xAVERAGE 26.1%MEDIAN 20.6% 30.7x
remaining columns show the number of SAT calls made by PDR, the run-time of SAT, and the
total runtime of the algorithm. In each experiment, |L| suspects are chosen at random from the
set of all design locations. Each suspect location can make additional states reachable, making
it more difficult to approximate the reachable state space. This results in a greater number of
calls to the SAT solver, as the table shows. Additionally, each suspect location increases the
complexity of the transition relation by adding more clauses and more variables. Figure 5.5(a)
Table 5.6: Number of suspects and solutions in iterations 1 through 5benchmark |S1|/|L1| |S2|/|L2| |S3|/|L3| |S4|/|L4| |S5|/|L5|ac97 ctrl 5/2689 2/2 2/2 2/2 2/2divider 10/1028 10/10 18/18 - -mrisc core 4/1688 4/4 2/2 3/4 3/6spi 7/229 2/2 2/2 2/2 4/4usb core 3/1136 1/2 1/1 1/1 -wb 33/76 33/34 33/33 34/34 4/4
Chapter 5. Experimental Results 63
●
● ● ● ● ● ●
1 2 3 4 5 6 7
0.0
0.1
0.2
0.3
0.4
0.5
0.6
Iteration NumberIte
ratio
n R
untim
e(s)
Figure 5.4: Runtime per iteration for spi
0 1000 2000 3000 4000
010
020
030
040
0
|L|
Tota
l Run
time
(a)
0 1000 2000 3000 4000
0.0
0.1
0.2
0.3
0.4
|L|
Ave
rage
SAT
run
time
(b)
Figure 5.5: Total runtime and average runtime per SAT call for divider
plots the total runtime versus |L| for divider, while Figure 5.5(b) plots the average runtime of
each SAT call in Unreachability versus |L|. It demonstrates the impact of the more complex
transition relation that results from increasing the number of suspects. It can be seen that in-
creasing |L| substantially increases both the total runtime and the runtime of each SAT query
made by PDR.
Since the runtime of Unreachability is dependent on the size of its given suspect set,
earlier iterations of SEUnreachability are expected to take substantially longer than later
iterations. Figure 5.4 plots the runtime of each iteration for spi, confirming this intuition. It
can be seen that the first iteration consumes substantially more runtime than later iterations.
Chapter 5. Experimental Results 64
Table 5.7: Effect of |L| on Unreachability runtimebenchmark |L| #SAT SAT run- total run-
calls time (s) time (s)usb core 125 13 0.01 0.12usb core 250 13 0.01 0.15usb core 500 121 0.24 0.48usb core 1000 331 1.51 2.02usb core 2000 299 5.11 6.10usb core 4000 8213 22.89 28.58divider 125 142 0.10 0.20divider 250 169 0.33 0.48divider 500 311 1.95 2.21divider 1000 391 8.50 9.12divider 2000 868 57.82 59.57divider 4000 1042 459.01 465.74
As shown in Figure 5.3, this iteration has the largest number of suspects by far. This appears
to confirm that larger suspect sets result in increased runtime, as expected. A larger suspect set
results in more hardware being added to the enhanced FSM model of the circuit that is used by
Unreachability. It therefore increases the complexity of the individual PDR instances the
algorithm must solve. It is additionally expected that suspect sets with many non-solutions are
difficult to solve for. To find a solution, PDR only needs to find a counter-example trace that
ends on a target state. However, to prove locations are not solutions it is necessary to prove
that no such traces exist. Intuitively, this seems to be an inherently difficult problem. When
a large number of non-solution locations are in the suspect set, proving no counter-examples
exist can be an expensive operation due to the complexity of the model used in PDR and the
large number of potential state transitions.
In addition to runtime, it is instructive to compare the solution sets found by the algorithms
presented in this thesis. It can be seen that in many case, using a window size N = 1 is sufficient
to find the complete solution set. This demonstrates the effectiveness of the SCUnreacha-
bility algorithm. However, both wb and divider present interesting counter-examples to this
observation. Figure 5.6 plots the number of solutions found versus N for both of these designs.
It can be seen that the two designs exhibit drastically different behavior in this regard. For
divider, the number of solutions steadily increases until it plateaus at N = 9. This occurs
because the error is in a pipelined portion of the design. As a result, increasing N allows so-
lutions to be found in earlier pipeline stages. Conversely for wb, very few solutions are found
Chapter 5. Experimental Results 65
for N < 3 and many new solutions appear at N = 3. This suggests that the design error can
be corrected in a manner that requires reaching a sequence of other unreachable states before
reaching the target state. This demonstrates the importance of the other algorithms presented
in this thesis, as while SCUnreachability is sufficient in many cases, it is inadequate to
debug certain types of errors.
● ● ●
●
● ● ● ● ● ●
2 4 6 8 10
050
100
150
200
N
Sol
utio
ns F
ound
●●
●●
●●
●●
● ●
●
●
wbdivider
Figure 5.6: Solutions versus N for divider and wb (K = 10)
Figure 5.7 plots the cumulative number of solutions found over time for spi for both Un-
reachability and SEUnreachability, confirming this intuition. It can be seen in Fig-
ure 5.7(a) that the former appears to find many solutions towards the beginning of the run.
These solutions result from counter-examples that PDR is able to find relatively quickly. Af-
ter exhausting the easy counter-examples, it becomes more difficult to find later solutions, as
indicated by the increased time between solutions being found. Finally, after every solution is
found, it takes a significant amount of time for the algorithm to prove that no further solutions
exist before it terminates.
Conversely, Figure 5.7(b) shows that SEUnreachability finds few of its solutions at the
start of the run. This is because the first iteration has the largest suspect set, thereby making
it much slower for Unreachability to solve. It can be seen that after finding all seven of
the solutions for iteration 1, there is a large delay before finding another solution. This delay
results from the time required to prove that the non-solution locations in set L1 are in fact not
solutions. As L1 is the largest suspect set, this takes a substantial amount of time. In iterations
2 and later, the suspect sets are all much smaller than L1. As a result, each iteration is very
Chapter 5. Experimental Results 66
0 2 4 6 8
05
1015
2025
Time (s)
Sol
utio
ns F
ound
(a)
0.0 0.2 0.4 0.6
05
1015
2025
Time (s)
Sol
utio
ns F
ound
(b)
Figure 5.7: Solutions found for spi vs. running time for (a) Unreachability (b) SEUn-reachability
fast and many solutions are found in a short period of time. This confirms the effectiveness of
this performance optimization.
To compare the solution sets found by the approaches, Figure 5.8 plots the percentage of
the complete solution set found by the presented algorithms across the set of experiments. It
can be seen that when N = 5 increasing K does not result in finding additional solutions for
the presented experiments. In these cases, N = 5 gives the solver sufficient freedom to find
a large portion of the solutions. On the other hand, when N = 1, increasing K can result in
finding substantially more solutions. In these cases, the solver needs the additional freedom
afforded by enlarging the approximation of the reachable state set to find a larger portion of
the solution set.
5.2 Benefits of Incrementality
This section presents experiments quantifying the runtime performance gained by applying PDR
incrementally in each of the algorithms presented in Chapter 4 of this thesis. The algorithms
of Chapter 3 are “inherently incremental.” That is, if each call to PDR were made with a
new inductive trace, it is unclear what information should be extracted to constrain the SAT-
based debugging problems. For instance, if a spurious solution was detected, it would make
sense to use only a single clause from PDR that blocks the relevant unreachable state in the
Chapter 5. Experimental Results 67
Unreachability
SEUnreachability
K=5,N=1
K=10,N=1
K=15,N=1
K=20,N=1
K=5,N=5
K=10,N=5
K=15,N=5
K=20,N=5
0 20 40 60 80 100
Figure 5.8: Average percentage of solutions found by algorithms across all experiments
debugging step. However, it would also be reasonable to use the entire inductive invariant.
Essentially, in the algorithms of Chapter 4, incrementality is a performance optimization. For
the algorithms of Chapter 3, it is a critical design decision that heavily impacts the construction
of the algorithm. Those algorithms are therefore not considered in this section.
The algorithms of Chapter 4, however, are more naturally and intuitively non-incremental.
Without the analysis of section 4.3, it may not be clear that they can be applied incrementally.
Additionally, it is immediately clear how the non-incremental version of Unreachability
works. It simply does not re-use the inductive trace between calls to PDR. Similarly, for
SEUnreachability, the non-incremental version of the algorithm simply makes calls to the
non-incremental version of Unreachability. Unlike the other algorithms, no aspects of the
internal state of PDR are exported to other problem domains, making incrementality or non-
incrementality both natural choices.
Table 5.8 shows comprehensive results. The first column shows the name of the problem
instance. The remaining six columns show the incremental runtime, non-incremental runtime,
Chapter 5. Experimental Results 68
Table 5.8: Effect of incrementality on runtimeBenchmark Unreachability SEUnreachabilitybenchmark Incr. Non- Incr. Incr. Non- Incr.
runtime incr. speedup runtime incr. speedup(sec) runtime (sec) runtime
(sec) (sec)ac97 ctrl 490.8 6877.2 14.0x 18.2 32.7 1.8xdivider 419.4 8106.3 19.3x 12.2 66.4 5.4xmrisc core 276.4 282.2 1.0x 6.2 6.4 1.0xspi 7.6 457.4 60.5x 0.7 16.1 24.4xusb core 644.4 643.0 1.0x 2.3 2.4 1.0xwb 3.6 1672.6 470.6x 0.4 22.4 51.9xGEOMEAN 16.7x 3.6xMEDIAN 14.1x 4.9x
and incremental speedup for Unreachability and SEUnreachability, respectively. As the
tables demonstrate, incrementality provides substantial performance benefits in most cases.
Overall, for Unreachability, it provides a geometric mean 16.7x speedup with a median of
14.1x. For SEUnreachability, it gives a 3.6x geometric mean speedup and median speedup
of 4.9x. As expected, the speedup for Unreachability is higher. Since that algorithm uses
the same inductive trace throughout its entire execution, incrementality naturally benefits it
more. The algorithm SEUnreachability discards the inductive trace between internal calls
to Unreachability, so it gains less.
There are some cases where incrementality does not give a speedup. In the cases of usb core
and mrisc core, the incremental speedup is negligible. This suggests that the solutions in
these designs are very easy to find. That is, the algorithm finds all solutions very quickly and
spends the rest of its time attempting to prove that no more solutions exist. Since the calls
that find solutions terminate very quickly, applying incrementality does not speed them up
significantly. Towards confirming this intuition, Figure 5.9 plots the number of solutions found
versus running time for the non-incremental version of Unreachability with mrisc core and
usb core. Notice that usb core, all solutions are found within four seconds, implying that the
rest of the 644.4 seconds are spent proving that there are no other solutions. A similar pattern
can be observed for mrisc core, where all of the solutions are found in the first 20 seconds of
execution.
This further explains the substantial speedup obtained by the algorithms of Chapter 3 in
these cases, as shown in Table 5.2. Since those algorithms do not attempt to prove that they
Chapter 5. Experimental Results 69
0 50 100 200
05
1015
20
Time (s)
Sol
utio
ns F
ound
(a)
0 100 300 500
02
46
810
Time (s)
Sol
utio
ns F
ound
(b)
Figure 5.9: Solutions found by Unreachability vs. running time for (a) mrisc core (b)usb core
have found all solutions, they avoid this substantial overhead. A similar but less extreme effect
is observed on these designs with SEUnreachability. This is because, while that algorithm
does need to prove that it has found all solutions, it uses a much smaller suspect set. As such, it
saves substantial runtime in this step, explaining the large speedups observed on these designs.
5.3 Summary
This chapter presents experiments comparing and contrasting the algorithms presented in this
thesis. Section 5.1 compares the algorithms in terms of their runtime and the solution sets they
compute. It is found that in most cases, SEUnreachability is the most suitable algorithm
when no additional information is given that allow setting the parameters in the other algo-
rithms intelligently. In cases where such information is available or an error cardinality greater
than one is needed, the other algorithms may be more suitable.
Section 5.2 quantifies the speedup obtained by applying PDR incrementally. It is found that
incrementality gives substantial speedups in most cases. However, in cases where solutions are
very easy to find, incrementality provides very little benefit. In these cases, Unreachability
performs very poorly relative to the other algorithms. This suggests that Unreachability is
most suited to finding solutions that are difficult to find, such as when the approximation-based
algorithms fail to reveal solutions.
Chapter 6
Conclusion and Future Work
6.1 Contributions
Verification has become the primary bottleneck in the modern VLSI design cycle, and debugging
is the most time-consuming task within verification. As a result, the automation of debugging
tasks is of critical importance. This thesis presents a set of automated techniques that leverage
Boolean Satisfiability and Property Directed Reachability to automate a previously-manual
debugging task. Specifically, the techniques automate the debugging of errors that manifest
themselves in the form of unreachable states. In this case, all that is known about the debugging
problem is that some state is unreachable in violation of the design specification. As such,
no error trace is available to guide traditional SAT-based automated debugging techniques.
The presented techniques handle this case, and can be divided into two broad classifications:
approximation-based approaches and complete approaches.
The approximation-based approaches represent a practical step forward in the field of au-
tomated debugging. While they are not guaranteed to find the complete solution set to the
problem, in many cases they can find a useful subset of the solutions more quickly than the
complete approaches. In greater detail, these approaches work as follows. First, PDR are used
to compute an over-approximation of the set of states reachable in a specific bounded number
of clock cycles. Subsequently, a SAT-based debugging instance is constructed that models a
sequence of state transitions beginning at one of the approximation states and ending at the
target state which is erroneously unreachable. Each satisfying assignment may correspond to
70
Chapter 6. Conclusion and Future Work 71
a solution to the debugging problem. However, due to the use of over-approximation, it is
also possible to find spurious solutions when the chosen approximation state is not reachable.
These are detected using PDR and discarded. As a side effect of the spurious solution detection
process, the approximation is refined, making it less likely that more spurious solutions are
found. The initial formulation of the approach can find solutions that may make the target
state reachable one step after an already-reachable state. This is later extended to handle a
specific bounded number of steps specified by the user referred to as the window size.
The complete approaches use formal techniques to compute the complete solution set to
the problem. As such, they are capable of finding every design location where a change can be
made to correct the error. In greater detail, the canonical complete formulation of the approach
works as follows. An enhanced FSM model of the design is constructed. In the enhanced
model, particular states are reachable if and only if specific design locations are solutions to the
debugging problem. Multiple calls to PDR are used to find traces that reach these states. This
technique is shown to benefit from the incremental application of PDR, where each execution of
the PDR solver reuses the inductive trace from the previous call. It is also shown that applying
the underlying PDR engine incrementally in this fashion preserves the completeness of the
solution set. In addition to the initial formulation of the complete approach, an optimization
is presented that uses the structure of the circuit to prune a potentially large portion of the
non-solution space. This optimization is proven to still find the complete solution set to the
problem under the assumption that only one design error causes the observed unreachability.
A set of experiments is presented to compare the approaches and reveal practical tradeoffs
between them. The initial formulation of the approximation-based approach in which the
window size is one is sufficient to find an average of 60% of the complete solution set and a
median of 74%. This approach is found to be 9x faster than the canonical complete approach
across the set of experiments. Extending the approach to use window size of five allows it to
find an average of 92% of the solutions, and to find all of the solutions in 5 out of 6 benchmarks.
However, the average speedup is reduced to 6x.
Turning to the complete approaches, it is found that the use of incrementality offers an
average speedup of 17x across the set of experiments in the canonical complete algorithm.
It also provides a speedup of 4x for the optimized algorithm that assumes a single design
Chapter 6. Conclusion and Future Work 72
error. Additionally, this optimized approach offers an impressive 32x speedup over the canonical
approach and is often able to outperform the approximation-based approaches. However, unlike
the other approaches, it is limited to cases in which only a single design error is present.
6.2 Future Work
The contributions of this thesis rely heavily on PDR. While PDR is explicitly a model checking
algorithm, it has extensive capabilities and can be seen as a powerful reasoning engine much like
Boolean Satisfiability. As SAT solvers have improved dramatically in terms of performance in
recent years, it is expected that PDR engines will do the same. This presents many promising
future directions for research into applying PDR to other verification and debugging problems
beyond its original target of model checking and the debugging formulations presented in this
thesis. In particular, traditional debugging techniques that make use of an error trace create
SAT instances with numerous copies of the transition relation. Due to the high degree of dupli-
cation in this problem, it is expected that modifying the formulation somewhat and applying
PDR in place of SAT could result in performance gains. A similar approach [26] does the same
using solvers for Quantified Boolean Formulas in place of SAT.
An additional direction is in leveraging variants of PDR with more powerful reasoning
capabilities in the algorithms presented in this thesis. QUIP (Quest for an Inductive Proof) [24]
is such an engine. It extends PDR with the ability to detect so-called “good” and “bad” clauses
during the execution of the algorithm. Good clauses are those that will end up in the inductive
invariant PDR ultimately returns, while bad clauses are those that have no chance of being
part of the inductive invariant. As the algorithms in this thesis leverage PDR incrementally, it
is expected that some of the clauses from earlier runs of PDR are not entirely relevant to later
runs. As such, a solver such as QUIP could be used to detect and purge these clauses so the
solver does not waste time processing them. In particular, the clause propagation step of PDR
could waste substantial time trying to propagate bad clauses.
Finally, a direction that has yet to be explored is alternative formulations of the debugging
problem that better leverage all of the capabilities of the PDR engine, particularly its ability to
compute inductive invariants. In the algorithms presented in this thesis, the inductive invariant
Chapter 6. Conclusion and Future Work 73
PDR returns is simply an additional feature that the user may analyze themselves. However,
inductive invariants have many useful properties that may allow additional information to be
extracted. By formulating the debug problem differently, it may be possible to obtain more
meaningful inductive invariants that provide extra information to the user or to additional
debugging and verification algorithms.
Bibliography
[1] F. V. Andrade, L. M. Silva, and A. O. Fernandes. Improving sat-based combinational
equivalence checking through circuit preprocessing. In Computer Design, 2008. ICCD
2008. IEEE International Conference on, pages 40–45, Oct 2008.
[2] S. Asghar, E. Aubanel, and D. Bremner. A dynamic moldable job scheduling based parallel
sat solver. In Parallel Processing (ICPP), 2013 42nd International Conference on, pages
110–119, Oct 2013.
[3] J. D. Backes and M. D. Riedel. Using cubes of non-state variables with property directed
reachability. In Design, Automation Test in Europe Conference Exhibition (DATE), 2013,
pages 807–810, March 2013.
[4] B. Benhamou, T. Nabhani, R. Ostrowski, and M. R. Saidi. Enhancing clause learning by
symmetry in sat solvers. In Tools with Artificial Intelligence (ICTAI), 2010 22nd IEEE
International Conference on, volume 1, pages 329–335, Oct 2010.
[5] Ryan Berryhill and Andreas Veneris. Automated rectification methodologies to functional
state-space unreachability. In Proceedings of the 2015 Design, Automation & Test in Europe
Conference & Exhibition, DATE ’15, pages 1401–1406, 2015.
[6] Ryan Berryhill and Andreas Veneris. A complete approach to unreachable state diagnos-
ability via property directed reachability. In Proceedings of the 2016 Asia and South Pacific
Design Automation Conference, ASP-DAC ’16, 2016.
74
Bibliography 75
[7] Ryan Berryhill and Andreas Veneris. Efficient selection of suspect sets in unreachable
state diagnosis. In Proceedings of the 2016 Int’l Symposium on Artificial Intelligence and
Mathematics, ISAIM ’16, 2016.
[8] A. Biere, A. Cimatti, E. M. Clarke, O. Strichman, and Y. Zhu. Bounded model checking.
In Advances in Computers, volume 58, pages 118–149, 2003.
[9] A.R. Bradley. Sat-based model checking without unrolling. In Intl Conf. on Verification,
Model Checking, and Abstract Interpretation, pages 70–87, 2011.
[10] Robert Brummayer and Armin Biere. Local two-level and-inverter graph minimization
without blowup. In Proceedings of the 2nd Doctoral Workshop on Mathematical and En-
gineering Methods in Computer Science, MEMICS ’06, 2006.
[11] G. Cabodi, M. Palena, and P. Pasini. Interpolation with guided refinement: Revisiting
incrementality in sat-based unbounded model checking. In Formal Methods in Computer-
Aided Design (FMCAD), 2014, pages 43–50, Oct 2014.
[12] Kai-Hui Chang, I.L. Markov, and V. Bertacco. Automating post-silicon debugging and
repair. In Computer-Aided Design, 2007. ICCAD 2007. IEEE/ACM International Con-
ference on, pages 91–98, Nov 2007.
[13] Hana Chockler, Alexander Ivrii, Arie Matsliah, Shiri Moran, and Ziv Nevo. Incremental
formal verification of hardware. In Proceedings of the International Conference on For-
mal Methods in Computer-Aided Design, FMCAD ’11, pages 135–143, Austin, TX, 2011.
FMCAD Inc.
[14] Hong-Zu Chou, Kai-Hui Chang, and Sy-Yen Kuo. Facilitating unreachable code diagnosis
and debugging. In Proceedings of the 16th Asia and South Pacific Design Automation
Conference, ASPDAC ’11, pages 485–490, Piscataway, NJ, USA, 2011. IEEE Press.
[15] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduc-
tion to Algorithms, Third Edition. The MIT Press, 3rd edition, 2009.
Bibliography 76
[16] S. Disch and C. Scholl. Combinational equivalence checking using incremental sat solving,
output ordering, and resets. In Design Automation Conference, 2007. ASP-DAC ’07. Asia
and South Pacific, pages 938–943, Jan 2007.
[17] N. Een and N. Sorensson. An extensible SAT-solver. In SAT, pages 502–518, 2003.
[18] Niklas Een, Alan Mishchenko, and Robert Brayton. Efficient implementation of property
directed reachability. In Proceedings of the International Conference on Formal Methods
in Computer-Aided Design, FMCAD ’11, pages 125–134, Austin, TX, 2011. FMCAD Inc.
[19] G. Fey, S. Staber, R. Bloem, and R. Drechsler. Automatic fault localization for property
checking. Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions
on, 27(6):1138–1149, June 2008.
[20] E. Goldberg, M. Prasad, and R. Brayton. Using sat for combinational equivalence checking.
In Design, Automation and Test in Europe, DATE ’01, pages 114–121, 2001.
[21] Zyad Hassan, Aaron R. Bradley, and Fabio Somenzi. Better generalization in ic3. In
Formal Methods in Computer-Aided Design, FMCAD’13, pages 157–164. IEEE, 2013.
[22] Krystof Hoder and Nikolaj Bjørner. Generalized property directed reachability. In Pro-
ceedings of the 15th International Conference on Theory and Applications of Satisfiability
Testing, SAT’12, pages 157–171, Berlin, Heidelberg, 2012. Springer-Verlag.
[23] Shi-Yu Huang and Kwant-Ting Cheng. Formal Equivalence Checking and Design DeBug-
ging. Kluwer Academic Publishers, Norwell, MA, USA, 1998.
[24] Alexander Ivrii and Arie Gurfinkel. Pushing to the top. In Formal Methods in Computer-
Aided Design, FMCAD ’15, 2015.
[25] Alexander Ivrii, Arie Gurfinkel, and Anton Belov. Small inductive safe invariants. In
Formal Methods in Computer-Aided Design, FMCAD ’14, pages 21:115–21:122, 2014.
[26] H. Mangassarian, A.Veneris, S.Safarpour, M.Benedetti, and D.Smith. A performance-
driven qbf-based on iterative logic array representation with applications to verification,
debug and test. In Intl Conf. on CAD, 2007.
Bibliography 77
[27] Joao P. Marques-Silva and Karem A. Sakallah. Boolean satisfiability in electronic design
automation. In Design Automation Conference, DAC ’00, pages 675–680, 2000.
[28] K. McMillan. Interpolation and sat-based model checking. In Computer Aided Verification,
2003.
[29] A. Mishchenko, S. Chatterjee, R. K. Brayton, and N. Een. Improvements to combinational
equivalence checking. In Intl Conf. on CAD (ICCAD), pages 836–843, 2006.
[30] Matthew W. Moskewicz, Conor F. Madigan, Ying Zhao, Lintao Zhang, and Sharad Malik.
Chaff: Engineering an efficient sat solver. In Design Automation Conference, DAC ’01,
pages 530–535, 2001.
[31] OpenCores.org. http://www.opencores.org, 2007.
[32] V. Paruthi and A. Kuehlmann. Equivalence checking combining a structural sat-solver,
bdds, and simulation. In Computer Design, 2000. Proceedings. 2000 International Confer-
ence on, pages 459–464, 2000.
[33] S. Safarpour, A. Veneris, and H. Mangassarian. Trace compaction using sat-based reach-
ability analysis. In Design Automation Conference, 2007. ASP-DAC ’07. Asia and South
Pacific, pages 932–937, 2007.
[34] S. Safarpour, A. Veneris, and F. Najm. Managing verification error traces with bounded
model debugging. In Design Automation Conference (ASP-DAC), 2010 15th Asia and
South Pacific, pages 601–606, 2010.
[35] Sean Safarpour and Andreas Veneris. Abstraction and refinement techniques in automated
design debugging. In Proceedings of the Conference on Design, Automation and Test in
Europe, DATE ’07, pages 1182–1187, 2007.
[36] Sean Safarpour and Andreas Veneris. Automated design debugging with abstraction and
refinement. Trans. Comp.-Aided Des. Integ. Cir. Sys., 28(10):1597–1608, October 2009.
Bibliography 78
[37] Joao P. Marques Silva and Karem A. Sakallah. Grasp—a new search algorithm for
satisfiability. In International Conference on Computer-aided Design, ICCAD ’96, pages
220–227, 1996.
[38] A. Smith, A. Veneris, M. F. Ali, and A. Viglas. Fault diagnosis and logic debugging
using boolean satisfiability. IEEE Trans. Comput.-Aided Design Integr. Circuits Syst,
24(10):1606–1621, Oct. 2005.
[39] G. S. Tseitin. On the complexity of derivations in the propositional calculus. Studies in
Mathematics and Mathematical Logic, Part II:115–125, 1968.
[40] Yakir Vizel, Orna Grumberg, and Sharon Shoham. Tools and Algorithms for the Construc-
tion and Analysis of Systems: 19th International Conference, TACAS 2013, Held as Part of
the European Joint Conferences on Theory and Practice of Software, ETAPS 2013, Rome,
Italy, March 16-24, 2013. Proceedings, chapter Intertwined Forward-Backward Reacha-
bility Analysis Using Interpolants, pages 308–323. Springer Berlin Heidelberg, Berlin,
Heidelberg, 2013.
[41] T. Welp and A. Kuehlmann. Qf bv model checking with property directed reachability. In
Design, Automation Test in Europe Conference Exhibition (DATE), 2013, pages 791–796,
March 2013.
[42] T. Welp and A. Kuehlmann. Property directed invariant refinement for program verifi-
cation. In Design, Automation and Test in Europe Conference and Exhibition (DATE),
2014, pages 1–6, March 2014.
[43] T. Welp and A. Kuehlmann. Property directed reachability for qf bv with mixed type
atomic reasoning units. In Design Automation Conference (ASP-DAC), 2014 19th Asia
and South Pacific, pages 738–743, Jan 2014.