[IEEE 2012 IEEE 10th International New Circuits and Systems Conference (NEWCAS) - Montreal, QC,...

4
Probabilistic Model Checking of Clock Domain Crossing Interfaces Zaid Al-bayati, O. Ait Mohamed ECE Department, Concordia University Montréal, QC, Canada {z_albaya, ait}@ece.concordia.ca Yvon Savaria EE Department, École Polytechnique de Montréal, Montréal, QC, Canada [email protected] Mounir Boukadoum CS Department, UQAM Montréal, QC, Canada [email protected] Abstract—Clock domain crossing (CDC) interfaces constitute an increasingly essential part of large digital systems and Systems on Chip (SoCs). These interfaces are inherently difficult to design and debug. In this paper, we demonstrate how probabilistic model checking can be employed in the verification of CDC protocols. Popular CDC interfaces are modeled as Markov Decision Processes and verified using the PRISM model checker. I. INTRODUCTION As designs grow in size and complexity, more and more designers are building chips that encompass several clock domains. Different clock domains communicate through Clock Domain Crossing interfaces (CDCs). A major concern with the CDC interface design is the inevitability of violating the timing constraints of the clock domains since data and control signals can be generated in one domain and used in another. This may cause metastability in the receiving clock domain. If metastability is not dealt with appropriately, it can cause problems in the system ranging from transient errors to system failure. Several CDC interfaces have been developed to reduce the probability of errors in chips operating on multiple clocks. Some of the most common CDC interfaces used today are FIFO based interfaces [1], bundled data protocol based interfaces [2], and pausable clocking interfaces [3]. Most CDC interfaces use synchronizers, most commonly two-flop synchronizers, to synchronize signals moving from one domain to the other. These synchronizers need to be within the context of an appropriate protocol in order to operate robustly. Formal verification is one of the prominent methods to check the correctness of complex systems. Model checking is one of the most important formal verification techniques used today. Model checking tools have two shortcomings when applied to CDC interfaces: Model checkers are inherently built assuming a single clock. Commercial model checkers available today will either require the user to generate the clocks, or not support multiple clock domains at all [4]. CDC interfaces are subject to metastability failures. These failures are probabilistic and depend on the protocol used as well as on continuous-time circuit level issues that cannot be modeled at the high level of abstraction in which model checkers work. In this paper, we provide a framework for verifying CDC interfaces taking into account the above mentioned challenges. We overcome the limitations of other approaches by modeling CDC interfaces as Markov Decision Processes (MDP) which model both stochastic and non-deterministic behavior [5], making it an appropriate method to model CDC interfaces. Metastability is modeled as a state in the MDP and properties are specified in PCTL (Probabilistic Computation Tree Logic) and some of its extensions as specified in [5]. The PRISM [6] model checker, which allows for quantitative analysis of stochastic systems, is used to verify our models of these interfaces. The rest of this paper is organized as follows. In Section II, we discuss related works on CDC verification. In Section III, we give an overview of PRISM model checker. In Section IV we discuss our modeling of metastability and we apply it to different CDC interfaces in Section V. Section VI presents our conclusions. II. RELATED WORK Several methodologies have been presented for the formal verification of CDC interfaces. One of the most important works on CDC verification is presented in [7]. In their paper, the authors discuss two methods for verifying CDC interfaces. In the first, verification rules are generated as PSL properties from the Signal Transition Graph (STG) representing the protocol specification and are applied to RuleBase model checker. These rules verify that STG events occur in order and that each event happens only in the appropriate states. The second method checks the correctness of data transfers and checks for missing or duplicated data. The authors, however 978-1-4673-0859-5/12/$31.00 ©2012 IEEE 193

Transcript of [IEEE 2012 IEEE 10th International New Circuits and Systems Conference (NEWCAS) - Montreal, QC,...

Page 1: [IEEE 2012 IEEE 10th International New Circuits and Systems Conference (NEWCAS) - Montreal, QC, Canada (2012.06.17-2012.06.20)] 10th IEEE International NEWCAS Conference - Probabilistic

Probabilistic Model Checking of Clock Domain Crossing Interfaces

Zaid Al-bayati, O. Ait Mohamed ECE Department,

Concordia University Montréal, QC, Canada

{z_albaya, ait}@ece.concordia.ca

Yvon Savaria EE Department,

École Polytechnique de Montréal, Montréal, QC, Canada

[email protected]

Mounir Boukadoum CS Department,

UQAM Montréal, QC, Canada

[email protected]

Abstract—Clock domain crossing (CDC) interfaces constitute an increasingly essential part of large digital systems and Systems on Chip (SoCs). These interfaces are inherently difficult to design and debug. In this paper, we demonstrate how probabilistic model checking can be employed in the verification of CDC protocols. Popular CDC interfaces are modeled as Markov Decision Processes and verified using the PRISM model checker.

I. INTRODUCTION

As designs grow in size and complexity, more and more designers are building chips that encompass several clock domains. Different clock domains communicate through Clock Domain Crossing interfaces (CDCs). A major concern with the CDC interface design is the inevitability of violating the timing constraints of the clock domains since data and control signals can be generated in one domain and used in another. This may cause metastability in the receiving clock domain. If metastability is not dealt with appropriately, it can cause problems in the system ranging from transient errors to system failure.

Several CDC interfaces have been developed to reduce the probability of errors in chips operating on multiple clocks. Some of the most common CDC interfaces used today are FIFO based interfaces [1], bundled data protocol based interfaces [2], and pausable clocking interfaces [3]. Most CDC interfaces use synchronizers, most commonly two-flop synchronizers, to synchronize signals moving from one domain to the other. These synchronizers need to be within the context of an appropriate protocol in order to operate robustly.

Formal verification is one of the prominent methods to check the correctness of complex systems. Model checking is one of the most important formal verification techniques used today. Model checking tools have two shortcomings when applied to CDC interfaces:

Model checkers are inherently built assuming a single clock. Commercial model checkers available today

will either require the user to generate the clocks, or not support multiple clock domains at all [4].

CDC interfaces are subject to metastability failures. These failures are probabilistic and depend on the protocol used as well as on continuous-time circuit level issues that cannot be modeled at the high level of abstraction in which model checkers work.

In this paper, we provide a framework for verifying CDC interfaces taking into account the above mentioned challenges. We overcome the limitations of other approaches by modeling CDC interfaces as Markov Decision Processes (MDP) which model both stochastic and non-deterministic behavior [5], making it an appropriate method to model CDC interfaces. Metastability is modeled as a state in the MDP and properties are specified in PCTL (Probabilistic Computation Tree Logic) and some of its extensions as specified in [5]. The PRISM [6] model checker, which allows for quantitative analysis of stochastic systems, is used to verify our models of these interfaces.

The rest of this paper is organized as follows. In Section II, we discuss related works on CDC verification. In Section III, we give an overview of PRISM model checker. In Section IV we discuss our modeling of metastability and we apply it to different CDC interfaces in Section V. Section VI presents our conclusions.

II. RELATED WORK

Several methodologies have been presented for the formal verification of CDC interfaces. One of the most important works on CDC verification is presented in [7]. In their paper, the authors discuss two methods for verifying CDC interfaces. In the first, verification rules are generated as PSL properties from the Signal Transition Graph (STG) representing the protocol specification and are applied to RuleBase model checker. These rules verify that STG events occur in order and that each event happens only in the appropriate states. The second method checks the correctness of data transfers and checks for missing or duplicated data. The authors, however

978-1-4673-0859-5/12/$31.00 ©2012 IEEE

193

Page 2: [IEEE 2012 IEEE 10th International New Circuits and Systems Conference (NEWCAS) - Montreal, QC, Canada (2012.06.17-2012.06.20)] 10th IEEE International NEWCAS Conference - Probabilistic

explicitly mention that the probability of metastability is ignored in their approach.

Another interesting work is presented in [4]. In this work, the authors use SAT based bounded model checking to verify CDC protocols. The focus is on modeling multiple clocks in the bounded model checker. They assume that setup and hold times are zero. Different clocks are modeled by assigning a state variable for each clock which can be either 0 or 1 at each verification tick.

In [8], a methodology for verifying multiple clock designs in SMV model checker is presented. The work focuses on dealing with the zero-delay abstraction performed by formal verification tools. The output of each gate is set to a random value for a single verification step for all gates or components along the critical path between two domains. The authors acknowledge that this method might lead to under- or over-approximations if there is more than one critical path in the interface; however, they assume that these situations do not arise. In reality, hardware paths have significant variations in delay leading to situations that are totally different from the cases predicted by the approach.

In [9], the authors use the SAL model checker to prove the correctness of a simple CDC interfacing circuit. The interface is modeled as three interacting processes and a proof is generated to check that the circuit satisfies a basic invariant.

Many approaches for verifying CDC interfaces using simulation were also presented such as in [10]. SystemVerilog assertions are generated to check CDC interfaces in a simulation-based flow. Several checks for common synchronizers and CDC interfaces are discussed. However, the huge number of possible clock orderings observed with multiple clock domain circuits, makes formal verification the preferred choice for verifying CDC interfaces [7].

III. PROBABLISTIC MODEL CHECKING & PRISM

Probabilistic model checking is a formal verification technique that can be applied to systems that have stochastic behavior. It is a generalization of model checking that supports probabilistic models [5]. Probabilistic model checking does not only provide a Yes/No answer on whether a property is satisfied by the model. It can also provide quantitative measures on the minimum and maximum probability that a certain property holds. Probabilistic model checking has a wide range of applications in fields such as communication protocols and reliability analysis.

PRISM [6, 11] is an open source probabilistic model checker. PRISM supports many different probabilistic models such as discrete-time and continuous-time Markov chains and Markov Decision Processes (MDP). We will primarily focus on MDP because they incorporate the ability to express non-determinism which is essential for our models. The PRISM language is a state based language described thoroughly in the PRISM manual available in [11]. Models in PRISM are written in the form of state-based modules, each composed by a set of guarded commands. PRISM supports a wide range of probabilistic temporal logics to specify the properties to be verified such as PCTL, PCTL* and CSL.

IV. UNDERSTANDING & MODELING METASTABILITY

Metastability is a well-studied phenomenon in digital circuit. It occurs in bi-stable elements such as flip-flops when the input changes within the setup or hold time of the flip-flop causing it to enter a state in which its output is neither 0 nor 1 for a non-deterministic amount of time. The flip-flop eventually resolves non-deterministically to either 0 or 1.

A synchronizer composed of two cascaded flip-flops is usually used to reduce metastability-related failures. However, non-determinism remains an issue as the first flip-flop might converge to either of the two values, and this value gets captured in the second flip-flop. Moreover, if the flip-flop does not converge to a stable value by the time of the next clock edge, the second flip-flop itself might get metastable (although with a lower probability) and propagate this metastability to the receiving domain. Several designs have failed due to the lack or inappropriate use of synchronizers in CDC protocols or just because the synchronizer itself failed. In [2], the authors give examples of some successful and unsuccessful design attempts.

In our verification framework, both metastable voltage outputs as well as non-determinism were modeled. Our model of metastability for a 0-to-1 transition of a flip-flop is shown in Figure 1. The transition is modeled as a non-deterministic choice between a normal operation mode in which it does not get into metastability and a metastability mode in which it enters metastability. When a flip-flop enters metastability, its output is set to one of three possible outputs {0, 1, M}. Probabilities of entering each state {p1, p2, p3} depend on the type and design of the flip-flop and can be obtained through simulation. The probabilities can then be plugged in into PRISM for verifying the system as a whole in the presence of metastability. A designer might be especially interested in the overall behavior of the system for a given probability of metastability (p3). When in this state, different behaviors can be defined such as system failure or restart. The PRISM code representing the metastability model is shown in Figure 2.

S0

X=0

p1

S1

X=1S2

X=0

S3

X=M

p2 p3

Figure 1. Metastability model for a 0->1 transition of a flip-flop

Figure 2. PRISM code for 0->1 transition of a flip-flop

[] s=0 -> (s'=1)&(X'=1); [] s=0 -> p1:(s'=1)&(X’=1) + p2:(s'=2)&(X’=0) +p3:(s'=3)&(X’=2)&(metas'=true); [] s=2 -> (s’=1)& (X’=1);

194

Page 3: [IEEE 2012 IEEE 10th International New Circuits and Systems Conference (NEWCAS) - Montreal, QC, Canada (2012.06.17-2012.06.20)] 10th IEEE International NEWCAS Conference - Probabilistic

V. MODELING CDC INTERFACES

In this section, we discuss the modeling of two common CDC interfaces in PRISM, namely bundled-data protocol based interfaces [2], and FIFO based interfaces [1].

A. Bundle-Data Protocol CDC Interface

The bundled-data protocol CDC interface [2] (Figure 3) is a simple interface that uses a simple request/ack mechanism to transfer data. The mechanism is defined as follows

The sender places data in REGs and informs the interface by asserting the snd signal.

The sender part of the CDC interface sends a transfer request (R1) to the receiver side of the CDC interface which gets synchronized into the receiver domain through the two-flop synchronizer generating the R2 signal.

R2 latches the data into the receiver side. Then, the receiver acknowledges the reception through the A1 signal which gets synchronized into the sender using a two-flop synchronizer.

The interface is modeled as MDP. Metastability is modeled as shown in Figure 1. The FSMs of the sender and receiver interact through global variables. The probability of the sender flip-flop entering metastability and staying for more than one cycle was assumed to be 5x10-4, and for the receiver flip-flop, it was assumed to be 2.5x10-4. Several references such as [12] discuss computing the probability of failures for different types of flip-flops.

Three properties for this model were checked:

PP1:Pmin=? [(snd=true)=>(F (A2=1))] Returns the minimum probability that a request to send data is acknowledged.

PP2:Pmax=? [(request=false)&(R2=1)] Returns the maximum probability that data is latched to the receiver without being sent by the sender. Where request is a variable that keeps track of ongoing requests that have not been acknowledged.

PP3:Pmax=? [ F<K (metas=true)]

Returns the maximum probability that metastability occurs within K cycles.

The PRISM model constructed consists of 90 states and 339 transitions. Memory requirements for the constructed model were 56 KB. Table I shows the returned results and the

Figure 3. Bundled-data protocol CDC interface [2]

time required for model checking for each of the properties specified. The table shows that the verification time is quite low. The verification has been conducted on a machine with an Intel Core i5 CPU running at 2.27 GHz and with a 4 GB memory. It is natural that the probability that the system enters a metastable state increases with time as more requests are synchronized. Figure 4 shows the probability of entering a metastable state as a function of the number of cycles (K). The probability increases quickly at the start and as the number of cycles exceeds 25000, it slowly converges to 1.

B. FIFO-Based CDC Interface

FIFO based CDC interfaces [1] are used for high throughput applications. These interfaces use a dual-clock circular queue as shown in Figure 5 to transfer data between two different clock domains. The FIFO acts as intermediate storage for data. The sender places the data item in the queue by putting it in the data_put port and making a request through req_put. The receiver retrieves the item when it is ready by asserting req_get. The FIFO can be viewed as consisting of two interfaces; a put interfaces that communicates with the sender and a get interface that communicates with the receiver. The ptoken (gtoken) signal is internally used as a pointer to the end (start) of the queue. The sender (receiver) needs to be informed when the queue is full (empty). Since the full signal and the empty signal are asynchronous to the interfaces, a two-flop synchronizer is required at each end. The mechanisms for generating these two signals and for latching and releasing data are a bit complicated. The details are explained in [1].

The FIFO interface was modeled in PRISM in a similar way to the bundled-data protocol interface. An MDP module for a put interface and a get interface was constructed. A FIFO size of 4 cells was modeled. The same metastability probabilities of 5x10-4 and 2.5x10-4 were used. The PRISM model constructed consists of 33128 states and 157680 transitions. This interface has several additional blocks as illustrated in [1], and is much more complicated than bundled-data CDC leading to a larger model with a larger

TABLE I. VERIFICATION RESULTS

Property Model checking time (seconds)

Result (P=?)

PP1 0.037 0.99925

PP2 0.031 0

PP3 (K=1000) 0.027 0.095

Figure 4. Maximum probability as a function of number of cycles

195

Page 4: [IEEE 2012 IEEE 10th International New Circuits and Systems Conference (NEWCAS) - Montreal, QC, Canada (2012.06.17-2012.06.20)] 10th IEEE International NEWCAS Conference - Probabilistic

Figure 5. FIFO-based interface [1]

number of state variables. Memory requirements for the model were 4.2 MB. Some properties verified are:

PP1:Pmin=? [((reqput)&(!full)&(ptoken=n))=> ( F((reqget)&(!empty)&(gtoken=n)))] Returns the minimum probability that a data item written to the FIFO is eventually read. n is the cell index. For this FIFO, n was given values in [0,3].

PP2:Pmax=? [ F<K (metas=true) ] Returns the maximum probability that metastability occurs within K cycles.

PP3:Pmax=?[(enput)&(ptoken=n)&(cfulln=true)]

Where enput is an internal signal in the FIFO that enables writing, and cfulln indicates whether cell n of the fifo is full. The property returns the maximum probability of writing to a full cell.

PP4:Pmax=?[((enget)&(gtoken=n)&(cfulln=false)]

Where enget is an internal signal in the FIFO that enables reading. The property returns the maximum probability of reading from an empty cell.

The verification results are shown in Table II. Verification of some properties took more time than similar properties in the first case study, however, verification times are still small. Figure 6 shows the probability of entering a metastable state as a function of the number of cycles. The probability is generally higher than bundled-data based interface since the interface is more complex and data can be transmitted at a higher rate leading to a higher frequency of input changes at the inputs of the flip-flops.

TABLE II. VERIFICATION RESULTS FOR FIFO

Property Model checking time (seconds)

Result (P=?)

PP1 6.393 1

PP2 (K=1000) 7.248 0.28

PP3 (n=0) 0.139 0

PP4 (n=0) 0.802 0

Figure 6. Maximum probability as a function of no. of cycles for FIFO

VI. CONCLUSION

In this paper, we have presented a method for verifying CDC interfaces using probabilistic model checking. The proposed method allows for modeling metastability. CDC interfaces are modeled as Markov Decision Processes integrating both probabilistic and non-deterministic behavior. Two different CDC interfaces; bundled-data based CDC and FIFO-based CDC were modeled and verified using this approach. Properties were written in PCTL and verified using PRISM model checker.

The approach presented here could form the basis of a larger framework that is used to formally verify Globally-Asynchronous Locally-Synchronous (GALS) systems. This would help designers detect errors resulting from signals crossing clock domains much earlier in the design cycle and enhance the designs’ reliability.

REFERENCES

[1] T. Chelcea and S. M. Nowick, “Robust interfaces for mixed-timing systems,” IEEE Trans. Very Large Scale Integration (VLSI) Systems, vol. 12, no. 8, pp. 857–873, Aug. 2004.

[2] R. Ginosar, “Fourteen ways to fool your synchronizer,” in Proc. 9th IEEE Int. Symp. Asynchronous Circuits and Systems (ASYNC’03), 2003, pp.89–97.

[3] D. Chapiro, “Globally-Asynchronous Locally-Synchronous Systems,” Ph.D.dissertion, Stanford University, 1984.

[4] E. M. Clarke, D. Kroening, and K. Yorav, "Specifying and Verifying Systems with Multiple Clocks,"in Proc. 21st Int. Conf. on Computer Design (ICCD’03), 2003, pp.48–55.

[5] V. Forejt, M. Kwiatkowska, G. Norman and D. Parker, “Automated Verification Techniques for Probabilistic Systems,” Formal Methods for Eternal Networked Software Systems (SFM'11), vol. 6659 of LNCS, pp. 53-113, Springer, June 2011.

[6] M. Kwiatkowska, G. Norman, and D. Parker, “PRISM 4.0: Verification of Probabilistic Real-time Systems,” In Proc. 23rd International Conference on Computer Aided Verification (CAV’11), vol. 6806 of LNCS, pp. 585-591, Springer, 2011.

[7] T. Kapschitz and R. Ginosar, “Formal verification of synchronizers,” CHARME’05, vol. 3725 of LNCS, pp. 359-362. 2005.

[8] Safranek, D., Smrcka, A., Vojnar, T., Rehak, V., Rehak, Z., and Matousek, P, “Verifying VHDL Design with Multiple Clocks in SMV,” FMICS 2006, vol. 4346 of LNCS, Springer.

[9] Brown, G., “Verification of a Data Synchronization Circuit For All Time,” In Proc. 6th Int. Conf. on Application of Concurrency to System Design (ACSD’06), pp. 217–228, 2006.

[10] Litterick M., “Pragmatic Simulation-Based Verification of Clock Domain Crossing Signals and Jitter using SystemVerilog Assertions,” In Proc. DVCon 2006.

[11] http://www.prismmodelchecker.org.

[12] D. Li, P. Chuang and M. Sachdev, “Comparative analysis and study of metastability on high-performance flip-flops,” 11th Int. Symp. on Quality Electronic Design (ISQED’10), pp.853-860, 2010.

196