Sub-threshold synchronizer

11
Sub-threshold synchronizer Jun Zhou a,n , Maryam Ashouei b , David Kinniment c , Jos Huisken b , Gordon Russell c , Alex Yakovlev c a Institute of Microelectronics, ICS Lab, 11 Science Park Road, Singapore 117685, Singapore b IMEC/Holst Centre, Eindhoven, The Netherlands c School of Electrical, Electronic and Computer Engineering, Newcastle University, Newcastle, UK article info Article history: Received 29 December 2010 Received in revised form 10 April 2011 Accepted 12 April 2011 Available online 6 May 2011 Keywords: Synchronizer Sub-threshold MTBF Forward Body-bias abstract Sub-threshold operation has been proven to be very effective to reduce the power consumption of circuits when high performance is not required. Future low power systems on chip are likely to consist of many sub-systems operating at different frequencies and VDDs from super-threshold to sub- threshold region. Synchronizers are therefore needed to interface between these sub-systems. However, VDD scaling rapidly degrades synchronizers’ performance making them unsuitable for sub- threshold operation. For the first time, we analyze the synchronizer performance at ultra low voltages and propose to apply forward body bias to extend the operation of synchronizers to the sub-threshold region and to make them resilient to process variation. We show that applying full-VDD bias significantly increases the transconductance of the bi-stable in synchronizers without adding capaci- tance to the switching nodes. As a result all the circuit parameters (t metastability time constant, T d normal propagation delay and T w metastability window) determining synchronizer performance or mean time between failure (MTBF) can be improved by more than 80% (i.e. by five times) in the sub-threshold region. We also study the impact of process variation on the synchronizer performance in the sub-threshold region and conclude that with full-VDD bias the synchronizer MTBF can be improved from seconds to years for the worst case corner. Finally, we propose an implementation scheme of full-VDD body-biased synchronizer, which is able to work for a wide range of VDDs from sub-threshold region to nominal VDD with nearly zero overhead. & 2011 Elsevier Ltd. All rights reserved. 1. Introduction Technology scaling and integrating more functionality on a single die pose many challenges including high power density. This in turn results in a high cost for the cooling system, packaging and in portable devices a low battery lifetime. There are many techniques proposed to reduce dynamic [1] and leakage [3,4] power. One of the effective techniques to reduce dynamic power is to divide the system into many sub-systems and operate them with different power supplies and clock frequencies, which may involve dynamic scaling of the power supplies to voltages lower than the nominal VDD [11,12]. A more aggressive power reduction technique, known as sub-threshold operation [4,6,7,9,10], operates the system at an energy-optimal VDD in the sub-threshold or near-threshold region where the dynamic energy balances the leakage energy and the operation of circuits rely on sub-threshold leakage current. Due to aggressive VDD scaling the total power is significantly reduced. However, this technique is only suitable for low and medium performance applications such as medical monitoring utilizing a wireless sensor network because the circuit delay degrades exponentially when VDD is scaled to the sub-threshold region. Also, the circuit performance is heavily subject to process variation in the sub-threshold region, which has to be addressed. Due to the performance limitation and process variability challenges of sub-threshold operation, circuit blocks operating from low voltages to sub-threshold voltages may coexist in a system in order to minimize the power while providing the required performance. Synchronizers are therefore needed for interfacing between these blocks since they have different clock frequencies. However, one problem of synchronizers is that their performance degrades rapidly with decreasing VDD because the synchronizer operation depends on small signal behavior rather than large signal [13,14]. In the past synchronizer related research has been focusing on metastability theory, synchronizer measurement and high performance synchronizer designs [1118,21]. Very little research has been conducted on synchronizers for low voltage operation. In [22] the performance of a commonly used Jamb latch synchronizer at low voltages was studied and an improved synchronizer circuit was proposed for dealing with the degradation of synchronizer performance with VDD scaling. However, the study was only done for VDD values well above the threshold voltage. The synchronizer performance in the sub-threshold or near-threshold region has not Contents lists available at ScienceDirect journal homepage: www.elsevier.com/locate/mejo Microelectronics Journal 0026-2692/$ - see front matter & 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.mejo.2011.04.008 n Corresponding author. Tel.: þ65 67705538; fax: 65 67745754. E-mail address: [email protected] (J. Zhou). Microelectronics Journal 42 (2011) 840–850

Transcript of Sub-threshold synchronizer

Page 1: Sub-threshold synchronizer

Microelectronics Journal 42 (2011) 840–850

Contents lists available at ScienceDirect

Microelectronics Journal

0026-26

doi:10.1

n Corr

E-m

journal homepage: www.elsevier.com/locate/mejo

Sub-threshold synchronizer

Jun Zhou a,n, Maryam Ashouei b, David Kinniment c, Jos Huisken b, Gordon Russell c, Alex Yakovlev c

a Institute of Microelectronics, ICS Lab, 11 Science Park Road, Singapore 117685, Singaporeb IMEC/Holst Centre, Eindhoven, The Netherlandsc School of Electrical, Electronic and Computer Engineering, Newcastle University, Newcastle, UK

a r t i c l e i n f o

Article history:

Received 29 December 2010

Received in revised form

10 April 2011

Accepted 12 April 2011Available online 6 May 2011

Keywords:

Synchronizer

Sub-threshold

MTBF

Forward Body-bias

92/$ - see front matter & 2011 Elsevier Ltd. A

016/j.mejo.2011.04.008

esponding author. Tel.: þ65 67705538; fax:

ail address: [email protected] (J. Zhou)

a b s t r a c t

Sub-threshold operation has been proven to be very effective to reduce the power consumption of

circuits when high performance is not required. Future low power systems on chip are likely to consist

of many sub-systems operating at different frequencies and VDDs from super-threshold to sub-

threshold region. Synchronizers are therefore needed to interface between these sub-systems.

However, VDD scaling rapidly degrades synchronizers’ performance making them unsuitable for sub-

threshold operation. For the first time, we analyze the synchronizer performance at ultra low voltages

and propose to apply forward body bias to extend the operation of synchronizers to the sub-threshold

region and to make them resilient to process variation. We show that applying full-VDD bias

significantly increases the transconductance of the bi-stable in synchronizers without adding capaci-

tance to the switching nodes. As a result all the circuit parameters (t metastability time constant, Td

normal propagation delay and Tw metastability window) determining synchronizer performance

or mean time between failure (MTBF) can be improved by more than 80% (i.e. by five times) in the

sub-threshold region. We also study the impact of process variation on the synchronizer performance

in the sub-threshold region and conclude that with full-VDD bias the synchronizer MTBF can be

improved from seconds to years for the worst case corner. Finally, we propose an implementation

scheme of full-VDD body-biased synchronizer, which is able to work for a wide range of VDDs from

sub-threshold region to nominal VDD with nearly zero overhead.

& 2011 Elsevier Ltd. All rights reserved.

1. Introduction

Technology scaling and integrating more functionality on a singledie pose many challenges including high power density. This in turnresults in a high cost for the cooling system, packaging and inportable devices a low battery lifetime. There are many techniquesproposed to reduce dynamic [1] and leakage [3,4] power. One of theeffective techniques to reduce dynamic power is to divide thesystem into many sub-systems and operate them with differentpower supplies and clock frequencies, which may involve dynamicscaling of the power supplies to voltages lower than the nominalVDD [11,12]. A more aggressive power reduction technique, knownas sub-threshold operation [4,6,7,9,10], operates the system at anenergy-optimal VDD in the sub-threshold or near-threshold regionwhere the dynamic energy balances the leakage energy and theoperation of circuits rely on sub-threshold leakage current. Due toaggressive VDD scaling the total power is significantly reduced.However, this technique is only suitable for low and mediumperformance applications such as medical monitoring utilizing a

ll rights reserved.

65 67745754.

.

wireless sensor network because the circuit delay degradesexponentially when VDD is scaled to the sub-threshold region. Also,the circuit performance is heavily subject to process variation in thesub-threshold region, which has to be addressed. Due to theperformance limitation and process variability challenges ofsub-threshold operation, circuit blocks operating from low voltagesto sub-threshold voltages may coexist in a system in order tominimize the power while providing the required performance.Synchronizers are therefore needed for interfacing between theseblocks since they have different clock frequencies.

However, one problem of synchronizers is that their performancedegrades rapidly with decreasing VDD because the synchronizeroperation depends on small signal behavior rather than large signal[13,14]. In the past synchronizer related research has been focusingon metastability theory, synchronizer measurement and highperformance synchronizer designs [11–18,21]. Very little researchhas been conducted on synchronizers for low voltage operation. In[22] the performance of a commonly used Jamb latch synchronizerat low voltages was studied and an improved synchronizer circuitwas proposed for dealing with the degradation of synchronizerperformance with VDD scaling. However, the study was only donefor VDD values well above the threshold voltage. The synchronizerperformance in the sub-threshold or near-threshold region has not

Page 2: Sub-threshold synchronizer

J. Zhou et al. / Microelectronics Journal 42 (2011) 840–850 841

been studied. Due to totally different transistor behavior, thesynchronizers working well in the super-threshold region may notachieve the required MTBF when working in the sub-thresholdregion. Another thing is that the impact of normal propagation delay(the Clock-to-Q delay when the data arrives much earlier than theclock) and setup time on synchronizer MTBF is usually not con-sidered because they are relatively small in the super-thresholdregion. However, in the sub-threshold region they increase drama-tically as VDD decreases, which may affect the synchronizer MTBFsignificantly. In addition, the impact of process variation on circuitperformance is substantially large in the sub-threshold region. Itmay affect the synchronizer performance more heavily since thesynchronizer performance depends on small signal behavior ratherthan large signal. This has not been studied. These tasks need to bedone in order to extend the synchronization from super-thresholdregion to sub-threshold region.

In this paper, we analyze how two existing synchronizers,namely the Jamb latch synchronizer [15] and its improved versionfor low voltage operation [22], perform in the sub/near-thresholdregion. Furthermore, we analyze how by using forward body bias,the performance of the two designs can be significantly improvedfor sub/near-threshold operation [27].

The contributions of this paper can be summarized as follows:

The mean time between failures (MTBF) equation of synchro-nizers is re-analyzed in the sub-threshold region by taking intoaccount the effect of normal propagation delay and the setuptime of the subsequent flipflop. � The performance of two existing synchronizers (Jamb latch

synchronizer and its improved version) is analyzed at ultra lowvoltages. Their disadvantages are discussed. It is shown thatboth synchronizers exhibit unacceptable MTBF in the sub/near-threshold region.

� Synchronizers with forward body bias are proposed, which

greatly improve the circuit parameters (metastability timeconstant t, normal propagation delay Td and metastabilitywindow Tw) that determine the synchronizer performance.Compared with the existing synchronizers, an averageimprovement of more than 80% is achieved for all the threeparameters in the sub-threshold region.

� The impact of process variation on synchronization time is

analyzed and it is shown that with full VDD body bias thesynchronizer MTBF can be improved from several seconds toyears for the worst case corner.

Fig. 1. Metastability and synchronization.

An implementation scheme for a full-VDD body-biasedsynchronizer is proposed, which significantly improves thesynchronizer performance in the sub-threshold region withnearly zero overhead. It can be configured to work for a fullrange of VDDs from super-threshold to sub-threshold region.

The rest of the paper is organized as follows. Section 2provides the background on synchronizer performance and MTBF.Section 3 analyzes two existing synchronizers and their perfor-mance with scaling the supply voltage from super-thresholdto sub-threshold region. Section 4 analyzes the impact ofnormal propagation delay and setup time on synchronizer MTBFin the sub-threshold/near-threshold region. Section 5 presentsthe proposed improvement in synchronizer performance insub-threshold/near-threshold region by using forward body bias(FBB). Section 6 analyzes the impact of process variation onsynchronization performance and shows the improvement withFBB. Section 7 discusses the impact of FBB on the metastabilitywindow Tw. Section 8 presents the proposed implementationof the full-VDD biased synchronizer, and Section 9 concludesthe paper.

2. Synchronizer performance

2.1. MTBF

Metastability may happen during data transfer between dif-ferent clock domains as shown in Fig. 1. This is because data froma different clock region is seen as an asynchronous signal by theflip-flop at the interfaces between clock domains. It can arrive anytime. When it arrives very close to the rising edge of the localclock and violates the setup condition of the flipflop, metastabilitymay occur at the output of the flip-flop. Metastability is oftenseen as an indeterminate level between logic 0 and logic 1, whichmay cause failures in subsequent circuit blocks, which aredesigned only for defined logic levels. When metastability occurs,it will resolve to a logic 0 or 1 at a certain speed which isdetermined by the circuit parameters of the flip-flop. If themetastability cannot settle before the next rising edge of the readclock, the indeterminate logic level will be transferred to thesubsequent circuits, which may lead to a system failure. Synchro-nizers are used to resolve the metastability so that is not passedto subsequent circuits. The performance of synchronizers isusually evaluated through their MTBF. Here a failure means thatthe metastability is not successfully resolved within the synchro-nizer and is passed to subsequent circuits. The MTBF formula isgiven by [14,15]:

MTBF¼et=t

Twfcfdð1Þ

where fc is the frequency at which the synchronizer is clocked, fd

is the frequency of the incoming data, t is the metastability timeconstant, which determines the metastability resolution speed. t

is the time allowed for the metastability to resolve. Tw is themetastability window, which is defined as the time differencebetween tnormal and tbalance, where tnormal is the latest data changetime to give the normal propagation delay. tbalance is the datachange time to give the longest (theoretically infinite) responsetime and is usually called the balance point [16]. t and Tw are thecircuit parameters, which can be obtained through simulation ormeasurement [15–17,19]. Normally the synchronization time isequal to one full clock cycle. So given the clock and datafrequency, MTBF can be computed by using the measured orsimulated t and Tw.

A drawback of the given MTBF formula is that the normalpropagation delay and the setup time of the subsequent flipflop isassumed to be zero so the synchronization time is fully used forthe resolution of the metastability. This is not the case especiallyin the sub-threshold/near-threshold region, where the propaga-tion delay and setup time increases exponentially and theirimpact on MTBF cannot be neglected. This will be discussed laterin Section 4.

Page 3: Sub-threshold synchronizer

J. Zhou et al. / Microelectronics Journal 42 (2011) 840–850842

2.2. Metastability time constant t

From Eq. (1), it can be seen that the synchronizer performanceis mainly determined by the metastability window Tw and themetastability time constant t, where t plays a much moreimportant role than Tw since it affects the MTBF exponentially.Previous work [13,14] has used the small signal model to analyzethe resolution of metastability in synchronizers. It has shown thatt is determined by the transconductance and the node capaci-tance of the cross-coupled inverters or gates in synchronizers asshown in Eq. (2), where gm and C are the transconductance andnode capacitance of the cross-coupled gates or inverters in thesynchronizer, respectively.

t¼ C

gmð2Þ

Larger transconductance and smaller node capacitance result infaster resolution of metastability. Therefore research work forimproving the synchronizer performance has been focusing onenlarging the transconductance and reducing the node capacitanceof the cross-coupled gates in synchronizers [14,15,20,22,23].

Previous work also showed that the synchronizer performanceis subject to VDD and process variation because t degradesrapidly as VDD decreases and Vth increases[22,25,24]. This degra-dation in synchronizer speed is more severe than that of logiccircuits because metastability resolution depends on small signal

Fig. 2. Jamb latch.

9.5E-09

2

1.2E-074.0E-07

3.4E-08

1.00E-11

1.00E-10

1.00E-09

1.00E-08

1.00E-07

1.00E-06

0.2

Tau

(s)

0.4 0.6

Fig. 3. Tau versus VD

behavior when the cross-coupled gates are at the metastable levelwhich is around half VDD. In addition, the effect of randomprocess variation can average out for long path in logic circuitswhile it is not the case for t and the synchronization time [25].These problems prevent synchronizers working in the sub-thresh-old/near-threshold region and must be addressed.

3. Existing synchronizers and their performance at low VDDs

3.1. Jamb latch synchronizer

In the past the two-flipflop synchronizer has been heavily usedbecause it is easy to implement [13,14,17,23]. However, normalflipflops are not designed specially for synchronization purpose, sotheir synchronization performance is low compared to dedicatedsynchronizers [26]. Jamb latch is a circuit commonly used as asynchronizer because of its relatively good performance [14,15,22].As shown in Fig. 2, the circuit is set by pulling Node A to ground whendata is high and clock is low, and reset by pulling Node B to ground,which is usually done when data goes low. Metastability occurs whenthe data and clock arrive very close to each other so that the Node A ispulled to around half VDD just before the lower transistor is switchedoff. After that, the metastability is resolved at a certain speed towards1 or 0 in the cross-coupled inverters. The resolution speed ofmetastability is determined by the metastability time constant t,which is a circuit parameter dependent on transconductance in thecross-coupled inverters and the capacitance at nodes A and B. Thesimplicity of the set and reset structure in the Jamb latch allows alighter load on nodes A and B, therefore leads to a small t and fastresolution of metastability. It should be noted that two Jamb latches(master and slave) are needed for making a Jamb latch synchronizer,where the metastability is mainly resolved in the master latch andthe second latch is just for the propagation of the data.

The problem of the Jamb latch synchronizer is that its perfor-mance degrades rapidly with VDD scaling. This is because t isdetermined by the transconductance or current in the cross-coupledinverters when the nodes (A and B) are at half VDD (metastability).In the super-threshold region this current decreases quadraticallywith scaling VDD causing a rapid decrease in transconductance andincrease in t, according to Eq. (3) and (4).

ID ¼mnCox

2

W

LðVGS�VTHÞ

2ð1þlVDSÞ ð3Þ

gm ¼2ID

VGS�VTHð4Þ

.8E-09

9.0E-103.2E-10

1.3E-106.0E-11

3.2E-11

VDD (v)0.8 1 1.2

D for Jamb latch.

Page 4: Sub-threshold synchronizer

J. Zhou et al. / Microelectronics Journal 42 (2011) 840–850 843

when VDD is approaching the sum of the threshold voltages of the Pand N transistors, the operation of the transistors starts relying onthe sub-threshold current. Eqs. (5) and (6) show the sub-thresholdcurrent and transconductance equations [5]. As can be seen from theequations, the sub-threshold current decreases exponentially with

Fig. 4. Improved synchronizer: (a) Version 1, (b) Version 2.

9.2E-08

8.4E-09

9.8E-10

1.9E-10

1.00E-12

1.00E-11

1.00E-10

1.00E-09

1.00E-08

1.00E-07

1.00E-06

0.2

Tau

(s)

0.4 0.6

Fig. 5. Tau versus VDD for i

decreasing VDD and causes the transconductance to decreaseexponentially. As a result t increases much more quickly than inthe super-threshold region.

Isub ¼ I0W

LeððVGS�VTH Þ=nnt Þð1�e�VDS=nt Þ ð5Þ

gm ¼Isub

nUntð6Þ

SPICE simulations were conducted for t of the Jamb latch atVDDs ranging from super-threshold to sub-threshold region on a90 nm process where the threshold voltage is around 0.4 V (all ofthe simulations mentioned in the rest of this paper are alsoconducted on the 90 nm process). The simulation method is thesame as introduced in [15]: initially a switch is used for holdingthe node A and B at the same metastable level. After some timethe switch is opened to let the voltage of the two nodes diverge,which emulates the resolution of the metastability. t is calculatedfrom the slope of the voltage difference between the two nodesover time. Fig. 3 shows the simulated Tau versus VDD for theJamb latch where the Y axis is displayed in logarithmic scale.

It can be seen from Fig. 3 that t increases rapidly with decreasingVDD and becomes enormously large when VDD approaches the sumof the threshold voltages of P and N transistors (around 0.8 V inthis case) where the resolution of metastability relies on thesub-threshold current.

3.2. Improved synchronizer

An improved synchronizer is proposed in [22] in order to dealwith the performance degradation of the Jamb latch synchronizerwith decreasing VDD. The circuit is shown in Fig. 4(a). It is basedon the structure of Jamb latch. A difference is that the gates of Ptransistors are connected to ground rather than the nodes A or B.This structure keeps the circuit ‘‘alive’’ at low voltages by main-taining enough current in the cross-coupled inverters. Thismodification slows down the degradation of t with decreasingVDD. However, the always-on P transistors consume a lot ofpower during standby mode. So in Fig. 4(b), a metastabilitydetector is added to control the ON and OFF of the P transistorsin a way that the P transistors are ON only during metastabilityand are OFF when the circuit is out of metastability. The size ofthe P transistors and transistors in the metastability detector areminimized to reduce the capacitance at the nodes A and B forminimizing t. In order for the circuit to switch normally when theP transistors are OFF, two additional P transistors are added. Their

6.6E-113.4E-11 2.1E-11

1.5E-111.2E-11 9.6E-12

VDD (v)0.8 1 1.2

mproved synchronizer.

Page 5: Sub-threshold synchronizer

J. Zhou et al. / Microelectronics Journal 42 (2011) 840–850844

sizes are also minimized to reduce the node capacitance. Also, theset and reset transistors can be made smaller now since they donot need a strong pulling strength.

The improved synchronizer has a little overhead in areacompared to the Jamb latch but is effective in improving t withscaling VDD, which can be seen from the simulation result shownin Fig. 5.

By comparing Figs. 3 and 5, it can be seen that the t of theimproved synchronizer is smaller than that of the Jamb latch forall VDDs and the degradation of t with decreasing VDD isreduced.

4. The impact of normal propagation delay and setup time onMTBF

Fig. 6 shows a two-flipflop synchronization circuit, whichincludes a dedicated synchronizer followed by a normal flipflop.Metastability may happen at the synchronizer when the asyn-chronous data arrives very close to the rising edge of the localclock. There is no logic between the synchronizer and the flipflop.Ideally, the synchronization time is equal to one full clock cycle.However, as shown in Fig. 6, the effective synchronization timeshould not include the normal propagation delay of the synchro-nizer since it does not contribute to the metastability resolution.It is also noted that the metastability has to be resolved before thesetup time of the subsequent flipflop otherwise it will gometastable. So the effective synchronization is equal to the givensynchronization time minus the normal propagation delay andthe setup time. In the super-threshold region, the delay increases

Fig. 6. Effective synchronization time.

2.6E-08

8.1

1.5E-06

1.6E-08

7.1E-07

1.1E-07

6.2E-09

6.2E-08

2.4E-07

1.00E-11

1.00E-10

1.00E-09

1.00E-08

1.00E-07

1.00E-06

1.00E-05

0.2

Nor

mal

Pro

paga

tion

Del

ay (s

)

0.4 0.6

Fig. 7. Normal propagation delay versus VDD f

nearly linearly as VDD decreases. Compared to t, the impact ofthe normal propagation delay and setup time on MTBF isnegligible, so they are usually ignored in the MTBF equation.However, in the sub-threshold region or near-threshold regionthis is not the case. The operation of circuits relies on the sub-threshold leakage current so the normal propagation delay andthe setup time increases exponentially as VDD decreases. Theirimpact on MTBF becomes significant and cannot be neglectedany more.

When considering the impact of the normal propagation delayand the setup time, the original MTBF formula should be rewrit-ten as

MTBF¼eðts�Td�TsetupÞ=t

Twfcfdð7Þ

where Td represents the normal propagation delay of the syn-chronizer, Tsetup represents the setup time of the subsequentflipflop and ts is the given synchronization time, which is usuallythe same as the clock period.

In order to investigate the impact of normal propagation delayon synchronizer MTBF, simulations are conducted for both theJamb latch synchronizer and the improved synchronizer atdifferent VDDs. The results are shown in Fig. 7.

As shown in Fig. 7, the normal propagation delay increaseswith decreasing VDD, especially when VDD approaches the sub-threshold region, the propagation delay increases nearly expo-nentially due to the exponentially decreasing sub-thresholdcurrent.

It should be noted that the improved synchronizer has a worsepropagation delay than the Jamb latch synchronizer at all VDDs.This is because the improved synchronizer has stronger pull uptransistors and weaker set transistors, which improves t butdegrades the normal propagation delay.

Fig. 8 shows the relation between the setup time and VDDfrom super-threshold to sub-threshold region. The simulationwas done on a D flipflop from a 90 nm commercial standard celllibrary and the setup time is characterized by measuring the D-to-Clock time of the flipflop which gives 20% longer Clock-to-Q delaythan the normal propagation delay. As shown in Fig. 8, thesetup time increases nearly exponentially as VDD decreases inthe sub-threshold region. According to Eq. (7) the increase in thesetup time and the propagation delay will result in exponentialincrease in the synchronizer MTBF.

E-10

2.7E-101.6E-10 1.2E-10 9.4E-11 7.9E-11

6.6E-103.5E-10

2.4E-10 1.8E-10 1.5E-10

2.5E-09

VDD (v)

Jamb latchImproved Synchronizer

0.8 1 1.2

or Jamb latch and improved synchronizer.

Page 6: Sub-threshold synchronizer

J. Zhou et al. / Microelectronics Journal 42 (2011) 840–850 845

After obtaining the above data, we can calculate the MTBFfor the Jamb latch and the improved synchronizers in thesub-threshold region. At 0.3 V the metastability window Tw hasa typical value of 50 ns, which is obtained by changing thedata arrival time towards the clock and observing the output ofthe synchronizer in extensive simulation. We assume that theclock and data frequency are 300 KHz, which is quite typical forsub-threshold operation [6,7,9,10]. The setup time of a flipflop isaround 50 ns as shown in Fig. 8. For the Jamb latch synchronizer,the values of t and propagation delay are 400 and 700 ns,respectively, as shown in Figs. 3 and 7. By calculation usingEq. (7), the MTBF of Jamb latch synchronizer at 0.3 V is around 0.15 s.The same calculation conducted for the improved synchronizerby using the data from Figs. 5 and 7 results in a MTBF of 1 day.Obviously, neither of them is acceptable.

It should be noted that the MTBF calculation has not taken intoaccount the number of synchronizers in the system. If a designcontains more than one synchronizer, the overall MTBF should beequal to the single synchronizer MTBF divided by the number ofsynchronizers [25]. Also, the impact of process variation on thesynchronizer MTBF is not taken into account in the calculation. Inthe sub-threshold region, the large process variation may degradethe MTBF significantly, which can be seen in later sections. Aftertaking into account all of these factors, the MTBF will be muchworse than previously calculated.

Fig. 8. Setup time versus VDD.

1.80E-11

2.00E-11

2.20E-11

2.40E-11

2.60E-11

2.80E-11

3.00E-11

0.00E+00Transis

Tau

(s)

2.00E-06 4.00E-06 6.

Fig. 9. Tau versus t

5. Proposed synchronizers

The main problem for synchronizers to work at ultra low voltagesis that the metastability time constant t, which depends on thetransconductance of the cross-coupled gates in the synchronizer,degrades rapidly with decreasing VDD. Another problem is that thenormal propagation delay increases exponentially with VDD in thesub-threshold/near-threshold regions and finally becomes compar-able to t in affecting MTBF or synchronization time. To overcomethese problems, we have to improve the transconductance in thecross-coupled inverter and the drive strength of the set transistorswhile reducing the node capacitance. It is difficult to address thisproblem by simply increasing the transistor size because it increasesthe transconductance and node capacitance at the same time.Simulation (Fig. 9) shows that initially the t improves by increasingtransistor sizes but finally the improvement saturates. In addition,increasing transistor sizes also increase the power and area.

We propose to address the problems by applying forward bodybias to the transistors in the synchronizer. This has twoadvantages:

a.

tor00E

ran

Applying forward body bias (FBB) to the cross-coupled tran-sistors decreases the threshold voltage and thus increases

Size (m)-06 8.00E-06 1.00E-05 1.20E-05

sistor size.

Fig. 10. Jamb latch with body bias.

Page 7: Sub-threshold synchronizer

J. Zhou et al. / Microelectronics Journal 42 (2011) 840–850846

transconductance without increasing the node capacitance. Eq.(8) [2] shows the effect of source-body voltage on the transis-tor threshold voltage VT. FBB reduces the Vsb and thus VT. As aresult, gm is increased while capacitance remains the same,which leads to a reduction in the metastability time constant tin both super-threshold region and sub-threshold regionaccording to Eq. (4) and (6). In the sub-threshold region, dueto the exponential relationship between the current and thethreshold voltage, the improvement is especially large.

VT ¼ VT0þKffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiVsbþ2fF

qð8Þ

b.

Applying forward body bias to the set/reset transistorsincreases their drive strength and thus improves the normalpropagation delay without adding capacitance to the nodes.

To investigate the effect of forward body bias on synchronizerperformance, we used the Jamb latch synchronizer as an example.Fig. 10 shows the body-biased Jamb latch, its difference from theoriginal versions is that now the bulk of all the transistors is notconnected to VDD/ground, but controlled by a forward bias voltage,which is possible in a triple well process. Here P and N transistorsare body-biased separately to achieve the optimum effect. Note thatbiasing individual transistor may give a better result, but it isdifficult to implement, that is why we apply the same P bias andN bias to all the P and N transistors in the synchronizer. Theimplementation details will be discussed in Section 8.

In our simulations the NBias is swept and the PBias alwaysequals VDD minus NBias. It should be noted that there is a limit forthe applied forward body bias voltage. It should not exceed the PNjunction conducting voltage (around 0.8 V in our case) otherwise thejunction current will degrade the synchronizer performance andcause functional and power problem, which will be discussed later.

Fig. 11 shows t of the Jamb latch synchronizer under differentbias voltage with different VDDs. Here for readability the Y axis isdisplayed in logarithmic scale. For each VDD value, the NBiasvoltage was only increased up to the VDD.

As can be seen from Fig. 11, t decreases nearly exponentiallyas forward body bias voltage increases. At VDD of 0.3 V, the t with

4.88E

3.96E-07

1.00E-11

1.00E-10

1.00E-09

1.00E-08

1.00E-07

1.00E-06

0Bo

Tau

(s)

VDD = 1.2 VDD = 1.1 VDD = 1.0 VD

VDD = 0.5 VDD = 0.4 VDD = 0.3

0.1 0.2 0.3

Fig. 11. Tau versus body bias fo

zero body bias is around 396 ns, and with 0.3 V body bias tquickly decreases to around 49 ns. The reduction is about 88%showing that applying forward body bias is an effective way toreduce t.

Fig. 12 shows the normal propagation delay of the Jamb latchsynchronizer under different forward body bias and VDDs. Asshown in Fig. 12, the normal propagation delay decreases asforward body bias voltage increases. In the super-thresholdregion, the normal propagation delay decreases linearly as bodybias voltage increases while in the sub-threshold/near-thresholdregions it decreases exponentially with body bias voltage increas-ing. At VDD of 0.3 V, the delay with zero body bias is around706 ns, and with 0.3 V body bias it quickly decreases to around143 ns. The reduction is about 80%.

As mentioned before there is a limit for the applied forward bodybias voltage. It should not exceed the PN junction conductingvoltage (around 0.8 V in our case). When the bias voltage exceedsthis voltage, the large bulk junction current will reduce the channelcurrent and thus degrade the performance. As shown in Fig. 13,when the applied forward body bias is larger than 0.8 V, the t of theJamb latch synchronizer increases rather than decrease withincreasing body bias voltage. Similar effect is observed for thenormal propagation delay as expected. Another problem with bodybias larger than PN junction conducting voltage is that the output ofthe synchronizer cannot reach to full VDD or ground due to the largecurrent injected from the bulk to the channel.

The same simulations are conducted on the improved syn-chronizer for comparison. Figs. 14 and 15 show t and the normalpropagation delay versus body bias for the improved synchroni-zer. At VDD of 0.3 V with body bias of 0.3 V, t of the improvedsynchronizer is only 13 ns, 86% better than without body bias(92 ns). The normal propagation is 315 ns, 78% better than with-out body bias (1454 ns). The results are better than that of thebody-biased Jamb latch synchronizer as expected (shown inFigs. 11 and 12).

We can re-calculate the MTBF for the Jamb latch synchronizerand the improved synchronizer with FBB at 0.3 V. Assuming thatthe clock and data frequency is still 300 KHz and Tw is still 50 ns(the effect of FBB on Tw will be discussed later), for the Jamb latchsynchronizer with 0.3 V FBB, the t and the normal propagation

-08

dy-Bias (v)

D = 0.9 VDD = 0.8 VDD = 0.7 VDD = 0.6

0.4 0.5 0.6 0.7 0.8

r Jamb latch synchronizer.

Page 8: Sub-threshold synchronizer

0.00E+00

5.00E-12

1.00E-11

1.50E-11

2.00E-11

2.50E-11

3.00E-11

3.50E-11

4.00E-11

4.50E-11

0Body-Bias (v)

Tau

(s)

VDD = 1.2

0.2 0.4 0.6 0.8 1 1.2 1.4

Fig. 13. Over-biased Jamb latch synchronizer.

1.43E-07

7.06E-07

1.00E-11

1.00E-10

1.00E-09

1.00E-08

1.00E-07

1.00E-06

0Body-Bias (v)

Nor

mal

Pro

paga

tion

Del

ay (s

)

VDD = 1.2 VDD = 1.1 VDD = 1.0 VDD = 0.9 VDD = 0.8 VDD = 0.7

VDD = 0.6 VDD = 0.5 VDD = 0.4 VDD = 0.3

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

Fig. 12. Normal propagation delay versus body bias for Jamb latch synchronizer.

J. Zhou et al. / Microelectronics Journal 42 (2011) 840–850 847

delay are 29 and 143 ns, respectively. This results in a MTBF of4.87�1016 years. For the improved synchronizer with 0.3 V FBB,the MTBF is 1.04�1088 years. They are much better than theMTBF without FBB (0.15 s for the Jamb latch synchronizer and1 day for the improved synchronizer) and will meet the MTBFrequirement of any application.

6. Process variation

As mentioned before, the synchronizer performance is alsosubject to process variation since the variation in Vth affects themetastability time constant t in a similar way as VDD. Figs. 16and 17 shows the probability distribution of t under processvariation for the non-biased and full VDD biased Jamb latchsynchronizer at 0.3 V, respectively. The results are obtained fromMonte Carlo simulations with 1000 samples. By comparing

Figs. 13 and 14 it can be seen that with FBB the mean value oft is improved from 440 to 54 ns and the standard deviation isimproved from 230 to 25 ns. By considering a 3s variation of t,mþ3s, the MTBF of non-biased Jamb latch synchronizer degradesfrom 0.15 s to 0.2 ms. For the biased Jamb latch synchronizer, 3svariation causes the MTBF to degrade from 4.87�1016 years to3.2 Months, which is still sufficient for most applications, andonly one out of 1000 synchronizers will have a MTBF worsethan this.

7. Metastability window Tw

As mentioned in Section 2, the metastability time constant tplays a much more important role than the metastability windowTw because it affects MTBF exponentially. So are the normalpropagation delay and the setup time as can be seen in Eq. (7).

Page 9: Sub-threshold synchronizer

3.150E-07

1.454E-06

1.00E-10

1.00E-09

1.00E-08

1.00E-07

1.00E-06

1.00E-05

0Body-Bias (v)

Nor

mal

Pro

paga

tion

Del

ay (s

)

VDD = 1.2 VDD = 1.1 VDD = 1.0 VDD = 0.9 VDD = 0.8 VDD = 0.7

VDD = 0.6 VDD = 0.5 VDD = 0.4 VDD = 0.3

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

Fig. 15. Normal propagation delay versus body bias for the improved synchronizer.

Mean: 440 ns

Std: 230 ns

0

10

20

30

40

50

60

70

80

0Tau (ns)

Num

ber o

f Occ

uran

ce

150 300 450 600 750 900 1050 1200 1350 1500 1650 1800

Fig. 16. Distribution of t for the Jamb latch synchronizer at 0.3 V without body-bias.

1.30E-08

9.16E-08

1.00E-12

1.00E-11

1.00E-10

1.00E-09

1.00E-08

1.00E-07

1.00E-06

0Body-Bias (v)

Tau

(s)

VDD = 1.2 VDD = 1.1 VDD = 1.0 VDD = 0.9 VDD = 0.8 VDD = 0.7 VDD = 0.6

VDD = 0.5 VDD = 0.4 VDD = 0.3

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

Fig. 14. Tau versus Body bias for the improved synchronizer.

J. Zhou et al. / Microelectronics Journal 42 (2011) 840–850848

Page 10: Sub-threshold synchronizer

Mean: 54 nsStd: 25 ns

0

10

20

30

40

50

60

70

80

90

0Tau (ns)

Num

ber o

f Occ

uran

ce

25 44 63 82 101 120 139 158 177 196

Fig. 17. Distribution of t for the Jamb latch synchronizer at 0.3 V with full VDD body-bias.

17.17

3.43

0.502

2.63

0.34

0.071

50.78

7.37

0.01

0.1

1

10

100

0.2VDD (v)

Tw (n

s)

Without FBB With Full-VDD FBB

0.3 0.4 0.5 0.6 0.7

Fig. 18. Metastability window Tw versus VDD with/without FBB.

Fig. 19. Full VDD body-biased synchronizer.

J. Zhou et al. / Microelectronics Journal 42 (2011) 840–850 849

However, it is still interesting to see how Tw varies with VDD andFBB for completion of the MTBF analysis. By extensive simulation,we obtained the values of Tw of Jamb latch at different voltages inthe sub-threshold and near-threshold region with and withoutfull-VDD forward body bias and show them in Fig. 18.

As shown in Fig. 18, Tw increases nearly exponentially withdecreasing VDD in the near-threshold/sub-threshold region. WithFBB Tw improves significantly. This can be explained by the factthat Tw is dependent on the strength of the stacking pull down Ntransistors in the latch of the synchronizer. Applying forwardbody bias to the transistors lowers their threshold voltage andthus improves their strength. Although the strength of the pull upP transistors is also improved, the N transistors have highermobility leading to stronger current. As a result Tw is reduced.In the previous calculation of 3s MTBF, Tw was assumed to be50 ns, while with it is actually improved from 50 to 7 ns with FBBas shown in Fig. 18. The improvement is around 86%. Using thisimproved value of Tw in the calculation of the 3s MTBF will resultin a much better MTBF of around 23 months or 1.9 years.

8. Implementation

As previously discussed the synchronizer performance improvesas the forward body bias increases until the PN junction startsconducting. Considering that a cost-effective way of implementingFBB may be to use full VDD bias, which significantly improves thesynchronizer performance with minimum overhead, we proposed

the implementation shown in Fig. 19 where instead of connectingthe bulk directly to VDD or ground, the NBias is connected to theBias Control, which could be an external control pin or an on-chipcontrol register, and an inverter is used to generate the PBias fromthe NBias. In this way when the synchronizer works at voltagesbelow PN junction conducting voltage (0.8 V in our case), the biascontrol is set high so that a full VDD forward body bias is applied forimproving the synchronizer performance. When the synchronizerworks at higher VDD the bias control is set low so zero body bias isapplied.

The advantage of the full VDD body bias scheme is that thebias control circuit is very simple. It gives nearly zero overheadwhile improving the synchronizer performance significantly. Also,with this scheme the synchronizer is able to work in a wide rangeof VDDs from sub-threshold region to nominal VDD, using only anextra control signal (i.e. Bias Control). For lower values of VDD, itis still beneficial to have bias values greater than VDD, but itrequires on-chip voltage generation which costs area and power.

The disadvantage of this scheme is that for higher VDD (abovePN junction conducting voltage) the forward body bias has to bedisabled through the Bias Control so its benefits are gone.However, the main problem of synchronizers is their performancedegradation at low/ultra low voltages. For high voltage applica-tions using the fast synchronizers such as proposed in [18,22]should be sufficient. Moreover, the proposed technique in thispaper can be used for any synchronizer to maintain theirperformance at low/ultra low voltages. Another disadvantage ofthe body-biased synchronizer is that as expected the leakage

Page 11: Sub-threshold synchronizer

J. Zhou et al. / Microelectronics Journal 42 (2011) 840–850850

power increases when FBB is applied. However, synchronizers aresmall circuits so the impact of this on system power is negligible.For example, our simulation shows that the leakage power for thenon-biased Jamb latch synchronizer at 0.3 V is 7.56 pW while it is29.64 pW for the full-VDD-biased Jamb latch synchronizer. Thisincrease does not affect the overall leakage power of the system(usually in mW range for ultra low power systems [6–8,10]). Inaddition, implementing well-biased synchronizers will incur areaoverhead due to well separation rules and extra wire connections.For example, in the target 90 nm process the well separation rulesis around 2–3 mm. Given the area of the investigated synchroni-zers we estimate that the area overhead for the full-VDD biasedsynchronizer is about 4.8–6.2%. These increases in wire lengthand area are far outweighed by the improvements in synchronizerperformance through FBB.

9. Conclusion

In many low-VDD systems on chip synchronizers are neededfor interfacing voltage/clock domains operating from super-threshold to sub-threshold voltages. However, the circuit para-meters, which determine the synchronizer performance, degraderapidly with VDD scaling because the operation of synchronizersrely on small signal behavior rather than large signal. Thissituation becomes much worse in the sub/near-threshold regionthan in the super-threshold region due to the exponentiallydecreasing current with decreasing VDD. As a result synchroni-zers working well in the super-threshold region exhibitunacceptable MTBF in the sub/near-threshold region. In order tosolve this problem, we analyzed the performance of two existingsynchronizers (Jamb latch synchronizer and its improved versionfor low voltage operation) from nominal VDD all the way down tothe sub-threshold region. We found that their performancedegrades dramatically in the sub/near-threshold region, whichcauses the MTBF to decrease quickly to several seconds or days.To extend the synchronization operation to the sub/near-thresh-old region, we proposed to apply forward body bias to thesynchronizers to improve their performance. The forward bodybias lowers the threshold voltage and thus increases transcon-ductance of the cross-coupled elements in the synchronizerwithout adding capacitance to the switching nodes. As a result,all the circuit parameters (t, Td and Tw) determining the synchro-nizer performance improve significantly in the subthreshold/near-threshold region. Compared with the non-biased synchroni-zers, the average improvement for all the synchronizer perfor-mance determining parameters is more than 80%, which leads toan exponential improvement on MTBF. We also analyzed theimpact of process variation on the synchronizer MTBF and foundthat with full VDD body bias the MTBF can be improved fromseveral seconds to years for the worst case corner. Finally weproposed a simple implementation scheme of full-VDD-biasedsynchronizer, which gives nearly zero overhead and can beconfigured to work for a wide range of VDDs from sub-thresholdregion to nominal VDD.

Acknowledgment

This work was partly supported by EPSRC—EP/G066361/1(Project ‘‘Reliable cell design methods for variable processes(RelCel)’’) and EP/C007298/1 (Project ‘‘Synchronizer Reliabilityin the Next Generation of SoC with Multiple Clocks’’).

References

[1] M. Pedram, J. Rabaey, Power Aware Design Methodologies, Kluwer, 2002.[2] H. Veendrick, Nanometer CMOS ICs: from Basics to ASICs, Springer, 2008.[3] F. Fallah, M. Pedram, Standby and active leakage current control and

minimization in CMOS VLSI circuits, IEICE Trans. on Electronics, SpecialSection on Low-Power LSI and Low Power IP 88 (4) (2005) 509–519.

[4] K. Roy, S. Mukhopadhyay, H. Mahmoodi-Meimand, Leakage current mechan-isms and leakage reduction techniques in deep-sub micrometer, Proceedingof the IEEE 91 (2) (2003) 305–327.

[5] M. Alioto, Understanding DC behavior of subthreshold CMOS logic through closed-form analysis, IEEE Trans. on Circuits and Systems – part I 56 (12) (2010) 1–11.

[6] Bo Zhai, L. Nazhandali , J. Olson, A. Reeves, M. Minuth, R. Helfand, Sanjay Pant,D. Blaauw, T. Austin, A 2.60 pJ/Inst subthreshold sensor processor for optimalenergy efficiency, in: Proceedings of the IEEE Symposium on VLSI Circuits,June 2006, pp. 154–155.

[7] Pu Yu, J.P. de Gyvez, H. Corporaal, Ha Yajun, An ultra-low-energy/framemulti-standard JPEG co-processor in 65 nm CMOS with sub/near-thresholdpower supply, in: Proceedings of the Solid-State Circuits Conference—Digestof Technical Papers, IEEE International 2009, ISSCC, 8–12 February 2009,pp. 146–147, 147a.

[8] Lennart Yseboodt, Michael De Nil, Jos Huisken, Mladen Berekovic, Qin Zhao,Frank Bouwens, Jef L. van Meerbergen, Design of 100 muW wireless sensornodes on energy scavengers for biomedical monitoring, in: Proceedings ofSAMOS 2007, pp. 385–395.

[9] J. Kwong, Y. Ramadass, N. Verma, M. Koesler, K. Huber, H. Moormann, A.Chandrakasan, A 65 nm Sub-Vt microcontroller with integrated sram andswitched-capacitor DC–DC converter, in: Proceedings of the Solid-StateCircuits Conference. ISSCC 2008, Digest of Technical Papers, IEEE Interna-tional, 3–7 February 2008, pp. 318–616.

[10] S.C. Jocke, J.F. Bolus, S.N. Wooters, A.D. Jurik, A.C. Weaver, T.N. Blalock, B.H.Calhoun, A 2.6 uW Sub-threshold mixed-signal ECG SoC, in: Proceedings ofthe 2009 Symposium on VLSI Circuits, 16–18 June 2009, pp. 60–61.

[11] C. Isci, A. Buyuktosunogly, C. Cher, P. Bose, M. Martonosi, An analysis ofefficient multi-core global power management policies: maximizing perfor-mance for a given power budget, in: Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture, 2006, pp. 347–358.

[12] W. Jang, D. Ding, D.Z. Pan., A Voltage-Frequency Island aware energyoptimization framework for networks-on-chip, in: Proceedings of ICCAD,2008.

[13] H.J.M. Veendrick., The behavior of flip-flops used as synchronizers andprediction of their failure rate, IEEE Journal of Solid-State Circuits 15 (2)(1980) 169–176.

[14] D.J. Kinniment, A. Bystrov, A.V. Yakovlev, Synchronization circuit perfor-mance, IEEE Journal of Solid-State Circuits 37 (2) (2002) 202–209.

[15] C. Dike, E. Burton., Miller and noise effects in a synchronizing flip-flop, IEEEJournal of Solid State Circuits 34 (6) (1999) 849–855.

[16] D. Kinniment, K. Heron, G. Russell, Measuring deep metastability, in:Proceedings of the IEEE Symposium on Asynchronous Circuits and Systems(ASYNC), March 2006, pp. 2–11.

[17] J. Zhou, D.J. Kinniment, C.E. Dike, G. Russell, A. Yakovlev, On-chip measure-ment of deep metastability in synchronizers, IEEE Journal of Solid-StateCircuits 43 (2) (2008) 550–557.

[18] W.J. Dally, S.G. Tell, The even/odd synchronizer: a fast, all-digital, periodicsynchronizer, in: Proceedings of the IEEE Symposium on AsynchronousCircuits and Systems (ASYNC), 3–6 May 2010, pp. 75–84.

[19] I.W. Jones YangSuwen M., Greenstreet, Synchronizer Behavior and Analysis,in: Proceedings of the 15th IEEE Symposium on Asynchronous Circuits andSystems, 2009. ASYNC ’09, 17–20 May 2009, pp. 117–126.

[20] J. Jex, C. Dike, A fast resolving BiNMOS synchronizer for parallel processorinterconnect, IEEE Journal of Solid-State circuits 30 (2) (1995) 133–139.

[21] S. Yang, M. Greenstreet, Computing synchronizer failure probabilities, in:Proceedings of the Design, Automation & Test in Europe Conference &Exhibition, 2007, 16–20 April 2007, pp. 1–6.

[22] J. Zhou, D.J. Kinniment, G. Russell, A. Yakovlev, A Robust Synchronizer Circuit,in: Proceedings of the IEEE Computer Society Annual Symposium on VLSI,2006, pp. 442–443.

[23] D.J. Kinniment, Synchronization and Arbitration in Digital Systems, Wiley,2007.

[24] S. Beer, R. Ginosar, M. Priel, R. Dobkin, A. Kolodny, The Devolution ofSynchronizers, in: Proceedings of the IEEE Symposium on AsynchronousCircuits and Systems (ASYNC), May 2010, pp. 94–103.

[25] J. Zhou, D.J. Kinniment, G. Russell, A. Yakovlev, Adapting Synchronizers to theEffects of On Chip Variability, in: Proceedings of the IEEE Symposium onAsynchronous Circuits and Systems (ASYNC), 2008, pp. 39–47.

[26] Mohammed Alshaikh, David Kinniment, Alex Yakovlev, On the trade-offbetween resolution time and delay times in bistable circuits, in: Proceedingsof the 2009 IEEE Interantional Conference on Electronics Circuits andSystems (ICECS 2009), December 2009.

[27] Jun Zhou, M. Ashouei, D. Kinniment, J. Huisken, G. Russell, Extendingsynchronization from super-threshold to sub-threshold region, in: Proceed-ings of the IEEE Symposium on Asynchronous Circuits and Systems (ASYNC),3–6 May 2010, pp. 85–93.