ISSCC SESSION ANALOG TECHNIQUES AND PLLs...

3
ISSCC 2007 / SESSION 17 / ANALOG TECHNIQUES AND PLLs /173 17.7 A Double-Tail Latch-Type Voltage Sense Amplifier input and output pads (for probe station measurement) was with 18ps Setup+Hold Time placed on the same die. The layout of the double-tail SA is shown in the inset of the chip micrograph in Fig. 17.7.7. An SR-latch is Daniel Schinkel, Eisse Mensink, Eric Klumperink, Ed van Tuijl, connected to the output of the SA to create static output signals Bram Nauta without loss of timing information from the core of the SA. When required, more advanced 'slave' stages could be used in applica- University of Twente, Enschede, The Netherlands tions [3]. Latch-type sense amplifiers, or sense amplifier based flip-flops, Figure 17.7.4 shows the measured relative delay under different are very effective comparators. They achieve fast decisions due to conditions (the absolute delay is not measurable due to addition- a strong positive feedback and their differential input enables a al delay from the output drivers). As intended, the minimal delay low offset. Sense amplifiers (SA) are hence widely applied in, for is found at V.,, = l.lV At a Vc, of 0.6V, there is still only a 20ps example, memories, A/D converters, data receivers and lately also increase in delay. The delay versus AV,, is 44ps/decade under in on-chip transceivers [1,2]. Voltage-mode SA's, as shown in Fig. nominal conditions. In comparison, measurements in [4] on a con- 17.7.1, have become especially popular [3-5] because of their high ventional topology in 0.13Rtm CMOS with VDD= 1.5V show a delay input impedance, full-swing output and absence of static power versus AVi7, of 100 to 170ps/dec and a 250ps increase in delay consumption. when V17< is lowered to 0.6V However, the stack of transistors in Fig. 17.7.1 requires a large The offset in [4] is also very dependent on the V17>. and rises from voltage headroom, which is problematic in low-voltage deep-sub- 8.5 to l9mV when the V>,, changes from 1.05 to 1.5V. For our micron CMOS technologies. Furthermore, the speed and offset of design, measurements on 20 samples gave an offset of a,,= 8mV this circuit are very dependent on the common-mode voltage of at both V., = 1.1V and Vc,, = 0.75V. If desired, area upscaling could the input V,,,, [4], which is a problem in applications with wide further reduce the offset at the expense of power (P 2) common-mode ranges, for example A/D converters. Offset compensation schemes [5] are a good alternative if the As an alternative, a double-tail sense amplifier is presented here, application allows for the added complexity. The power consumed which uses one tail for the input stage and another for the latch- by the SA is 113fJ/decision when AVi is 5OmV (f[1k = 1GHz, VDD = ing stage, as shown in Fig. 17.7.2. This topology has less stack- 1.2V, P = 113gtW f 1GHz, or 225gW @ 2 GHz), which drops to ing and can therefore operate at lower supply voltages. The dou- 92fJ/decision for full-swing inputs. ble tail enables both a large current in the latching stage (wide The SAs equivalent input noise was extracted, by measuring the M12), for fast latching independent of the V,,,, and a small cur- average number of positive decisions versus AV,, as shown in Fig. rent in the input stage (small M9), for low offset. 17.7.5. Fitting the measurements to a Gaussian cumulative dis- The signal behavior of the double-tail SA is also shown in Fig. tribution gives an RMS noise voltage of V,,,,= 1.5mV. 17.7.2. During the reset phase (Clk = 0V), transistors M7 and M8 Setup and hold times are extracted from BER measurements pre-charge the Di nodes to VDD, which in turn causes M10 and around the zero crossings of the full-swing input patterns, as Mll to discharge the output nodes to ground (so there is no need shown in Fig. 17.7.6. No bit errors are measured outside an inter- for dedicated reset transistors at the output nodes), After the val of 18ps, so the required setup+hold time is smaller than 18ps reset phase, the tail transistors M9 and M12 turn on (Clk = VDD). (as input jitter is part of the 18ps). A conventional circuit in At the Di nodes, the common-mode voltage then drops monotoni- 0.18gm CMOS [3] achieves 80ps, which would still be 40ps in cally with a rate defined by ITMICDi and on top of this, an input 9Onm CMOS according to scaling theory. In the double-tail topol- dependent differential voltage AVDi will build up. The intermedi- ogy, the setup+hold time could be further reduced with a wider ate stage formed by Mi10 and Mll passes AVDi to the cross-coupled tail transistor M9, but at the expense of increased offset and inverters and also provides additional shielding between the noise due to a shortening of the time that M5/M6 operate in sat- input and output, with less kickback noise as a result. The invert- uration. Simulations predict that the current aperture time is ers start to regenerate the voltage difference as soon as the com- already fast enough to sample data patterns of 4OGb/s, provided mon-mode voltage at the Di nodes is no longer high enough for that interleaving is used to enable a suitably long regeneration M10 and Mll to clamp the outputs to ground. The ideal operat- phase. ing point (V'>,) and the timing of the various phases can be tuned with the transistor sizes. In conclusion, the double-tail topology has an added degree of freedom that enables better optimization of the balance between To compare the conventional and double-tail SAs, both circuits speed, offset, power and common-mode voltage. The circuit also are simulated, with transistor dimensions scaled to get an offset has better isolation between input and output and is well suited standard deviation of a,, = 13mV. The operating conditions are to operate at low supply voltages. VDD = 1.2V and f,, = 3GHz, and the input has VC11 = 1.1V. At this high V>,, (found, for example, in memories), the conventional Acknowledgements: topology ne rettniosthDThe authors thank Philips Research for chip fabrication, the Dutch topology needs reset tranLsistors at the Di nodes [1,5] to ensure Technology Foundation (STW, project TCS.5791) for funding and Gerard that M5 and M6 at least start in saturation. The power consumed Wienk for assistance. by both circuits is similar, about 40fJ/bit. Figure 17.7.3 shows the delays of both circuits versus the differential input voltage. The References: feedback gives a logarithmic relation between the delay [1] I. Zhang, V. George and J. Rabaey, "Low-Swing On-Chip Signaling posAitivee r e Techniques: Effectiveness and Robustness," IEEE T VLSI Systems, pp. and Al">: 37ps per decade for the double-tail SA anLd 37 to . '264-272, June, 2000. 43ps/dec for the conventional SA. The double-tail SA is both [2] D. Schinkel et al., "A 3Gb/s/ch Transceiver for 10-mm Uninterrupted faster in general and the delay increases by only 7ps when V,,. RC-Limited Global On-Chip Interconnects," IEEE J. Solid-State Circuits, drops to 0.7V, instead of the 25 to 60ps increase for the conven- pp. 297-306, Jan,, 2006. tional topology. When simulated at VDD= 1Vthe delay of thedou [31 B. Nikolic et al., "Improved Sense-Amplifier-Based Flip-Flop: Design tional topology. When st,1imulatedt at V DD = 1 V, the delay of the doU- and Measurements," IEEE J. Solid-State Circuits, pp. 876-884, June, ble-tail SA is lps larger versus 29ps for the conventional SA. 2000 [41 B. Wicht at at., 'Yield anld Speed Optimnization of a Latch-Type Voltage Thne dLouble-tail SA was implementedt in a 1.2 V CMOSlU OOnm tech- Sense Amplifier," IEEE J. Solid-Stnte circuits, pp. 1148-1158, JulLy, 2004. nology, as part of a low-swing on-chip data transceiver that oper- [51 K.-Lid. Wong and C.-K.K. Yang, "Offset CompensatDion inl Comparators atesU aound1 V. = 1.1V. The 1, can hae, large variations due to, with Minimum Input-Referred Supply Noise," IEEE J. Solid-State for examnple, crosstalk effects. A double-tail SA with dedicated circuits, pp. 837- 840, May, 2004. 314 200 IEEE11 Internaionlll Soid-lState Circuits Conference 1Il4244l0852lO/07/$2500 ©2007 IEEE. Authorized licensed use limited to: Ehsan Zhian Tabasy. Downloaded on November 15, 2009 at 04:17 from IEEE Xplore. Restrictions apply.

Transcript of ISSCC SESSION ANALOG TECHNIQUES AND PLLs...

ISSCC 2007 / SESSION 17 / ANALOG TECHNIQUES AND PLLs /173

17.7 A Double-Tail Latch-Type Voltage Sense Amplifier input and output pads (for probe station measurement) waswith 18ps Setup+Hold Time placed on the same die. The layout of the double-tail SA is shown

in the inset of the chip micrograph in Fig. 17.7.7. An SR-latch isDaniel Schinkel, Eisse Mensink, Eric Klumperink, Ed van Tuijl, connected to the output of the SA to create static output signalsBram Nauta without loss of timing information from the core of the SA. When

required, more advanced 'slave' stages could be used in applica-University of Twente, Enschede, The Netherlands tions [3].

Latch-type sense amplifiers, or sense amplifier based flip-flops, Figure 17.7.4 shows the measured relative delay under differentare very effective comparators. They achieve fast decisions due to conditions (the absolute delay is not measurable due to addition-a strong positive feedback and their differential input enables a al delay from the output drivers). As intended, the minimal delaylow offset. Sense amplifiers (SA) are hence widely applied in, for is found at V.,, = l.lV At a Vc, of 0.6V, there is still only a 20psexample, memories, A/D converters, data receivers and lately also increase in delay. The delay versus AV,, is 44ps/decade underin on-chip transceivers [1,2]. Voltage-mode SA's, as shown in Fig. nominal conditions. In comparison, measurements in [4] on a con-17.7.1, have become especially popular [3-5] because of their high ventional topology in 0.13Rtm CMOS with VDD= 1.5V show a delayinput impedance, full-swing output and absence of static power versus AVi7, of 100 to 170ps/dec and a 250ps increase in delayconsumption. when V17< is lowered to 0.6V

However, the stack of transistors in Fig. 17.7.1 requires a large The offset in [4] is also very dependent on the V17>. and rises fromvoltage headroom, which is problematic in low-voltage deep-sub- 8.5 to l9mV when the V>,, changes from 1.05 to 1.5V. For ourmicron CMOS technologies. Furthermore, the speed and offset of design, measurements on 20 samples gave an offset of a,,= 8mVthis circuit are very dependent on the common-mode voltage of at both V., = 1.1V and Vc,, = 0.75V. If desired, area upscaling couldthe input V,,,, [4], which is a problem in applications with wide further reduce the offset at the expense of power (P 2)common-mode ranges, for example A/D converters. Offset compensation schemes [5] are a good alternative if the

As an alternative, a double-tail sense amplifier is presented here, application allows for the added complexity. The power consumedwhich uses one tail for the input stage and another for the latch- by the SA is 113fJ/decision when AVi is 5OmV (f[1k = 1GHz, VDD =ing stage, as shown in Fig. 17.7.2. This topology has less stack- 1.2V, P = 113gtW f 1GHz, or 225gW @ 2 GHz), which drops toing and can therefore operate at lower supply voltages. The dou- 92fJ/decision for full-swing inputs.ble tail enables both a large current in the latching stage (wide The SAs equivalent input noise was extracted, by measuring theM12), for fast latching independent of the V,,,, and a small cur- average number of positive decisions versus AV,, as shown in Fig.rent in the input stage (small M9), for low offset. 17.7.5. Fitting the measurements to a Gaussian cumulative dis-The signal behavior of the double-tail SA is also shown in Fig. tribution gives an RMS noise voltage of V,,,,= 1.5mV.17.7.2. During the reset phase (Clk = 0V), transistors M7 and M8 Setup and hold times are extracted from BER measurementspre-charge the Di nodes to VDD, which in turn causes M10 and around the zero crossings of the full-swing input patterns, asMll to discharge the output nodes to ground (so there is no need shown in Fig. 17.7.6. No bit errors are measured outside an inter-for dedicated reset transistors at the output nodes), After the val of 18ps, so the required setup+hold time is smaller than 18psreset phase, the tail transistors M9 and M12 turn on (Clk = VDD). (as input jitter is part of the 18ps). A conventional circuit inAt the Di nodes, the common-mode voltage then drops monotoni- 0.18gm CMOS [3] achieves 80ps, which would still be 40ps incally with a rate defined by ITMICDi and on top of this, an input 9Onm CMOS according to scaling theory. In the double-tail topol-dependent differential voltage AVDi will build up. The intermedi- ogy, the setup+hold time could be further reduced with a widerate stage formed by Mi10 and Mll passes AVDi to the cross-coupled tail transistor M9, but at the expense of increased offset andinverters and also provides additional shielding between the noise due to a shortening of the time that M5/M6 operate in sat-input and output, with less kickback noise as a result. The invert- uration. Simulations predict that the current aperture time isers start to regenerate the voltage difference as soon as the com- already fast enough to sample data patterns of 4OGb/s, providedmon-mode voltage at the Di nodes is no longer high enough for that interleaving is used to enable a suitably long regenerationM10 and Mll to clamp the outputs to ground. The ideal operat- phase.ing point (V'>,) and the timing of the various phases can be tunedwith the transistor sizes. In conclusion, the double-tail topology has an added degree of

freedom that enables better optimization of the balance betweenTo compare the conventional and double-tail SAs, both circuits speed, offset, power and common-mode voltage. The circuit alsoare simulated, with transistor dimensions scaled to get an offset has better isolation between input and output and is well suitedstandard deviation of a,, = 13mV. The operating conditions are to operate at low supply voltages.VDD = 1.2V and f,, = 3GHz, and the input has VC11= 1.1V. At thishigh V>,, (found, for example, in memories), the conventional Acknowledgements:

topology ne rettniosthDThe authors thank Philips Research for chip fabrication, the Dutchtopology needs reset tranLsistors at the Di nodes [1,5] to ensure Technology Foundation (STW, project TCS.5791) for funding and Gerardthat M5 and M6 at least start in saturation. The power consumed Wienk for assistance.by both circuits is similar, about 40fJ/bit. Figure 17.7.3 shows thedelays of both circuits versus the differential input voltage. The References:

feedback gives a logarithmic relation between the delay [1] I. Zhang, V. George and J. Rabaey, "Low-Swing On-Chip SignalingposAitivee r e

Techniques: Effectiveness and Robustness," IEEE T VLSI Systems, pp.and Al">: 37ps per decade for the double-tail SA anLd 37 to .'264-272, June, 2000.43ps/dec for the conventional SA. The double-tail SA is both [2] D. Schinkel et al., "A 3Gb/s/ch Transceiver for 10-mm Uninterruptedfaster in general and the delay increases by only 7ps when V,,. RC-Limited Global On-Chip Interconnects," IEEE J. Solid-State Circuits,drops to 0.7V, instead of the 25 to 60ps increase for the conven- pp. 297-306, Jan,, 2006.

tional topology. When simulated at VDD= 1Vthe delay of thedou [31 B. Nikolic et al., "Improved Sense-Amplifier-Based Flip-Flop: Designtional topology. Whenst,1imulatedt at VDD = 1 V, the delay of the doU- and Measurements," IEEE J. Solid-State Circuits, pp. 876-884, June,ble-tail SA is lps larger versus 29ps for the conventional SA. 2000

[41 B. Wicht at at., 'Yield anld Speed Optimnization of a Latch-Type VoltageThne dLouble-tail SA was implementedt in a 1.2V CMOSlU OOnm tech- Sense Amplifier," IEEE J. Solid-Stnte circuits, pp. 1148-1158, JulLy, 2004.nology, as part of a low-swing on-chip data transceiver that oper- [51 K.-Lid. Wong and C.-K.K. Yang, "Offset CompensatDion inl ComparatorsatesU aound1 V. = 1.1V. The 1, can hae, large variations due to, with Minimum Input-Referred Supply Noise," IEEE J. Solid-Statefor examnple, crosstalk effects. A double-tail SA with dedicated circuits, pp. 837- 840, May, 2004.

314 200 IEEE11 Internaionlll Soid-lState Circuits Conference 1Il4244l0852lO/07/$2500 ©2007 IEEE.Authorized licensed use limited to: Ehsan Zhian Tabasy. Downloaded on November 15, 2009 at 04:17 from IEEE Xplore. Restrictions apply.

ISSCC 2007 1 February 13, 2007 /4:15 PM

VDD VDD

Cl- M12M7 L I\A8AVIN: 200.OmV 1O.OmV

M. 1.4

Clk~~~~~~ Clk-.-------- ------------

Out+u+ Ot -Out- ~~~~~~~~~~~~~~~~~-0.2 -__

MIO Mlr M3 Mu 1.2L ~~~~~~~~~~~~~~~~~~~~~~~~~~Di+

--l Covntoa SAVcDD7

M7 VMM8 Di 0':'D

0-&-ConvetionalSA,Vcm=1.1V1k0 il 18011.4

Dela.OtIn+-I 5 M6 n0

M5 M6 I Ou_t-In+- ________n__....._............-0.2

800.Op I n1n .2n t (s)

Figure 17.71: Conventional latch-type voltage sense amplitier. The dotted transistorsare examples ot common variations. Figure 17.72: Double-tail latch-type voltage sense amplitier and signal behavior.

250Conventional SA, Vcm=0.7VConventional SA, Vcm=l1 lV 180 180

-+-Double-Tail SA, Vcm=0.7V IF- Vcm=0.6V, Vdd=1 V AVin=5OmV, Vdd=1 .2V]200 -------E Double-Tail SA, Vcm=l .1 V --Vcm=0.75V, Vdd=1.2V 16

140 Vcm=0.75V, Vdd0 ----------------2V-----cn 27 12 80 120 -------------------- 120

CZ

Xia ~ ~ ~ ~~~~~ 1,00 1>f> T0 001 1CZ,

CD~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~C

CZ -6

-- -4

20 -- -20 -

50 00 0 0t18 2 010 103 102 io 100 101 102 0.5 1 1.5AVin (V) AVin-Voffset (mV) Vcm (V)E ~~~~~Vn (V)

Figure 17.7.3: Simulated sense amplifier delays versus differential input voltage. The Figure 17.7.4: Measured relative delay versus differential and common-mode inputdelay is the time between the clock edge and the instant when AOut crosses 1/2 VDD. voltages.

I measured ' 0 1 0 P attern0.8 Gaussian, cy =1 .5mV 10.-2 / PRBS Pattern

0.8 ,1 10 f ;1I~~ ~ ~ ~~~~ ~~~~~~~~~~~~~ /11

I 0.6EE -6

0~~~~~~~~~~~~~~~i11,0.4r- 081

0.2I / 1 10 ;+,t l,~~~~~~~~~~1

1 8ps setup + hold time

-6 -4 -2 0 2 4 6 -610 -605 -600 -595 -590 -585AVin - Voffset (mV) Clock skew (ps)

Figure 17.7.5: Measured cumulative noise distribution and tit to Gaussian distribution. Figure 17.7.6: Bit error rate versus clock skew, at tcIk= 1GHiz.Conztinued onz Patge 605

DIGEST OFTECHNICAL PAPERS * 315Authorized licensed use limited to: Ehsan Zhian Tabasy. Downloaded on November 15, 2009 at 04:17 from IEEE Xplore. Restrictions apply.

60 * 200 IEE Inentoa Solid3-Stt Cicut Cofrec 1-4m44 0852-O/.0X7/$25_0 ©20 IEEEl.

Authorized licensed use limited to: Ehsan Zhian Tabasy. Downloaded on November 15, 2009 at 04:17 from IEEE Xplore. Restrictions apply.