Lecture 19 Carry-Lookahead...

20
EE241 1 UC Berkeley EE241 B. Nikolic, J. Rabaey EE241 - Spring 2003 Advanced Digital Integrated Circuits Lecture 19 Carry-Lookahead Adders UC Berkeley EE241 B. Nikolic, J. Rabaey Carry-Lookahead Adders l Adder trees » Radix of a tree » Minimum depth trees » Sparse trees l Logic manipulations » Conventional vs. Ling » Stack height limiting

Transcript of Lecture 19 Carry-Lookahead...

EE241

1

UC Berkeley EE241 B. Nikolic, J. Rabaey

EE241 - Spring 2003Advanced Digital Integrated Circuits

Lecture 19Carry-Lookahead Adders

UC Berkeley EE241 B. Nikolic, J. Rabaey

Carry-Lookahead Addersl Adder trees

» Radix of a tree» Minimum depth trees» Sparse trees

l Logic manipulations» Conventional vs. Ling» Stack height limiting

EE241

2

UC Berkeley EE241 B. Nikolic, J. Rabaey

Propagate and Generate Signals

Define 3 new variable which ONLY depend on ai, bi

Generate (gi) = aibi

Propagate (pi) = ai + bi (could be XOR as well)

Delete = ai bi

Can also derive expressions for s and cout based on di

and pi

( )iniii

iniiiiout

cgpgs

cpgpgc

⊕=

+=

),(

,

UC Berkeley EE241 B. Nikolic, J. Rabaey

A0,B0 A1,B1 AN-1,BN-1...

Ci,0 P0 Ci,1 P1Ci,N-1 PN-1

...

Carry Lookahead Adder

Weinberger, Smith, 1958.

EE241

3

UC Berkeley EE241 B. Nikolic, J. Rabaey

Lookahead Adder

1−+= iiii cpgc

Looakahead Equations

( )1111

111

111

−+++

−++

+++

++=++=

+=

iiiiii

iiiii

iiii

cppgpg

cpgpg

cpgcPosition i:

Position i + 1:

Carry exists if:- generated in stage i + 1- generated in stage i and propagated through i + 1- propagated through both i and i + 1

UC Berkeley EE241 B. Nikolic, J. Rabaey

Lookahead Adder

• Unrolling of carry recurrence can be continued• If unrolled to level k, resulting in two-level AND-OR

structure• AND Fan-In = k + 1, OR Fan-In = k + 1• k + 1 transistors in the MOS stack• Limits k to 2 – 4 • Later referred to as a radix of an adder

EE241

4

UC Berkeley EE241 B. Nikolic, J. Rabaey

Lookahead AdderVDD

P3

P2

P1

P0

G3

G2

G1

G0

Ci,0

Co,3

Mirror Implementation

UC Berkeley EE241 B. Nikolic, J. Rabaey

Block Lookahead

1123123

1232334

−++++++

+++++++

++

++=

iiiiiiiii

iiiiiii

cppppgppp

gppgpgcFourth bit carry:

iiiiiiiiiiii gpppgppgpgG 1231232333, ++++++++++ +++=

iiiiii ppppP 1233, ++++ =

13,3,4 −+++ += iiiiii cPGc

Block generate and block propagate:

EE241

5

UC Berkeley EE241 B. Nikolic, J. Rabaey

Block Lookahead

Can create groups of groups, or ‘super-groups’:

jjjjjjjjjjjj GPPPGPPGPGG 123123233*

:3 ++++++++++ +++=

jjjjjj pPPPP 123*

:3 ++++ =

Delay is Nctd log1=

UC Berkeley EE241 B. Nikolic, J. Rabaey

Block LookaheadFrom Oklobdzija

EE241

6

UC Berkeley EE241 B. Nikolic, J. Rabaey

Carry Lookahead Trees

Co 0, G0 P0Ci 0,+=

Co 1, G1 P1 G0 P1P0 Ci 0,+ +=

Co 2, G2 P2G1 P2 P1G0 P+ 2 P1P0C i 0,+ +=

G2 P2G1+( )= P2P1( ) G0 P0Ci 0,+( )+ G 2:1 P2:1Co 0,+=

Can continue building the tree hierarchically.

UC Berkeley EE241 B. Nikolic, J. Rabaey

Tree Adders

lmG ppP ⋅=

lmmG gpgG ⋅+=

m – more significantl – less significant

Start from the input P, G, and continue up the tree2-bit groups, then 4-bit groups, …

( ) ( ) ( )lmlmmllmm ppgpgpgpgpg ⋅⋅+=•= ,,,),(

Kogge, Stone, Trans on Comp,’73 Radix 2

EE241

7

UC Berkeley EE241 B. Nikolic, J. Rabaey

Tree Adders: Radix 2

16-bit radix-2 Kogge-Stone Tree

(A0,

B0)

(A1,

B1)

(A2,

B2)

(A3,

B3)

(A4,

B4)

(A5,

B5)

(A6,

B6)

(A7,

B7)

(A8,

B8)

(A9,

B9)

(A10

, B10

)

(A11

, B11

)

(A12

, B12

)

(A13

, B13

)

(A14

, B14

)

(A15

, B15

)

S0

S1

S2

S3

S4

S5

S6

S7

S8

S9

S10

S11

S12

S13

S14

S15

UC Berkeley EE241 B. Nikolic, J. Rabaey

Tree Adders: Radix 4

(a0,

b0)

(a1,

b1)

(a2,

b2)

(a3,

b3)

(a4,

b4)

(a5,

b5)

(a6,

b6)

(a7,

b7)

(a8,

b8)

(a9,

b9)

(a10

, b10

)

(a11

, b11

)

(a12

, b12

)

(a13

, b13

)

(a14

, b14

)

(a15

, b15

)

S0

S1

S2

S3

S4

S5

S6

S7

S8

S9

S10

S11

S12

S13

S14

S15

16-bit radix-4 Kogge-Stone Tree

EE241

8

UC Berkeley EE241 B. Nikolic, J. Rabaey

Sparse Trees

(a0,

b0)

(a1,

b1)

(a2,

b2)

(a3,

b3)

(a4,

b4)

(a5,

b5)

(a6,

b6)

(a7,

b7)

(a8,

b8)

(a9,

b9)

(a10

, b1

0)

(a11

, b1

1)

(a12

, b1

2)

(a13

, b1

3)

(a14

, b1

4)

(a15

, b1

5)

S1

S3

S5

S7

S9

S11

S13

S15

S0

S2

S4

S6

S8

S1

0

S1

2

S1

4

16-bit radix-2 sparse tree with sparseness of 2 (Han-Carlson)

UC Berkeley EE241 B. Nikolic, J. Rabaey

Full vs. Sparse Treesl Sparse trees have

less transistors, wires» Less power

l Less input loadingl Recovering missing

carries» Ripple (extra gate

delay)» Precompute (extra

fanout)

l Complex precompute can get into the critical path

Adder Delay [FO4]

Tot

al T

rans

isto

r Wid

th [u

nit w

idth

/bit]

300

400

500

600

700

800

900

1000

7 9 11 13 15 17

Radix-4 Kogge-Stone

Radix-4 2-Sparse

Radix-4 4-Sparse

-23.3%

EE241

9

UC Berkeley EE241 B. Nikolic, J. Rabaey

Tree Adders: Other Treesl Ladner-Fischer

(A0,

B0)

(A1,

B1)

(A2,

B2)

(A3,

B3)

(A4,

B4)

(A5,

B5)

(A6,

B6)

(A7,

B7)

(A8,

B8)

(A9,

B9)

(A10

, B10

)

(A11

, B11

)

(A12

, B12

)

(A13

, B13

)

(A14

, B14

)

(A15

, B15

)

S0

S1

S2

S3

S4

S5

S6

S7

S8

S9

S10

S11

S12

S13

S14

S15

UC Berkeley EE241 B. Nikolic, J. Rabaey

Other Sparse Trees

Mathew, VLSI’02

EE241

10

UC Berkeley EE241 B. Nikolic, J. Rabaey

Ling Adder

Variation of CLA

Ling, IBM J. Res. Dev, 5/81

1−⋅+= iiii GpgG

1−⊕= iii GpS

iii bap ⊕=

iii bag ⋅=

11 −− ⋅+= iiii HtgH

11 −−+⊕= iiiiii HtgHtS

iii bat +=

iii bag ⋅=

Ling’s equations

UC Berkeley EE241 B. Nikolic, J. Rabaey

Ling Adder

1−⋅+= iiii GpgG

1−⋅+= iiii GtgG 11 −− ⋅+= iiii GtgH

Ling’s equation shifts the index ofpseudo carry

Doran, Trans on Comp 9/88

Propagates informationon two bits

Conventional CLA:

Also:

EE241

11

UC Berkeley EE241 B. Nikolic, J. Rabaey

Ling Adder

01231232333 gtttgttgtgG +++=

0121223

00121122233

gttgtgg

gtttgttgtgH

+++=+++=

Conventional

Ling

Reduces the stack height (or width)Reduces input loading

UC Berkeley EE241 B. Nikolic, J. Rabaey

Stack Height Limitingl Transform conventional G, P

EE241

12

UC Berkeley EE241 B. Nikolic, J. Rabaey

HP Adder

Naffziger, ISSCC’96

01234 ppppi =

UC Berkeley EE241 B. Nikolic, J. Rabaey

HP Adder – Differential Domino

Carry rippleSum select

EE241

13

UC Berkeley EE241 B. Nikolic, J. Rabaey

Hybrid Adders

Dobberpuhl, JSSC 11/92 DEC Aplha 21064

UC Berkeley EE241 B. Nikolic, J. Rabaey

DEC Adderl Combination:

» 8-bit tapered pre-discharged Manchester carry chains, with Cin = 0 and Cin = 1

» 32-bit LSB carry-lookahead» 32-bit MSB conditional sum adder» Carry-select on most significant bits» Latch-based timing

EE241

14

UC Berkeley EE241 B. Nikolic, J. Rabaey

Z X·· Y× Zk2k

k 0=

M N 1–+

∑= =

Xi2i

i 0=

M 1–

Yj2j

j 0=

N 1–

=

XiYj2i j+

j 0=

N 1–

i 0=

M 1–

∑=

X Xi2i

i 0=

M 1–

∑=

Y Yj2j

j 0=

N 1–

∑=

with

Binary Multiplication

UC Berkeley EE241 B. Nikolic, J. Rabaey

1 0 1 1

1 0 1 0 1 0

0 0 0 0 0 0

1 0 1 0 1 0

1 0 1 0 1 0

1 0 1 0 1 0

×

1 1 1 0 0 1 1 1 0

+

Partial Products

AND operation

Binary Multiplication

N+ M bits in the final sum

N bits

M bits

EE241

15

UC Berkeley EE241 B. Nikolic, J. Rabaey

Shift-and-Add Multiplierl Standard adder and shift-in the

multiplicandl Shift the result as well and addl N cyclesl Parallel adders add more hardware

(adders) instead.

UC Berkeley EE241 B. Nikolic, J. Rabaey

HA FA FA HA

FA FA FA HA

FA FA FA HA

X0X1X2X3 Y1

X0X1X2X3 Y2

X0X1X2X3 Y3

Z1

Z2

Z3Z4Z5Z6

Z0

Z7

Array Multiplier

EE241

16

UC Berkeley EE241 B. Nikolic, J. Rabaey

HA FA FA HA

HAFAFAFA

FAFA FA HA

Critical Path 1

Critical Path 2

Critical Path 1 & 2

MxN Array Multiplier— Critical Path

UC Berkeley EE241 B. Nikolic, J. Rabaey

A

B

P

Ci

VDD A

A A

VDD

Ci

A

P

AB

VDD

VDD

Ci

Ci

Co

S

Ci

P

P

P

P

P

Identical Delays for Carry and Sum

Adder Cells in Array Multiplier

EE241

17

UC Berkeley EE241 B. Nikolic, J. Rabaey

HA HA HA HA

FAFAFAHA

FAHA FA FA

FAHA FA HA

Vector Merging Adder

Carry-Save Multiplier

UC Berkeley EE241 B. Nikolic, J. Rabaey

SCSCSCSC

SCSCSCSC

SCSCSCSC

SC

SC

SC

SC

Z0

Z1

Z2

Z3Z4Z5Z6Z7

X0X1X2X3

Y1

Y2

Y3

Y0

Vector Merging Cell

HA Multiplier Cell

FA Multiplier Cell

X and Y signals are broadcastedthrough the complete array.( )

Multiplier Floorplan

EE241

18

UC Berkeley EE241 B. Nikolic, J. Rabaey

Multipliersl Partial product generationl Partial product accumulationl Final summation

UC Berkeley EE241 B. Nikolic, J. Rabaey

Generating Partial Productsl All partial products: AND

l Booth’s recoding – reduction of partial product count

X7

PP7

X6

PP6

X5

PP5

X4

PP4

X3

PP3

X2

PP2

X1

PP1

X0

PP0

EE241

19

UC Berkeley EE241 B. Nikolic, J. Rabaey

Booth Recodingl Instead of generating all the partial products

0 * x = 0 1 * x = x x={0,1}l Reduce the number of partial products

by grouping

0 0 00 1 1*1 0 2* (shift)1 1 3* (or 4* -1)

Booth’51

UC Berkeley EE241 B. Nikolic, J. Rabaey

Booth Recodingl Instead of using set {0, 1*Y, 2*Y, 3*Y}l Use {0, 1*Y, 2*Y, 4*Y, -Y}l Shifting and complementingl 3*Y = 4*Y – Yl Can be simplified by looking into three bits –

modified Booth recoding

EE241

20

UC Berkeley EE241 B. Nikolic, J. Rabaey

Modified Booth Recoding

Two bits and the MSB of previous two

UC Berkeley EE241 B. Nikolic, J. Rabaey

Accumulating Partial Products

Partial product matrix Reorganized matrix