Resources for Measurement-Based Quantum Computation Quantum Carry-Lookahead Adder
Lecture 19 Carry-Lookahead...
Transcript of Lecture 19 Carry-Lookahead...
EE241
1
UC Berkeley EE241 B. Nikolic, J. Rabaey
EE241 - Spring 2003Advanced Digital Integrated Circuits
Lecture 19Carry-Lookahead Adders
UC Berkeley EE241 B. Nikolic, J. Rabaey
Carry-Lookahead Addersl Adder trees
» Radix of a tree» Minimum depth trees» Sparse trees
l Logic manipulations» Conventional vs. Ling» Stack height limiting
EE241
2
UC Berkeley EE241 B. Nikolic, J. Rabaey
Propagate and Generate Signals
Define 3 new variable which ONLY depend on ai, bi
Generate (gi) = aibi
Propagate (pi) = ai + bi (could be XOR as well)
Delete = ai bi
Can also derive expressions for s and cout based on di
and pi
( )iniii
iniiiiout
cgpgs
cpgpgc
⊕=
+=
),(
,
UC Berkeley EE241 B. Nikolic, J. Rabaey
A0,B0 A1,B1 AN-1,BN-1...
Ci,0 P0 Ci,1 P1Ci,N-1 PN-1
...
Carry Lookahead Adder
Weinberger, Smith, 1958.
EE241
3
UC Berkeley EE241 B. Nikolic, J. Rabaey
Lookahead Adder
1−+= iiii cpgc
Looakahead Equations
( )1111
111
111
−+++
−++
+++
++=++=
+=
iiiiii
iiiii
iiii
cppgpg
cpgpg
cpgcPosition i:
Position i + 1:
Carry exists if:- generated in stage i + 1- generated in stage i and propagated through i + 1- propagated through both i and i + 1
UC Berkeley EE241 B. Nikolic, J. Rabaey
Lookahead Adder
• Unrolling of carry recurrence can be continued• If unrolled to level k, resulting in two-level AND-OR
structure• AND Fan-In = k + 1, OR Fan-In = k + 1• k + 1 transistors in the MOS stack• Limits k to 2 – 4 • Later referred to as a radix of an adder
EE241
4
UC Berkeley EE241 B. Nikolic, J. Rabaey
Lookahead AdderVDD
P3
P2
P1
P0
G3
G2
G1
G0
Ci,0
Co,3
Mirror Implementation
UC Berkeley EE241 B. Nikolic, J. Rabaey
Block Lookahead
1123123
1232334
−++++++
+++++++
++
++=
iiiiiiiii
iiiiiii
cppppgppp
gppgpgcFourth bit carry:
iiiiiiiiiiii gpppgppgpgG 1231232333, ++++++++++ +++=
iiiiii ppppP 1233, ++++ =
13,3,4 −+++ += iiiiii cPGc
Block generate and block propagate:
EE241
5
UC Berkeley EE241 B. Nikolic, J. Rabaey
Block Lookahead
Can create groups of groups, or ‘super-groups’:
jjjjjjjjjjjj GPPPGPPGPGG 123123233*
:3 ++++++++++ +++=
jjjjjj pPPPP 123*
:3 ++++ =
Delay is Nctd log1=
UC Berkeley EE241 B. Nikolic, J. Rabaey
Block LookaheadFrom Oklobdzija
EE241
6
UC Berkeley EE241 B. Nikolic, J. Rabaey
Carry Lookahead Trees
Co 0, G0 P0Ci 0,+=
Co 1, G1 P1 G0 P1P0 Ci 0,+ +=
Co 2, G2 P2G1 P2 P1G0 P+ 2 P1P0C i 0,+ +=
G2 P2G1+( )= P2P1( ) G0 P0Ci 0,+( )+ G 2:1 P2:1Co 0,+=
Can continue building the tree hierarchically.
UC Berkeley EE241 B. Nikolic, J. Rabaey
Tree Adders
lmG ppP ⋅=
lmmG gpgG ⋅+=
m – more significantl – less significant
Start from the input P, G, and continue up the tree2-bit groups, then 4-bit groups, …
( ) ( ) ( )lmlmmllmm ppgpgpgpgpg ⋅⋅+=•= ,,,),(
Kogge, Stone, Trans on Comp,’73 Radix 2
EE241
7
UC Berkeley EE241 B. Nikolic, J. Rabaey
Tree Adders: Radix 2
16-bit radix-2 Kogge-Stone Tree
(A0,
B0)
(A1,
B1)
(A2,
B2)
(A3,
B3)
(A4,
B4)
(A5,
B5)
(A6,
B6)
(A7,
B7)
(A8,
B8)
(A9,
B9)
(A10
, B10
)
(A11
, B11
)
(A12
, B12
)
(A13
, B13
)
(A14
, B14
)
(A15
, B15
)
S0
S1
S2
S3
S4
S5
S6
S7
S8
S9
S10
S11
S12
S13
S14
S15
UC Berkeley EE241 B. Nikolic, J. Rabaey
Tree Adders: Radix 4
(a0,
b0)
(a1,
b1)
(a2,
b2)
(a3,
b3)
(a4,
b4)
(a5,
b5)
(a6,
b6)
(a7,
b7)
(a8,
b8)
(a9,
b9)
(a10
, b10
)
(a11
, b11
)
(a12
, b12
)
(a13
, b13
)
(a14
, b14
)
(a15
, b15
)
S0
S1
S2
S3
S4
S5
S6
S7
S8
S9
S10
S11
S12
S13
S14
S15
16-bit radix-4 Kogge-Stone Tree
EE241
8
UC Berkeley EE241 B. Nikolic, J. Rabaey
Sparse Trees
(a0,
b0)
(a1,
b1)
(a2,
b2)
(a3,
b3)
(a4,
b4)
(a5,
b5)
(a6,
b6)
(a7,
b7)
(a8,
b8)
(a9,
b9)
(a10
, b1
0)
(a11
, b1
1)
(a12
, b1
2)
(a13
, b1
3)
(a14
, b1
4)
(a15
, b1
5)
S1
S3
S5
S7
S9
S11
S13
S15
S0
S2
S4
S6
S8
S1
0
S1
2
S1
4
16-bit radix-2 sparse tree with sparseness of 2 (Han-Carlson)
UC Berkeley EE241 B. Nikolic, J. Rabaey
Full vs. Sparse Treesl Sparse trees have
less transistors, wires» Less power
l Less input loadingl Recovering missing
carries» Ripple (extra gate
delay)» Precompute (extra
fanout)
l Complex precompute can get into the critical path
Adder Delay [FO4]
Tot
al T
rans
isto
r Wid
th [u
nit w
idth
/bit]
300
400
500
600
700
800
900
1000
7 9 11 13 15 17
Radix-4 Kogge-Stone
Radix-4 2-Sparse
Radix-4 4-Sparse
-23.3%
EE241
9
UC Berkeley EE241 B. Nikolic, J. Rabaey
Tree Adders: Other Treesl Ladner-Fischer
(A0,
B0)
(A1,
B1)
(A2,
B2)
(A3,
B3)
(A4,
B4)
(A5,
B5)
(A6,
B6)
(A7,
B7)
(A8,
B8)
(A9,
B9)
(A10
, B10
)
(A11
, B11
)
(A12
, B12
)
(A13
, B13
)
(A14
, B14
)
(A15
, B15
)
S0
S1
S2
S3
S4
S5
S6
S7
S8
S9
S10
S11
S12
S13
S14
S15
UC Berkeley EE241 B. Nikolic, J. Rabaey
Other Sparse Trees
Mathew, VLSI’02
EE241
10
UC Berkeley EE241 B. Nikolic, J. Rabaey
Ling Adder
Variation of CLA
Ling, IBM J. Res. Dev, 5/81
1−⋅+= iiii GpgG
1−⊕= iii GpS
iii bap ⊕=
iii bag ⋅=
11 −− ⋅+= iiii HtgH
11 −−+⊕= iiiiii HtgHtS
iii bat +=
iii bag ⋅=
Ling’s equations
UC Berkeley EE241 B. Nikolic, J. Rabaey
Ling Adder
1−⋅+= iiii GpgG
1−⋅+= iiii GtgG 11 −− ⋅+= iiii GtgH
Ling’s equation shifts the index ofpseudo carry
Doran, Trans on Comp 9/88
Propagates informationon two bits
Conventional CLA:
Also:
EE241
11
UC Berkeley EE241 B. Nikolic, J. Rabaey
Ling Adder
01231232333 gtttgttgtgG +++=
0121223
00121122233
gttgtgg
gtttgttgtgH
+++=+++=
Conventional
Ling
Reduces the stack height (or width)Reduces input loading
UC Berkeley EE241 B. Nikolic, J. Rabaey
Stack Height Limitingl Transform conventional G, P
EE241
12
UC Berkeley EE241 B. Nikolic, J. Rabaey
HP Adder
Naffziger, ISSCC’96
01234 ppppi =
UC Berkeley EE241 B. Nikolic, J. Rabaey
HP Adder – Differential Domino
Carry rippleSum select
EE241
13
UC Berkeley EE241 B. Nikolic, J. Rabaey
Hybrid Adders
Dobberpuhl, JSSC 11/92 DEC Aplha 21064
UC Berkeley EE241 B. Nikolic, J. Rabaey
DEC Adderl Combination:
» 8-bit tapered pre-discharged Manchester carry chains, with Cin = 0 and Cin = 1
» 32-bit LSB carry-lookahead» 32-bit MSB conditional sum adder» Carry-select on most significant bits» Latch-based timing
EE241
14
UC Berkeley EE241 B. Nikolic, J. Rabaey
Z X·· Y× Zk2k
k 0=
M N 1–+
∑= =
Xi2i
i 0=
M 1–
∑
Yj2j
j 0=
N 1–
∑
=
XiYj2i j+
j 0=
N 1–
∑
i 0=
M 1–
∑=
X Xi2i
i 0=
M 1–
∑=
Y Yj2j
j 0=
N 1–
∑=
with
Binary Multiplication
UC Berkeley EE241 B. Nikolic, J. Rabaey
1 0 1 1
1 0 1 0 1 0
0 0 0 0 0 0
1 0 1 0 1 0
1 0 1 0 1 0
1 0 1 0 1 0
×
1 1 1 0 0 1 1 1 0
+
Partial Products
AND operation
Binary Multiplication
N+ M bits in the final sum
N bits
M bits
EE241
15
UC Berkeley EE241 B. Nikolic, J. Rabaey
Shift-and-Add Multiplierl Standard adder and shift-in the
multiplicandl Shift the result as well and addl N cyclesl Parallel adders add more hardware
(adders) instead.
UC Berkeley EE241 B. Nikolic, J. Rabaey
HA FA FA HA
FA FA FA HA
FA FA FA HA
X0X1X2X3 Y1
X0X1X2X3 Y2
X0X1X2X3 Y3
Z1
Z2
Z3Z4Z5Z6
Z0
Z7
Array Multiplier
EE241
16
UC Berkeley EE241 B. Nikolic, J. Rabaey
HA FA FA HA
HAFAFAFA
FAFA FA HA
Critical Path 1
Critical Path 2
Critical Path 1 & 2
MxN Array Multiplier— Critical Path
UC Berkeley EE241 B. Nikolic, J. Rabaey
A
B
P
Ci
VDD A
A A
VDD
Ci
A
P
AB
VDD
VDD
Ci
Ci
Co
S
Ci
P
P
P
P
P
Identical Delays for Carry and Sum
Adder Cells in Array Multiplier
EE241
17
UC Berkeley EE241 B. Nikolic, J. Rabaey
HA HA HA HA
FAFAFAHA
FAHA FA FA
FAHA FA HA
Vector Merging Adder
Carry-Save Multiplier
UC Berkeley EE241 B. Nikolic, J. Rabaey
SCSCSCSC
SCSCSCSC
SCSCSCSC
SC
SC
SC
SC
Z0
Z1
Z2
Z3Z4Z5Z6Z7
X0X1X2X3
Y1
Y2
Y3
Y0
Vector Merging Cell
HA Multiplier Cell
FA Multiplier Cell
X and Y signals are broadcastedthrough the complete array.( )
Multiplier Floorplan
EE241
18
UC Berkeley EE241 B. Nikolic, J. Rabaey
Multipliersl Partial product generationl Partial product accumulationl Final summation
UC Berkeley EE241 B. Nikolic, J. Rabaey
Generating Partial Productsl All partial products: AND
l Booth’s recoding – reduction of partial product count
X7
PP7
X6
PP6
X5
PP5
X4
PP4
X3
PP3
X2
PP2
X1
PP1
X0
PP0
EE241
19
UC Berkeley EE241 B. Nikolic, J. Rabaey
Booth Recodingl Instead of generating all the partial products
0 * x = 0 1 * x = x x={0,1}l Reduce the number of partial products
by grouping
0 0 00 1 1*1 0 2* (shift)1 1 3* (or 4* -1)
Booth’51
UC Berkeley EE241 B. Nikolic, J. Rabaey
Booth Recodingl Instead of using set {0, 1*Y, 2*Y, 3*Y}l Use {0, 1*Y, 2*Y, 4*Y, -Y}l Shifting and complementingl 3*Y = 4*Y – Yl Can be simplified by looking into three bits –
modified Booth recoding