Eyecharts: Constructive Benchmarking of Gate Sizing Heuristics Puneet Gupta, University of...
-
Upload
homer-atkins -
Category
Documents
-
view
218 -
download
3
Transcript of Eyecharts: Constructive Benchmarking of Gate Sizing Heuristics Puneet Gupta, University of...
Eyecharts: Constructive Benchmarking of
Gate Sizing Heuristics
Eyecharts: Constructive Benchmarking of
Gate Sizing Heuristics
Puneet Gupta, University of California, Los Angeles
Andrew B. Kahng, University of California, San Diego
Amarnath Kasibhatla, University of California, Los Angeles
Puneet Sharma, Freescale Semiconductor, TX
Research funded in part by NSF
OutlineOutline
Motivation
Solving Eyechart Topologies
Experiments for Suboptimality Study
Results
2
Why is Sizing Important?Why is Sizing Important?
Sizing Effective way of optimizing for power, speed and area Tunable parameters
Gate-width Threshold voltage Gate-length Supply voltage etc.
Sizing problem seen at all stages of RTL to GDS flow
Power recovery crucial during post-layout phase
3
Why Study Suboptimality?Why Study Suboptimality?
Literally hundreds of gate sizing methods exist
Common heuristics/algorithms: Linear Programming, Lagrangian Relaxation, Convex
Optimization, Dynamic Programming, Geometric Programming, Sensitivity based gradient-descent, Simulated Annealing etc.
Which heuristic is better?
No systematic way to compare, so far
How suboptimal are these heuristics?
Does a heuristic’s performance depend on Circuit topology? Characteristics of the cell library?
No prior work focuses on these aspects
4
Sizing Problem FormulationSizing Problem Formulation
Sizing problem could be discrete or continuous
Discrete sizing problem is NP-hard
Common designs are standard cell-based and discrete
We focus only on discrete sizing problem
Problem: Leakage power minimization under timing constraints
Circuits are purely combinational
Gate sizing alone is tested and not logic optimization capability
5
Our ContributionsOur Contributions We generate artificial combinational circuits called eyecharts
Gate’s delay depends only on Gate size Total load capacitance
Eyecharts can be solved optimally using Dynamic Programming (DP)
A variety of eyecharts are generated by varying Circuit topology Power-Size, delay-size characteristics of library
Suboptimalities of existing heuristics studied under these variations
Leakage optimization details are presented
Extensions to dynamic power optimization are easy
6
OutlineOutline
Motivation
Solving Eyechart Topologies
Experiments for Suboptimality Study
Results
7
What are Eyecharts?What are Eyecharts?
Chain
MESH
STAR
Chain, Mesh and Star proposed
Each can be solved optimally for the assumed delay model
Stage of a gate is its logic level from the PI
Levelized nature of eyecharts enables optimal sizing by DP
8
What are Eyecharts?What are Eyecharts?
Chain
MESH
STAR
Stage1
Stage2Stage3
Stage4
Stage5
PI PO
Chain, Mesh and Star proposed
Each can be solved optimally for the assumed delay model
Stage of a gate is its logic level from the PI
Levelized nature of eyecharts enables optimal sizing by DP
9
Solving a Chain OptimallySolving a Chain Optimally
Input Leakage Delaycap power Load Load
= 3 = 6
Size 1 3 5 3 4
Size 2 6 10 1 2
10
Solving a Chain OptimallySolving a Chain OptimallyStage 3Stage 1 Stage 2
6INV1 INV2 INV3
Dmax = 8 Input Leakage Delay cap power Load Load
3 6
Size 1 3 5 3 4
Size 2 6 10 1 2
11
Solving a Chain OptimallySolving a Chain Optimally
6INV1 INV2 INV3
Dmax = 8Stage 3Stage 1 Stage 2
?Input Leakage Delay cap power Load Load
3 6
Size 1 3 5 3 4
Size 2 6 10 1 2
12
Solving a Chain OptimallySolving a Chain Optimally
6INV1 INV2 INV3
Dmax = 8
Stage 1
1 ? ?2 ? ?3 ? ?4 ? ?5 ? ?6 ? ?7 ? ?8 ? ?
Stage 3Stage 1 Stage 2
Budget Power Size
?Input Leakage Delay cap power Load Load
3 6
Size 1 3 5 3 4
Size 2 6 10 1 2
13
Solving a Chain OptimallySolving a Chain Optimally
6INV1 INV2 INV3
Dmax = 8
Stage 1
1 ? ?2 ? ?3 ? ?4 ? ?5 ? ?6 ? ?7 ? ?8 ? ?
Stage 3Stage 1 Stage 2
Budget Power Size
?
Load
= 3
Input Leakage Delay cap power Load Load
3 6
Size 1 3 5 3 4
Size 2 6 10 1 2
14
Solving a Chain OptimallySolving a Chain Optimally
6INV1 INV2 INV3
Dmax = 8
Stage 1
2 ? ?3 ? ?4 ? ?5 ? ?6 ? ?7 ? ?8 ? ?
Stage 3Stage 1 Stage 2
Budget Power Size
?Input Leakage Delay cap power Load Load
3 6
Size 1 3 5 3 4
Size 2 6 10 1 2
Load
= 6
15
Solving a Chain OptimallySolving a Chain Optimally
6INV1 INV2 INV3
Dmax = 8
Stage 1
1 ? ?2 ? ?3 ? ?4 ? ?5 ? ?6 ? ?7 ? ?8 ? ?
2 ? ?3 ? ?4 ? ?5 ? ?6 ? ?7 ? ?8 ? ?
Stage 3Stage 1 Stage 2
Budget Power Size
?Input Leakage Delay cap power Load Load
3 6
Size 1 3 5 3 4
Size 2 6 10 1 2
Load
= 3
Load
= 6
16
Solving a Chain OptimallySolving a Chain Optimally
6INV1 INV2 INV3
Dmax = 8
Stage 1
1 ? ?2 ? ?3 ? ?4 ? ?5 ? ?6 ? ?7 ? ?8 ? ?
2 ? ?3 ? ?4 ? ?5 ? ?6 ? ?7 ? ?8 ? ?
Stage 3Stage 1 Stage 2
Budget Power Size
?Input Leakage Delay cap power Load Load
3 6
Size 1 3 5 3 4
Size 2 6 10 1 2
Load
= 3
Load
= 6
17
Solving a Chain OptimallySolving a Chain Optimally
6INV1 INV2 INV3
Dmax = 8
Stage 1
1 ? ?2 ? ?3 ? ?4 ? ?5 ? ?6 ? ?7 ? ?8 ? ?
2 ? ?3 ? ?4 ? ?5 ? ?6 ? ?7 ? ?8 ? ?
Stage 3Stage 1 Stage 2
Budget Power Size
?Input Leakage Delay cap power Load Load
3 6
Size 1 3 5 3 4
Size 2 6 10 1 2
Load
= 3
Load
= 6
18
Solving a Chain OptimallySolving a Chain Optimally
6INV1 INV2 INV3
Dmax = 8
Stage 1
1 ? ?2 ? ?3 5 14 ? ?5 ? ?6 ? ?7 ? ?8 ? ?
2 ? ?3 ? ?4 ? ?5 ? ?6 ? ?7 ? ?8 ? ?
Stage 3Stage 1 Stage 2
Budget Power Size
Input Leakage Delay cap power Load Load
3 6
Size 1 3 5 3 4
Size 2 6 10 1 2
Load
= 3
Load
= 6
19
Solving a Chain OptimallySolving a Chain Optimally
6INV1 INV2 INV3
Dmax = 8
Stage 1
1 10 22 10 23 5 14 5 15 5 16 5 17 5 18 5 1
2 10 23 10 24 5 15 5 16 5 17 5 18 5 1
Stage 3Stage 1 Stage 2
Budget Power Size
Input Leakage Delay cap power Load Load
3 6
Size 1 3 5 3 4
Size 2 6 10 1 2
Load
= 3
Load
= 6
20
Solving a Chain OptimallySolving a Chain Optimally
6INV1 INV2 INV3
Dmax = 8
Stage 2
1 10 22 10 23 5 14 5 15 5 16 5 17 5 18 5 1
3 20 24 15 15 15 26 10 17 10 18 10 1
2 10 23 10 24 5 15 5 16 5 17 5 18 5 1
4 20 25 15 16 15 27 10 18 10 1
Stage 1
Stage 3Stage 1 Stage 2
Budget Power Size
Budget Power Size
Input Leakage Delay cap power Load Load
3 6
Size 1 3 5 3 4
Size 2 6 10 1 2
Load
= 3
Load
= 6
Load
= 3
Load
= 6
21
Solving a Chain OptimallySolving a Chain Optimally
6INV1 INV2 INV3
Dmax = 8
1 10 22 10 23 5 14 5 15 5 16 5 17 5 18 5 1
2 10 23 10 24 5 15 5 16 5 17 5 18 5 1
3 20 24 15 15 15 26 10 17 10 18 10 1
4 20 25 15 16 15 27 10 18 10 1
Stage 1 Stage 2
Stage 3Stage 1 Stage 2
Budget Power Size
Budget Power Size
Input Leakage Delay cap power Load Load
3 6
Size 1 3 5 3 4
Size 2 6 10 1 2
Load
= 3
Load
= 6
Load
= 3
Load
= 6
22
Solving a Chain OptimallySolving a Chain Optimally
6INV1 INV2 INV3
Dmax = 8
1 10 22 10 23 5 14 5 15 5 16 5 17 5 18 5 1
2 10 23 10 24 5 15 5 16 5 17 5 18 5 1
3 20 24 15 15 15 26 10 17 10 18 10 1
4 20 25 15 16 15 27 10 18 10 1
Stage 1 Stage 2
Stage 3Stage 1 Stage 2
Budget Power Size
Budget Power Size
INV2 Excess Total delay budget power
size 1 3 3 10
Input Leakage Delay cap power Load Load
3 6
Size 1 3 5 3 4
Size 2 6 10 1 2
Load
= 3
Load
= 6
Load
= 3
Load
= 6
23
Solving a Chain OptimallySolving a Chain Optimally
6INV1 INV2 INV3
Dmax = 8
INV2 Excess Total delay budget power
size 1 3 3 10
size 2 1 5 15
1 10 22 10 23 5 14 5 15 5 16 5 17 5 18 5 1
2 10 23 10 24 5 15 5 16 5 17 5 18 5 1
3 20 24 15 15 15 26 10 17 10 18 10 1
4 20 25 15 16 15 27 10 18 10 1
Stage 1 Stage 2
Stage 3Stage 1 Stage 2
Budget Power Size
Budget Power Size
Input Leakage Delay cap power Load Load
3 6
Size 1 3 5 3 4
Size 2 6 10 1 2
Load
= 3
Load
= 6
Load
= 3
Load
= 6
24
Solving a Chain OptimallySolving a Chain Optimally
6
8 20 1
INV1 INV2 INV3
Dmax = 8
1 10 22 10 23 5 14 5 15 5 16 5 17 5 18 5 1
2 10 23 10 24 5 15 5 16 5 17 5 18 5 1
3 20 24 15 15 15 26 10 17 10 18 10 1
4 20 25 15 16 15 27 10 18 10 1
Stage 1 Stage 2 Stage 3
Stage 3Stage 1 Stage 2
Budget Power Size
Budget Power Size
Budget Power Size
Input Leakage Delay cap power Load Load
3 6
Size 1 3 5 3 4
Size 2 6 10 1 2
Load
= 3
Load
= 6
Load
= 3
Load
= 6
Load = 6
25
Solving a Chain OptimallySolving a Chain Optimally
6
8 20 1
INV1 INV2 INV3
Dmax = 8
1 10 22 10 23 5 14 5 15 5 16 5 17 5 18 5 1
2 10 23 10 24 5 15 5 16 5 17 5 18 5 1
3 20 24 15 15 15 26 10 17 10 18 10 1
4 20 25 15 16 15 27 10 18 10 1
Stage 1 Stage 2 Stage 3
Stage 3Stage 1 Stage 2
Budget Power Size
Budget Power Size
Budget Power Size
INV3 Excess Total delay budget power
size 1 4 4 20
size 2 2 6 25
Input Leakage Delay cap power Load Load
3 6
Size 1 3 5 3 4
Size 2 6 10 1 2
Load
= 3
Load
= 6
Load
= 3
Load
= 6
Load = 6
26
Solving a Chain OptimallySolving a Chain Optimally
6
8 20 1
INV1 INV2 INV3
Dmax = 8
1 10 22 10 23 5 14 5 15 5 16 5 17 5 18 5 1
2 10 23 10 24 5 15 5 16 5 17 5 18 5 1
3 20 24 15 15 15 26 10 17 10 18 10 1
4 20 25 15 16 15 27 10 18 10 1
Stage 1 Stage 2 Stage 3
Stage 3Stage 1 Stage 2
Budget Power Size
Budget Power Size
Budget Power Size
INV3 Excess Total delay budget power
size 1 4 4 20
size 2 2 6 25
Input Leakage Delay cap power Load Load
3 6
Size 1 3 5 3 4
Size 2 6 10 1 2
Load
= 3
Load
= 6
Load
= 3
Load
= 6
Load = 6
27
Solving a Chain OptimallySolving a Chain Optimally
6
8 20 1
INV1 INV2 INV3
Dmax = 8
1 10 22 10 23 5 14 5 15 5 16 5 17 5 18 5 1
2 10 23 10 24 5 15 5 16 5 17 5 18 5 1
3 20 24 15 15 15 26 10 17 10 18 10 1
4 20 25 15 16 15 27 10 18 10 1
INV3 Excess Total delay budget power
size 1 4 4 20
size 2 2 6 25
Stage 1 Stage 2 Stage 3
Stage 3Stage 1 Stage 2
Budget Power Size
Budget Power Size
Budget Power Size
Input Leakage Delay cap power Load Load
3 6
Size 1 3 5 3 4
Size 2 6 10 1 2
Load
= 3
Load
= 6
Load
= 3
Load
= 6
Load = 6
28
Solving a Chain OptimallySolving a Chain Optimally
6
8 20 1
INV1 INV2 INV3
Dmax = 8
1 10 22 10 23 5 14 5 15 5 16 5 17 5 18 5 1
2 10 23 10 24 5 15 5 16 5 17 5 18 5 1
3 20 24 15 15 15 26 10 17 10 18 10 1
4 20 25 15 16 15 27 10 18 10 1
Stage 1 Stage 2 Stage 3
Stage 3Stage 1 Stage 2
Budget Power Size
Budget Power Size
Budget Power Size
Input Leakage Delay cap power Load Load
3 6
Size 1 3 5 3 4
Size 2 6 10 1 2
Load
= 3
Load
= 6
Load
= 3
Load
= 6
Load = 6
29
Solving a Chain OptimallySolving a Chain Optimally
6
1 10 22 10 23 5 14 5 15 5 16 5 17 5 18 5 1
2 10 23 10 24 5 15 5 16 5 17 5 18 5 1
3 20 24 15 15 15 26 10 17 10 18 10 1
4 20 25 15 16 15 27 10 18 10 1
8 20 1
INV1 INV2 INV3
Load = 6
Budget Power Size
6fFOPTIMIZED CHAIN
Dmax = 8
Stage 1 Stage 2 Stage 3
Stage 3Stage 1 Stage 2
Budget Power Size
Budget Power Size
Input Leakage Delay cap power Load Load
3 6
Size 1 3 5 3 4
Size 2 6 10 1 2
Load
= 3
Load
= 6
Load
= 3
Load
= 6
30
Solving Mesh OptimallySolving Mesh Optimally Stage with multiple gates represented with composite cell
Mesh to chain conversion
C2 C3 C4
A1
A2
B1
B1
B2
B2
Stage1
Stage2Stage3
Stage4
Stage5
A1 B2
A1
A2
A1
31
Solving Mesh OptimallySolving Mesh Optimally Stage with multiple gates represented with composite cell
Delay, power numbers for all size combinations for all output load combinations
Delay, power table of B1
Mesh to chain conversion
C2 C3 C4
A1
A2
B1
B1
B2
B2
Stage1
Stage2Stage3
Stage4
Stage5
A1 B2
Input Power Delay cap Load Load
=12 = 24
Size 1 6 10 6 8
Size 2 12 20 2 4
Size Power Delay
(B1,B2)
(1,1) 30 12
(1,2) 50 6
(2,1) 40 12
(2,2) 60 4
Size Power Delay
(B1,B2)
(1,1) 30 16
(1,2) 50 8
(2,1) 40 16
(2,2) 60 8
LOAD = 12 LOAD = 24
ST
AG
E
4
A1
A2
A1
32
Solving Star OptimallySolving Star OptimallyStar solved by converting it to chain
Composite cells formed for stages with multiple gates
A1
A1
A1
A2
A2
A2
B
C1 C3B
Stage 1 Stage2 Stage 3 C1 & C3, Composite cells for Stages 1 & 3
33
Hybrid EyechartsHybrid Eyecharts
Chain 1 and Chain 2 Chain 3 and Chain 4
Sample hybrid eyechart
A
A
A
A
A
B
B
B
BA
A A
A
C
A
A
A
A
A
B
B
B
B
A
A
A
A
A
B
B
B
B
A
A
A
A
A
B
B
B
B A
C
Chain, mesh, star daisy-chained for arbitrarily large hybrid eyecharts
Mesh/chain arbitrarily inserted along each PI/PO chain
Hybrid eyechart solved optimally by ultimately reducing it to a chain
AAA
Chain 1
Chain 2
Chain 3
Chain 4
PO1
PO2PI2
PI1
34
Arbitrary Extensions to EyechartsArbitrary Extensions to Eyecharts
Arbitrary extensions potentially add more realism to eyecharts
Such topologies solved using partial enumeration
No levelization restriction
One example is multi-output mesh
POPI
MESH
35
Arbitrary Extensions to EyechartsArbitrary Extensions to Eyecharts
Arbitrary extensions potentially add more realism to eyecharts
Such topologies solved using partial enumeration
No levelization restriction
One example is multi-output mesh
PO 1
PO 2
PO 3
PI
MULTI-OUTPUT MESH
POPI
MESH
36
Arbitrary Extensions to EyechartsArbitrary Extensions to Eyecharts
Arbitrary extensions potentially add more realism to eyecharts
Such topologies solved using partial enumeration
No levelization restriction
One example is multi-output mesh
PO 1
PO 2
PO 3
PI
MULTI-OUTPUT MESH
POPI
MESH
37
OutlineOutline
Motivation
Solving Eyechart Topologies
Experiments for Suboptimality Study
Results
38
Experimental SetupExperimental Setup Heuristics compared
Comm1, Comm2: Two different commercial gate-sizing/leakage-optimization tools
GS: Sensitivity-based sizing tool with sensitivity metric = LP: Linear programming tool [Nguyen et.al, ISLPED ’03] SBS: Sensitivity-based sizing tool with sensitivity metric =
[Gupta et.al, IEEE Tran. on CAD ’06]
Explored power, delay tradeoffs with size LP-LD: Linear increase in power, linear increase in delay LP-NLD: Linear increase in power, nonlinear increase in delay EP-LD: Exponential increase in power, linear increase in delay EP-NLD: Exponential increase in power, nonlinear increase in
delay
Experiments to explore dependence of suboptimality on Circuit size, circuit topology Delay-Size, power-size tradeoff Granularity of the cell library
delay
power
slack
power
39
Library CharacteristicsLibrary Characteristics
100_
__%
poweroptimal
poweroptimalpowermethoditySuboptimal
LP-LD 8.43% Gate Sizing 8 LP-NLD 0.3% Gate Sizing 8
EP-LD 8.43% Vt, gate-length 3,3
EP-NLD 0.3% Vt, gate-length 3,3
Library Model RMS fitting Optimization Default #
error (delay) context sizes/variants
The sizing choices for EP-LD and EP-NLD models are Vt variants and gate-length variants
Capacitance does not vary across Vt variants
Capacitance increase with gate-length for gate-length variants
Delay values are fitted to a 65 nm industrial library
Suboptimalities are calculated as
40
OutlineOutline
Motivation
Solving Eyechart Topologies
Experiments for Suboptimality Study
Results
41
Circuit Topology ImpactCircuit Topology Impact
Mesh-only
Chain-only
Sub
optim
ality
%
Star-only
Mesh is the toughest topology
42
Circuit Size ImpactCircuit Size Impact
#Gates Comm1 RT Comm2 RT LP RT GS RT
% % % %
1796 21.31 14 15.81 16 15.7 56 23.1 24
10026 21.29 261 16.73 365 15.5 450 23.5 309
25993 20.98 540 15.75 539 15.4 1617 23.2 512
51015 21.3 721 16.21 722 15.1 2458 23.5 921
Suboptimality relatively constant with circuit size
10K-gate benchmarks for the rest of the experiments
LP runtime does not scale well
43
Circuit Size ImpactCircuit Size Impact
#Gates Comm1 RT Comm2 RT LP RT GS RT
% % % %
1796 21.31 14 15.81 16 15.7 56 23.1 24
10026 21.29 261 16.73 365 15.5 450 23.5 309
25993 20.98 540 15.75 539 15.4 1617 23.2 512
51015 21.3 721 16.21 722 15.1 2458 23.5 921
Suboptimality relatively constant with circuit size
10K-gate benchmarks for the rest of the experiments
LP runtime does not scale well
44
Impact of Timing ConstraintsImpact of Timing Constraints
LP-LD delay model: Linear increase in delay with size
Suboptmalities are close to zero for very tight or very relaxed constraints
45
Impact of Nonlinearity in Delay Impact of Nonlinearity in Delay
LP-NLD delay model: Nonlinear increase in delay with size
Gap between sensitivity-based methods and LP tool becomes narrow
46
Impact of Power TradeoffImpact of Power Tradeoff
EP-NLD delay model: Exponential increase in power
LP suffers significantly due to snapping error
47
Gate-length Biasing ScenarioGate-length Biasing Scenario
Tools in general slightly worse compared to Vt
assignment
Capacitance varies across gate-length variants
48
Effect of GranularityEffect of Granularity Higher granularity has much larger benefits for
exponential compared to linear power tradeoff
LP-NLD
EP-NLD
49
Extensions to Slew Dependent DelayExtensions to Slew Dependent Delay
Delay of a gate depends on input slew
Output slew of a gate depends on Gate size Output load capacitance
Slew propagation is not considered
Optimal solution not guaranteed due to the need to maintain slew consistency
Experiments show suboptimality is still significant (5% to 35%)
Conclusion Conclusion
Eyecharts can help diagnose and benchmark gate sizing heuristics Suboptimality depends on circuit topology and power-/delay-size
tradeoffs
Existing heuristics be highly suboptimal 2% to 46% (gate sizing) 6% to 54% (Vt-assignment) 14% to 49% (gate-length biasing)
Ongoing work includes benchmarks Based on fanout distribution For joint sizing, multi-Vt, gate-length variant optimization
Benchmarks and code can be downloaded at http://nanocad.ee.ucla.edu/Main/DownloadForm
50