University of California, San Diego - Smart Non …...UC San Diego / VLSI CAD Laboratory Smart...

21
UC San Diego / VLSI CAD Laboratory Smart Non-Default Routing for Clock Power Reduction Andrew B. Kahng, Seokhyeong Kang, Hyein Lee VLSI CAD LABORATORY, UC San Diego 50 th Design Automation Conference June 5, 2013

Transcript of University of California, San Diego - Smart Non …...UC San Diego / VLSI CAD Laboratory Smart...

Page 1: University of California, San Diego - Smart Non …...UC San Diego / VLSI CAD Laboratory Smart Non-Default Routing for Clock Power Reduction Andrew B. Kahng, Seokhyeong Kang, Hyein

UC San Diego / VLSI CAD Laboratory

Smart Non-Default Routing for Clock Power Reduction

Andrew B. Kahng, Seokhyeong Kang, Hyein Lee

VLSI CAD LABORATORY, UC San Diego

50th Design Automation Conference

June 5, 2013

Page 2: University of California, San Diego - Smart Non …...UC San Diego / VLSI CAD Laboratory Smart Non-Default Routing for Clock Power Reduction Andrew B. Kahng, Seokhyeong Kang, Hyein

-2-

Outline

Motivation

Our Wire Sizing Algorithm

Experimental Results

Conclusions and Future Works

Page 3: University of California, San Diego - Smart Non …...UC San Diego / VLSI CAD Laboratory Smart Non-Default Routing for Clock Power Reduction Andrew B. Kahng, Seokhyeong Kang, Hyein

-3-

Non-Default Routing Rule

Non-Default Routing Rules (NDRs) are used to increase wire widths and spacings – Reduce wire parasitic and delay variability

– Reduce coupling capacitance

– Avoid Electromigration (EM) violation

Default Rule

width

Non-Default Rule (NDR)

spacing

Page 4: University of California, San Diego - Smart Non …...UC San Diego / VLSI CAD Laboratory Smart Non-Default Routing for Clock Power Reduction Andrew B. Kahng, Seokhyeong Kang, Hyein

-4-

Electromigration

EM causes unwanted opens or shorts in wires

EM reliability ↔ reduced current density – Widen wires (use NDR)

Can NDRs always cure the EM problems?

Page 5: University of California, San Diego - Smart Non …...UC San Diego / VLSI CAD Laboratory Smart Non-Default Routing for Clock Power Reduction Andrew B. Kahng, Seokhyeong Kang, Hyein

-5-

Motivation Larger Cap.

Larger/More Buffers

More EM Violations

Fixed NDR (Wider wires)

More power

Fixed NDR

Smart NDR (SNDR)

Smaller/Fewer Buffers

Smaller Cap.

Less EM Violations

Smart NDR (Tapering)

Less power

No need for segment to be wide if small # of downstream sinks

Page 6: University of California, San Diego - Smart Non …...UC San Diego / VLSI CAD Laboratory Smart Non-Default Routing for Clock Power Reduction Andrew B. Kahng, Seokhyeong Kang, Hyein

-6-

Related Works

Wire sizing in clock trees – [Tsai04] buffer insertion and wire sizing to optimize

delay and power using dynamic programming

– [Guthaus06] clock buffer/wire sizing to minimize skew using sequential linear programming

– Related to timing; EM reliability not considered

EM-constrained wire sizing – [Pullela95] low-power clock tree with EM constraints

– Inserts buffers to reduce wire width

Our work considers EM reliability without buffer insertion

Page 7: University of California, San Diego - Smart Non …...UC San Diego / VLSI CAD Laboratory Smart Non-Default Routing for Clock Power Reduction Andrew B. Kahng, Seokhyeong Kang, Hyein

-7-

Outline

Motivation

Our Wire Sizing Algorithm

Experimental Results

Conclusions and Future Works

Page 8: University of California, San Diego - Smart Non …...UC San Diego / VLSI CAD Laboratory Smart Non-Default Routing for Clock Power Reduction Andrew B. Kahng, Seokhyeong Kang, Hyein

-8-

SNDR Wire Sizing Algorithm

Objective: Minimize the total capacitance of clock network

While maintaining – Clock latency

– Maximum transition time

– Clock skew

– Without EM violations

Solution: NDR for each wire segment of a given clock network

Page 9: University of California, San Diego - Smart Non …...UC San Diego / VLSI CAD Laboratory Smart Non-Default Routing for Clock Power Reduction Andrew B. Kahng, Seokhyeong Kang, Hyein

-9-

Wire Delay, Slew, RC and EM Limit Models

Wire delay and slew model

– Wire delay: Elmore delay model [Elmore48]

– Wire slew: PERI model [HuAHK07]

Wire RC and EM limits: f(w)

𝑅 ∝ 1/𝑤 𝐶 ∝ 𝑤

R, C

= linear functions of log(w) 𝐸𝑀 ∝ 𝑤

y = 0.6681x + 0.5475

R² = 0.9723

0.00

0.50

1.00

1.50

2.00

0.0 0.5 1.0 1.5E

M L

imit

(Ir

ms,m

A)

SNDR variable xe = ln(we)

y = -0.9814x + 1.7098

R² = 0.9674

0.00

0.50

1.00

1.50

2.00

0.0 0.5 1.0 1.5

Wir

e R

es.

pe

r u

m

(oh

m/u

m)

SNDR variable xe = ln(we)

y = 0.0319x + 0.0866

R² = 0.9439

0.00

0.05

0.10

0.15

0.0 0.5 1.0 1.5

Wir

e C

ap

. p

er

um

(Ff/

um

)

SNDR variable xe = ln(we)

w : wire width

<Resistance> <Capacitance> <EM limit>

, EMlimit

Page 10: University of California, San Diego - Smart Non …...UC San Diego / VLSI CAD Laboratory Smart Non-Default Routing for Clock Power Reduction Andrew B. Kahng, Seokhyeong Kang, Hyein

-10-

SNDR for a Clock Subnet

Problem formulation

NDR solutions are obtained for wire segments

Minimize total capacitance Subject to

Delayi < MaxDelay Skewi,j < MaxClockSkew Ie < MaxEMCurrentLimite

we ≥ wdesc(e) (desc(e) ≡ edges downstream from e)

...

<subnet>

e

i j

Page 11: University of California, San Diego - Smart Non …...UC San Diego / VLSI CAD Laboratory Smart Non-Default Routing for Clock Power Reduction Andrew B. Kahng, Seokhyeong Kang, Hyein

-11-

SNDR for Entire Clock Tree

Solve SNDR subnet problems from the bottom to top of clock tree

Skew propagation : Maximum/minimum clock latencies of downstream subnets are propagated to upstream subnets

level N

level 1

level N-1

<Entire Clock Tree>

... Max/min clock latencies at level N

Skew propagation

Skew constraints at level N-1

Page 12: University of California, San Diego - Smart Non …...UC San Diego / VLSI CAD Laboratory Smart Non-Default Routing for Clock Power Reduction Andrew B. Kahng, Seokhyeong Kang, Hyein

-12-

Iterative Linear Programming

Sizing problem is a quadratic program due to RC delay

Separate sizing problem into two linear programs

5X-30X runtime reduction by avoiding quadratic formulation

170 minutes vs. 30 minutes for dma testcase

Elmore delay = R·C

fR(xe)·fC(K)

fR(K)·fC(xe) fR(xe)·fC(xe)

Quadratic program Two linear programs

fR(xe)·fC(K)

fR(K)·fC(xe)

Iterative linear program

Iteratively solve the problems until the solution (xe) converges

and solve them iteratively

Page 13: University of California, San Diego - Smart Non …...UC San Diego / VLSI CAD Laboratory Smart Non-Default Routing for Clock Power Reduction Andrew B. Kahng, Seokhyeong Kang, Hyein

-13-

Outline

Motivation

Our Wire Sizing Algorithm

Experimental Results

Conclusions and Future Works

Page 14: University of California, San Diego - Smart Non …...UC San Diego / VLSI CAD Laboratory Smart Non-Default Routing for Clock Power Reduction Andrew B. Kahng, Seokhyeong Kang, Hyein

-14-

Implementation Flow

Practical and automated flow

Clock Tree Synthesis with NDR

DEFOut

DEFIn

Meet skew/slew/delay/EM?

Tapered Clock Tree

No

Yes

Clock Tree with SNDR

Solve SNDR Problems

Recovery with greedy algorithm

Solution

Original Clock Tree

Page 15: University of California, San Diego - Smart Non …...UC San Diego / VLSI CAD Laboratory Smart Non-Default Routing for Clock Power Reduction Andrew B. Kahng, Seokhyeong Kang, Hyein

-15-

Experimental Environments

Cadence SOC Encounter, Matlab, Synopsys 32/28nm PDK

Various NDRs are tried

NDR Width Space Norm. Cap.

1W2S 1 2 1.21

1W5S 1 5 0.73

2W4S 2 4 0.89

1W8S 1 8 0.66

2W7S 2 7 0.76

3W6S 3 6 0.88

4W5S 4 5 1.00

Iso area : W+S is fixed

Experiments

Iso area NDRs

Non-iso (less) area NDRs

Results essentially satisfy all timing and EM constraints

Runtime: 10 seconds ∼100 minutes per subnet

Matlab R2012b and 2.5GHz Intel Xeon processor

Page 16: University of California, San Diego - Smart Non …...UC San Diego / VLSI CAD Laboratory Smart Non-Default Routing for Clock Power Reduction Andrew B. Kahng, Seokhyeong Kang, Hyein

-16-

Results: Proportion of NDRs

Smaller-width NDRs replace ≥ 80% of the wiring

0% 20% 40% 60% 80% 100%

aes

eth

jpeg_enc

mc

mpeg2

tv80s

usbf

conmax

dma

1W8S

2W7S

3W6S

4W5S

Page 17: University of California, San Diego - Smart Non …...UC San Diego / VLSI CAD Laboratory Smart Non-Default Routing for Clock Power Reduction Andrew B. Kahng, Seokhyeong Kang, Hyein

-17-

Results: Less Capacitance

Clock switching power and wire capacitance reduction

0.0%

5.0%

10.0%

15.0%

Re

du

cti

on

[%

]

Wire Cap.

Clock Switching Pwr

0.0%2.0%4.0%6.0%8.0%

Re

du

cti

on

[%

]

Wire Cap.

Clock Switching Pwr

Orig. NDR = 4W5S

Orig. NDR = 2W4S

9% reduction in wire capacitance 5% reduction in clock switching power

5% reduction in wire capacitance 2% reduction in clock switching power

Page 18: University of California, San Diego - Smart Non …...UC San Diego / VLSI CAD Laboratory Smart Non-Default Routing for Clock Power Reduction Andrew B. Kahng, Seokhyeong Kang, Hyein

-18-

Results: Less Area

50% track cost reduction can be achieved by non-iso area NDRs

0

10

20

30

40

50

60

Tra

ck

co

st

red

ucti

on

[%

]

NDR Track cost

1W2S 3

2W4S 5

3W6S 7

4W5S 7

Track cost : total amount of track length occupied

Page 19: University of California, San Diego - Smart Non …...UC San Diego / VLSI CAD Laboratory Smart Non-Default Routing for Clock Power Reduction Andrew B. Kahng, Seokhyeong Kang, Hyein

-19-

Conclusion and Future Works Smart NDRs for clock networks reduce the

clock tree wire capacitance under timing and EM constraints

Less capacitance: 9% reduction of wire cap, 5% reduction of clock switching power

Less area: 50% track cost reduction by using non-iso area NDRs

Future work – Control local skews for improved chip timing and

robustness

– Use better delay models for problem formulation

– Noise and variability consideration

Page 20: University of California, San Diego - Smart Non …...UC San Diego / VLSI CAD Laboratory Smart Non-Default Routing for Clock Power Reduction Andrew B. Kahng, Seokhyeong Kang, Hyein

Thank You!

Page 21: University of California, San Diego - Smart Non …...UC San Diego / VLSI CAD Laboratory Smart Non-Default Routing for Clock Power Reduction Andrew B. Kahng, Seokhyeong Kang, Hyein

-21-

References

[Elmore48] W. C. Elmore, “The Transient Response of Damped Linear Networks with Particular Regard to Wideband Amplifiers”, J. Applied Physics 19(1) (1948), pp. 55-63

[HuAHK07] S. Hu, C. J. Alpert, J. Hu, S. Karandikar, Z. Li, W. Shi and C. N. Sze, “Fast Algorithms For Slew Constrained Minimum Cost Buffering”, IEEE TCAD 26(11) (2007), pp. 2009-2022.

[PullelaMP95] S. Pullela, N. Menezes and L. T. Pillage, “Low Power IC Clock Tree Design”, Proc. CICC, 1995, pp. 263-266.