Keith So University of New South Wales, Sydney, Australia Feb 25 @ FPGA’08

23
Enforcing Long-Path Timing Closure for FPGA Routing with Path Searches on Clamped Lexicographic Spirals Keith So University of New South Wales, Sydney, Australia Feb 25 @ FPGA’08

description

Enforcing Long-Path Timing Closure for FPGA Routing with Path Searches on Clamped Lexicographic Spirals. Keith So University of New South Wales, Sydney, Australia Feb 25 @ FPGA’08. Outline. Problem Statement Related Work SpiralRoute Overview Budget Generation - PowerPoint PPT Presentation

Transcript of Keith So University of New South Wales, Sydney, Australia Feb 25 @ FPGA’08

Page 1: Keith So University of New South Wales,  Sydney, Australia Feb 25 @ FPGA’08

Enforcing Long-Path Timing Closure for FPGA Routing

with Path Searches on Clamped Lexicographic Spirals

Keith SoUniversity of New South Wales,

Sydney, AustraliaFeb 25 @ FPGA’08

Page 2: Keith So University of New South Wales,  Sydney, Australia Feb 25 @ FPGA’08

Outline

• Problem Statement• Related Work1. SpiralRoute Overview

a) Budget Generationb) Clamped Lexicographic Search

2. Some Performance Optimizations3. Experiments• Conclusions and Future Work

Page 3: Keith So University of New South Wales,  Sydney, Australia Feb 25 @ FPGA’08

Problem Statement & Assumptions

Long-Path Timing-Driven Detailed Routing• Given: Placed circuit mapped into RR Graph +

Timing Requirement D• Find: Mutually RR-vertex disjoint routing trees

s.t. Max. Long-Path Comb. Delay <= DAssumptions• D is achievable under given placement• Buffered switching (delays summable)

Page 4: Keith So University of New South Wales,  Sydney, Australia Feb 25 @ FPGA’08

Related Work

• [F’92] Iterative slack allocation• [AR’95] Criticality bin + Steiner/Arbor.• [ME’95] Negotiated Congestion• [BR’97] VPR• [LW’03] Lagrangian Rel. Weighting• [ANC+’04] Auto. Constraint Gen.• [FBC’04] RCV

Page 5: Keith So University of New South Wales,  Sydney, Australia Feb 25 @ FPGA’08

SpiralRoute Overview

• Negotiated Congestion Routing over A*• Paths are lexicographic-costed [S’07,ISPD07]Major Deltas• Optimal delay upper bound generation for

FPGA routing domain• Minimum-congestion bounded-delay

searching (vs tradeoff using weights)• Provable timing closure at completion

Page 6: Keith So University of New South Wales,  Sydney, Australia Feb 25 @ FPGA’08

Connection Budget Generation – Optimization Component

Weighted Budget Distribution Problem [Ghiasi et.al, ICCAD’04]

Given: DAG G=(V,A), min. delays dij, weights wij, long-path constraint T

Find: delay budgets bij such that:1. (dij+bij) summed along all paths satisfies T2. Sum of (wij.bij) over all edges is maximised

Transforms into min-cost flow problem; budgets recovered from dual of flow solution.

Page 7: Keith So University of New South Wales,  Sydney, Australia Feb 25 @ FPGA’08

Connection Budget Generation – Mapping to FPGA Routing

1. Represent LE’s and pads as edges (split clocked LE’s)

2. Form super-DAG3. dij = min connection delay (from congestion-

oblivious routing)4. Set T = D5. Set wij = 1 for real edges, 0 for virtuals6. Solved (dij+bij) is the maximum delay for each

edge in our routing

Page 8: Keith So University of New South Wales,  Sydney, Australia Feb 25 @ FPGA’08

Comparison with It. Minimax PERT(clma runtime ~ 20mins)

Page 9: Keith So University of New South Wales,  Sydney, Australia Feb 25 @ FPGA’08

Search Design – n-Lex. Search • [1-Line A* search: f(v)=g(v)+h(v), expand v with

minimum f(v) until t]• 2-component lexicographic search used for

routability router (Conceptually a*∞ + b)• Need n-components and custom comparison

functions (proofs needed to avoid ∞k values!)Theorem A* of n-lexicographic search is admissible

if all components are totally-ordered monoids with order-preserving addition

• Monoids helpful to avoid clutter from max()

Page 10: Keith So University of New South Wales,  Sydney, Australia Feb 25 @ FPGA’08

Search Design – Clamping Component

• 3-component vector1. Delay, with pivot (x < y iff

x <= T & y > T)2. Congestion, regular <3. Delay, regular <Ex: f(w2)=[0,2,2];

f(x1)=[1,0,4]; f(w3)=[0,1,3]Assumption h(v) is at least

close to h*(v) for clamping component

Page 11: Keith So University of New South Wales,  Sydney, Australia Feb 25 @ FPGA’08

Search Design – Timing Closure

• Delay pivot element splits congestion identical paths by budget

• Will always choose a budget-compliant path (sum of finite congestion costs are finite)

• Over all connections => successful routing always yields timing closure!

Page 12: Keith So University of New South Wales,  Sydney, Australia Feb 25 @ FPGA’08

Performance – [Low-Hanging] Optimizations

• Original implementation is around ~ 2-2.5x slower than current runtime

• Introduced some low-hanging speed & quality optimizations– Index structure for lexicographic costs– Greedy tree mgmt. to ameliorate pin-ordering

• A high-hanging optimization in future work is congestion schedule handling (but many promising leads from global routers in ICCAD’07)

Page 13: Keith So University of New South Wales,  Sydney, Australia Feb 25 @ FPGA’08

Trie-of-Stacks Index Structure

• Replaces f(v) index structure

• Exploits FPGA routing symmetry

• Index operations independent of size

• Reduces runtimes by ~15 %

Page 14: Keith So University of New South Wales,  Sydney, Australia Feb 25 @ FPGA’08

Tree Topology Maintainence

Page 15: Keith So University of New South Wales,  Sydney, Australia Feb 25 @ FPGA’08

Experiments - Setup

• Run against VPR4.30 on architecture similar to single-segment “challenge” arch.– (Researcher timing

constraints)– routability comparison

with unclamped lex-search

• Route at the placement allowed Fmax

• VPR pres_fac=1.5/1.1

Page 16: Keith So University of New South Wales,  Sydney, Australia Feb 25 @ FPGA’08

Routed Solution Timing Quality

Page 17: Keith So University of New South Wales,  Sydney, Australia Feb 25 @ FPGA’08

Runtime Comparison

Page 18: Keith So University of New South Wales,  Sydney, Australia Feb 25 @ FPGA’08

Effects of Budget Quality

Page 19: Keith So University of New South Wales,  Sydney, Australia Feb 25 @ FPGA’08

Future Work

• Runtime improvements – Schedule improvement– Performance tuning

• Multi-CLB segments (see backup slide)• Multi-objective routing • Other domains (e.g. standard cell global)

Page 20: Keith So University of New South Wales,  Sydney, Australia Feb 25 @ FPGA’08

Conclusions

• Extended lexicographic search to timing-driven routing– New budgeting component– Clamped search design– Supporting techniques for runtime

• Timing closure is guaranteed on routing success

• Solution quality is good but need more runtime improvement to be viable

Page 21: Keith So University of New South Wales,  Sydney, Australia Feb 25 @ FPGA’08

Acknowledgements

• J. Rose, V. Betz, A. Marquardt (Toronto) – VPR4.30 source & benchmarks

• Australian Centre for Advanced Computing and Communications (ac3) – High Performance Computing Support

• Advisor*: Dr. Aleks Ignjatovic

Page 22: Keith So University of New South Wales,  Sydney, Australia Feb 25 @ FPGA’08

Question Time…

To Backup Slides

Page 23: Keith So University of New South Wales,  Sydney, Australia Feb 25 @ FPGA’08

Issues with h(v) ~/~ h*(v)• “Node locking” occurs when g(v)+h(v) <= D but

really g(v)+h*(v) > D– Expansion downstream will be truncated– But a subpath with less delay but more congestion

cannot expand into it– But if reexpand on shorter delay then backtrace will

ignore congestion – not locally decidable!• Quick fix: precompute h*(v) (Only needed for

sink pins t) – Only bounding components need the accuracy

• Fancy on-the-fly handling?