1 A Min-Cost Flow Based Detailed Router for FPGAs Seokjin Lee *, Yongseok Cheon *, D. F. Wong + *...

25
1 A Min-Cost Flow Based Detailed Router for FPGAs Seokjin Lee * , Yongseok Cheon * , D. F. Wong + * The University of Texas at Austin + University of Illinois at Urbana-Champaign

Transcript of 1 A Min-Cost Flow Based Detailed Router for FPGAs Seokjin Lee *, Yongseok Cheon *, D. F. Wong + *...

1

A Min-Cost Flow Based Detailed Router for FPGAs

Seokjin Lee*, Yongseok Cheon*, D. F. Wong+

*The University of Texas at Austin+ University of Illinois at Urbana-Champaign

2

Outline Overview Introduction

FPGA Architecture, Routing resources

Problem Definitions Algorithm Description

Min-cost flow based router Lagrangian relaxation

Experimental Results Conclusion

3

Overview FlowRoute - A congestion-driven detailed

router Finds a feasible routing with minimum

total delay for a given placed netlist. Routes all the nets connected to a LUT

simultaneously by a min-cost flow algorithm

Iterative refinement with Lagrangian relaxation

4

FPGA Architecture Logic modules

Implements logic functions

LUTs, flip-flops Routing resources

Wire segments Programmable

switches I/O modules

L

S

wiresegments

logicmodule

I/Omodule

programmableswitch

L L

L L L

LLL

S S

S S S

S S S

<A typical FPGA architecture>

5

FPGA Routing Resources Prefabricated

routing resources Congestion

constraints Limited Routability

High RC delays and large area of switches

a b

cd

ef

g h

L2 L4

L1 L3

6

FPGA Routing Example

a b

c

d

e

f

g h

12

34

56

7

8

910

1314

1112

1516

L1

L2

L3

L4

7

Graph Representation

Routing resource graph G (V , E) V : I/O pins of logic modules, wire segments E : feasible connections between the nodes Routing problem: Finding vertex disjoint trees T={T1,…,Tn}

3

2

8

7 13

16

10

9

a b

c

d

g h

e

f

a b

c

d

e

f

g h

12

34

56

7

8

910

1314

1112

1516

L1

L2

L3

L4

8

Problem Definitions The Routing for One LUT

(ROL) Problem Find routes for all the net

segments connected to a LUT

Using equivalence of input pins of a LUT

FPGA detailed routing problem

Find routes for all the nets.

Soving ROL problem for all LUTs in an FPGA

9

Flow Network for ROL Construct Gf(Vf, Ef) from G(V, E) Vf = V U {s, s1, s2, …, sn, t}, si : subsource Ef = E U Es U Es’ U Et

Es = {(s, si)| i = 1, …, n}, Es’ = {(si, v)|i = 1, …, n, v in Ti}

Et = {(pi, t)| pi in Sp} Edge capacity

rf(e) = 1 for all e in Ef

Node capacity rf(v) = 1, for all v in Vf – {s, t}

Cost: cf(e) = c(e) for e in E, cf(v) = c(v) for v in V

10

Flow Network (example)

1 2 3 4

5 6 7 8

9 10 11 12

13 14 15 16

s1

s2

s3

s4

s t

(1,0)

(1,0)

(1,0)

(1,0)

(1,0)

(1,0)

(1,0)

(1,0)

(1,0)

(1,0)

(1,0)

(1,0)

(1,0)

(1,0)

(1,0)

(1,0)

(1,0)

11

ROL_NF (example)

1 2 3 4

5 6 7 8

9 10 11 12

13 14 15 16

s1

s2

s3

s4

s t

(1,0)

(1,0)

(1,0)

(1,0)

(1,0)

(1,0)

(1,0)

(1,0)

(1,0)

(1,0)

(1,0)

(1,0)

(1,0)

(1,0)

(1,0)

(1,0)

(1,0)

12

ROL_NF (example)

1 2 3 4

5 6 7 8

9 10 11 12

13 14 15 16

13

ROL_NF for ROL A min-cost max-flow f* in Gf

corresponds to a solution to ROL with minimum total delay cost.

If |f*|=n, all the net segments connected to a LUT

ROL_NF exactly solves ROL problem in polynomial time

14

ROL_NF Algorithm ROL_NF 1. Construct Gf (Vf, Ef) 2. Assign costs and capacities 3. Run min-cost max-flow

algorithm on Gf (Vf, Ef) 4. Derive routes for the nets

from the computed flow

15

Lagrangian Relaxation General technique for solving

optimization problems with difficult constraints

Original optimization problem is divided into subproblems

Each subproblem is solved by repetitive application of ROL_NF

Lagrangian multipliers guide the router

16

Lagrangian Relaxation

kk b)(g

b)(g

b)(g

f

x

x

x

x

...

s.t

)( min

22

11

))((...

))((

))(( )( min

222

111

kkk bg

bg

bgf

x

x

xx

Original problem Lagrangian subproblem

k ,...,, update 21)}])(()([min{max0

xx gf

17

LR for FPGA detailed routing

Original problem Lagrangian subproblem

λ updatemax{min L(x)}

1

.

min,

k ik

ki iki

x

ts

xc

Vi

k ikik ii iki xxc

xL

)1(min{

)(min

18

Solving Lagrangian Subproblem

By rearranging terms, L(x) = ki(ci + i)xik – ii LS’ = min{ki(ci + i)xik}

ROL_NF solves LS’ Set (ci + i) as a cost of i ci = di (delay term) * qi (congestion

term)

19

Updating Lagrangian Multipliers

)}1(,0max{1 k ikr

ri

ri x

econvergenc

lim 0lim1

r

ii

rr

r

Subgradient Method

r : stepsize

20

FlowRoute

1. Initialize 2. For each lk in L do

3. Rip up nets connected to lk4. Call ROL_NF 5. Update costs and reset capacities6. Update 7. Repeat Step 2 – 6 until no shared

resource exists

21

Experimental Results FPGA model used

Symmetrical-array-based FPGA Each logic block contains four 4-input LUTs and

flip-flops Switch connections: Fs = 3, Fc = W Fs: number of connections per wire entering the

switch box Fc : number of tracks to which each logic block pin can connect W : number of tracks in a channel

22

Experimental Results Tested on MCNC benchmark circuits Results compared with VPR router Used smaller number of routing tracks Improvement on critical path delay up

to 28.9 % (average 14.1%) Total wire length reduced (ave. 8.3%)

23

Experimental Results Channel width and delay comparison

Circuits LUTs/ FFs

Number of tracks

Critical Path Delay

VPR FR VPR FR

9symml 104 10 9 26.7 25.1 (6.0%)

term1 128 13 12 25.3 23.3 (7.9%)

apex7 252 13 13 26.1 21.3 (18.4%)

example2

404 17 16 29.6 23.2 (21.6%)

alu2 224 17 17 54.7 49.2 (10.1%)

Too-lrg 208 19 19 31.2 30.2 (3.2%)

vda 456 23 23 46.5 38.9 (16.3%)

alu4 1560 33 33 143.4 122.5 (14.6%)

s298 1960 27 27 274.0 194.7 (28.9%)

24

Conclusion A new congestion-driven routing

algorithm for FPGAs Find a feasible routing with

minimum total delay – expects reduced critical path delay

Can be used in multiple stage routing scheme

25

Thank You!