CSE 494: Electronic Design Automation Lecture 4 Partitioning.

38
CSE 494: CSE 494: Electronic Design Electronic Design Automation Automation Lecture 4 Lecture 4 Partitioning Partitioning

Transcript of CSE 494: Electronic Design Automation Lecture 4 Partitioning.

Page 1: CSE 494: Electronic Design Automation Lecture 4 Partitioning.

CSE 494: Electronic CSE 494: Electronic Design Automation Design Automation

Lecture 4Lecture 4

PartitioningPartitioning

Page 2: CSE 494: Electronic Design Automation Lecture 4 Partitioning.

OrganizationOrganization

PartitioningPartitioning Kernighan-Lin (KL) HeuristicKernighan-Lin (KL) Heuristic Fiduccia-Mattheyses (FM) HeuristicFiduccia-Mattheyses (FM) Heuristic Simulated annealingSimulated annealing

Page 3: CSE 494: Electronic Design Automation Lecture 4 Partitioning.

PartitioningPartitioning

Division of a graph (or hypergraph) into Division of a graph (or hypergraph) into multiple sub-graphs is known as multiple sub-graphs is known as partitioning.partitioning.

Partitioning shouldPartitioning should Maintain functionalityMaintain functionality Minimize interconnections between sub-Minimize interconnections between sub-

graphsgraphs Have low run-time complexityHave low run-time complexity

Page 4: CSE 494: Electronic Design Automation Lecture 4 Partitioning.

Problem FormulationProblem Formulation GivenGiven

A hypergraph G(V,E)A hypergraph G(V,E) V = {v_1,v_2,…,v_n} set of verticesV = {v_1,v_2,…,v_n} set of vertices E = {e_1,e_2,…,e_m} set of hyperedges where e_i = E = {e_1,e_2,…,e_m} set of hyperedges where e_i =

{v_i, v_j, …,v_k}{v_i, v_j, …,v_k} Area of each vertex, a(v_i)Area of each vertex, a(v_i)

Partition V into {V_1,V_2,V_3,…,V_k} wherePartition V into {V_1,V_2,V_3,…,V_k} where V_i intersection V_j is empty set, i<>jV_i intersection V_j is empty set, i<>j Union of all V_i = VUnion of all V_i = V Size of each partition < ConstraintSize of each partition < Constraint Cut-set is minimizedCut-set is minimized

Partitioning is an NP complete problem.Partitioning is an NP complete problem.

Page 5: CSE 494: Electronic Design Automation Lecture 4 Partitioning.

Objective and ConstraintsObjective and Constraints

ObjectiveObjective Obj1: Minimize interconnection between Obj1: Minimize interconnection between

various partitionsvarious partitions Obj2: Minimize delay due to partitionObj2: Minimize delay due to partition

ConstraintsConstraints Const1: Number of terminals or pins.Const1: Number of terminals or pins. Const2: Area of each partitionConst2: Area of each partition Const 3: Number of partitionsConst 3: Number of partitions

Page 6: CSE 494: Electronic Design Automation Lecture 4 Partitioning.

Partitioning and Design StylesPartitioning and Design Styles

Full CustomFull Custom Area and terminal count constraintsArea and terminal count constraints Minimize nets crossing a partition, delayMinimize nets crossing a partition, delay

Standard CellStandard Cell At RTL, CircuitAt RTL, Circuit Partition RTL specification into dis-joint sub-circuits, such that Partition RTL specification into dis-joint sub-circuits, such that

each sub-circuit corresponds to a standard celleach sub-circuit corresponds to a standard cell Minimize nets, delayMinimize nets, delay

Gate arrayGate array At RTLAt RTL Partition RTL specification recursively such that each partition Partition RTL specification recursively such that each partition

corresponds to a gate.corresponds to a gate. Minimize delayMinimize delay

Page 7: CSE 494: Electronic Design Automation Lecture 4 Partitioning.

Classification of Partitioning Classification of Partitioning AlgorithmsAlgorithms

Constructive algorithms versus iterative Constructive algorithms versus iterative improvement algorithmsimprovement algorithms

Deterministic versus probabilistic Deterministic versus probabilistic algorithmsalgorithms

Page 8: CSE 494: Electronic Design Automation Lecture 4 Partitioning.

Bi-partitioning problemBi-partitioning problem

Also known as min cut partitioningAlso known as min cut partitioning Number of partitions = 2Number of partitions = 2 Minimize the nets crossing the partitionsMinimize the nets crossing the partitions Size of the two partitions is equalSize of the two partitions is equal Given a graph with N nodes, calculate the Given a graph with N nodes, calculate the

number of different bi-partitions!number of different bi-partitions!

Page 9: CSE 494: Electronic Design Automation Lecture 4 Partitioning.

Kernighan-Lin (KL) HeuristicKernighan-Lin (KL) Heuristic

Bi-partitioning algorithmBi-partitioning algorithm Input specified as a graph G(V,E)Input specified as a graph G(V,E)

Obj: Divide V into two equal halvesObj: Divide V into two equal halves Minimize cut-setMinimize cut-set

Iterative improvementIterative improvement Starts with a random initial partition.Starts with a random initial partition.

Page 10: CSE 494: Electronic Design Automation Lecture 4 Partitioning.

KL: Input and OutputKL: Input and Output

1

6

4

2

3 7

8

5 1

6

4

2

3 7

8

5

Page 11: CSE 494: Electronic Design Automation Lecture 4 Partitioning.

KL: Gain CalculationKL: Gain Calculation For each vertex aFor each vertex a

I(a) = number of edges that do not cross cutI(a) = number of edges that do not cross cut E(a) = number of edges that cross the cutE(a) = number of edges that cross the cut Gain(a) = E(a) – I(a)Gain(a) = E(a) – I(a)

If two vertices a in A and b in B are If two vertices a in A and b in B are exchangedexchanged Gain(a,b) = Gain(a) + Gain(b) – 2c(a,b)Gain(a,b) = Gain(a) + Gain(b) – 2c(a,b)

Cutcost’ = Cutcost - Gain(a,b)Cutcost’ = Cutcost - Gain(a,b) For the remaining vertices x in A and y in BFor the remaining vertices x in A and y in B

Gain’(x) = Gain(x) + 2c(x,a) – 2c(x,b)Gain’(x) = Gain(x) + 2c(x,a) – 2c(x,b) Gain’(y) = Gain(y) + 2c(y,b) – 2c(y,a)Gain’(y) = Gain(y) + 2c(y,b) – 2c(y,a)

Page 12: CSE 494: Electronic Design Automation Lecture 4 Partitioning.

KL: StrategyKL: Strategy

From a node from each partition whose From a node from each partition whose exchange results in largest gain.exchange results in largest gain.

Exchange the nodes, and lock them in the Exchange the nodes, and lock them in the new partitions.new partitions.

Maintain a table that records and updates Maintain a table that records and updates the cumulative gain after every move.the cumulative gain after every move.

Continue exchanging nodes until all nodes Continue exchanging nodes until all nodes are locked.are locked.

Based on the table implement the first “k” Based on the table implement the first “k” moves that result in largest gain.moves that result in largest gain.

Page 13: CSE 494: Electronic Design Automation Lecture 4 Partitioning.

KL: TableKL: TableIterationIteration Vertex Vertex

pairpairGain(i,j)Gain(i,j) Sum ofSum of

Gain(i,i)Gain(i,i)

CutsizeCutsize

00 -- -- -- 99

11 (3,5)(3,5) 33 33 66

22 (4,6)(4,6) 55 88 11

33 (1,7)(1,7) -6-6 22 77

44 (2,8)(2,8) -2-2 00 99

Page 14: CSE 494: Electronic Design Automation Lecture 4 Partitioning.

KL: AlgorithmKL: Algorithmbeginbegin

initialize();initialize();while (improve == TRUE)while (improve == TRUE)

while (UNLOCK(A) == TRUE)while (UNLOCK(A) == TRUE)for all unlocked (a) in Afor all unlocked (a) in A

for all unlocked(b) in Bfor all unlocked(b) in Bif (cutcost + gain(a,b) < min)if (cutcost + gain(a,b) < min)

min = cutcost + gain(a,b)min = cutcost + gain(a,b)sel_a = a, sel_b =bsel_a = a, sel_b =b

cutcost = min, lock(sel_a), lock(sel_b), update(T)cutcost = min, lock(sel_a), lock(sel_b), update(T)implement first k moves that achieve the lowest cutsetimplement first k moves that achieve the lowest cutsetset improveset improve

endend

Complexity = O(n^3)

Page 15: CSE 494: Electronic Design Automation Lecture 4 Partitioning.

KL DrawbacksKL Drawbacks

Handles only unit vertex nodes.Handles only unit vertex nodes. Addresses only exact bisections.Addresses only exact bisections. Cannot handle hypergraphs.Cannot handle hypergraphs. Time complexity is high.Time complexity is high.

Page 16: CSE 494: Electronic Design Automation Lecture 4 Partitioning.

Fiduccia-Mattheyses (FM) Problem Fiduccia-Mattheyses (FM) Problem DefinitionDefinition

Given Given A hypergraph G(C, N) where C is the set A hypergraph G(C, N) where C is the set

of cells, and N is the set of nets.of cells, and N is the set of nets. Each cell i has a size s(i).Each cell i has a size s(i).

A fraction r = |A|/(|A| + |B|)A fraction r = |A|/(|A| + |B|) Partiton G into two block A and B such Partiton G into two block A and B such

that that the resulting cutset is minimized, andthe resulting cutset is minimized, and the fraction r is satisfied.the fraction r is satisfied.

Page 17: CSE 494: Electronic Design Automation Lecture 4 Partitioning.

FM DefinitionsFM Definitions

Total number of nets: NTotal number of nets: N Total number of cells: CTotal number of cells: C Size of each cell: s(i)Size of each cell: s(i) Number of cells in a net: n(i)Number of cells in a net: n(i) Number of pins in a cell: p(i)Number of pins in a cell: p(i) Total number of pins: p(1) + p(2) + .. P(C) Total number of pins: p(1) + p(2) + .. P(C)

= n(1) + n(2) + …n(N) = P= n(1) + n(2) + …n(N) = P

Page 18: CSE 494: Electronic Design Automation Lecture 4 Partitioning.

FM DefinitionFM Definition

The cut state of a net is ‘1’, if the net has The cut state of a net is ‘1’, if the net has cells in both partitions.cells in both partitions.

A net is considered critical if it has a cell A net is considered critical if it has a cell which if moved will change its cut state:which if moved will change its cut state: No cell in one partition (or all cells are in one No cell in one partition (or all cells are in one

partition),partition), It has only one cell in partition A, and the It has only one cell in partition A, and the

remaining are in partition B.remaining are in partition B.

Page 19: CSE 494: Electronic Design Automation Lecture 4 Partitioning.

FM StrategyFM Strategy Overall strategy is similar to KL.Overall strategy is similar to KL.

Iterative improvement.Iterative improvement. However, some modifications.However, some modifications.

Support for hypergraphs.Support for hypergraphs. Only one cell moved at a time.Only one cell moved at a time.

Max gainMax gain Maintains the ratio (r-smax <= r <= r+smax)Maintains the ratio (r-smax <= r <= r+smax)

Efficient data structures for:Efficient data structures for: Accessing cells and netsAccessing cells and nets Obtaining cells with max gainObtaining cells with max gain Calculating and updating gainCalculating and updating gain

Page 20: CSE 494: Electronic Design Automation Lecture 4 Partitioning.

Cell and Net Data StructuresCell and Net Data Structures

An array of cell nodesAn array of cell nodes Each node has a linked list of netsEach node has a linked list of nets

A array of netsA array of nets Each position has a linked list of cellsEach position has a linked list of cells

Constructed in O(P).Constructed in O(P).

Page 21: CSE 494: Electronic Design Automation Lecture 4 Partitioning.

Bucket StructureBucket Structure

The gain when a cell is moved can vary from The gain when a cell is moved can vary from pmax to - pmax.pmax to - pmax.

Each partition has an array of pointers called the Each partition has an array of pointers called the bucket array.bucket array.

Size of the array is given by 2*pmax + 1.Size of the array is given by 2*pmax + 1. Each array location “i” has a linked list of Each array location “i” has a linked list of

pointers with gain “-pmax + i”.pointers with gain “-pmax + i”. The bucket structure is utilized for bucket sort.The bucket structure is utilized for bucket sort. A pointer MAXGAIN that points to the location A pointer MAXGAIN that points to the location

with the maxgain cell. with the maxgain cell.

Page 22: CSE 494: Electronic Design Automation Lecture 4 Partitioning.

Free ListFree List

Once a cell has been moved, and locked it isOnce a cell has been moved, and locked it is Removed from the bucket structure.Removed from the bucket structure. Placed in the free cell list.Placed in the free cell list.

Reduces the number of entries in the bucket Reduces the number of entries in the bucket structure.structure.

Page 23: CSE 494: Electronic Design Automation Lecture 4 Partitioning.

Selection of base cellSelection of base cell

Consider the cell of the highest gain from Consider the cell of the highest gain from each of the bucket structure.each of the bucket structure. Must satisfy r “inequality” on the move.Must satisfy r “inequality” on the move.

Break ties by selecting one that gives the Break ties by selecting one that gives the best r.best r.

Selected cell is called base cell.Selected cell is called base cell. Remove from bucket structure, lock and Remove from bucket structure, lock and

place in free list.place in free list.

Page 24: CSE 494: Electronic Design Automation Lecture 4 Partitioning.

Initial Computation of Cell GainsInitial Computation of Cell Gains

F => current or “from” block of cell i.F => current or “from” block of cell i. T => target or “to” block of cell i.T => target or “to” block of cell i. Gain determined by only critical nets.Gain determined by only critical nets. FS(i) => number of nets that have cell i as FS(i) => number of nets that have cell i as

their only F cell.their only F cell. TE(i) => number of nets that contain cell i TE(i) => number of nets that contain cell i

and have an empty T.and have an empty T. G(i) = FS(i) – TE(i)G(i) = FS(i) – TE(i) Can be calculated in O(P).Can be calculated in O(P).

Page 25: CSE 494: Electronic Design Automation Lecture 4 Partitioning.

Updating Cell GainsUpdating Cell Gains

Base cell is moved from one partition to Base cell is moved from one partition to another.another.

Only nets that are critical before and after Only nets that are critical before and after the move should be considered.the move should be considered.

Cells that are not locked and belong such Cells that are not locked and belong such critical nets are updated.critical nets are updated.

Page 26: CSE 494: Electronic Design Automation Lecture 4 Partitioning.

Updating Cell GainsUpdating Cell Gains

F

F T

F T

F T

T

Case 1

Case 4Case 3

Case 2

Page 27: CSE 494: Electronic Design Automation Lecture 4 Partitioning.

Updating Cell GainsUpdating Cell Gains

+1 +1

F T

Case 1

+1 0 0

F T

-1

F T

-1 -1

F T

Page 28: CSE 494: Electronic Design Automation Lecture 4 Partitioning.

Updating Cell GainsUpdating Cell Gains

0 +1

F T

Case 2

0 0 +1

F T

0

F T

0 0

F T

0 +1

Page 29: CSE 494: Electronic Design Automation Lecture 4 Partitioning.

Updating Cell GainsUpdating Cell Gains

0 +1

F T

Case 3

0 0 0

F T

0

F T

+1 0

F T

0 +1

0

Page 30: CSE 494: Electronic Design Automation Lecture 4 Partitioning.

Updating Cell GainsUpdating Cell Gains

-1

F T

Case 4

-1 -1

F T

F T

0

F T

-1 +1

Page 31: CSE 494: Electronic Design Automation Lecture 4 Partitioning.

Updation AlgorithmUpdation Algorithm

For each net n on the base cellFor each net n on the base cell

/* critical before move *//* critical before move */

If T(n) = 0 then incr gain of all free cells on nIf T(n) = 0 then incr gain of all free cells on n

If T(n) = 1 then decr gain of only T cell If T(n) = 1 then decr gain of only T cell

/* change net distribution *//* change net distribution */

decr F(n), incr T(n)decr F(n), incr T(n)

/* critical after move *//* critical after move */

If F(n) = 0 then decr gain of all free cells on nIf F(n) = 0 then decr gain of all free cells on n

If F(n) = 1 then incr gain on the only F cellIf F(n) = 1 then incr gain on the only F cell

End End

Complexity if O(P)

Page 32: CSE 494: Electronic Design Automation Lecture 4 Partitioning.

KL and FM are Deterministic algorithmsKL and FM are Deterministic algorithms Every invocation of the algorithm with Every invocation of the algorithm with

identical inputs, generates the same identical inputs, generates the same solution (hence, deterministic).solution (hence, deterministic).

Fast, but inherently greedy in nature.Fast, but inherently greedy in nature.

Cost

Successive solutions

Local minima

Page 33: CSE 494: Electronic Design Automation Lecture 4 Partitioning.

Non-deterministic algorithmsNon-deterministic algorithms Also known as probabilistic or stochastic Also known as probabilistic or stochastic

algorithms.algorithms. Every invocation of the algorithm with identical Every invocation of the algorithm with identical

inputs generates a different solution.inputs generates a different solution. Slower than non-deterministic, but demonstrates Slower than non-deterministic, but demonstrates

non-greedy behavior.non-greedy behavior.

Cost

Successive solutions

Hill-climbing behavior

Page 34: CSE 494: Electronic Design Automation Lecture 4 Partitioning.

Simulated AnnealingSimulated Annealing

Simulated annealing is a generic optimization Simulated annealing is a generic optimization technique.technique.

In PDA, it has been applied to partitioning and In PDA, it has been applied to partitioning and placement.placement.

Maintains a temperature variable that is reduced Maintains a temperature variable that is reduced from high value to a low value.from high value to a low value.

Number of solutions explored at each temperature Number of solutions explored at each temperature by modification of existing solution.by modification of existing solution.

Solution that decreases cost is always accepted.Solution that decreases cost is always accepted. Accept solutions that increase cost at high Accept solutions that increase cost at high

temperatures with greater probability.temperatures with greater probability. At low temperatures accept solutions that increase At low temperatures accept solutions that increase

cost with very low probability. cost with very low probability.

Page 35: CSE 494: Electronic Design Automation Lecture 4 Partitioning.

Partitioning by Simulated AnnealingPartitioning by Simulated Annealing

Algorithm SAAlgorithm SABeginBegin

T = T_initial; P = initial partition; C = cutsize(P);T = T_initial; P = initial partition; C = cutsize(P);repeatrepeat

repeatrepeatP’ = neighbourhood(P); C’ = cutsize(P’);P’ = neighbourhood(P); C’ = cutsize(P’);D = C’ – C; r = random (0,1);D = C’ – C; r = random (0,1);If (D < 0 OR r < exp(-D/T)) accept P’;If (D < 0 OR r < exp(-D/T)) accept P’;

until (equilibrium at T is reached)until (equilibrium at T is reached)T = alpha * T; /* 0 < alpha < 1 */T = alpha * T; /* 0 < alpha < 1 */

Until (T == T_final);Until (T == T_final);End.End.

Page 36: CSE 494: Electronic Design Automation Lecture 4 Partitioning.

Partitioning by Simulated AnnealingPartitioning by Simulated Annealing

A neighbourhood solution could be A neighbourhood solution could be generated by exchanging of two nodes.generated by exchanging of two nodes.

Equilibrium at TEquilibrium at T Apply fixed number of moves. Apply fixed number of moves.

Page 37: CSE 494: Electronic Design Automation Lecture 4 Partitioning.

Ratio CutRatio Cut

KL aims to generate equally sized bi-KL aims to generate equally sized bi-partitions. partitions.

FM gives the possibility for unequal FM gives the possibility for unequal bipartitions.bipartitions.

Neither, consider the graph structure itself.Neither, consider the graph structure itself. Ratio cut overcomes this limitation.Ratio cut overcomes this limitation.

Page 38: CSE 494: Electronic Design Automation Lecture 4 Partitioning.

Ratio CutRatio Cut

Ratio cut is a cost function.Ratio cut is a cost function. Utilized instead of just cut set.Utilized instead of just cut set.

|||| BA

CR