Physical Design FlowPhysical Design Flow
Mohammad reza [email protected]@
Agenda
Introduction to design flow and BackendIntroduction to design planningIntroduction to design planningFloorplanning / Hierarchical designP l iPower planningSummary
Agenda
Introduction to design flow and BackendIntroduction to design planningIntroduction to design planningFloorplanning / Hierarchical designP l iPower planningSummary
The Physical Design Task
Physical Design FlowVerilog netlist FlowVerilog netlist
GDSII
SDC constraints
Front End Back End
Example Physical Design FlowDesign/Constraints Import
Floorplanningp g
Placement
Cl k T S th iClock Tree Synthesis
Routing
Post Route Optimization
Layout Verification / Finishing
Fullchip Design OverviewFullchip Design Overview
Core placement area
The location of the core, I/O areas P/G pads and
the P/G gridthe P/G grid
RingsP/G IP
ROM
RAM
StrapsGrid
Periphery (I/O) area
Where Do We Start? - DesignWhere Do We Start? Design Planning
Physical DesignVerilog netlist Physical Design Flow
Verilog netlist
SDC constraintsHow do we handle?
Die sizeIO / Hard-IP placementGlobal clock distribution P l iPower planningFlat versus hierarchical design
Design PlanningFloorplanning
Determine die sizeShape and arrange hierarchical blocksShape and arrange hierarchical blocks Integrate hard-IP efficientlyPredict and prevent congestion hotspots and critical timing pathspaths
Power planningCreate power distribution grid
Consider IR drop and ElectromigrationImplement power saving techniques
Power gatingg gMulti-Voltage design / Voltage islands
Agenda
Introduction to design planningFloorplanningp g
Setup/configurationDie size utilization metallization schemeDie size, utilization, metallization schemeIO-ring and macro placementFlat versus hierarchical designFlat versus hierarchical designHierarchical design planning issues
Power planningPower planningSummary
S t / fi tiSetup/configuration
Read netlistRead SDC
check netlistHigh fanoutU iRead SDC
Read .lib filesRead footprint for P&R
UniqueUnconnected inputsStandard cell areaCh k ti i ith t i l d
pLEF : SOC encounterFram : Synopsys tools
Check timing without wire load
Read technology fileMetal width … (DRC rules)rules)
Floorplanning – Die SizeFloorplanning Die Size, Utilization & Metal Stack-up
Choosing the die size, initial standard cell utilization and metallization scheme involves several design tradeoffs ( Schedule, Cost, Performance)Schedule, Cost, Performance)
Larger die Easier to route, less congestion, lower cap (decrease signal/power integrity related problems) faster designsignal/power integrity related problems), faster design cycleHigher cost, higher power
M d idMore dense power grid Reduce risk of power related failuresIncrease number of metal layer masks, reduce signal route tracks
Floorplanning – UtilizationFloorplanning Utilization
Low standard-cell High standard-cell tili tiutilization utilization
Floorplanning – Utilization
Utilization refers to the percentage of core area that is taken up by standard cells.
A typical starting utilization might be 70%A typical starting utilization might be 70%This can very a lot depending on the design
High utilization can make it difficult to close a designRouting congestion,Negative impact during optimization legalization stages.
Utilization changes should be examined after each stage of g gthe flow
Avoid having large increases after placement optimizationFeedback should be given to front-end designersFeedback should be given to front end designersTopographical synthesis is now possible
Initialize FloorplanInitialize FloorplanDefine globals (VDD1,VDD2,GND1,….)D fi ( ll ili i f )Define core area : (cells + utilization factor)
IO [Analog] macro
core core
[ g]
IO
Shape can be implied by a macro
Place IO (fixed, equidistant,..)Take macro’s and power domains intoTake macro’s and power domains into account already
IO Ring and Large MacroIO Ring and Large Macro Placement
IO Ring is often decided by front-end designers, with input from physical design and packaging engineers.When placing large macros we must consider impacts on routing,When placing large macros we must consider impacts on routing, timing and power.
For wire-bond place power hungry macros away from the chip center. Possible routing center.
congestion hotspots
Flat Versus HierarchicalFlat Versus Hierarchical DesignWhat happens if the design is too big to be handled by the EDA tools?y
Hierarchical DesignI/O PadFullchip Design
IP Macro
/Blk 1 Blk 2 Blk 3
Block / TileP&R Flow
P&R Flow
P&R Flow
Fullchip Timing & Verification
Flat Versus HierarchicalFlat Versus Hierarchical DesignHierarchical Design
AdvantagesFaster runtime, less memory needed for EDA toolsFaster eco turn-around timeAbility to do design re useAbility to do design re-use
DisadvantagesMuch more difficult for fullchip timing closureMuch more difficult for fullchip timing closure (ILMs)More intensive design planning needed, feedthrough generation repeater insertion timingfeedthrough generation, repeater insertion, timing constraint budgeting.
Hierarchical Design : SpecifyHierarchical Design : Specify Partitions / Plan GroupsNetlist must have partitions as top level modules.Partitions generally sized according to a target initial utilization ~70% utilization, ~300k-700k instances
Ch l b t tChannels or abutmentRectilinear block shapes are possible Abutment
Channels
RectilinearBlocksBlocks
Hierarchical Design : PinHierarchical Design : Pin AssignmentPin constraints include parameters such as,
Layers, spacing, size, overlapNet groups, pin guides
Pins can be assigned placement based
Pin guide 1
Pin guide 2Pins can be assigned placement-based (flightlines) or route-based (trial route, boundary crossings).Pin guides can be used to influence automatic
Partition
Pin guide 2
pin placement of particular net groups
Pins at partition corners can make
ti diffi ltrouting difficult
Hierarchical Design : TimingHierarchical Design : Timing Budgeting
Chip level constraints must be mapped correctly to block level constraintsTh d i t b l d t i l t d d h iThe design must be placed, trial routed and have pins assigned before running budgetingBlock level constraints will be assigned input or outputBlock level constraints will be assigned input or output delays on I/O ports based off of the estimated timing slack.
set input delay 1 5 [ get port IN1 ]IN1 set_input_delay 1.5 [ get_port IN1 ]1.5ns
Block Boundary
Hierarchical Design : TimingHierarchical Design : Timing Budgeting & Fullchip Timing Closure
Fullchip timing closure is typically a bottleneck for design cycles.Block-level P&R flow does not emphasize io-to-flop, flop-to-io, io-to-io timing paths because budgeted constraints are only estimatestiming paths, because budgeted constraints are only estimates
Interface logic models (ILMs) can be usedTo speed-up timing analysis runs when fullchip design is too large.Required clock and datapaths are preserved, net/cell names are q p p ,identical
A
B
Clk
X
Y
A
B
X
Y
ClkClk
Original Netlist Interface Logic Model (ILM)
Agenda
Introduction to design planningFloorplanningFloorplanningPower planning
Intro to power issues in IC designIntro to power issues in IC designBasic power grid creationMulti voltage design & power gatingMulti-voltage design & power gatingAutomated power grid design flows
SummarySummary
Power Consumption and ReliabilityPower Consumption and Reliability
Dynamic Power IR-Drop /Dynamic Power IR Drop /Voltage Drop
Average Power problem
Static Power(Leakage Power)
Fail
p ob e
Electromigration(EM)
FloorplanPower densityproblem in theFloorplan
+Design of the grid
problem in theLong run
1 out of 5 chips fail due to excessive power consumption
Power Consumption and Reliability :Power Consumption and Reliability : IR-Drop
The drop in supply voltage over the length of the supply lineA resistance matrix of the power grid is constructedThe average current of each gate is considered g gThe matrix is solved for the current at each node, to determine the IR-drop.
VDD Pad VDD
Where does the all power goWhere does the all power go to?
Total Power
Core I/O+Core I/O
•Separate supply ring•Often higher voltage
Standard Cells Macros +
•Clock network
•Fixed, no optimization
Agenda Introduction to design planningFloorplanningPower planning
Intro to power issues in IC designp gBasic power grid creationMulti-voltage design & power gatingAutomated power grid design flows
Summary
Power Grid Creation : MacroPower Grid Creation : Macro Placement
Blocks with the highest performance andperformance and highest power consumption
Close to border power pads (IR drop)
Away from each other (EM)
Agenda Introduction to design planningFloorplanningPower planning
Intro to power issues in IC designp gBasic power grid creationMulti-voltage design & power gatingAutomated power grid design flows
Summary
Agenda
Introduction to design planningFloorplanningp gPower planning
Intro to power issues in IC designIntro to power issues in IC designBasic power grid creationMulti-voltage design & power gatingg g p g gAutomated power grid design flows
Summaryy
Automated Power Grid Design:Automated Power Grid Design: PNS & PNA
Power grid creation has usually done by hand using rules of thumb for widths and number of straps
Analysis often done late in the design flowAnalysis often done late in the design flowGrid is typically over-designed to prevent time-intensive power grid changes.
When incorporating advanced low-power strategies, there are too many variables to achieve an optimal result manually.For more complex designs an automated strategy is preferred.
e g Power Network Synthesis (PNS) and Powere.g Power Network Synthesis (PNS) and Power Network Analysis (PNA) from SynopsysAllows designers to anticipate affects of floorplanning
P N t k A l i (PNA)Power Network Analysis (PNA)
There are EDA tools that allow early power network analysis for designs in the early floorplaning stage.
N t i ff lit b t d h f i iti l d iNot signoff quality, but good enough for initial design.e.g. Synopsys Power Network Analysis (PNA)
VDDVDD Pad VDD
Power Network Synthesis:Power Network Synthesis: PNS – What?
Goal is to QUICKLY find minimum routing resource required to meet specified IR drop target
More power routing => easier to reach IR-dropMore power routing => easier to reach IR-drop target, but harder to route clock and signals with remaining tracks
Power pads
Power straps(in Red)
Power pads
Power trunks
Power rings
PNS : Running PNS TrialsPNS : Running PNS Trials
Run PNS
PNS C t P R tiPNS : Create Power RoutingAfter running trials, an optimal power grid can be chosen and the g , p p gactual rails can be laid out.Virtual rails => actual rails
Outside main PNS : memory footprint + cpu timeMany options : eg. % Via penetration , order of routing …
Check legal cell/pin placement (grid aligned ?)Depending on the design phasep g g p
What cells, nets and layerseg. First macros and pads, then high voltage areas, …
Secondary PG ports on level shftrs, isol. cells, ret. RegsSeco da y G po ts o e e s t s, so ce s, et egsLater after placement during routing : same as the follow pins for the normal vdd and gnd of the std cells.
PNS C t P R tiPNS : Create Power Routing
SSummaryThe goal of design planning is to arrange the chip so that the “Place and g g p g g pRoute” flow can converge quickly and easily.
Design experience is neededFloorplan is driven by :
PPower TimingCongestionMinimum areaMinimum area
There is no 1 way to create a floorplanFlat – hierarchicalRegions, position of the macro’sg , pOrder of placement IO versus macros versus core
This phase can take a significant portion of the complete backend design time.E l l i f id i ti l f idi j blEarly analysis of power grid is essential for avoiding major problems near the end of the design cycle.Automated power grid tools may help reduce necessary safety margins.
PlacementPlacement
Placement in the FlowPlacement in the Flow
Design Specification dDesign Specification
Logic Design and Verification
Fron
t-End
Logic Synthesis
Physical Libraries
F
PhysicalDesignStage
Netlist
Libraries
Placement
Floorplanning
ack-
End
gPhysical Design
Constraints
Routing Ba
D fi iti f Pl tDefinition of PlacementPlacement : Exact placement of thePlacement : Exact placement of the
modules (modules can be gates, standard cells macros ) The general goal is tocells, macros…). The general goal is to minimize the total area and interconnect costcost.
The quality of the attainable routing is highly d t i d b th l t
Circuit placement becomes very critical in 90nm
determined by the placement.
and below technologies.
C t F ti f Pl tCost Function for PlacementMethods of considerationCost components
Area
Wire length Traditional methods of Placement
Overlap
Timing Timing driven PlacementTiming
Congestion
Timing-driven Placement
Congestion-driven Placement
Clock
Power
Clock Gating
Multivoltage and Multisupply Placement
Placement StepspInput information:
NetlistMapped and floorplanned design
Logical and physical librariesDesign constraints
Reading Gate level netlists from synthesisReading Gate-level netlists from synthesis
Global placement
D il d lDetailed placement
Placement optimization
Output information:Physical layout informationCell placement locations
Physical layout timing and technology information of reference librariesPhysical layout, timing, and technology information of reference libraries
Inputs for the Placement ToolInputs for the Placement ToolGate-level netlist
LogicalDesign
constraints
TargetPlacement
Physicaltool
Macro cell
Design libraries
Macro cell
Standard cellReference
Floorplanned design
Technology file
Inside A Physical LibraryInside A Physical LibraryMACRO AN2D0
CLASS CORE
Example l f DimensionCLASS CORE ;
FOREIGN AN2D0 0.000 0.000 ;ORIGIN 0.000 0.000 ;SIZE 1.400 BY 2.520 ;SYMMETRY x y ;SITE core ;PIN Z
.lef Dimension“bounding box”
Pins
VDD
A B
Blockage
ANTENNADIFFAREA 0.1680 ;DIRECTION OUTPUT ;PORTLAYER M1 ;RECT 1.300 0.640 1.330 1.675 ;RECT 1.190 0.640 1.300 1.780 ;RECT 1.140 0.640 1.190 0.900 ; reference point
(direction, layer and shape)
GND
Y
NAND_1
Symmetry(X, Y, or 90º) F
Abstract ViewRECT 1.140 1.520 1.190 1.780 ;END
END ZPIN A2
ANTENNAGATEAREA 0.0704 ;DIRECTION INPUT ;PORT
reference point(typically 0,0)
Abstract View
PORTLAYER M1 ;RECT 0.610 0.975 0.770 1.545 ;END
…
T h l I f tiTechnology InformationFor each tool, a specific set of files are required to provide details about the metal layers for the chosenprovide details about the metal layers for the chosen process technology…
Number and name designations for each layer/viaPh i l d l t i l h t i ti f h lPhysical and electrical characteristics for each layerDielectric constantDesign rules for each layer (min spacing, min width,
)etc…)Units and precision for numerical values
Example filetypesp yp.lefhdr, .tf -> contain layer and design rule information
Also, there are files that enable improved RC estimationAlso, there are files that enable improved RC estimation that can be read by the placement engines.
.captable, .tluplus -> store RC coefficients.
Ph i l T h l D tPhysical Technology DataLAYER M1
TYPE ROUTING ;DIRECTION HORIZONTAL ;
The technology files contain ExampleOFFSET 0 ;PITCH 0.280 ;WIDTH 0.120 ;MAXWIDTH 12.000 ;AREA 0.058 ;MINENCLOSEDAREA 0.200 ;THICKNESS 0.240 ;HEIGHT 0.765 ;
design rule information that can be read by the tools
For example the
Example .lefhdr
SPACINGTABLEPARALLELRUNLENGTH 0.00 0.52 1.50 4.50WIDTH 0.00 0.12 0.12 0.12 0.12WIDTH 0.30 0.12 0.17 0.17 0.17WIDTH 1.50 0.12 0.17 0.50 0.50WIDTH 4.50 0.12 0.17 0.50 1.50
;
For example, the spacing table constrains the parallel runlength of dj t i th MINIMUMCUT 2 WIDTH 0.42 ;
MINIMUMCUT 4 WIDTH 0.98 FROMABOVE ;MINIMUMCUT 2 WIDTH 0.70 LENGTH 0.70 WITHIN 1.001 ;MINIMUMCUT 2 WIDTH 2.00 LENGTH 2.00 WITHIN 2.001 ;MINIMUMCUT 2 WIDTH 3.00 LENGTH 10.0 WITHIN 5.001 ;
MINIMUMDENSITY 15 ;MAXIMUMDENSITY 70 ;
adjacent wires on the same layer.Wire width and pitch are MAXIMUMDENSITY 70 ;
DENSITYCHECKWINDOW 50 50 ;DENSITYCHECKSTEP 50 ;FILLACTIVESPACING 0.60 ;
Wire width and pitch are also described, as well as any more complex design rules for routingdesign rules for routing.
Gl b l d D t il Pl tGlobal and Detail PlacementReading Gate LevelReading Gate-Level
Netlist from synthesis
Global Placement
Pl t ti i ti
Detailed Placement
Placement optimization
Gl b l Pl tGlobal Placement
Standard cells are placed into groups such that the number of connections between groups is minimized.This is solved through circuit partitioningThis is solved through circuit partitioning.
Bad Placement Good Placement
Detail Placement : CoarseDetail Placement : Coarse Placement
C Pl t
All the cells are placed in the i t l ti b t th
Coarse Placement
approximate locations but they are not legally placed
No logic optimization is done
D t il Pl t L li tiDetail Placement : Legalization
Legalization: Ensures that the final placement is legal before saving the design.
Legal placement of cells is not required for analyzing routing ti t l tcongestion at an early stage
H d M Pl tHard Macro PlacementHard macros are placed during the fl l i t d th k dfloorplanning stage and then marked as FIXED for placement.Typically, hard macros are placed near the sides of the core area.
S G id li f Pl t (2)Some Guidelines for Placement (2)RAM 1 RAM 2 RAM 3
RAM 4 RAM 5 RAM 6
Avoid constrictive channels
RAM 8RAM 7
Avoid many pins in the narrow
channel. Rotate for pin accessibility Use blockage
t i ito improve pin accessibility
Review of Placement CostReview of Placement Cost Function
Methods of considerationCost components
Area
Wire length Traditional methods of Placement
Methods of considerationCost components
Wire length
Overlap
Traditional methods of Placement
Timing
Congestion
Timing-driven Placement
Congestion-driven Placement
Clock
Power
Clock Gating
Multivoltage and Multisupply PlacementPower Multivoltage and Multisupply Placement
Ti i D i Pl tTiming Driven PlacementCritical paths are determined using static timing p g ganalysis (STA). Tool attempts to minimize wire length of critical paths to meet setup timing.
Net RCs are based on VirtualNet RCs are based on Virtual Routing (VR) estimates
Vi t l R t / T i l R tVirtual Route / Trial RouteManhatten geometry
Virtual Route
Horizontal Vertical
Manhatten geometry
Horizontal – Vertical
NO diagonal routing
Congestion Driven Placement:Congestion Driven Placement: Detouring Routes
Congestion Map
Congestion hot spot
Congestion Map
If congestion is not too
Issues with Congestion
severe, the actual route can be detoured around the
congested area
DetourThe detoured nets will have worse RC delay compared to
the VR estimates
In highly congested areas delay estimates during placement will
≥2 ≥3 ≥4 ≥5 ≥6 ≥7
the VR estimates
In highly congested areas, delay estimates during placement will be optimistic.
C ti MCongestion MapCauses high local No need to use -congestion
utilizationBy default, physical synthesis tools
perform some congestion optimization
unnecessarily
G f
perform some congestion optimization which has a reasonable chance of providing acceptable congestion
Gives uniform densityCongestion driven placement increases the effort of algorithm to fix congestion
On average –congestion option
For better correlation to post-route, congestion-driven placement is enabled
increases runtime by 20%
co gest o d e p ace e t s e ab edbased on GR congestion map
Congestion Driven Placement:Congestion Driven Placement: Options
Some Congestion: using medium effort congestion-driven
M ti ti 90%Max routing congestion > 90%Large hot spots
Bad Congestion: using high effort congestion-drivenBad Congestion: using high effort congestion-drivenMax routing congestion >> 90%Very large hot spotsy g p
Congestion-driven might affect timing negatively butPost-route numbers will not create surprisesLower congestion will speed up the detailed router
M dif i Ph i l C t i tModifying Physical ConstraintsModifying Physical Constraints:
Cell density can be up to
Modifying Physical Constraints: Cell Density
x2 y2y p
95% by default
Density level can also be applied to a specific region
x1 y1
applied to a specific region
Lower cell density in congested areas using –congested areas using
coordinate option
M dif i th Fl lModifying the FloorplanTop-level portsTop level ports
Changing to a different metal layerSpreading them out, re-ordering or moving to other sides
Macro location or orientationAlignment of bus signal pinsAlignment of bus signal pinsIncrease of spacing between macros
Core aspect ratio and sizepMaking block taller to add more horizontal routing resourceI f th bl k i t d ll tiIncrease of the block size to reduce overall congestion
Power grid: Fixing any routed or non-preferred layers
Congestion Driven vs. TimingCongestion Driven vs. Timing Driven Placement
In general there is a direct trade-off between congestion and timingg g
Timing-driven placement tries to shorten nets whereas congestion driven placement tries to g pspread cells, thus lengthing nets.
Iterative placement trials should be pperformed to find a balance between the different tool options/settings.p g
Timing and Congestion Optimization
Some things that can be done for timing optimization…Addi / d l ti b ffAdding / deleting buffers Resizing gatesRestructuring the netlistRestructuring the netlistSwapping pinsMoving instancesgArea recovery
Congestion optimization tries to reduce local congestion hotspots.
Generally if congestion exists after placement, little more can be done if area recovery is not significantmore can be done, if area recovery is not significant.
It is essential that sufficient area is available for any optimizations that are required
Cl k T S th iClock Tree Synthesis
CTS
General Concept of Clock treeGeneral Concept of Clock tree synthesisy
CLK CLK
Unbuffered clock tree Buffered/balanced clock tree
Skew
Power
Area (#buffers)
Slew rates
71+ Minimize total insertion delay (latency)
S f kSources of skewNot perfectly balanced clock treep y
Different levels of bufferingDifferent cellsDifferent load due to routingDifferent RC delays
SSetting a skew constraint = 0 psMakes no senseInsertion delay (latency) will increaseInsertion delay (latency) will increasePower consumption will increaseArea will increaseArea will increaseRule of thumb : skew values : 100 – 150 ps for 90 nm
Extra sources of clock skew : variabilityy
Unwanted Skew Variations
TS
WProcess variations in clock buffers
Power supply noiseH
Ground plane
HTemperature variations
. part of the OCV (lecture 15)
tGate length
Gate width
.
.
. L effectivepart of the OCV (lecture 15)
toxGate width
73
CTS in a design flowVLSI Design Steps Simplified CTS Design Flow
RTLLogical
Clock TreeSequentials
(x,y)
Clock gating
LogicSynthesis
( ,y)
ClockBufferingPhysical
Synthesis (Placement)Routing
Clock Nets
Buffering
CTS
SizingClock Buffers
Clock Nets
RoutingClock Buffers
Prepare the netlist for CTSPrepare the netlist for CTS
Analyze the clock treesCheck the clocksCheck the clocksRemove unwanted buffering
R t d b ff iRemove unwanted buffering
Unnecessary pre-existing clock buffers/inverters
remove_clock_tree
CTS : GoalsCTS : GoalsMeeting the clock tree design rule
Constraints are upper bound goals. If constraints
t t i l ti ill
constraints
Maximum transition delay
Maximum load capacitance are not met, violations will be reported.
Maximum load capacitance
Maximum fanout
[Maximum buffer levels][ ]
defaults
Meeting the clock tree targets
Maximum skew Highest priority
77
Maximum skew
Min/Max insertion delay (latency)
Highest priority
Effect of Clock Tree Synthesis on placementon placement
Clock buffers added
Congestion may increase
Non clock cells may have been moved to less ideal locations
Inserting clock trees can introduce new timing and max
moved to less ideal locations
introduce new timing and max tran/cap violations
“real” skew taken into account
SummaryClock tree synthesis is one of the mostClock tree synthesis is one of the most important steps of IC design and can have a significant impact on timing power areaa significant impact on timing, power, area, etc.Th l ki t t h t b di dThe clocking strategy has to be discussed with the frontend people before CTS is t t dstartedClocks identificationClock dependenciesClock balancing
RoutingRouting
Overview
Routing fundamentals / Advanced issues introThe routing flowSpecial topics for 90nm and belowSpecial topics for 90nm and belowAdditional routing considerationsSummary
Physical Design FlowPhysical Design Flow
Design/Constraints Import
Physical Design Flow
Floorplanning
Placement
Clock Tree Synthesis
Routingg
Post Route Optimization
Fi i hi
82
Finishing
Routing Fundamentals
Goal is to realize the metal/copper connections between the pins of standard cells and macros
Input : placed designplaced design fixed number of metal/copper layers
Goal: routed design that is DRC clean and meets setup/hold timing
Consists of two phases1. Global route2. Detail route
Standard cell pin
Horizontal routingrouting tracks
Vertical routing tracks
Routing Fundamentals :Routing Fundamentals : Advanced Issues
Timing driven routingTiming budget for each netTiming budget for each netMinimize critical paths
Signal integrity aware : 90nm and below !!!!Signal integrity aware : 90nm and below !!!!Minimize crosstalk
DFM / DFYDFM / DFYDRC cleanRule based versus Model based
General Flow for RoutingGeneral Flow for RoutingPlacement and CTS
Route Clock Nets
Global Route Signal Nets
Detail Route Signal Nets
Design for Manufacturing (DFM)
Geert Vanwijnsberghe - Affiliation 85
Global RouteGlobal RouteVertical routing capacity = 9 tracks
Horizontal routing
Y
Horizontal routing capacity = 9 tracks
X
XXY
8686
Global RouteGlobal RouteInput:
Cell and macro placementCell and macro placementRouting channel capacity per layer / per direction
Goal:Goal: Perform fast, coarse grid routing through global routing cells (GCells) while considering the following:
Wire lengthCongestionTimingTimingNoise / SI
Often used by placement engines to predict congestion in the form of a “trial ro te” or
8787
congestion in the form of a “trial route” or “virtual route”
Global RouteGlobal RouteGlobal Route
global route
Tries to avoid congested Gcells while
Assigns nets to specific metal layers and global routing cells (Gcells)
Tries to avoid congested Gcells while minimizing detours
Congestion exists when more tracks are needed than availableare needed than available
Also avoids P/G (rings/straps/rails) and
Detours increase wire length (delay)
X
Y
virtual routecongested area
routing blockages
88
Global RouteGlobal Route
8989
Preroute Global route
Detail RouteDetail RouteUsing global route plan, within each global route cellglobal route cell
Assign nets to tracksL d iLay down wiresConnect pins to corresponding nets
Solve DRC violationsReduce cross couple capp pApply special routing rules
9090
Detail Route: Track AssignmentDetail Route: Track Assignment
For nets that traverse multiple GCellsGCellsAssigns each net to a specific track anda specific track and lays down the actual metal tracesMakes long, straight traces and Reduces the number
91
Reduces the number of vias
Preroute TA metal traces Jog reduces via count
Detail route : Solve DRC ViolationsDetail route : Solve DRC Violations
Detail Route BoxesSolveshorts
NotchSpacing
Detail Route Boxesshorts
NotchSpacing
Mi
Thin&FatSpacing
MinSpacing
92
Detail Route: Analysis of RoutingDetail Route: Analysis of Routing DRC Errors
93
Timing Driven RoutingTiming Driven Routing
At 90 Quality of route can effect timingnm net delay becomes significant
Optimize critical paths Route some nets first
Most routing freedom at startUse shortest paths possible
Net weightsNet weights Order of routing (priorities : eg. Default : Clocks 50, others 2)
Wi id iWire wideningReduce resistance
What is Signal Integrity or SI? (1)What is Signal Integrity or SI? (1)Signal delay caused by crosstalk noise
Possible in 2 directions : push-out pull-downp p
Aggressornet 1
Victimnet 2
DelaySpeed Up
95
What is SI? (2)What is SI? (2)Glitch caused by crosstalk noise
Aggressor
Extra clock cycle!
Functional FailureFunctional Failure
^D Q
Vdd
Clk
Victim
96
Crosstalk Prevention : DesignCrosstalk Prevention : Design Optimization
Noise depends onCoupling capacitanceCoupling capacitanceTotal net capacitanceStrength of the driver (Rd of the victim net)Strength of the driver (Rd of the victim net)
Design optimizationIncrease drive strength often easier (onlyIncrease drive strength, often easier (only local effect)Buffer long netsBuffer long nets
Crosstalk Prevention : RoutingCrosstalk Prevention : RoutingRouting solution
Limit length of parallel nets (H&V)
Wire spreading (skip track - clocks)
Shield special netsShield special nets
Coupling free routing
98
Crosstalk Prevention : ReduceCrosstalk Prevention : Reduce Cross Coupling Cap
Critical NetsExtra space Grounded shields
Critical Nets
Spacing ShieldingSpacing ShieldingSame layer (H)
Adjacent layers (V) Net Ordering
99
Effect of Floorplanning on RoutingEffect of Floorplanning on Routing CongestionFor hierarchical designs, good pin placement is essential to preventing p p grouting congestion.
Can use pin guides during partitioningCan use pin guides during partitioning
Routing around blockages and overRouting around blockages and over macros
By default routing tool will:
M1- M4 Routing BlockageRoute over macros
By default routing tool will:
Not route where there is a routing
M1- M3 Routing Blockage
Not route where there is a routing blockage
Not route through a narrow channel in the non-preferred
M1- M4 Routing Blockage
channel in the non preferred routing direction
Macro
M4 has a horizontal routing channel but its preferred
routing direction is vertical
The preferred routing direction needs to be changed
Macro
Clock Tree RoutingClock Tree RoutingFor SI prevention we generally want to route our clocks with extra spacingour clocks with extra spacing.Global H-trees are often routed manually before placementbefore placement
Htree nets may be routed with wide-metal and shielding. Wide-metal H-Tree netWide metal H Tree net
102Grounded shields
Post Route Clock TreePost Route Clock Tree Optimization (CTO)
improve the skew on clock nets
Detail Routed Before CTO
Skew OK?Yes
Detail Routed Design
Short
Postroute CTOECO Route
Skew OK?
Nopath
ECO Route
After CTO
Increased delay
O ti f CPU ff tOptions for CPU effort
# processorsRouting in parallel on # processorsRouting in parallel on # processorsSuperthreading, multithreadingSome routers are better a threading thanSome routers are better a threading than others
# iterations for detail route# of iteration steps done to get a DRC free# of iteration steps done to get a DRC free design
SSummaryStarting from 90 nm technologiesStarting from 90 nm technologies
Timing Driven Routenet delay is becoming more of a factornet delay is becoming more of a factor
SI Aware RouteSmall geometries make SI timing closure muchSmall geometries make SI timing closure much more difficult
DFM / DFYNow a crucial part of the routing flow
DRCNumber and complexity of DRC rules has increased dramatically
Top Related