Mohit Gupta, Kwangok Jeong and Andrew B. Kahng UCSD VLSI CAD Laboratory [email protected]

31
UCSD VLSI CAD Laboratory - ICCAD, Nov. 3, 2009 Timing Yield-Aware Color Reassignment and Detailed Placement Perturbation for Double Patterning Lithography Mohit Gupta, Kwangok Jeong and Andrew B. Kahng UCSD VLSI CAD Laboratory [email protected] ECE Department University of California, San Diego

description

Timing Yield-Aware Color Reassignment and Detailed Placement Perturbation for Double Patterning Lithography. Mohit Gupta, Kwangok Jeong and Andrew B. Kahng UCSD VLSI CAD Laboratory [email protected] ECE Department University of California, San Diego. Outline. - PowerPoint PPT Presentation

Transcript of Mohit Gupta, Kwangok Jeong and Andrew B. Kahng UCSD VLSI CAD Laboratory [email protected]

Page 1: Mohit Gupta,  Kwangok Jeong  and Andrew B. Kahng UCSD VLSI CAD Laboratory kjeong@vlsicad.ucsd.edu

UCSD VLSI CAD Laboratory - ICCAD, Nov. 3, 2009

Timing Yield-Aware Color Reassignment and Detailed Placement Perturbation

for Double Patterning Lithography

Mohit Gupta, Kwangok Jeong and Andrew B. KahngUCSD VLSI CAD [email protected]

ECE DepartmentUniversity of California, San Diego

Page 2: Mohit Gupta,  Kwangok Jeong  and Andrew B. Kahng UCSD VLSI CAD Laboratory kjeong@vlsicad.ucsd.edu

(2/28)

Outline• Bimodal CD Distribution in DPL

• Impact on design timing • Mitigating Impact of Bimodal CD Distribution

• Bimodal-Aware Timing Library• Optimization 1: Color Reassignment (Max Alternation)• Optimization 2: Placement Perturbation (DPL-Correctness)

• Experimental Framework and Results• Impact of Color Reassignment• Impact of Placement Perturbation

• Conclusion

Page 3: Mohit Gupta,  Kwangok Jeong  and Andrew B. Kahng UCSD VLSI CAD Laboratory kjeong@vlsicad.ucsd.edu

(3/28)

Bimodal CD distribution in DPL

C12-type cell C21-type cell

Gates from CD group1Gates from CD group2

• Two patterning steps Two different CDs

• Two different colorings Two different timings

Linesfrom 1st patterning

Linesfrom 2nd patterning

C12: ODD polys in BLUE, EVEN polys in GREENC21: ODD polys in GREEN, EVEN polys in BLUE

Jeong et al. ASPDAC’09

Page 4: Mohit Gupta,  Kwangok Jeong  and Andrew B. Kahng UCSD VLSI CAD Laboratory kjeong@vlsicad.ucsd.edu

(4/28)

Impact of Bimodality on Guardband• Comparison of design guardband (Min-Max delay)

• FACT 1: Unimodal representation is too pessimistic!

CD mean difference

Large CD group

Small CD group

Jeong et al. ASPDAC’09

Page 5: Mohit Gupta,  Kwangok Jeong  and Andrew B. Kahng UCSD VLSI CAD Laboratory kjeong@vlsicad.ucsd.edu

(5/28)

Impact of Bimodality on Path Delay• By definition, 2(x+y) = 2(x) + 2(y) + 2 cov(x,y)• Delay variation of a timing path,

• Since cov(d(gi),d(qj)) cov(d(gi),d(gj)) or cov(d(qi),d(qj)), variation of bimodal distribution is smaller than unimodal distribution

jiji

jjjj

iiii

ii

ii

ji

ii

qdgdqdqdgdgd

qdgd

qdgdpathd

,,,

)(),(cov)(),(cov)(),(cov

)()(

)()())((

22221

21

21

21

22

22

• Simulation results validated• FACT 2: Alternate (mixed)

coloring has smaller delayvariation!

Jeong et al. ASPDAC’09

Sigm

a / M

ean

(%)

Page 6: Mohit Gupta,  Kwangok Jeong  and Andrew B. Kahng UCSD VLSI CAD Laboratory kjeong@vlsicad.ucsd.edu

(6/28)

• Different coloring sequences in a clock network Clock skew

• FACT 3: Same color on all clock buffers is better!

Impact of Bimodality on Clock Skew

Case Launch Capture1 C12+C12+C12+…+C12 C12+C12+C12+…+C122 C12+C12+C12+…+C12 C21+C21+C21+…+C21

Case2

Case1

Clo

ck s

kew

(s)

Jeong et al. ASPDAC’09

Launch

capture

Page 7: Mohit Gupta,  Kwangok Jeong  and Andrew B. Kahng UCSD VLSI CAD Laboratory kjeong@vlsicad.ucsd.edu

(7/28)

Bimodal CD Distribution: 3 Key Facts

1. Design requires bimodal-aware timing models

• Unimodal representation is too pessimistic

2. Data paths benefit from alternate (mixed) coloring

• Exploit existence of two uncorrelated CD populations

• Minimize correlated variations in a given path

3. Clock paths benefit from uniform coloring

• Correlated variation between launch and capture paths

minimizes bimodality-induced clock skew

Page 8: Mohit Gupta,  Kwangok Jeong  and Andrew B. Kahng UCSD VLSI CAD Laboratory kjeong@vlsicad.ucsd.edu

(8/28)

DPL Layout-to-Mask Flow

RTL-to-GDS

DPL Mask Coloring

Bimodal-AwareTiming Analysis

ILP to Maximize Alternate Coloring

(Datapaths)

Placement Perturbationfor Color Conflict Removal

(Clock and Datapaths)

Optimization 1

Optimization 2

Page 9: Mohit Gupta,  Kwangok Jeong  and Andrew B. Kahng UCSD VLSI CAD Laboratory kjeong@vlsicad.ucsd.edu

(9/28)

Outline• Bimodal CD Distribution in DPL

• Impact on design timing • Mitigating Impact of Bimodal CD Variation

• Bimodal-Aware Timing Library• Optimization 1: Color Reassignment (Max Alternation)• Optimization 2: Placement Perturbation (DPL-Correctness)

• Experimental Framework and Results• Impact of Color Reassignment• Impact of Placement Perturbation

• Conclusion

Page 10: Mohit Gupta,  Kwangok Jeong  and Andrew B. Kahng UCSD VLSI CAD Laboratory kjeong@vlsicad.ucsd.edu

(10/28)

Bimodality-Aware Timing Model and Analysis• Timing model

• Two timing libraries: • G1L-G2S: group1 has larger CD than group2• G1S-G2L: group1 has smaller CD than group2

• Two coloring versions of a cell in each library• C12: leftmost poly is in group1• C21: leftmost poly is in group2

• CD Mean difference• Chosen from process information• E.g., 2nm, 4nm and 6nm

• Timing analysis• For each CD mean difference, check timing slack using each of

timing libraries G1L-G2S and G1S-G2L• Worse timing between G1L-G2S and G1S-G2L libraries is

regarded as the actual worst-case timing

G1G2

G2G1

Page 11: Mohit Gupta,  Kwangok Jeong  and Andrew B. Kahng UCSD VLSI CAD Laboratory kjeong@vlsicad.ucsd.edu

(11/28)

Optimization 1: Maximum Alternate Coloring• Maximize alternate (mixed) coloring

Minimize delay variation• How to quantify alternation of coloring sequence?

New metric: Coloring Sequence Cost (CSC) • Represents delay variation due to the coloring

Page 12: Mohit Gupta,  Kwangok Jeong  and Andrew B. Kahng UCSD VLSI CAD Laboratory kjeong@vlsicad.ucsd.edu

(12/28)

Delay and Coloring• Rise delay depends on PMOS tr. ~10% variation• Fall delay depends on both NMOS trs. ~ 1% variation

MP1

MN1

MP2

MN2ZN

VSS

VDDA1 A2

A1 A2ZN

MP1

MN1

MN2

VDD

VSS

MP2

A1 CD (nm)

A2 CD(nm)

Fall: A1(ps)

Rise: A1

(ps)

Fall: A2

(ps)

Rise: A2

(ps)51 51 49.79 97.30 54.65 113.149 49 48.23 88.25 51.48 102.150 50 48.30 92.90 53.08 107.651 49 49.05 97.26 52.32 102.249 51 48.89 88.28 53.79 113.0

MP1

MN1

MP2

MN2ZN

VSS

VDDA1 A2

G1L-G2S

G1S-G2L

Page 13: Mohit Gupta,  Kwangok Jeong  and Andrew B. Kahng UCSD VLSI CAD Laboratory kjeong@vlsicad.ucsd.edu

(13/28)

Coloring Sequence Cost (CSC) for NAND2• Two observations

• Activated transistors determine the delay• The impact on delay is averaged when more than one

transistor are activated

• Assign CSC for single transistor• Group1: −1 (CSCMP1 = CSCMN1 = −1)• Group2: +1 (CSCMP2 = CSCMN2 = +1)

• CSC for NAND2 gate• A1ZN rise (by MP1): -1• A2ZN rise (by MP2): 1• A2ZN fall (by MN1 and MN2):

(1 + -1) / 2 = 0• A1ZN fall (by MN1 and MN2)

(-1 + 1) / 2 = 0

MP1

MN1

MP2

MN2ZN

VSS

VDDA1 A2

A1 A2ZN

MP1

MN1

MN2

VDD

VSS

MP2

Page 14: Mohit Gupta,  Kwangok Jeong  and Andrew B. Kahng UCSD VLSI CAD Laboratory kjeong@vlsicad.ucsd.edu

(14/28)

CSC Calculation for Cells - Examples

AND2 gate

A

MP1

MN1

MP2

MN2

BUFFER gate

Z

VSS

VDD

ZMP1

MN1

MP2

MN2A

VDD

VSS

A1 A2

MP1

MN1

MP2

MN2

MP3

MN3

Z

VSS

VDD

A1 A2MP1MP2

MN1

MN2

MP3

A2MN3A1

Z

VDD

VSS

A1Z fall: {-1} + (-1) = -2A1Z rise: {(-1 + 1) / 2} + (-1) = -1A2Z fall: {1} + (-1) = 0A2Z rise: {(-1 + 1) / 2} + (-1) = -1

AZ fall : -1 + 1 = 0AZ rise : -1 + 1 = 0

Topology CSC calculation

Parallel CSC of activated tr.

Series Average of all series tr.

Fingered Average of all fingered tr.

Multiplestages

Sum of CSC of each stage

Page 15: Mohit Gupta,  Kwangok Jeong  and Andrew B. Kahng UCSD VLSI CAD Laboratory kjeong@vlsicad.ucsd.edu

(15/28)

Coloring Sequence Cost for Path (CSCP)

• CSCP = Sum of CSC values of stages in path, weighted by stage delay (Di)• CSCPi =

• Correlation between CSCP and delay variation• 1,300 different colorings of a timing path• CSCP metric is strongly correlated

with delay variation of timing paths• Correlation coefficient: 0.902

• CSCP reduction Delay variation reduction

il

il DCSC

l : timing arc in a path i

Page 16: Mohit Gupta,  Kwangok Jeong  and Andrew B. Kahng UCSD VLSI CAD Laboratory kjeong@vlsicad.ucsd.edu

(16/28)

Maximization of Alternate Coloring• Optimal timing path coloring problem:

• Given a set of timing-critical paths: P • Color each cell in union of timing paths to minimize

• ILP to minimize maximum CSCP• Objective:

• Subject to:

iPi CSCPMaxM

MMinimize

},{, ,

)( )(

, ,

,,

101

11

2112

jjjj

jlCil

jlCi

i

i

yxyx

yjCSCxjCSCCSCPkiCSCPMkiCSCPM

Page 17: Mohit Gupta,  Kwangok Jeong  and Andrew B. Kahng UCSD VLSI CAD Laboratory kjeong@vlsicad.ucsd.edu

(17/28)

Impact of Alternate Coloring Optimization• Alternate coloring improves timing slack and reduces

timing variation: JPEG 70% utilization case

• TNS improves by 11% ~ 27%

TNS(ns): Initial coloringTNS(ns): Alternate coloring

TNS

(ns)

Page 18: Mohit Gupta,  Kwangok Jeong  and Andrew B. Kahng UCSD VLSI CAD Laboratory kjeong@vlsicad.ucsd.edu

(18/28)

Optimization 2: Placement Perturbation• DPL feasibility: distance between same-color polys must be

larger than minimum resolution

• Coloring assignment from Optimization 1 can introduce additional coloring conflicts into an existing layout

• Placement perturbation for DPL-Correctness

2dpb > Resmin

dpb: distance from poly to cell boundary

Resmin: minimum resolution

(a) Original placement

Logical connection

(b) Alternate coloring

Coloring conflict

(c) Conflict removal

> Resmin

Page 19: Mohit Gupta,  Kwangok Jeong  and Andrew B. Kahng UCSD VLSI CAD Laboratory kjeong@vlsicad.ucsd.edu

(19/28)

DP Using Cost of Coloring Conflicts• HCost: Horizontal placement cost under constraints

• Cost of placing a cell “a” to a placement site “b”• Considers the spacing between poly lines in different cells

spacing = xa + b + LPSa − (xa−1 + i + wa−1 − RPS

a−1) (b: displacement of cell a to site b)

• HCost is defined as:If ((spacing < Rmin) && (LPC

a == RPCa−1))

• HCost(a, b, a − 1, i) = Otherwise• HCost(a, b, a − 1, i) = 0

Rightmost-Poly of cell a-1

Leftmost-Polyof cell a

LaPSRa-1

PS

wa-1 waxa-1 xa

Ra-1PC

=0La

PC

=0La-1

PC

=1Ra

PC

=1

Page 20: Mohit Gupta,  Kwangok Jeong  and Andrew B. Kahng UCSD VLSI CAD Laboratory kjeong@vlsicad.ucsd.edu

(20/28)

Two Dynamic Programming Approaches• DP Algorithm 1: SHIFT

• Minimize total displacement cost, considering HCost

• DP Algorithm 2: SHIFT+RECOLOR • Necessary when high utilization blocks Algorithm 1• Performs simultaneous recoloring of non-timing critical cells• Cost is defined for each color of cell instances, e.g., C12

and C21• Other DP variants: MAX, FLIP

bs , b) Cost(

, i), aHCost(a, b, i)Cost(aMinbsCost(a, b) SRCHxSRCHxiaa

a

a

111

111

1

aSlacka e *Timing criticality weight for displacement

Page 21: Mohit Gupta,  Kwangok Jeong  and Andrew B. Kahng UCSD VLSI CAD Laboratory kjeong@vlsicad.ucsd.edu

(21/28)

Outline• Bimodal CD Distribution in DPL

• Impact on design timing • Mitigating Impact of Bimodal CD Variation

• Bimodal-Aware Timing Library• Optimization 1: Color Reassignment (Max Alternation)• Optimization 2: Placement Perturbation (DPL-Correctness)

• Experimental Framework and Results• Impact of Color Reassignment• Impact of Placement Perturbation

• Conclusion

Page 22: Mohit Gupta,  Kwangok Jeong  and Andrew B. Kahng UCSD VLSI CAD Laboratory kjeong@vlsicad.ucsd.edu

(22/28)

Experiment FrameworkPlaced and routed design

(SOC Encounter) orig.def

Initial Coloring initial_colored.def

Timing Analysis(PrimeTime - SI)

ILP Instance

Optimal Coloring(Alternate Coloring

maximization)

slack.listkeep_color.listopt_colored.def

Conflicts Removal(SHIFT,

SHIFT+RECOLOR)opt.def

Optimization 1

Optimization 2

Page 23: Mohit Gupta,  Kwangok Jeong  and Andrew B. Kahng UCSD VLSI CAD Laboratory kjeong@vlsicad.ucsd.edu

(23/28)

Optimization 1: Max Alternate Coloring• Testcases with 45nm Nangate Open Cell Library

Init. Opt.2nm

Init. Opt.4nm

Init. Opt.6nm

Init. Opt.2nm

Init. Opt.4nm

Init. Opt.6nm

59% reduction 85%

reduction

Page 24: Mohit Gupta,  Kwangok Jeong  and Andrew B. Kahng UCSD VLSI CAD Laboratory kjeong@vlsicad.ucsd.edu

(24/28)

Optimization 2: Placement Perturbation

#CC (#coloring conflicts), SDT (sum of displacement of timing-critical cells), SDNT (sum of displacement of nontiming-critical cells), #RC (# recolored cells)

• All SHIFT runtimes for JPEG are 204-354 seconds• All SHIFT+RECOLOR runtimes are 578-678 seconds

Page 25: Mohit Gupta,  Kwangok Jeong  and Andrew B. Kahng UCSD VLSI CAD Laboratory kjeong@vlsicad.ucsd.edu

(25/28)

Overall Timing Improvement

• Bimodal timing model Reduce pessimism• Alternate coloring Improve timing• Placement perturbation Remove conflicts

Stage #Conflict TimingMetric

Mean CD Difference

2nm 4nm 6nm

Initial Coloring(Unimodal) 0

WNS (ns) -1.113 -2.016 -2.902

TNS (ns) -671.1 -1776.3 -3348.5Initial Coloring(Bimodal) 0

WNS (ns) -0.191 -0.354 -0.527

TNS (ns) -8.17 -26.56 -64.64

AlternateColoring 219

WNS (ns) -0.090 -0.145 -0.267

TNS (ns) -1.48 -3.85 -22.40

DPL-Corr(+ECO Routing) 0

WNS (ns) -0.104 -0.183 -0.295

TNS (ns) -3.43 -10.45 -28.42

The impact of bimodality can be effectively mitigated!

Page 26: Mohit Gupta,  Kwangok Jeong  and Andrew B. Kahng UCSD VLSI CAD Laboratory kjeong@vlsicad.ucsd.edu

(26/28)

Conclusion• Contributions

• New CSC metric to represent the timing variation in double patterning• ILP-based color reassignment to improve timing slack and variation• DP-based placement perturbation to remove coloring conflicts after

color reassignment• Results (45nm Nangate Open Library)

• Up to 232ps WNS reduction and 36.22ns TNS reduction• WNS variation reduction from 380ps to 84ps • TNS variation reduction from 64ns to 22ns

• Ongoing work • More accurate metrics for timing path color balancing to enhance

timing quality• Golden DPL timing and placement optimizer based on simultaneous

timing-aware coloring and conflict removal

Page 27: Mohit Gupta,  Kwangok Jeong  and Andrew B. Kahng UCSD VLSI CAD Laboratory kjeong@vlsicad.ucsd.edu

UCSD VLSI CAD Laboratory - ICCAD, Nov. 3, 2009

THANK YOU

Page 28: Mohit Gupta,  Kwangok Jeong  and Andrew B. Kahng UCSD VLSI CAD Laboratory kjeong@vlsicad.ucsd.edu

UCSD VLSI CAD Laboratory - ICCAD, Nov. 3, 2009

BACKUP

Page 29: Mohit Gupta,  Kwangok Jeong  and Andrew B. Kahng UCSD VLSI CAD Laboratory kjeong@vlsicad.ucsd.edu

(29/28)

Property(2): Clock Skew and Timing Slack• Timing slack calculation

• Timing slack: • Timing slack variation:

• Clock skew• Especially, clock skew from uncorrelated launching and

capturing clock paths are the major source of timing slack variation.

• Example

pathdatacyclepathclockslack TTTT __

pathdatapathclockTTT TTpathdatapathclockslack __

222 ,cov2__

Large correlation is better for timing slack

Data (10 2 = 8~12ns)Clock (10 2 = 8~12ns)

Worst slack = 5 5 = 0ns

Worst slack = min(clock) – max(data) = 8 12 = 4ns Worst slack = 15 15 = 0ns

(a) Worst slack in DPLSmall delay variation

but large negative slack

(b) Worst slack in single exp.Large delay variation

but zero slack

Data (10 – 5 = 5ns)Clock (10 – 5 = 5ns)

Data (10 + 5 = 15ns)Clock (10 + 5 = 15ns)

BC

WC

Page 30: Mohit Gupta,  Kwangok Jeong  and Andrew B. Kahng UCSD VLSI CAD Laboratory kjeong@vlsicad.ucsd.edu

(30/28)

Simulation Setup: Skew and Slack• Testcase

• AES from Opencores, Nangate 45nm library, PTM 45nm• Extracted critical path

Clock launch: 14 stages

Clock capture: 14 stages

Data path: 30 stages

• Exhaustive tests (4 x 254) not feasible, so we fix the data path coloring.

Case Launch Capture1 M12+M12… M12+M12…2 M21+M21… M21+M21…3 M12+M12… M21+M21…4 M21+M21… M12+M12…5 M12+M21… M12+M21…

M1 M2

Mean 3s Mean 3s

CD Mean

Uni-modal 50.00 2.00 - -

0nmPooled 50.00 2.00 - -

Bimodal 50.00 2.00 50.00 2.00

1nmPooled 50.00 2.50 - -

Bimodal 49.50 2.00 50.50 2.00

2nmPooled 50.00 3.61 - -

Bimodal 49.00 2.00 51.00 2.00

3nmPooled 50.00 4.92 - -

Bimodal 48.50 2.00 51.50 2.00

4nmPooled 50.00 6.32 - -

Bimodal 48.00 2.00 52.00 2.00

5nmPooled 50.00 7.76 - -

Bimodal 47.50 2.00 52.50 2.00

6nmPooled 50.00 9.22 - -

Bimodal 47.00 2.00 53.00 2.00

Page 31: Mohit Gupta,  Kwangok Jeong  and Andrew B. Kahng UCSD VLSI CAD Laboratory kjeong@vlsicad.ucsd.edu

(31/28)

Experiments on Clock Skew and Timing Slack• Clock skew• Even for the zero mean

difference case, clock skew exists and increases with mean difference

• Pooled unimodal can not distinguish this clock skew

• Timing slack• Originally zero slack turns

out to be significant negative slack

• Pooled unimodal shows very pessimistic slack

Timing slack (s) for MAX-MAX combination

-3.00E-10

-2.50E-10

-2.00E-10

-1.50E-10

-1.00E-10

-5.00E-11

0.00E+00

5.00E-11

1.00E-10

0nm 1nm 2nm 3nm 4nm 5nm 6nmMean difference (nm)

Sla

ck (s

)

Unimodal (Pooled)Bimodal (case1)Bimodal (case2)Bimodal (case3)Bimodal (case4)Bimodal (case5)

0.00E+00

1.00E-11

2.00E-11

3.00E-11

4.00E-11

5.00E-11

6.00E-11

0nm 1nm 2nm 3nm 4nm 5nm 6nmMean difference (nm)

Clo

ck s

kew

(s)

Launch (G12+G12...), Capture (G12+G12...)Launch (G21+G21...), Capture (G21+G21...)Launch (G12+G12...), Capture (G21+G21...)Launch (G21+G21...), Capture (G12+G12...)Launch (G12+G21...), Capture (G12+G21...)

22ps

53ps