GAN-OPC: Mask Optimization with Lithography-guided Generative … · 2018. 6. 29. · GAN-OPC: Mask...

1
, GAN-OPC: Mask Optimization with Lithography-guided Generative Adversarial Nets Haoyu Yang, Shuhe Li, Yuzhe Ma, Bei Yu and Evangeline F. Y. Young Department of Computer Science and Engineering The Chinese University of Hong Kong , Backgrounds Lithography Proximity Effects Design target Mask Wafer without OPC with OPC I What you see 6= what you get. I Diffraction information loss. I RET: Optical Proximity Correction (OPC), Scatter Bar and Multiple Patterning Lithography. Classic OPC Requires iterative call of lithography simulation engine. I Model/Rule-based OPC [Kuang+,DATE’15][Awad+,DAC’16] [Su+,ICCAD’16] 1. Fragmentation of shape edges; 2. Move fragments for better printability. I Inverse Lithography [Gao+,DAC’14][Poonawala+,TIP’07] [Ma+,ICCAD’17] 1. Efficient model that maps mask to aerial image; 2. Continuously update mask through descending the gradient of contour error. Machine Learning OPC Masks are updated segment-by-segment and cannot be done in one inference step. [Matsunawa+,JM3’16][Choi+,SPIE’16] [Xu+,ISPD’16][Shim+,APCCAS’16] 1. Edge fragmentation; 2. Feature extraction; 3. Model training. Preliminaries Lithography Model I SVD Approximation of Partial Coherent System [Cobb,1998] I = N 2 X k =1 w k |M h k | 2 . (1) I Reduced Model [Gao+,DAC’14] I = N h X k =1 w k |M h k | 2 . (2) I Etch Model Z(x , y )= 1, if I(x , y ) I th , 0, if I(x , y ) < I th . (3) Lithography Defects Bad EPE Good EPE Bridge Neck I EPE measures horizontal or vertical distances from given points (i.e. OPC control points) on target edges to lithography contours. I Neck detector checks the error of critical dimensions of lithography contours compared to target patterns. I Bridge detector aims to find unexpected short of wires. Inverse Lithography The main objective in ILT is minimizing the lithography error through gradient descent. E = ||Z t - Z|| 2 2 , (4) where Z t is the target and Z is the wafer image of a given mask. Apply translated sigmoid functions to make the pixel values close to either 0 or 1. Z = 1 1 + exp[-α × (I - I th )] , (5) M b = 1 1 + exp(-β × M) . (6) Combine Equations (1)–(6) and the analysis in [Poonawala,TIP’07], E M =2αβ × M b (1 - M b ) (((Z - Z t ) Z (1 - Z) (M b H * )) H+ ((Z - Z t ) Z (1 - Z) (M b H)) H * ). (7) Mask can then be updated by descending the gradient derived in Equation (7), M = M - γ E M . (8) Generative Adversarial Nets (GAN) GAN Basis I x: Sample from the distribution of target dataset; z: Input of G I Generator G (z; θ g ): Differentiable function represented by a multilayer perceptron with parameters θ g . I Discriminator D (x; θ d ): Represents the probability that x came from the data rather than G . 1. Train D to maximize the probability of assigning the correct label to both training examples and samples from G . 2. Train G to minimize log(1 - D (G (z))), i.e. generate faked samples that are drawn from similar distributions as p data (x). min G max D E xp data (x) [log D (x)] + E zp z (z) [log(1 - D (G (z)))]. (9) GAN Architecture 0.2 0.8 Fake Real Discriminator 1.62 3.83 … 3.15 Generator GAN-OPC Generator Design I Auto-encoder based generator which consists of an encoder and a decoder subnets. I An encoder is a stacked convolutional architecture that performs hierarchical layout feature abstraction. I A decoder operates in an opposite way that predicts the pixel-based mask correction with respect to the target. Discriminator Design I Take target-mask tuple as inputs: (Z t , G(Z) t ) or (Z t , M * ). GAN-OPC Architecture 0.2 0.8 Bad Mask Good Mask Discriminator Generator Encoder Decoder Target Mask Target & Mask GAN-OPC Training Based on the OPC-oriented GAN architecture in our framework, we tweak the objectives of G and D accordingly, max E Z t ∼Z [log(D(Z t , G(Z t )))], (10) maxE Z t ∼Z [log(D(Z t , M * ))] + E Z t ∼Z [1 - log(D(Z t , G(Z t )))]. (11) In addition to facilitate the training procedure, we minimize the differences between generated masks and reference masks when updating the generator as in Equation (12). min E Z t ∼Z ||M * - G(Z t )|| n , (12) where || · || n denotes the l n norm. Combining (10), (11) and (12), the objective of our GAN model becomes min G max D E Z t ∼Z [1 - log(D(Z t , G(Z t ))) + ||M * - G(Z t )|| n n ] + E Z t ∼Z [log(D(Z t , M * ))]. (13) The generator and the discriminator are trained alternatively as follows. The GAN-OPC Training Algorithm 1: for number of training iterations do 2: Sample m target clips Z←{Z t ,1 , Z t ,2 ,..., Z t ,m }; 3: ΔW g 0, ΔW d 0; 4: for each Z t ∈Z do 5: M G(Z t ; W g ); 6: M * Groundtruth mask of Z t ; 7: l g ←- log(D(Z t , M)) + α||M * - M|| 2 2 ; 8: l d log(D(Z t , M)) - log(D(Z t , M * )); 9: ΔW g ΔW g + l g W g W d ΔW d + l d W g ; 10: end for 11: W g W g - λ m ΔW g ; W d W d - λ m ΔW d ; 12: end for ILT-guided Pretraining Why Pretraining? I ILT and neural networks optimization share similar gradient descent procedure. I Let the generator get prior knowledge from the lithography engine. Generator Real Fake Feed-forward Back-propagetion Discriminator (a) Litho- Simulator Generator (b) ILT-guided Pretraining Equation (8) is naturally compatible with mini-batch gradient descent, if we create a link between the generator and ILT engine, the wafer image error can be back- propagated directly to the generator as illustrated above. ILT-guided Pretraining 1: for number of pre-training iterations do 2: Sample m target clips Z←{Z t ,1 , Z t ,2 ,..., Z t ,m }; 3: ΔW g 0; 4: for each Z t ∈Z do 5: M G(Z t ; W g ); 6: Z LithoSim(M) . Equations (2)–(3) 7: E ← ||Z - Z t || 2 2 ; 8: ΔW g ΔW g + E M M W g ; . Equation (7) 9: end for 10: W g W g - λ m ΔW g ; . Equation (8) 11: end for Training Behavior: 0 1,000 2,000 3,000 4,000 5,000 6,000 7,000 0.00 50.00 100.00 Training Step L 2 Loss GAN-OPC PGAN-OPC Experimental Results The Dataset I The lithography engine is based on the lithosim v4 package from ICCAD 2013 CAD Contest. I Manually generated 4000 instances based on the design specfication from existing 32nm M1 layouts. Mask Optimization Results ILT GAN PGAN 49,000 49,500 50,000 50,500 51,000 PVBand (nm 2 ) ILT GAN PGAN 35,000 37,000 39,000 41,000 43,000 45,000 Squared L 2 Error (b) (a) Visualizing PGAN-OPC and ILT: (a) masks of [Gao+,DAC’14]; (b) masks of PGAN-OPC; (c) wafer images by masks of [Gao+,DAC’14]; (d) wafer images by masks of PGAN-OPC; (e) target patterns. (a) (b) (c) (d) (e) Haoyu Yang – CSE Department – The Chinese University of Hong Kong E-Mail: [email protected] http://appsrv.cse.cuhk.edu.hk/hyyang/index.html

Transcript of GAN-OPC: Mask Optimization with Lithography-guided Generative … · 2018. 6. 29. · GAN-OPC: Mask...

Page 1: GAN-OPC: Mask Optimization with Lithography-guided Generative … · 2018. 6. 29. · GAN-OPC: Mask Optimization with Lithography-guided Generative Adversarial Nets Haoyu Yang, Shuhe

,

GAN-OPC: Mask Optimization with Lithography-guided Generative Adversarial Nets

Haoyu Yang, Shuhe Li, Yuzhe Ma, Bei Yu and Evangeline F. Y. Young

Department of Computer Science and EngineeringThe Chinese University of Hong Kong

,

Backgrounds

Lithography Proximity Effects

Design target Mask Wafer

without OPC

with OPC

I What you see 6= what you get.

I Diffraction information loss.

I RET: Optical Proximity Correction (OPC), Scatter Bar and Multiple PatterningLithography.

Classic OPC

Requires iterative call of lithography simulation engine.

I Model/Rule-based OPC [Kuang+,DATE’15][Awad+,DAC’16] [Su+,ICCAD’16]

1. Fragmentation of shape edges;2. Move fragments for better printability.

I Inverse Lithography [Gao+,DAC’14][Poonawala+,TIP’07] [Ma+,ICCAD’17]

1. Efficient model that maps mask to aerial image;2. Continuously update mask through descending the gradient of contour error.

Machine Learning OPC

Masks are updated segment-by-segment and cannot be done in one inference step.[Matsunawa+,JM3’16][Choi+,SPIE’16] [Xu+,ISPD’16][Shim+,APCCAS’16]

1. Edge fragmentation;

2. Feature extraction;

3. Model training.

Preliminaries

Lithography Model

I SVD Approximation of Partial Coherent System [Cobb,1998]

I =N2∑k=1

wk|M⊗ hk|2. (1)

I Reduced Model [Gao+,DAC’14]

I =

Nh∑k=1

wk|M⊗ hk|2. (2)

I Etch Model

Z(x , y) =

{1, if I(x , y) ≥ Ith,0, if I(x , y) < Ith.

(3)

Lithography Defects

Bad EPE Good EPE

Bridge

Neck

I EPE measures horizontal or vertical distances from given points (i.e. OPC controlpoints) on target edges to lithography contours.

I Neck detector checks the error of critical dimensions of lithography contourscompared to target patterns.

I Bridge detector aims to find unexpected short of wires.

Inverse Lithography

The main objective in ILT is minimizing the lithography error through gradient descent.

E = ||Zt − Z||22, (4)

where Zt is the target and Z is the wafer image of a given mask.Apply translated sigmoid functions to make the pixel values close to either 0 or 1.

Z =1

1 + exp[−α× (I− Ith)], (5)

Mb =1

1 + exp(−β ×M). (6)

Combine Equations (1)–(6) and the analysis in [Poonawala,TIP’07],∂E

∂M=2αβ ×Mb � (1−Mb)�

(((Z− Zt)� Z� (1− Z)� (Mb ⊗H∗))⊗H+

((Z− Zt)� Z� (1− Z)� (Mb ⊗H))⊗H∗). (7)

Mask can then be updated by descending the gradient derived in Equation (7),

M = M− γ ∂E∂M

. (8)

Generative Adversarial Nets (GAN)

GAN Basis

I x: Sample from the distribution of target dataset; z: Input of G

I Generator G (z; θg): Differentiable function represented by a multilayer perceptronwith parameters θg .

I Discriminator D(x; θd): Represents the probability that x came from the datarather than G .

1. Train D to maximize the probability of assigning the correct label to both trainingexamples and samples from G .

2. Train G to minimize log(1− D(G (z))), i.e. generate faked samples that are drawnfrom similar distributions as pdata(x).

minG

maxD

Ex∼pdata(x)[logD(x)] + Ez∼pz(z)[log(1− D(G (z)))]. (9)

GAN Architecture

0.2 0.8

Fake Real

Discrim

inator

1.62 3.83 … 3.15

Generator

GAN-OPC

Generator DesignI Auto-encoder based generator which consists of an encoder and a decoder subnets.

I An encoder is a stacked convolutional architecture that performs hierarchicallayout feature abstraction.

I A decoder operates in an opposite way that predicts the pixel-based maskcorrection with respect to the target.

Discriminator DesignI Take target-mask tuple as inputs: (Zt,G(Z)t) or (Zt,M∗).

GAN-OPC Architecture

0.2 0.8

BadMask

GoodMask

Discrim

inator

Generator

EncoderD

ecoder

Target

Mask

Target & Mask

GAN-OPC Training

Based on the OPC-oriented GAN architecture in our framework, we tweak the objectivesof G and D accordingly,

maxEZt∼Z[log(D(Zt,G(Zt)))], (10)

maxEZt∼Z[log(D(Zt,M∗))] + EZt∼Z[1− log(D(Zt,G(Zt)))]. (11)

In addition to facilitate the training procedure, we minimize the differences betweengenerated masks and reference masks when updating the generator as in Equation (12).

minEZt∼Z||M∗ − G(Zt)||n, (12)

where || · ||n denotes the ln norm. Combining (10), (11) and (12), the objective of ourGAN model becomes

minG

maxD

EZt∼Z[1− log(D(Zt,G(Zt))) + ||M∗ − G(Zt)||nn]+ EZt∼Z[log(D(Zt,M

∗))]. (13)

The generator and the discriminator are trained alternatively as follows.

The GAN-OPC Training Algorithm

1: for number of training iterations do2: Sample m target clips Z ← {Zt,1,Zt,2, . . . ,Zt,m};3: ∆Wg ← 0,∆Wd ← 0;4: for each Zt ∈ Z do5: M← G(Zt; Wg);6: M∗← Groundtruth mask of Zt;7: lg ← − log(D(Zt,M)) + α||M∗ −M||22;8: ld ← log(D(Zt,M))− log(D(Zt,M∗));

9: ∆Wg ← ∆Wg +∂lg∂Wg

; ∆Wd ← ∆Wd +∂ld∂Wg

;

10: end for

11: Wg ←Wg −λ

m∆Wg ; Wd ←Wd −

λ

m∆Wd ;

12: end for

ILT-guided Pretraining

Why Pretraining?I ILT and neural networks optimization share similar gradient descent procedure.

I Let the generator get prior knowledge from the lithography engine.

Generator RealFake

Feed-forward Back-propagetion

Discriminator

(a)

Litho-SimulatorGenerator

(b)

ILT-guided PretrainingEquation (8) is naturally compatible with mini-batch gradient descent, if we createa link between the generator and ILT engine, the wafer image error can be back-propagated directly to the generator as illustrated above.

ILT-guided Pretraining

1: for number of pre-training iterations do2: Sample m target clips Z ← {Zt,1,Zt,2, . . . ,Zt,m};3: ∆Wg ← 0;4: for each Zt ∈ Z do5: M← G(Zt; Wg);6: Z← LithoSim(M) . Equations (2)–(3)7: E ← ||Z− Zt||22;

8: ∆Wg ← ∆Wg +∂E

∂M

∂M

∂Wg; . Equation (7)

9: end for

10: Wg ←Wg −λ

m∆Wg ; . Equation (8)

11: end for

Training Behavior:

0 1,000 2,000 3,000 4,000 5,000 6,000 7,0000.00

50.00

100.00

Training Step

L2Loss

GAN-OPCPGAN-OPC

Experimental Results

The Dataset

I The lithography engine is based on the lithosim v4 package from ICCAD 2013CAD Contest.

I Manually generated 4000 instances based on the design specfication from existing32nm M1 layouts.

Mask Optimization Results

ILT GAN PGAN

49,000

49,500

50,000

50,500

51,000

PV

Ban

d(nm

2 )

ILT GAN PGAN

35,00037,00039,00041,00043,00045,000

Squ

ared

L2

Err

or

(b)(a)

Visualizing PGAN-OPC and ILT:

(a) masks of [Gao+,DAC’14]; (b) masks of PGAN-OPC; (c) wafer images by masksof [Gao+,DAC’14]; (d) wafer images by masks of PGAN-OPC; (e) target patterns.

(a)

(b)

(c)

(d)

(e)

Haoyu Yang – CSE Department – The Chinese University of Hong Kong E-Mail: [email protected] http://appsrv.cse.cuhk.edu.hk/∼hyyang/index.html