1 EDS Silicon Valley October 2012Copyright © 2012 SuVolta, Inc. All rights reserved.Copyright ©...

23
1 EDS Silicon Valley October 2012 Copyright © 2012 SuVolta, Inc. All rights reserved. Copyright © 2011 SuVolta, Inc. All rights reserved. 1 Device Considerations for Low Power VLSI Circuits EDS Silicon Valley Chapter David Kidd Senior Director of Digital Design

Transcript of 1 EDS Silicon Valley October 2012Copyright © 2012 SuVolta, Inc. All rights reserved.Copyright ©...

1 EDS Silicon Valley October 2012 Copyright © 2012 SuVolta, Inc. All rights reserved.Copyright © 2011 SuVolta, Inc. All rights reserved.1

Device Considerations for Low Power VLSI CircuitsEDS Silicon Valley Chapter

David Kidd

Senior Director of Digital Design

2 EDS Silicon Valley October 2012 Copyright © 2012 SuVolta, Inc. All rights reserved.

Transistor Variability Limits Chips Impact on Mobile System on Chip (SOC) Limited Low Power Design Techniques Where does Variability come from?

New Transistor Alternatives to Reduce Variability Deeply Depleted Channel (DDC) technology Silicon Impact

Outlook Taking advantage of Deeply Depleted Channel (DDC) technology in

Mobile SOC

Overview

3 EDS Silicon Valley October 2012 Copyright © 2012 SuVolta, Inc. All rights reserved.

Multiple blocks with different performance requirements Integrated on the same die Different power modes – would like to run at different supplies Multiple VT transistors used to control leakage Single chip solution requires analog integration

Need co-design of architecture, circuits and transistor technology for best solution

What is needed in Mobile System on Chip?

High Frequency

Medium Frequency

LowerFrequency

Memory

CPU (1.5GHz)

L2 (1MB)

GPU (500MHz)

HD Video, H264, MPEG

Crypto, Security

MC (LPDDR)

Flash

BT WiFiGPS3GLTERF

PMU

4 EDS Silicon Valley October 2012 Copyright © 2012 SuVolta, Inc. All rights reserved.

Limited benefit using voltage scaling (DVFS) Cannot overdrive much due to reliability and power restrictions Dynamically lowering voltage limited to 100-200mV Only lowering frequency leaves large leakage power “Run to hold” beats DVFS despite overhead

Finicky SRAM memories High SRAM VMIN leaves no room for memory voltage scaling

Many circuit tricks to improve VMIN and noise margins Design teams moved to dedicated power rail for SRAM

Works for CPU – difficult in GPU Impacts power network integrity – more fluctuations

Transistor variability limits chips

Variability Limits Design & Architecture

5 EDS Silicon Valley October 2012 Copyright © 2012 SuVolta, Inc. All rights reserved.

Global/Systematic/Manufacturing Variation Shifts all the transistors similarly

Longer/shorter transistor lengths More (or less) implant energy and dose Will result in speed/power distribution

Local/Random Variation Transistor next to each other vary widely Small number of dopants in transistor channel Random Dopant Fluctuation (RDF)

Apparent in threshold voltage mismatch (σVT) Impacts speed, leakage, SRAM & Analog

Industry solution: Remove RDF using Undoped Channel

What is the right silicon roadmap going forward?

Transistor Variation Source of Chip Variation

[#Tr

ansi

sto

rs]

UseableYield

Too slow Too hot

6 EDS Silicon Valley October 2012 Copyright © 2012 SuVolta, Inc. All rights reserved.

FinFET or TriGate Promises high drive current Manufacturing, cost, and IP challenge Doped channel to enable multi VT

FDSOI Showing off undoped channel benefits Good body effect, but lack of multi VT capability Restricted supply chain

DDC – Deeply Depleted Channel transistor Straight forward insertion into Bulk Planar CMOS Undoped channel to reduce random variability Good body effect and multi VT transistors

Transistor Alternatives

Source: IMEC

Source: Fujitsu

Textbook FinFET

Source: GSS, Chipworks

Intel TriGate

7 EDS Silicon Valley October 2012 Copyright © 2012 SuVolta, Inc. All rights reserved.

Undoped or very lightly doped region Significantly reduced transistor random variability sVT

Lower leakage

Better SRAM (IREAD, lower Vmin & Vret)

Tighter corners

Smaller area analog design

Higher channel mobility (increased Ieff, lower DIBL)

Higher speed, improved voltage scaling

VT setting offset region Enables multiple threshold voltages

Screening region Strong body coefficient

Bias bodies to tighten manufacturing distribution

Body biasing to compensate for temperature and aging

Deeply Depleted Channel™ (DDC) Transistor

1

2

33

2

1

Benefits similar to FinFET in planar bulk CMOS

*Example implementation

8 EDS Silicon Valley October 2012 Copyright © 2012 SuVolta, Inc. All rights reserved.

Wells and VT

Poly

Spacers and LDDs

Salicide

Gate

Metal Layers

Blanket Epi

STI

Poly

Spacers and LDDs

Salicide

Gate

Metal Layers

Wells and VT

Foundry Standard Flow SuVolta Flow (example)

New

No new materials / No new tools

STI

Bulk Planar Foundry-Compatible Process

9 EDS Silicon Valley October 2012 Copyright © 2012 SuVolta, Inc. All rights reserved.

Presented at IEDM 2011

TEM of DDC Transistor and STI

43.1nm

S

D

10 EDS Silicon Valley October 2012 Copyright © 2012 SuVolta, Inc. All rights reserved.

Lower Transistor Variability Reduces Leakage

Transistor variability is reflected in threshold voltage (VT) distribution

Leakage current is exponentially dependent on VT

Lower VT variability (sVT) reduces number of leaky low VT devices

Power dissipation is dominated by low VT edge of distribution

Smaller sVT Less leakage power for digital and memory/SRAM

High leakage tail

High leakage tail dominates power

2.7x higher power(Model using 85mV

subVT slope)

High VT tailSlows down ICs

[#Tr

ansi

sto

rs]

[Lea

kag

e P

ow

er]

65nm SiliconSRAM VT

11 EDS Silicon Valley October 2012 Copyright © 2012 SuVolta, Inc. All rights reserved.

Nominal (TT) ring oscillator speed expected to be 400ps (A) Equivalent to having many similar critical paths in a chip VT variation will randomly affect paths within the same die limiting speed to 470ps

Undoped channel reduces variability and increases mobility (B) 25% faster mean, 30% faster tail due to tighter distribution

To match performance lower VDD until tails have same speed (C) Large impact on power due square dependence P=CV2f +IV

Lower Transistor Variability Improves Speed

65nm SiliconMeasurement

(A)(B)

(C)

12 EDS Silicon Valley October 2012 Copyright © 2012 SuVolta, Inc. All rights reserved.

SRAM memories built using 6-T SRAM cell Smallest transistors on every chip, worst VT

mismatch Higher VDD is required to avoid failures

SRAM blocks limit VDD scaling

Vmin – lowest operating voltage limited by transistor mismatch

Demonstrated SRAM to Vmin of 0.425V

Standard SRAM macros No circuit “tricks” for low voltage operation

Demonstrates potential for 50% voltage scaling

Sub 0.5V VMin is Possible

300 mV

Tester limit

Indu

stry

Nor

m

DDC

0.0

0.2

0.4

0.6

0.8

1.0

0.0 0.2 0.4 0.6 0.8 1.0

Node 1 [V]

No

de

2 [V

]

(b) Control

0.0

0.2

0.4

0.6

0.8

1.0

0.0 0.2 0.4 0.6 0.8 1.0

Node 1 [V]

No

de

2 [V

]

(a) DDC

13 EDS Silicon Valley October 2012 Copyright © 2012 SuVolta, Inc. All rights reserved.

40-60% improved matching for SRAM transistors

Improved VT Matching Key for Low VMIN

-3

-2

-1

0

1

2

3

-0.2 -0.1 0 0.1 0.2Pull-down DVT [V]

Cu

mu

lati

ve

Pro

ba

bil

ity

[s

]

BaselineDDC

-3

-2

-1

0

1

2

3

-0.2 -0.1 0 0.1 0.2Pass-gate DVT [V]

Cu

mu

lati

ve

Pro

ba

bil

ity

[s

]

-3

-2

-1

0

1

2

3

-0.2 -0.1 0 0.1 0.2Pull-up DVT [V]

Cu

mu

lati

ve

Pro

ba

bil

ity

[s]

BaselineDDC

BaselineDDC

Presented at IEDM 2011

14 EDS Silicon Valley October 2012 Copyright © 2012 SuVolta, Inc. All rights reserved.

In analog circuits, matching is key Large transistors used to improve

relative variability in current mirrors, differential pairs, etc.

Better transistor matching allows for Area savings Higher performance Lower power

Undoped channel improves ROUT higher gain

OpAmp stage Matched bandwidth, gain, slewrate Over 50% smaller area (84 vs 190μm2) 45% better input noise (176 vs 327 μV)

Bandgap reference circuit Same accuracy achieved at half the

size

Design Examples Analog

450um

420u

m

210u

m

450um

Baseline

DDC

Baseline

DDC

10.9x17.4um 6.9x12.2um

SuVolta sample layout

15 EDS Silicon Valley October 2012 Copyright © 2012 SuVolta, Inc. All rights reserved.

Body Bias to fix systematic variation Speed-up (forward bias - FBB) slow parts

Cool down (reverse bias - RBB) hot parts

Increase manufacturing yield

Body bias enables multiple modes of operation Active minimize power at every performance Standby leakage reduction, power gating

DDC transistor provides 2-4x larger body factor

Better Chips with Body Biasing

UseableYield

Too slow Too hotRBBFBB

TCADprediction

16 EDS Silicon Valley October 2012 Copyright © 2012 SuVolta, Inc. All rights reserved.

Inverter ring-oscillators (RO) fabricated at process corners Baseline @ 1.2V VDD and DDC @ 0.9V VDD

For each corner, DDC transistor RO is faster and lower power Using strong body coefficient to pull in corners

Half the power (50% less power) while matching speed

Half the Power at Matched Performance65nm Silicon Measurement

Bas

elin

e s

peed

50% power

100% power

SS

SS

TT

FFFF

TT

17 EDS Silicon Valley October 2012 Copyright © 2012 SuVolta, Inc. All rights reserved.

Better process control leads to tighter corners Manufacturing flow further

reduces layout effects

1 sigma tighter wafer to wafer and within wafer variation for DDC

Less overdesign as max paths and min (hold) paths are closer

Faster design closure earlier tapeout shorter TTM

Tighter Manufacturing Corners w/ DDC

1s

2s3s

VDD=1.2V

VDD=0.9V

POR

65nm Silicon Measurement

18 EDS Silicon Valley October 2012 Copyright © 2012 SuVolta, Inc. All rights reserved.

0 100 200 300 400 500

-1.00E-04

6.78E-20

1.00E-04

2.00E-04

3.00E-04

4.00E-04

5.00E-04

Frequency (MHz)

Pow

er (W

)

Achieve half the speed at 1/6 the power @0.6V VDD

Use body bias to compensate for temperature and aging Critical for low VDD operation Enable workable design window – avoid overdesign

Voltage Scaling to 0.6V VDD

0 100 200 300 400 500

-1.00E-04

6.78E-20

1.00E-04

2.00E-04

3.00E-04

4.00E-04

5.00E-04

Frequency (MHz)

Pow

er (W

)65nm Silicon Measurement

Baseline

DDC

FF

TT

SS

-83%

19 EDS Silicon Valley October 2012 Copyright © 2012 SuVolta, Inc. All rights reserved.

Turbo Mode: DDC transistor achieves over 50% speedup @ 1.2V VDD

All corners for DDC runat 580MHz vs 370MHz for baseline

This is HotChips – Go Faster!

DVFS Baseline DDC

VDD 1.2V 0.6V 0.9V 1.05V 1.2V

Speed 1 0.5 1 1.28 1.56

Power 1 0.17 0.52 1 1.51

300 350 400 450 500 550 600

-1.00E-04

6.78E-20

1.00E-04

2.00E-04

3.00E-04

4.00E-04

5.00E-04

Frequency (MHz)

Pow

er (W

)

SS

TT

FF

Baseline

65nm Silicon Measurement300 350 400 450 500 550 600

-1.00E-04

6.78E-20

1.00E-04

2.00E-04

3.00E-04

4.00E-04

5.00E-04

Frequency (MHz)

Pow

er (W

)

SS

TT

FFDDC

Baseline

+56%

20 EDS Silicon Valley October 2012 Copyright © 2012 SuVolta, Inc. All rights reserved.

Same performance at 0.75V VDD as baseline at 0.9V VDD 30% lower power Alternatively 25% faster at same voltage

Even better when using body bias to pull in corners

28nm and Beyond

(silicon calibrated SPICE simulations)

21 EDS Silicon Valley October 2012 Copyright © 2012 SuVolta, Inc. All rights reserved.

CPU: Single thread performance critical Push frequency by temporarily raising voltage in turbo mode DVFS with body biasing becomes DVBFS

GPU: High number of cores using small transistors Less overdesign due to lower delay variability Increase parallelism, lower voltage, body bias dynamically for more pixels/Watt

Lower frequency blocks In addition to high VT transistors also run at lower voltage and optimal body bias

Whole chip: Use body bias to adjust for manufacturing variation Take advantage of improved memory and analog performance Lowering variability while compatible with existing bulk planar silicon IP

Applying DDC to Lower Variability in Mobile SOC

High Frequency

Medium Frequency

LowerFrequency

Memory

CPU (1.5GHz)

L2 (1MB)

GPU (500MHz)

HD Video, H264, MPEG

Crypto, Security

MC (LPDDR)

Flash

BT WiFiGPS3GLTERF

PMU

22 EDS Silicon Valley October 2012 Copyright © 2012 SuVolta, Inc. All rights reserved.

Variability limits chips DDC transistor reduces random variability through its undoped channel DDC transistor’s strong body factor can be used to fix systematic variation

and compensate for temperature variation

DDC technology provides performance kicker from 90nm to 20nm Straight forward integration into existing nodes Compatible with existing bulk planar CMOS silicon IP Use existing CAD flow

DDC technology brings back low power tools Large range DVFS Body biasing Low voltage operation

Taking advantage of reduced variability DDC transistor in design and architecture will lead to next level in mobile SOC

Conclusions

23 EDS Silicon Valley October 2012 Copyright © 2012 SuVolta, Inc. All rights reserved.Copyright © 2011 SuVolta, Inc. All rights reserved.23