CMPEN 411VLSI Digital CircuitsVLSI Digital Circuits
Spring 2011
Lecture 13: Designing for Low Power
[Adapted from Rabaey’s Digital Integrated Circuits, Second Edition, ©2003 J. Rabaey, A. Chandrakasan, B. Nikolic]
Sp11 CMPEN 411 L13 S.1
Review: Designing Fast CMOS GatesTransistor sizingTransistor sizing
Progressive transistor sizingfet closest to the output is smallest of series fetsfet closest to the output is smallest of series fets
Transistor orderingput latest arriving signal closest to the outputput latest arriving signal closest to the output
Logic structure reorderingreplace large fan-in gates with smaller fan-in gate networkreplace large fan-in gates with smaller fan-in gate network
Logical effort
B ff (i t ) i tiBuffer (inverter) insertionseparate large fan-in from large CL with buffersuses buffers so that the path delay is minimized
Sp11 CMPEN 411 L13 S.2
uses bu e s so t at t e pat de ay s ed
Why Power Matters
Packaging costs
P l il d iPower supply rail design
Chip and system cooling costs
Noise immunity and system reliability
Battery life (in portable systems)
Environmental concernsOffice equipment accounted for 5% of total US commercial
i 1993energy usage in 1993Energy Star compliant systems
Sp11 CMPEN 411 L13 S.3
Why worry about power? – Power Dissipation
100
Lead microprocessors power continues to increase
P6Pentium ®
10atts
)
486386
2868086
808580801ow
er (W
a
808080084004
0 1
1P
0.11971 1974 1978 1985 1992 2000
Year
Sp11 CMPEN 411 L13 S.4
Power delivery and dissipation will be prohibitiveSource: Borkar, De Intel®
Why worry about power? – Chip Power Density
10000Sun’sSurface
1000
W/c
m2)
N l
RocketNozzle
hi mi ht b m h t
100
Den
sity
(W NuclearReactor
…chips might become hot…
40048008 8085
8086
286 386486
Pentium®P610
Pow
er D Hot Plate
8080 286 48611970 1980 1990 2000 2010
Y
Sp11 CMPEN 411 L13 S.5
Year
Source: Borkar, De Intel®
Chip Power Density Distribution => HeatPower Map O Di T
200
250
Power Map
100
110
On-Die Temperature
100
150
Heat
Flu
x (W
/cm
2)
60
70
80
90
Tem
pera
ture
(C)
0
50
40
50
Heat dissipation => temperature,
higher power density => higher temperature
Power density is not uniformly distributed across the chip
Intel Pentium 4: (0.18 um) 64 W @ 217 mm2
Sp11 CMPEN 411 L13 S.6
( ) @
Intel Pentium 4: (90 nm) 103W @ 112 mm2
Examples
IBM Power G4 die temperature profile
Sp11 CMPEN 411 L13 S.7
7/50
Apple Power G5
Sp11 CMPEN 411 L13 S.8
Problem Illustration
Sp11 CMPEN 411 L13 S.9
Power and temperature are BAD
and can be EVIL
Sp11 CMPEN 411 L13 S.10
Source: Tom’s Hardware Guidehttp://www6.tomshardware.com/cpu/01q3/010917/heatvideo-01.html
Power and Energy Figures of Merit
Power consumption in Wattsdetermines battery life in hours
Peak powerdetermines power ground wiring designssets packaging limitssets packaging limitsimpacts signal noise margin and reliability analysis
Energy efficiency in JoulesEnergy efficiency in Joulesrate at which power is consumed over time
Energy = power * delaygy p yJoules = Watts * secondslower energy number means less power to perform a computation at the same frequency
Sp11 CMPEN 411 L13 S.11
computation at the same frequency
Power versus Energy
WattsPower is height of curve
Lower power design could simply be slower
Approach 1
Approach 2
Lower power design could simply be slower
time
Approach 2
Energy is area under curveWatts
Approach 1
Two approaches require the same energy
time
Approach 2
Sp11 CMPEN 411 L13 S.12
PDP and EDPPower delay product (PDP) = P * t = (C V 2)/2Power-delay product (PDP) = Pav * tp = (CLVDD
2)/2PDP is the average energy consumed per switching event (Watts * sec = Joule)lower power design could simply be a slower design
15Energy-delay product (EDP) = PDP * tp = Pav * tp2
EDP is the average energy
10orm
aliz
ed)
energy-delayEDP is the average energyconsumed multiplied by the computation time requiredtakes into account that one
5-Del
ay(n
o
energy
takes into account that one can trade increased delay for lower energy/operation (e.g., via supply voltage scaling that increases delay
0Ener
gy delayscaling that increases delay, but decreases energy consumption)
Sp11 CMPEN 411 L13 S.13
allows one to understand tradeoffs better0.5 1 1.5 2 2.5
Vdd (V)
Understanding TradeoffsWhich design is the “best” (fastest coolest both) ?Which design is the best (fastest, coolest, both) ?
b
ac
d
1/Delay
Sp11 CMPEN 411 L13 S.14
y
better
Understanding TradeoffsWhich design is the “best” (fastest coolest both) ?Which design is the best (fastest, coolest, both) ?
b
Lower EDP
ac
d
1/Delay
Sp11 CMPEN 411 L13 S.15
y
better
CMOS Power Equations
P = CL VDD2 f + tscVDD Ipeak f + VDD Ileak
Dynamic power
Short-circuit power
Leakage powerpower power power
Sp11 CMPEN 411 L13 S.16
Dynamic Power ConsumptionVddVdd
Vin Vout
CL
Energy/transition = CL * VDD2 * P0 1
f0→1Energy/transition CL VDD P0→1
Pdyn = Energy/transition * f = CL * VDD2 * P0→1 * f
Pdyn = CEFF * VDD2 * f where CEFF = P0→1 CL
Not a function of transistor sizes!
Sp11 CMPEN 411 L13 S.17
Not a function of transistor sizes!Data dependent - a function of switching activity!
Lowering Dynamic Power
Capacitance:Function of fan out
Supply voltage:Has been droppingFunction of fan-out,
wire length, transistor sizes
Has been dropping with successive generations
Pdyn = CL VDD2 P0→1 f
Clock frequency:Activity factor: Clock frequency:Increasing…
Activity factor:How often, on average, do wires switch?
Sp11 CMPEN 411 L13 S.18
Short Circuit Power Consumption
Vin VoutIsc
CL
Finite slope of the input signal causes a direct current path between VDD and GND for a short period of time during switching when both the NMOS and PMOS transistors are conducting.
Sp11 CMPEN 411 L13 S.19
NMOS and PMOS transistors are conducting.
Short Circuit Currents Determinates
P V I fPsc = tsc VDD Ipeak f0→1
Duration and slope of the input signal, tsc
I d t i d bIpeak determined by the saturation current of the P and N transistors which depend on their sizes, process technology, temperature, etc.strong function of the ratio between input and output slopes
- a function of CL
Sp11 CMPEN 411 L13 S.20
Impact of CL on Psc
I 0 I I
Vin Vout
C
Isc ≈ 0
Vin Vout
C
Isc ≈ Imax
CL CL
Large capacitive load Small capacitive load
Output fall time significantly larger than input rise time
Output fall time substantially smaller than the input rise
Sp11 CMPEN 411 L13 S.21
larger than input rise time. smaller than the input rise time.
Ipeak as a Function of CL
2
2.5x 10-4
CL = 20 fFWhen load capacitance is small, Ipeak is large.
1.5
2 CL 20 fF s s a , peak s a ge
0.5
1 CL = 100 fF Short circuit dissipation is minimi ed b
00 2 4 6
x 10-10
CL = 500 fF
minimized by matching the rise/fall times of the input
-0.5time (sec)
x 10 10
and output signals -slope engineering.
Sp11 CMPEN 411 L13 S.22
500 psec input slope
Psc as a Function of Rise/Fall Times8
6
7
8
V = 3 3 V
When load capacitance is small (tsin/tsout > 2 for
4
5
6 VDD= 3.3 V s souVDD > 2V) the power is dominated by Psc
2
3
4
VDD = 2.5 VIf VDD < VTn + |VTp| then
0
1
2
VDD = 1.5V
pPsc is eliminated since both devices are never on at the same time.0
0 2 4tsin/tsout
normalized wrt zero input
on at the same time.
W/Lp = 1.125 μm/0.25 μm
Sp11 CMPEN 411 L13 S.23
normalized wrt zero input rise-time dissipation
W/Lp 1.125 μm/0.25 μmW/Ln = 0.375 μm/0.25 μmCL = 30 fF
Is Transistor a Good Switch?
I 0I ≠ 0
On
I = 0
I = ∞ I = 1ma/u
Off
I = 0 I ≠ 0
Off
I = 0 I ≠ 0S b th h ld L k
Sp11 CMPEN 411 L13 S.24
Sub-threshold Leakage
Leakage (Static) Power Consumption
VDD Ileakage
VoutDrain junctionDrain junction
leakage
Subthreshold currentG t l k Subthreshold currentGate leakage
Sub-threshold current is the dominant factor.
Sp11 CMPEN 411 L13 S.25
Leakage as a Function of VT
C fContinued scaling of supply voltage and the subsequent scaling of threshold voltage will make subthreshold conduction a dominate component of power dissipation.
10-2
An 90mV/decade VTroll off so each
(A)
10 7
roll-off - so each 270mV increase in VT gives 3 orders of magnitude reduction
ID (
VT=0.4V
10-7 magnitude reduction in leakage (but adversely affects performance)
0 0.2 0.4 0.6 0.8 1
VT=0.1V10-12
performance)
Sp11 CMPEN 411 L13 S.26
VGS (V)
TSMC Processes Leakage and VT
CL013 HS
CL015 HS
CL018 HS
CL018 ULP
CL018 LP
CL018 G
0 08 m24 Å1.2 V
0 11 m29 Å1.5 V
42 Å42 Å42 Å42 ÅTox (effective)0 13 m0 18 m0 16 m0 16 mL
2 V1.8 V1.8 V1.8 VVdd
13 000
920/4000.08 μm
1 800
860/3700.11 μm
3000 151 6020I (l k )
780/360320/130500/180600/260IDSat (n/p) (μA/μm)
0.13 μm 0.18 μm 0.16 μm 0.16 μm Lgate
0.25 V
13,000
0.29 V
1,800
0.40 V0.73 V0.63 V0.42 VVTn
3000.151.6020Ioff (leakage) (ρA/μm)
805243142230FET Perf. (GHz)
Sp11 CMPEN 411 L13 S.27
From MPR, 2000
Exponential Increase in Leakage Currents
10000
1000
0.25
μm)
100 0.180.130.1ak
age(
nA/μ
10I lea
130 40 50 60 70 80 90 100 110
Temp(C)
Sp11 CMPEN 411 L13 S.28
p( )
From De,1999
Itanium example
Itanium 2 (Intel)0.18 um, 1.5V, 1Ghz, 221M transistors
5% 5%
Dynamic
90%
DynamicI/OLeakage
90%Latest generation Itanium (Intel)
0.13 um, 1.3V, 1.5Ghz, 410M transistors21%
5%
21%DynamicIOLeakage
Sp11 CMPEN 411 L13 S.29example74%
Leakage
Next Lecture
Constant Throughput/Latency
Variable Throughput/Latency
Energy Design Time Non-active Modules Run Time
Logic designDFS DVS
Active
(Dynamic)
Reduced Vdd
TSizingClock Gating
DFS, DVS
(Dynamic Freq, Voltage Scaling)
Multi-VddScaling)
LeakageMulti-VT
Sleep Transistors
Multi VLeakage
(Standby)Stack effect
Pin ordering
Multi-Vdd
Variable VT
Input control
Variable VT
Sp11 CMPEN 411 L13 S.30
Input control
Top Related