No Slide Titleusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_5.pdf · · 2008-10-21metals causing...
Transcript of No Slide Titleusers.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_5.pdf · · 2008-10-21metals causing...
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 1/44
Gian Gerosa, IntelFall 2008
EE-382M
VLSI–II
A brief summary of trends, device limitations, scaling, device performance in CMOS technologies
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 2/44
Transistors on a chip doubles every ~2 years
P4 in 90n
Core 2 duo in 65nCore duo in 65n
Core 2 duo in 45n
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 3/44
Die Size Growth in Desktop/Mobile Processors (mm per side)
40048008
80808085
8086286
386486 PentiumPPro
1
10
100
1970 1980 1990 2000 2010Year
Per s
ide
(mm
)
~7% growth per year~2X growth in 10 years
Die size used to grow ~14% every two years
P4
P4 in 90nCore-duo in 65n
Core2-duo in 65n
Core2-duo in 45n
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 4/44
Logic Transistor Density
Shrinks & Compactions meet density goalsNew u-Architectures drop density
Courtesy: Shekhar Borkar, Intel
P4250K
90n 65n 45n
P4450K
P41000
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 5/44
Each processor has several revisionsTime
1.5µ 1.5µ P646P646
1.0µ 1.0µ P648P648
0.8µ 0.8µ P650P650
0.6µ 0.6µ P852P852
0.35µ 0.35µ P854P854
0.25µ 0.25µ P856P856
0.18µ 0.18µ P858P858
0.13µ 0.13µ P860P860
Lead designs
proliferations80386
Pentium
Pentium 4
Pentium II,III
80846
1.5µ 1.5µ P646P646
1.0µ 1.0µ P648P648
0.8µ 0.8µ P650P650
0.6µ 0.6µ P852P852
0.35µ 0.35µ P854P854
0.25µ 0.25µ P856P856
0.18µ 0.18µ P858P858
0.13µ 0.13µ P860P860
Lead designs
proliferations80386
Pentium
Pentium 4
Pentium II,III
80846
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 6/44
0
20
40
60
80
100
120
1
10
100
1000
10000
0
20
40
60
80
100
120
10
1000
100
10000
1
Pow
er(W
)
Freq
uenc
y(M
Hz)
Willamette
Northwood
Banias
CuMine
Katmai
Deschutes
Klamath
1.0um 0.8um 0.6um 0.35um 0.25um 0.18um 0.13um
Frequency used to double every ~2 years
POWER WALL at ~100W stopped this.
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 7/44
Power Dissipation of Compactions
0
10
20
30
40
50
60
70
1.5u 1.0u 0.7u 0.5u 0.35u 0.25u 0.18u
Pow
er(W
atts
)
386 486Pentium
P2&3
P4
Lead processor power increasesCompactions provide higher performance at lower power
Courtesy: Shekhar Borkar, Intel
130n90n
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 8/44
Power Dissipation of Lead uP
PProPentium
486386
2868086
808580808008
4004
0.1
1
10
100
1971 1974 1978 1985 1992 2000Year
Pow
er (W
atts
)
Power increases exponentially
P4
Courtesy: Shekhar Borkar, Intel
P4 (90n)
Core-duo65n
Core2-duo65n
Po
wer w
all
Core2-duo45n
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 9/44
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 10/44
Source: Mark Bohr, Intel Corporation
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 11/44
MEROM core2 duo in 65nm
~180 mm2~ 450 million transistors
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 12/44
PENRYN core2 duo in 45nm
~105 mm2~510 million transistors
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 13/44
~25 mm2~47 million transistors
SILVERTHORNE (ATOM Processor) in 45nm
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 14/44
Deep Sub-micron CMOS device Cross Section
P-Epi
Shallow trench
isolationP-Well N-Well
N+ N+
N+
P+ P+
P+
2CoSi 43NSi
Implant Halo
ExtensionS/D
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 15/44
Deep Sub-Micron Transistors
• Characteristics in the linear, saturation, and sub-threshold regions
• Leakage• Parasitic Elements• Performance / Leakage tradeoff
Si3N4CoSi2
130nm Generation Courtesy: Mark Bohr, Intel
70 nm
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 16/44
Source: Mark Bohr, Intel Corporation
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 17/44
Source: Mark Bohr, Intel Corporation
Strained silicon increases electron/hole mobility.
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 18/44
Source: Mark Bohr, Intel Corporation
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 19/44
High-K, Metal Gate 45 nm CMOS (intel)
K. Mistry, et al., “A 45nm Logic Technology with High-k+ Metal Gate Transistors, Strained Silicon, 9 Cu Interconnect Layers, 193nm Dry Patterning, and 100% Pb-free Packaging”, Tech. Digest IEDM, Dec 2007.
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 20/44
Source: Mark Bohr, Intel Corporation
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 21/44
SOURCES of TRANSISTOR LEAKAGE
gate
source drain
gate
source drain
gate
source drain
SubthresholdLeakage
Ioff
JunctionLeakage
Ijctn
GateLeakage
Igate
Ilkg indicator = 0.5(ON state) + 0.5(OFF state)= 0.5(Igate(ON)) + 0.5(Ioff+Ijctn+Igate(OFF))
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 22/44
Transistor Leakage Components
Design of High-Performance Microprocessor Circuits, IEEE Press, New York, 2001
SubthresholdLeakage @ Vgs=0
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 23/44
Performance vs. Leakage Tradeoff
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 24/44
IEDM 2004, P. Bai et. al., “A 65nm Logic Technology Featuring 35nm Gate Length, Enhanced Channel Strain, 8 Cu Interconnect Layers, Low-k ILD and 0.57um2 SRAM Cell”.
65 nm Transistors Ioff vs. Idsat
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 25/44
Deep Submicron Device PARASITICS
Figures from: Y. Taur, T.H. Ning, Fundamentals of Modern VLSI Devices, Cambridge University Press, UK, 1998.
Source-Drain Resistance
Parasitic Capacitances
Gate Resistance
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 26/44
Design of High-Performance Microprocessor Circuits, IEEE Press, New York, 2001
Constant Field Scaling
constant
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 27/44
Some numbers .. Constant Electric Field Scaling
7.0,7.0
,7.07.0
7.07.0
=⇒==
=×
==
CCapTotalCCapFringing
CCapArea
f
a
Lateral and vertical dimensions reduce 30%
Capacitance--area and fringing--reduce 30%
7.0,7.0,7.0 ===== oxtLLengthWWidth
27.07.07.0 =×=×= YXAreaDie
Die area reduces 50%
~(e*W*L)/Tox
~W
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 28/44
Constant Electric Field Scaling –cnt’d-
7.017.0
==Transistor
Cap
Capacitance per transistor reduces 30%
7.01
7.07.07.0
=×
=AreaCap
Capacitance per unit area increases 43%
22
2 7.07.0
7.07.0,7.07.0
7.07.0
7.07.0
7.07.0)(,7.0,7.0
=×
=××==×
=×
=
=×
=−===
fVCPowerIVddCT
VVddtWIVVdd t
oxt
Delay reduces 30%, power reduces 50%
velocity-saturated device
Fmax scales by 1/0.7
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 29/44
What About Constant Voltage Scaling?
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 30/44
A comparison …….
7.01
17.0
117.07.0)(
1,7.0
=×==
=×=−=
==
ICVD
VtVtWI
VC
ox
7.07.0
7.07.0
7.07.07.07.0)(
17.07.0,1
7.07.0,7.0
=×==
=×=−=
=======
ICVD
VtVtWI
LVE
tVEC
ox
ox
Constant voltage scaling Constant electric field scaling
Power = CV2F = 0.7 × 17.0
Power = 1 Power = CV2F = 0.7 × 0.72
7.0Power = 0.5Power Density = 1/0.72 = 2
Power Density = 0.5/0.72 = 1
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 31/44
Issues with Constant Voltage Scaling
• Practical (from the systems integration point of view), since the power supply and signal voltages are unchanged …… but,
• Electric field increases by factor k (1/S). Can cause transistor failures such as oxide breakdown, punch-through, and hot electron charging of the oxide.
• Current density will also increase in transistors as well as metals causing self-heating and metal migration in interconnects.
• Power density (P/area) is increasing causing localized heating and heat dissipation problems.
• In reality, CMOS technology evolution has followed a mixture of both constant field and constant voltage scaling.
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 32/44
Non-Scaling Effects
• Subthreshold Current: since kT/q and Eg do not scale …..
• The Polysilicon gate depletion contributes a Capacitance which is in series with the oxide capacitance Cox ….. Thus the total gate capacitance does not scale exactly to 1/K ….. unless metal gates are used.
• The full benefit of scaling cannot be realized unless process tolerances (Leff, Tox, Vt, etc.) scale along with 1/K.
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 33/44
Transistor Performance Trends
CV/Idsat
Q = CVdQ/dt = CdV/dti = CdV/dtdt = CdV/i
Delay = CV/Idsat
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 34/44
CMOS Inverter Delay
Figures from: Y. Taur, T.H. Ning, Fundamentals of Modern VLSI Devices, Cambridge University Press, UK, 1998.
i = dQ/dti = D(CV)/dti = CdV/dt
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 35/44
Inverter Delay –cnt’d-
Wn=NFET width, Idsatn = mA/um
Wp=PFET width, Idsatp = mA/um
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 36/44
0.13 micron Cross Section (Copper)
Cu Interconnect 130nm Generation Courtesy: IBM
M3
M4
M5
M6
M2
Substrate
SiO2
M1
POLY
M6
M5
M4
M3
M2
M1LI
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 37/44
Source: Mark Bohr, Intel Corporation
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 38/44
Source: Mark Bohr, Intel Corporation
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 39/44
Interconnect
Figures from: Y. Taur, T.H. Ning, Fundamentals of Modern VLSI Devices, Cambridge University Press, UK, 1998.
thickness
length
width
Inter-dielectricthickness
Metal-to-metal space
RC delay = [ (1/0.7)^2 * Rw * 0.7 ] * [ Cw * 0.7 ] = RwCw
I / WwTw = [ 0.7 * I ] / [ (0.7 * Ww) * ( 0.7 * Tw ) ] = I / (0.7 * WwTw)
Assuming K = 1/0.7 ~ 1.43
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 40/44
Interconnect Delay Curves
microns
Pico
seco
nds
M1
M2
M3
M4
M5
M6
microns
0.18 um Aluminum 0.13 um Copper
An M4 5mm 0.18um line (1.8ns un-repeated) would scale to 3.5mm in 0.13um; assuming fF/um remains constant, but ohms/um doubles, then the same wire would take 3.6ns. Copper takes this to 1.0ns.
M4
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 41/44
Repeated Interconnect
M3
M4
M5
M6
Pico
seco
nds
microns
0.13 um Copper
The 3.5mm M4 line’s 1ns can be further reduced to 0.52 ns by adding repeaters.
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 42/44
90 nm & 65nm Technology Overview
90 nm 65nm units
Lphysical 60/65 38/44 nmWmin 90 65 nmTox N/P 2.0/2.0 1.4/1.4 nmXj N/P 32/32 24/24 nm
CONTACT 90 65 nmVIA1 130 95 nmVIA2 130 95 nmVIA3 130 95 nmVIA4 220 175 nmVIA5 240 175 nmVIA6 340 300 nmVIA7 300 nm
POLY w/s 90/130 65/90 nmM1 w/s 140/140 105/105 nmM2 w/s 170/170 130/130 nmM3 w/s 170/170 130/130 nmM4 w/s 240/240 180/180 nmM5 w/s 360/360 180/180 nmM6 w/s 540/540 400/400 nmM7 w/s 810/810 400/400 nm
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 43/44
90 nm & 65nm Technology Overview –cnt’d-90 nm (Cu) 65nm units
Vt0 N/P 0.3/-0.3 .29/-.33 voltsRdsw N/P 300/600 378/748 ohms-umCj N/P 2.9/2.9 4.1/4.2 fF/um^2Cjsw N/P 0.4/0.4 .34/.36 fF/umCgate N/P ~1.4 ~1.8 fF/umIdsat N/P ~1000/500 ~1380/630 uA/umIoff N/P 60/-60 190/-175 nA/um @100C
CONTACT 4.0 8.0 ohms/con @ 100CVIA1 3.0 6.0 ohms/con @ 100CVIA2 2.4 6.0 ohms/con @ 100CVIA3 2.4 6.0 ohms/con @ 100CVIA4 1.4 4.5 ohms/con @ 100CVIA5 1.0 3.4 ohms/con @ 100CVIA6 0.6 2.0 ohms/con @ 100CVIA7 2.0 ohms/con @ 100C
M1 R & C 700 & 0.23 1570 & 0.23 mohms/um & fF/umM2 R & C 400 & 0.23 930 & 0.23 mohms/um & fF/umM3 R & C 400 & 0.23 930 & 0.22 mohms/um & fF/umM4 R & C 150 & 0.22 330 & 0.22 mohms/um & fF/umM5 R & C 150 & 0.23 330 & 0.23 mohms/um & fF/umM6 R & C 150 & 0.25 330 & 0.23 mohms/um & fF/umM7 R & C 100 & 0.25 mohms/um & fF/um
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 44/44
References1. A. Chandrakasan, W.J. Bowhill, F. Fox, Design of High-Performance Microprocessor Circuits, IEEE
Press, New York, 2001.2. Y. Taur, T.H. Ning, Fundamentals of Modern VLSI Devices, Cambridge University Press, UK, 1998.3. K. Bernstein et. Al., High Speed CMOS Design Styles, Kluwer Academic Publishers, Boston, 1999.4. R.J. Baker, H.W. Li, D.E. Boyce, CMOS Circuit Design, Layout, and Simulation, IEEE Press, New
York, 1998.5. J.M. Rabaey, Digital Integrated Circuits, Prentice-Hall, New Jersey, 1996.6. S. Thompson, et. al., “An Enhanced 130nm Generation Logic Technology Featuring 60nm
Transistors Optimized for High Performance and Low Power at 0.7-1.4V”, 2001 IEDM Technical Digest, pp. 257-260.
7. S. Thompson, et. al., 90nm Technology, 2002 IEDM Technical Digest, pp. 61-64.8. P. Bai, et. al., “A 65nm Logic Technology Featuring 35nm Gate Length, Enhanced Channel Strain, 8 Cu
Interconnect Layersw, Low-k ILD and 0.57um2 SRAM Cell”, 2004 IEDM.9. Summary of a gazillion processors:
http://www-vlsi.stanford.edu/group/chips_micropro.html10. M. Bohr, Intel’s 90nm Process Starting High Volume Manufacturing,
http://www.intel.com/research/silicon11. For latest information on Intel’s silicon technology, please visit:
http://www.intel.com/technology/silicon/12. K. Mistry, et al., “A 45nm Logic Technology with High-k+ Metal Gate Transistors, Strained Silicon,
9 Cu Interconnect Layers, 193nm Dry Patterning, and 100% Pb-free Packaging”, Tech. Digest IEDM, Dec 2007.
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 45/44
BACKUP
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 46/44
Drain Current Models
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 47/44
Ids: Characteristics in the Linear Region
Figures from: Y. Taur, T.H. Ning, Fundamentals of Modern VLSI Devices, Cambridge University Press, UK, 1998.
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 48/44
Ids: Characteristics in the Subthreshold Region
Figures from: Y. Taur, T.H. Ning, Fundamentals of Modern VLSI Devices, Cambridge University Press, UK, 1998.
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 49/44
Characteristics in the Saturation Region (Long Channel)
Figures from: Y. Taur, T.H. Ning, Fundamentals of Modern VLSI Devices, Cambridge University Press, UK, 1998.
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 50/44
Characteristics in the Saturation Region with Velocity Saturation
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 51/44
Characteristics in the Saturation Region with Velocity Saturation
Figures from: Y. Taur, T.H. Ning, Fundamentals of Modern VLSI Devices, Cambridge University Press, UK, 1998.
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 52/44
Interconnect Delay
Intrinsic wire delay ~ 0.5*L2*(R/um)*(C/um)
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 53/44
Wire Delay Example
• In 0.18 um CMOS, assume a 2 mm M2 wire, minimum width (0.34um), and a 120fF load at the end.
• What is the intrinsic wire delay?• What size driver should you use?
• Intrinsic Delay:– Using a worst case 0.18 ohms/um and 0.22 fF/um:– C = 2000*0.22 = 440 fF and R = 360 ohms.– Intrinsic wire delay = 0.5*RC ~ 79ps
• Driver size (use FO=4 and P:N ratio~2):– input cap ~ (Cwire+Cload)/4 ~ 560fF/4 = 140fF– Using Cox ~1fF/um, PFET is 93um and NFET is 47um.
• If M2 wire is 4mm long, intrinsic delay is ~ 317ps ….. 4X longer ….
FO=4
93 μ
47 μ 220 f 220 f
360 120 fF
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 54/44
Homework Assignment #1.2 & #1.4
A 2 cycle static circuit will be analyzed with a Copper 1.0V, 90nm CMOS technology. Maximum frequency (Fmax) of operation and power dissipation as a function of VDD with ZERO clock skew will be established via circuit simulation. In addition, the impact of realistic clock skew on Fmax will be determined. Finally, the Fmax will be predicted for a 65nm CMOS circuit using constant field scaling law.
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 55/44
A0_in
B0_in
OUT
EE382M HMK #1.2 All dimensions in microns90nm CMOSR in ohms, C in fF
7.6
3.63.9 15.6
21
2129
21
56 21
6.46.47.2
6.010.510.5
101010.8
4.26.06.0
6.411.211.2
7.07.07.7
4.88.48.4
5.35.35.8
101010.8
4.26.06.0
13.313.314.4
5.68.08.0
MSFF
clk
MSFF
clk
MSFF
clk
MSFF
clk
82 2
226 6
7845 45 45
24 24 3313 13
162 2
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 56/44
Clock-Skew Impact to Fmax (HMK #1.4)
τ1 > τ2
local clock buffer
mast
er
slaveB0_in
Out
clock
LC
B
LC
B
GLOBAL clock
τ1 τ2
mast
er
slave
mast
er
clock
slave
mast
er
clock
slave
LC
B
τ2
logic logic
A0_in
en
outin
LCB
GCLK
Vdd
in en in en in en
GLOBAL enable
The University of Texas at Austin EE382M VLSI-II Class Notes Page # 57/44
INVERTING MSFF
clock
DoutDin