Optimizing the integration and energy efficiency of through...
Transcript of Optimizing the integration and energy efficiency of through...
Microelectronics System Design Research Group
School of Electrical Electronic amp Computer Engineering
Optimizing the integration and energy efficiencyof through silicon via-based 3D interconnects
Panagiotis Asimakopoulos
Technical Report Series
NCL-EECE-MSD-TR-2011-173
November 2011
Contact panagiotisasimakopoulosn13la13uk
NCL-EECE-MSD-TR-2011-173
Copyright ccopy 2011 Newcastle University
Microelectronics System Design Research Group
School of Electrical Electronic amp Computer Engineering
Merz Court
Newcastle University
Newcastle upon Tyne NE1 7RU UKhttpasyn13orguk
O P T I M I Z I N G T H E I N T E G R AT I O N A N DE N E R G Y E F F I C I E N C Y O F T H R O U G H
S I L I C O N V I A - B A S E D 3 D I N T E R C O N N E C T S
panagiotis asimakopoulos
A Thesis Submitted for the Degree of Doctor of Philosophyat Newcastle University
School of Electrical Electronic and Computer Engineering
Newcastle upon Tyne
November 2011
Panagiotis Asimakopoulos Optimizing the integration and energy
efficiency of through silicon via-based 3D interconnects PhD Thesiscopy November 2011
A B S T R A C T
The aggressive scaling of CMOS process technology has been driv-ing the rapid growth of the semiconductor industry for more thanthree decades In recent years the performance gains enabledby CMOS scaling have been increasingly challenged by highly-parasitic on-chip interconnects as wire parasitics do not scale atthe same pace Emerging 3D integration technologies based onvertical through-silicon vias (TSVs) promise a solution to the inter-connect performance bottleneck along with reduced fabricationcost and heterogeneous integration
As TSVs are a relatively recent interconnect technology innov-ative test structures are required to evaluate and optimise theprocess as well as extract parameters for the generation of designrules and models From the circuit designerrsquos perspective criticalTSV characteristics are its parasitic capacitance and thermomech-anical stress distribution This work proposes new test structuresfor extracting these characteristics The structures were fabric-ated on a 65nm 3D process and used for the evaluation of thattechnology
Furthermore as TSVs are implemented in large densely inter-connected 3D-system-on-chips (SoCs) the TSV parasitic capacitancemay become an important source of energy dissipation Typicallow-power techniques based on voltage scaling can be usedthough this represents a technical challenge in modern techno-logy nodes In this work a novel TSV interconnection scheme isproposed based on reversible computing which shows frequency-dependent energy dissipation The scheme is analysed using the-oretical modelling while a demonstrator IC was designed basedon the developed theory and fabricated on a 130nm 3D process
iii
P U B L I C AT I O N S
Some ideas and figures have appeared previously in the followingpublications
1 D Perry J Cho S Domae P Asimakopoulos A YakovlevP Marchal G Van der Plas and N Minas An efficientarray structure to characterize the impact of through siliconvias on fet devices In Proc IEEE Int Microelectronic TestStructures (ICMTS) Conf pages 118ndash122 2011 (best paper
award)
2 A Mercha G Van der Plas V Moroz I De Wolf P Asi-
makopoulos N Minas S Domae D Perry M Choi ARedolfi C Okoro Y Yang J Van Olmen S ThangarajuD S Tezcan P Soussan J H Cho A Yakovlev P Mar-chal Y Travaly E Beyne S Biesemans and B SwinnenComprehensive analysis of the impact of single and arraysof through silicon vias induced stress on high-k metalgate cmos performance In Proc IEEE Int Electron DevicesMeeting (IEDM) 2010
3 A Mercha A Redolfi M Stucchi N Minas J Van OlmenS Thangaraju D Velenis S Domae Y Yang G Katti RLa- bie C Okoro M Zhao P Asimakopoulos I De WolfT Chiarella T Schram E Rohr A Van Ammel A Jour-dain W Ruythooren S Armini A Radisic H PhilipsenN Heylen M Kostermans P Jaenen E Sleeckx D Sabun-cuoglu Tezcan I Debusschere P Soussan D Perry G Vander Plas J H Cho P Marchal Y Travaly E Beyne S Biese-mans and B Swinnen Impact of thinning and throughsilicon via proximity on high-k metal gate first cmos per-formance In Proc Symp VLSI Technology (VLSIT) pages109ndash110 2010
4 P Asimakopoulos G Van der Plas A Yakovlev and PMarchal Evaluation of energy-recovering interconnects forlow-power 3d stacked ics In Proc IEEE Int Conf 3D Sys-tem Integration 3DIC 2009 pages 1ndash5 2009
5 P Asimakopoulos and A Yakovlev An AdiabaticPower-Supply Controller For Asynchronous Logic Circuits20th UK Asynchronous Forum pages 1ndash4 2008
iv
In the end I hope therersquos a little note somewhere
that says I designed a good computer
mdash Steve Wozniak [79]
A C K N O W L E D G E M E N T S
First and foremost I would like to express my gratitude to mysupervisor Prof Alex Yakovlev who provided the perfect balancebetween freedom and guidance allowing me to obtain sphericalknowledge and scope in microelectronics but at the same time topromptly complete my studies Irsquom also grateful to the Engineer-ing and Physical Science Research Council (EPSRC) for fundingmy PhD studies through their Doctoral Training Accounts
As part of my research project I had the opportunity to spendsome time at Imec Belgium The experiences and lessons learnedduring my stay at Imec will undoubtedly prove valuable in myforthcoming professional career I would like to thank Dan Perryfrom Qualcomm for sharing his experience and ideas for the de-sign of the characterization structures implemented on the 65nm
Imec process Special thanks to Shinichi Domae from Panasonicand Dr Dimitrios Velenis for their assistance with the 65nm wafermeasurements as well as Marco Facchini for the time he spendtrying to measure the energy-recovery circuits Last but not leastIrsquod like to thank Geert Van der Plas and Paul Marchal for theiradvices and guidance throughout my stay at Imec
Finally many thanks to my colleagues at Newcastle Universityfor assisting me in diverse ways during my studies I would liketo especially mention Dr Santosh Shedabale and James Dochertyfor reviewing and commenting on parts of this text as well asDr Robin Emery for the numerous technical discussions and forhelping setup the design tools
v
C O N T E N T S
1 Introduction 1
11 Research goals and contribution 2
12 Thesis outline 3
2 Background 5
21 3D integrated circuits 5
211 3D integration approaches 6
212 TSV interconnects 9
22 Energy-recovery logic 13
221 Energy-recovery architectures 17
222 Power-clock generators 18
23 Full-custom design flow 20
24 Summary 21
3 Characterization of TSV capacitance 23
31 Introduction 23
32 TSV parasitic capacitance 23
33 Integrated capacitance measurement techniques 24
34 Electrical characterization 27
341 Physical implementation 28
342 Measurement setup 36
343 Experimental results 37
35 Summary and conclusions 44
4 TSV proximity impact on MOSFET performance 46
41 Introduction 46
42 TSV-induced thermomechanical stress 46
43 Test structure implementation 48
431 MOSFET arrays 51
432 Digital selection logic 55
44 Experimental measurements 61
441 Data processing methodology 61
442 Measurement setup 62
443 PMOS (long channel) 64
444 PMOS (short channel) 71
45 Summary and conclusions 73
5 Evaluation of energy-recovery for TSV interconnects 75
51 Introduction 75
52 Energy-recovery TSV interconnects 75
53 Analysis 79
531 Adiabatic driver 80
532 Inductorrsquos parasitic resistance 82
533 Switch M1 82
534 Timing Constraints 83
54 Simulations 83
541 Comparison to CMOS 86
vi
contents vii
55 Summary and conclusions 91
6 Energy-recovery 3D-IC demonstrator 92
61 Introduction 92
62 Physical implementation 92
621 Adiabatic driver 94
622 Pulse-to-level converter 95
623 Data generator 95
624 Pulse generator 95
625 Integrated inductor 97
626 Top level design 98
627 CMOS IO circuit 101
63 Post-layout simulation and comparison 101
64 Experimental measurements 106
641 Measurement setup 106
642 Results 106
65 Summary and conclusions 111
7 Conclusions 114
71 Main contributions 115
72 Future work 116
a Appendix A Photos 118
b Appendix B Statistics 120
bibliography 122
L I S T O F F I G U R E S
Figure 11 Scaling trend of logic and interconnect delay[30] 2
Figure 21 Comparison of 2D3D Integration 5
Figure 22 System-in-a-package 6
Figure 23 Package-on-package 6
Figure 24 Imecrsquos 3-D SIP wireless bio-electronic sensor[67] 7
Figure 25 3D wafer-level-packaging 7
Figure 26 3D-stacked IC 8
Figure 27 TSV process steps 9
Figure 28 Before thinningafter thinning 10
Figure 29 TSV stacking 10
Figure 210 Conventional AND gate 14
Figure 211 CMOS switching 16
Figure 212 Adiabatic switching 16
Figure 213 Basic 2N-2P adiabatic circuit 17
Figure 214 2N-2P timing 17
Figure 215 Adiabatic driver 18
Figure 216 Stepwise charging generator 19
Figure 217 1N power clock generator 20
Figure 218 Design flow 22
Figure 219 Inverter in Virtuoso schematic 22
Figure 220 Inverter in Virtuoso layout 22
Figure 31 Isolated 2D wire capacitance 24
Figure 32 TSV parasitic capacitance 25
Figure 33 Capacitance vs Gate voltage (CV) diagramof a MOS Capacitor 25
Figure 34 CBCM pseudo-inverters 27
Figure 35 Current as a function of frequency in CBCM 28
Figure 36 CBCM pseudo-inverter pair in layout 29
Figure 37 PRC1 test pad module (layout) 30
Figure 38 PRC1 test pad module (micrograph) 30
Figure 39 Structure rsquoArsquo - Single TSV capacitance (TOPdie) 32
Figure 310 Structure rsquoBrsquo - Single TSV capacitance (BOT-TOM die) 33
Figure 311 Structure rsquoCrsquo - Variable pitch TSV capacitance34
Figure 312 Structure rsquoDrsquo - TSV capacitance measure-ment with interconnection on TOP or BOT-TOM die 34
Figure 313 Structure rsquoRrsquo - Evaluation of TSV capacit-ance measurement accuracy 35
viii
List of Figures ix
Figure 314 Structure rsquoErsquo - TSV capacitance effect onsignal propagation delay (layout) 35
Figure 315 Structure rsquoErsquo - TSV capacitance effect onsignal propagation delay (schematic) 36
Figure 316 Measurement setup 36
Figure 317 TSV capacitance D2D variability on waferrsquoD11rsquo - Probability density function 39
Figure 318 TSV capacitance D2D variability on waferrsquoD18rsquo - Probability density function 40
Figure 319 TSV capacitance measurement results onwafer D11 40
Figure 320 TSV capacitance measurement results onwafer D18 41
Figure 321 Simulated oxide liner thickness variation 42
Figure 322 TSV capacitance W2W variability - Probab-ility density functions 42
Figure 323 Variable pitch TSV capacitance experimentaldata from wafer rsquoD18rsquo - Probability densityfunction 44
Figure 324 LCR experimental data from wafer rsquoD18rsquo -Probability density function 44
Figure 41 TSV thermomechanical stress simulation [40] 47
Figure 42 TSV thermomechanical stress interactionbetween two TSVs [40] 48
Figure 43 Estimated mobility shift on PMOS (a) andNMOS (b) transistors in between two TSVsseparated by distance d [65] 49
Figure 44 PTP1 test pad module (layout) 50
Figure 45 PTP1 test pad module (micrograph) 50
Figure 46 MOSFET dimensions and rotations 51
Figure 47 1 TSV MOSFET array configuration 52
Figure 48 4 diagonal TSVs MOSFET array configuration52
Figure 49 4 lateral TSVs MOSFET array configuration 53
Figure 410 9 TSVs MOSFET array configuration 53
Figure 411 MOSFET array detail 55
Figure 412 Digital selection logic 56
Figure 413 MOSFET array with digital selection logic 57
Figure 414 4-bit Gate-enableArray-enable decoder 58
Figure 415 Transmission gate for MOSFET Gate terminals59
Figure 416 Transmission gates for MOSFET Drain-Source terminals 59
Figure 417 2-bit Drain-select decoder 60
Figure 418 Processing of measurement results 62
Figure 419 Interpolation of measurement results 63
Figure 420 Measurement setup 63
Figure 421 Normalised on-current for PMOS structureldquoF0rdquo (D08) 65
List of Figures x
Figure 422 Normalised on-current for PMOS structureldquoG0rdquo (D08) 66
Figure 423 Normalised on-current for PMOS structureldquoG0rdquo (D08Y=14X=16) 66
Figure 424 Normalised on-current for PMOS structureldquoG0rdquo (D20) 67
Figure 425 Normalised on-current for PMOS structureldquoG0rdquo (D20Y=14X=16) 67
Figure 426 Normalised on-current for PMOS structureldquoG0rdquo (D08 60˚C) 68
Figure 427 Normalised on-current for PMOS structureldquoG0rdquo (D08 Y=14 X=16 60˚C) 69
Figure 428 Normalised on-current for PMOS structureldquoI0rdquo (D08) 69
Figure 429 Normalised on-current for PMOS structureldquoI0rdquo (D08 X=16) 70
Figure 430 Normalised on-current for PMOS structureldquoI0rdquo (D08 Y=14) 70
Figure 431 Normalised on-current for PMOS structureldquoG90rdquo (D08) 71
Figure 432 Normalised on-current for PMOS structureldquoG90rdquo (D08 Y=14 X=16) 71
Figure 433 Normalised on-current for PMOS structureldquoB0rdquo (D08) 72
Figure 434 Normalised on-current for PMOS structureldquoB0rdquo (D08 Y=14 X=16) 72
Figure 51 TSV energy dissipation per cycle increasewith 3D integration density 76
Figure 52 Comparison of a typical CMOS TSV driverto the proposed energy-recovery 77
Figure 53 Adiabatic driver 78
Figure 54 A simple pulse-to-level converter (P2LC) 78
Figure 55 Resonant pulse generator with dual-rail dataencoding 79
Figure 56 Energy-recovery scheme for TSV interconnects79
Figure 57 Energy-recovery timing constraints 83
Figure 58 Test case for the evaluation of the theoret-ical model 84
Figure 59 Energy dissipation results ( f Jcycle) the-oretical model (T) versus SPICE simulator(S) 85
Figure 510 Energy dissipation estimation error of thetheoretical model compared to the SPICEsimulator 85
Figure 511 Test case for comparison of energy-recoveryto CMOS 86
Figure 512 Improved P2LC design 87
List of Figures xi
Figure 513 Simulated energy dissipation of improvedP2LC design 87
Figure 514 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS for variable Q factor (C = 80 f F) 88
Figure 515 Energy contribution of energy-recovery cir-cuit components at 200MHz (C = 80 f F) 88
Figure 516 Energy contribution of energy-recovery cir-cuit components at 800MHz (C = 80 f F) 89
Figure 517 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS with variable TSV capacitance (C =
80 f F f = 500MHz) 89
Figure 518 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS (C = 80 f F f = 200MHz) 90
Figure 519 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS for variable technology (C = 80 f Ff = 500MHz) 90
Figure 61 Demonstrator circuit structure (showing 1-bit inputoutput (IO) channel face-downview) 93
Figure 62 Energy-recovery IO circuit schematic dia-gram 94
Figure 63 Adiabatic driver (layoutschematic view) 94
Figure 64 P2LC (layoutschematic view) 95
Figure 65 Data generator (layoutschematic view) 96
Figure 66 Timing constraints for the energy-recoverydemonstrator circuit 96
Figure 67 Pulse generator (layoutschematic view) 97
Figure 68 Pulse generator InputOutput signals 97
Figure 69 Inductorrsquos Q factor simulation in ASITIC Estimated energy dissipation reduction percycle over standard CMOS 98
Figure 610 Planar square spiral inductor designed inASITIC 99
Figure 611 Resonant pulse distribution tree 100
Figure 612 Interdigitated metal-insulator-metal capa-citor (fringe capacitor) 101
Figure 613 Energy-recovery IO circuit in layout view 102
Figure 614 CMOS IO circuit schematic diagram 103
Figure 615 CMOS driver schematic 103
Figure 616 CMOS IO circuit in layout view 103
Figure 617 Post-layout SPICE simulation of the energy-recovery circuit 105
Figure 618 Energy-recovery IO circuit micrograph 107
List of Figures xii
Figure 619 CMOS IO circuit micrograph 107
Figure 620 Measurement setup 107
Figure 621 Oscilloscope output for signals S1B and OC-MOS 109
Figure 622 Measuredsimulated power dissipation forthe CMOS circuit 109
Figure 623 Measuredsimulated energy dissipation forthe CMOS circuit 110
Figure 624 Oscilloscope output for signals S1A and OER110
Figure 625 Measuredsimulated energy dissipation forthe energy-recovery circuit 111
Figure 626 Energy-recovery IO circuit with fault hy-pothesis 111
Figure 627 Simulation of circuit with fault hypothesis 112
Figure 628 Measured energy dissipation and simula-tion of circuit with fault hypothesis 113
Figure A1 3D65 300mm wafer (on chuck) 118
Figure A2 Probe card 118
Figure A3 3D130 200mm wafer (on chuck) 119
Figure A4 Probe head 119
L I S T O F TA B L E S
Table 21 3D interconnect technologies comparison 8
Table 22 Technology parameters for Imec 3D130 process11
Table 23 Technology parameters for Imec 3D65 process11
Table 24 Comparison energy-recoveryconventionalCMOS 16
Table 31 Test structures on the PRC1 module 31
Table 32 PRC1 test pad instrument connections 37
Table 33 TSV capacitance D2D variability on waferrsquoD11rsquo 38
Table 34 TSV capacitance D2D variability on waferrsquoD18rsquo 39
Table 35 Variable pitch TSV capacitance experimentaldata from wafer rsquoD18rsquo 43
Table 36 Comparison between CBCMLCR measure-ments on wafer rsquoD18rsquo 45
Table 41 Coefficients of thermal expansion 47
Table 42 MOSFET array configurations for test padmodules PTP1PTP2 54
Table 43 MOSFET array configurations selected forexperimental evaluation 61
Table 44 PTP1PTP2 test pad module instrumentconnections 64
Table 45 Summary of experimental measurementson long (L) and short (S) channel PMOSdevices 73
Table 51 Energy dissipation results ( f Jcycle) the-oretical model (T) versus SPICE simulation(S) 84
Table 61 Energy-recovery IO circuit parameters 93
Table 62 Calculated resistance tolerances for eachwire branch in the resonant clock distribu-tion tree 100
Table 63 Energy-dissipation theoretical calculationsfor Q = 17 f = 500MHz 104
Table 64 Post-layout simulated energy dissipation Area requirements 106
Table 65 Energy-recovery circuit instrument connec-tions 108
Table 66 CMOS circuit instrument connections 108
xiii
A C R O N Y M S
2D two-dimentional
2N-2P 2 nMOS2 pMOS adiabatic logic
3D-IC three-dimentional integrated circuit
3D-SIC 3D stacked-IC
3D-SIP 3D system-in-package
3D three-dimentional
3D-WLP 3D wafer-level-packaging
BCB benzocyclobutene
BEOL back-end-of-line
CAL clocked adiabatic logic
CBCM charge-based capacitance measurement
CMOS complementary metal-oxide-semiconductor
CMP chemical-mechanical planarization
CTE coefficient of thermal expansion
Cu copper
CVD chemical vapor deposition
D2D die-to-die
D2W die-to-wafer
DCVSL differential cascode voltage switch logic
DRC design rule check
DUT device under test
ECRL efficient charge recovery logic
EDA electronic design automation
FEM finite element method
FEOL front-end-of-line
GPIB general purpose interface bus
IC integrated circuit
xiv
acronyms xv
IO inputoutput
KOZ keep-out-zone
LC inductance-capacitance
LVS layout versus schematic
MEMS microelectromechanical systems
MIM metal-insulator-metal
MOSFET metal-oxide-semiconductor field-effect transistor
MOS metal-oxide-semiconductor
MPU microprocessor unit
MuGFET multiple gate field-effect transistor
NMOS n-channel MOSFET
P2LC pulse-to-level converter
PAL pass-transistor adiabatic logic
PCB printed circuit board
PCG power-clock generator
PMD pre-metal dielectric
PMOS p-channel MOSFET
PVT process voltage temperature
QSERL quasi-static energy recovery logic
RC resistance-capacitance
RDL re-distribution layer
RLC resistance-inductance-capacitance
RMS root mean square
SIP system-in-a-package
Si silicon
SMU source measurement unit
SoC system-on-chip
TaN tantalum nitride
TSV through-silicon via
W tungsten
acronyms xvi
W2W wafer-to-wafer
WLP wafer-level-packaging
1I N T R O D U C T I O N
Extending integrated circuit (IC) functionality while keepingcost per silicon area approximately constant has historicallybeen the fundamental factor behind the growth of the semi-conductor industry Ever since Moorersquos observation [51] in 1965
chip functionality has been doubling every 15-2 years advancingfrom the Intel 4004 with 2000 transistors to modern generationmicroprocessor units (MPUs) that integrate over 1 billion tran-sistors [35] While there are indications that this trend might beslowing down due to planar metal-oxide-semiconductor field-effect transistor (MOSFET) physical limitations [22] innovativesolutions such as multiple gate field-effect transistors (MuGFETs)continue to drive the transistor roadmap forward as was recentlydemonstrated by Intel on their 22nm node MPU [12] It can bereasonably expected that IC transistor density will continue toincrease well into the next decade reaching 10 or even 100 billiontransistors per chip [4]
As transistor gate length dielectric thickness and junctiondepth are scaled performance in terms of speed and power im-proves as well The performance requirements for ICs have beensteadily increasing for succeeding technology generations drivenby market demand Although transistor scaling has traditionallybeen the main technology focus determining both cost and per-formance of ICs in recent years the performance gains enabled byscaling have been gradually challenged by on-chip interconnects
In an IC interconnects are the conducting wires connectingdevices and functional blocks in the circuit With smaller tran-sistors and increased functionality in modern processes intercon-nects have been progressively becoming more complex occupyingup to 12 metal layers in the chip As wire length increased sodid the signal delay time and power consumption of intercon-nects which are typically determined by wire intrinsic parasiticsin terms of resistance-capacitance (RC) It was estimated that a130nm microprocessor consumed over 50 of dynamic poweron interconnects [41] with a projection that in future designsup to 80 of microprocessor power would be consumed byinterconnects [3] Likewise signal propagation delay of globalinterconnects as a portion of the total system delay has beenrapidly increasing for succeeding technology nodes (Figure 11)
The rapid increase of the interconnect impact on IC perform-ance prompted ITRS1 to appoint the realisation of novel and
1 International Technology Roadmap for Semiconductors
1
11 research goals and contribution 2
Figure 11 Scaling trend of logic and interconnect delay [30]
innovative interconnect technologies as one of the ldquoGrand Chal-lengesrdquo in 2007 [3] Since then several technologies have been in-vestigated to address the interconnect bottleneck such as copperlow-k interconnects optical interconnects and three-dimentional (3D)interconnects Although copperlow-k interconnects are morecompatible with current processes and remain the most attractiveoption into the near term 3D interconnects are probably the mostrealistic option for meeting future technology trends [53]
Emerging 3D integration technologies based on vertical through-silicon vias (TSVs) promise a solution to the interconnect per-formance bottleneck along with reduced fabrication cost andheterogeneous integration TSVs are electrical connections passingthrough the silicon substrate and interconnecting separate diesstacked vertically in a 3D package [69] While there are several 3D
interconnect technologies [11] TSVs have the potential to offer thebest all around 3D integration technology with their advantagesincluding reduced parasitics and high interconnection density
11 research goals and contribution
As TSV is a relatively recent 3D interconnect technology there arestill challenges to be overcome before TSVs can be widely adoptedand commercialised Fabrication issues such as etching linerdeposition metallization and chemical-mechanical planarization(CMP) steps are in the process of being resolved or solutions havealready been found [55] Consequently the next logical step isthe implementation of three-dimentional integrated circuits (3D-
ICs) with more complex functionalities than simple test chipsHowever complex circuit design requires design methodologiesand tools that for 3D-ICs are still at their infancy stage [18]
The motivation behind this project was to accelerate the de-velopment of such design methodologies and tools for future
12 thesis outline 3
3D-IC designs by addressing key concerns of the TSV technologyprimarily from the circuit designerrsquos perspective The researcheffort was aimed at developing characterization structures forevaluating TSV characteristics affecting signal propagation delaypower consumption and circuit robustness as well as proposingdesign techniques for overcoming limitations Although TSVs areinterconnects their behaviour resembles more to MOSFET devicesthan the commonly used back-end-of-line (BEOL) metal wireswhile their integration into silicon introduces new challenges forICs Consequently under certain conditions conventional designapproaches are sufficient neither for TSV characterization nor forTSV interconnection
The TSV parasitic capacitance and thermomechanical stress dis-tribution are critical TSV characteristics The parasitic capacitanceaffects both delay and power consumption of signals passingthrough the TSV interconnect while thermomechanical stress af-fects MOSFET device mobility in the vicinity of TSVs Accuratelydetermining these parameters is necessary for generating designrules and models that can be used in TSV-aware design tools Aspart of this project sophisticated test structures were developedand fabricated on a 65nm 3D process for extracting the TSV para-sitic capacitance and thermomechanical stress distribution
The TSV parasitic capacitance is typically small however it maystill become an important source of power consumption as TSVs
are gradually implemented in large densely interconnected 3D-system-on-chips (SoCs) Although conventional low-power designtechniques based on voltage scaling can be used this representsa technical challenge in modern technology nodes In this worka novel TSV interconnection scheme is proposed which is basedon energy-recovery logic and shows frequency-dependent powerconsumption The scheme is analysed using theoretical modellingwhile a demonstrator IC was designed based on the developedtheory and fabricated on a 130nm 3D process
The work presented in this thesis was part of a larger researchproject undertaken at Imec2 with the goal of improving andoptimising TSV-based 3D integration technology The authorrsquoscontribution within this project was to propose design and sim-ulate the circuits presented in the following chapters as well asperform and analyse the experimental measurements
12 thesis outline
This thesis is organised into seven Chapters and two Appendixesas follows
2 Imec is a micro- and nanoelectronics research center located in Leuven Belgium(httpwwwimecbe)
12 thesis outline 4
Chapter 2 (Background) introduces the basic concepts com-prising the fields of 3D integration and energy-recovery logic Inaddition the design flow used for implementing the design partof this work is described
Chapter 3 (Characterization of TSV capacitance) presents agroup of test structures for characterizing the TSV parasitic ca-pacitance under various conditions The test structures are fab-ricated on a 65nm 3D process and the measurement data areanalysed using statistical methods
Chapter 4 (TSV proximity impact on MOSFET performance)presents a test structure for monitoring the impact of TSV-inducedthermomechanical stress on MOSFET device performance Thetest structure uses proven design techniques for implementinga large number of test-cases and accessing numerous deviceswith maximum precision The structure is fabricated and usedfor characterizing thermomechanical stress on a 65nm 3D process
Chapter 5 (Evaluation of energy-recovery for TSV intercon-nects) investigates the potential of the energy-recovery techniquefor reducing the energy dissipation of TSV interconnects in 3D-ICsThe total energy dissipation per cycle and optimum device sizingare extracted for the proposed scheme using theoretical model-ling while the configuration is evaluated against conventionalstatic complementary metal-oxide-semiconductor (CMOS) design
Chapter 6 (Energy-recovery 3D-IC demonstrator) elaborates onthe design of a 3D demonstrator circuit based on the energy-recovery scheme under realistic physical and electrical con-straints The demonstrator is compared to a CMOS circuit de-signed with identical specifications using post-layout simulationsBoth circuits are fabricated on a 130nm 3D technology processand evaluated based on the experimental measurements
Chapter 7 (Conclusions) summarises the contributions of thisthesis as presented in the previous chapters and discusses areasof future work
Appendix A includes photos used as reference in various partsof this work
Appendix B elaborates on the statistical descriptors used inthis work for analysing experimental measurement data
2B A C K G R O U N D
The work presented in this thesis contributes in the fields of 3D
integration and energy-recovery logic This chapter introduces thebasic concepts comprising these two fields not in extensive detailbut enough that the reader can understand the motivation andthe choices made in the context of this work Furthermore thefull-custom design flow is discussed that was used to implementthe design part of this thesis as will be presented in the followingchapters
21 3d integrated circuits
CMOS integration technologies are traditionally based on two-dimentional (2D) planar architectures Each die is individuallypackaged and placed on a printed circuit board (PCB) which isused to provide mechanical support and interconnection betweendifferent dies and electronic components A 3D-IC is a stack ofmultiple dies (Figure 21) vertically integrated and interconnec-ted sharing a common package
There are several motivations for the development of 3D-ICsPerhaps the most significant advantage of 3D integration is thereduced interconnection delay and energy dissipation In generalglobal interconnect delay and energy is linearly related to wirelength in an IC By using 3D integration modelling of simple3D-IC architectures has demonstrated that up to 50 reduction inglobal interconnect length may be achieved [24] with comparablereductions in delay and energy
Another advantage of 3D integration is heterogeneous integra-tion Presently 2D planar architectures limit designers to a singlefabrication technology for both the analog and digital parts of
ABCDEFBA ABCDEFBA
Figure 21 Comparison of 2D3D Integration
5
21 3d integrated circuits 6
ABCDEFBC
E
Figure 22 System-in-a-package
ABCDEFBC
E
Figure 23 Package-on-package
a circuit The common practice is to use an inexpensive digitalprocess for the complete design which is usually inefficient forimplementing analog circuits In contrast a 3D-IC allows for themost suitable process technology to be used for implementingeach part of the design In addition other components may beintegrated in the same circuit such as microelectromechanicalsystems (MEMS) [80] and passive devices [66] (resistors capacitorsinductors) enabling ICs with novel functionalities
The replacement of long horizontal global interconnects withshort vertical ones allows for reduced die size as well which isdirectly related to the fabrication cost Also the vertical integ-ration of dies in the same package removes the need for thespacing overhead between components that is necessary in PCBsincreasing the area efficiency and reducing the form factor
211 3D integration approaches
3D integration may be achieved using several microelectronictechnologies which result in different densities and capabilitiesThe levels of 3D integration can be reasonably classified into threemajor technology platforms [11 4] based on the interconnectwiring hierarchy and the industrial infrastructure
bull 3D system-in-package (3D-SIP) Stacking of individual pack-ages
bull 3D wafer-level-packaging (3D-WLP) Stacking of embeddeddies by wafer-level-packaging (WLP)
bull 3D stacked-IC (3D-SIC) TSV-based vertical system integra-tion
The 3D-SIP integration is based on traditional packaging intercon-nect technologies Each sub-system is integrated in a system-in-a-package (SIP) fashion (Figure 22) using wire-bonds for inter-connection and subsequently each SIP subsystem is assembledvertically to create package-on-package 3D stacks (Figure 23)
3D-SIP is currently the most mature technology and has alreadyreached volume production It provides with increased function-ality in a smaller form factor and low cost However interconnectdensity is limited since both wire bond pads and solder bumps
21 3d integrated circuits 7
Figure 24 Imecrsquos 3-D SIP wireless bio-electronic sensor [67]
A
BCDEAFE
FFDC
CFE
Figure 25 3D wafer-level-packaging
take up a large area on silicon due to their large pitch The longbond wire length can also significantly distort signals due toits intrinsic parasitics Hence this approach offers limited per-formance when compared to other 3D interconnect solutions Anexample of 3D-SIP application is the wireless bio-electronic sensor(Figure 24) developed at Imec [67]
3D-WLP integration is based on WLP technology directly stack-ing individual dies face-to-face using flip-chip micro-bumps forinterconnection The 3D interconnects are processed after IC pas-sivation using WLP techniques and may also be realised usingTSVs (Figure 25) The TSVs if existing are much larger than theequivalent interconnections in a 3D-SIC typically in the range of20 minus 40microm (1 minus 5microm in 3D-SIC) [11]
The 3D-SIC technology [69] is similar to a 2D SoC approachexcept that circuits exist physically on different layers and verticalinterconnects replace the global interconnections of the SoC 3D-
SICs (Figure 26) are based on TSV interconnections and have thepotential to offer the best all around3D integration technology Theadvantages of TSV-based 3D-SICs include the short interconnectlength (reduced parasitics) and the high interconnection densityHowever the additional process steps required for implementingthe TSVs increase manufacturing cost
A comparison between the discussed 3D interconnect technolo-gies is attempted in Table 21
21 3d integrated circuits 8
ABCD
EFFDB
F
Figure 26 3D-stacked IC
3D interconnect
technologyAdvantages Disadvantages
3D system-in-package (3D-SIP)
Simple existingtechnology bestheterogeneous
integration
Performance(speed power) low
densitymanufacturing costin high-quantities
3D wafer-level-packaging (3D-
WLP)
Existingtechnology good
compromisebetween
performance-manufacturing
cost
Averageperformance and
density
3D stacked-IC (3D-SIC)
Performance(speed power)
high densitymanufacturing costin high-quantities
Manufacturing costin low-quantities
not ready formanufacturing
Table 21 3D interconnect technologies comparison
21 3d integrated circuits 9
ABBCDE
FBACACDA
ACCD F
Figure 27 TSV process steps
212 TSV interconnects
Of particular interest to this PhD thesis is the 3D-SIC approachdeveloped at Imec and specifically the TSVs used for verticalinterconnection
In Imecrsquos 3D-SIC process [69 73] TSVs are created prior to theBEOL processing The TSV process steps are illustrated in Fig-ure 27 After processing of the CMOS front-end-of-line (FEOL) andpre-metal dielectric (PMD) stack the TSV hole is plasma-etchedwith a diameter of 5microm and a minimum pitch of 10microm A thinchemical vapor deposition (CVD) oxide layer (sim 200nm) is thendeposited for insulating the TSVs from the substrate The metal-lization sequence consists of a tantalum nitride (TaN) barrier de-position and filling of the TSV hole with electroplated copper (Cu)Finally the Cu overburden is removed in a top-side CMP step anda standard 2-metal layer BEOL process is applied to finalise thedie
Before stacking the wafer is mounted on a temporary carrierand thinned down to a silicon (Si) thickness of sim 25microm leavingthe TSV Cu exposed on the back-side of the wafer (Figure 28)After dicing the dies are stacked by Cu-Cu thermocompressionin a die-to-die (D2D) or die-to-wafer (D2W) fashion (Figure 29)using a tacky polymer as an intermediate glue layer such asbenzocyclobutene (BCB) (ǫr = 27) Target technology parametersfor Imecrsquos 3D-SIC 130nm and 65nm processes are summarised inTables 22 and 23 respectively
As TSV is a relatively new 3D interconnect technology thereare several challenges to be overcome before TSV technology canbe widely adopted and commercialised Issues in the fabricationlike etching liner deposition metallization and CMP steps arein the process of being resolved or solutions have already beenfound [55] On the contrary design methodologies for TSV-basedICs are still at their infancy stage [18]
21 3d integrated circuits 10
AAAB
CAAAB
ABC
DEF
Figure 28 Before thinningafter thinning
ABC ABC
DE DE
F
DEE
B
CBA
Figure 29 TSV stacking
21 3d integrated circuits 11
Minimum devicechannel length (nm)
130
Supply voltage (V) 12
Routing layers 2
TSV diameter (microm) 5
TSV pitch (microm) 10
TSV length (microm) 22
TSV resistance (mΩ) ~20
TSV capacitance( f F)
~40
Wafer diameter(mm)
200
Table 22 Technology paramet-ers for Imec 3D130
process
Minimum devicechannel length (nm)
70
Supply voltage (V) 10
Routing layers 2
TSV diameter (microm) 5
TSV pitch (microm) 10
TSV length (microm) 40
TSV resistance (mΩ) ~40
TSV capacitance( f F)
~80
Wafer diameter(mm)
300
Table 23 Technology paramet-ers for Imec 3D65 pro-cess
Approaching the technology challenges from the designerrsquosperspective we can emphasise on the following issues designtools thermomechanical stress thermal analysis electrical char-acteristics and testing
Design tools
Although it is possible to design 3D-ICs using existing tools byutilising ad hoc techniques as 3D implementations become moreadvanced ldquonativerdquo 3D design and analysis tools will becomenecessary A few physical layout tools are currently availablefor TSV-based design [2] however they do not offer automaticpartitioning placement and routing Moreover there are noavailable commercial tools for evaluating performance (speedpower) reliability and manufacturability in 3D-IC designs
Thermomechanical stress
TSV fill materials may include copper (Cu) tungsten (W) orpolysilicon[20] The large mismatch in the coefficient of thermalexpansion (CTE) between the fill material of the TSV (17 middot 10minus6K)and the Si surface (3 middot 10minus6K) induces mechanical stress aroundthe vias As Cu contracts much faster than Si it pulls the surface ofthe surrounding bulk while cooling down causing tensile stressin the area [25] Since even low stress values affect carrier mobilityin MOSFET devices [71] performance shift of the transistors inproximity to TSVs may occur
21 3d integrated circuits 12
The intensity of TSV-induced stress depends on various condi-tions such as TSV geometry device proximity and process char-acteristics Defining design rules such as keep-out-zones (KOZs)for digital and analog circuits and generating stress-effect modelsis an important milestone for implementing TSV-aware circuitswith predictable performance Simulations predict the complexdistribution of thermomechanical stress [40] which emphasisesthe need for sophisticated characterization structures that canmonitor stress impact with precision and verify simulation mod-els In Chapter 4 a characterization structure is presented forevaluating the impact of TSV-induced thermomechanical stresson MOSFET device performance
Thermal analysis
The close proximity of dies in a 3D-IC vertical stack increasesthe heat density proportionally to the number of stacked diesAs temperature affects transistor parameters such as leakagecurrent accurate thermal analysis is required for 3D-ICs to ensurethat design specifications are met Presently tools for thermalanalysis of planar ICs are available however they have to beadapted for use with complex 3D systems so as to reduce runtime and improve the accuracy of TSV models [18] In additionldquothermally-awarerdquo design tools are required which can partitionthe design in such a way that high heat generating componentsare placed on layers in proximity to the heat-sink
Electrical characteristics
The electrical characteristics of TSVs such as delay crosstalk sig-nal integrity (SI) and power integrity (PI) dictate many aspects ofa 3D-IC design Proper evaluation of TSV characteristics typicallythrough simulation means is critical for accurately predicting theperformance of a 3D system [57]
Several attempts have been made to generate models of theTSV structure [77 27] using its intrinsic resistance-inductance-capacitance (RLC) parasitics in an equivalent circuit model TheRLC parasitics depend on the materials and geometry of the TSV
structure while prior work has indicated that these parametersare relatively small with the capacitance assuming the dominantrole on TSV electrical properties [27] For verifying the accur-acy and calibrating the models experimental measurements arerequired using compatible characterization structures Such struc-tures are presented in Chapter 3 which were used for extractingthe TSVs parasitic capacitance of an experimental 3D process
22 energy-recovery logic 13
Testing
Testing ICs for defects is crucial for guaranteeing a sufficient man-ufacturing quality The unique processing steps of TSVs such aswafer thinning alignment and bonding introduce new types ofdefects that should be considered during testing [46] Further-more finding and diagnosing defects in TSV interconnections isconsiderably more challenging than 2D interconnects consideringthat it is not possible to gain physical access to the vias after thetop die has been placed All these factors emphasise the need fornew testing structures and methodologies for testing 3D systems
22 energy-recovery logic
TSV 3D interconnect technology aims to reduce delay and en-ergy dissipation when compared to the traditionally used planarBEOL interconnections A key TSV characteristic is its parasiticcapacitance discussed in Chapter 3 which is typically small(Tables 22 and 23) influencing proportionally both delay andenergy dissipation Although TSV technology offers significantperformance improvements to previous solutions a low-powerinterconnection scheme may still be desirable under certain con-ditions due to specific characteristics of TSV technology
1 The TSV capacitance may become a concern in future largedensely interconnected 3D-SoCs as it increases linearly withthe number of tiers and interconnections
2 Since interconnects do not follow transistor scaling tran-sistor gate capacitance will always be smaller than intercon-nect capacitance With the gap tending to widen in futuretechnologies it can be reasonably expected that 3D intercon-nects will become a performance bottleneck in ICs just asthe BEOL interconnections preceding them
Furthermore conventional CMOS low-power design techniquesbased on voltage scaling represent a technical challenge in mod-ern technology nodes as process voltage temperature (PVT) vari-ations and increased leakage [61] decrease in practice the rangeover which voltages may be scaled An interesting low-powerdesign approach is energy-recovery logic which demonstratesfrequency-dependent energy dissipation Energy-recovery logicis a design technique for minimising the energy dissipation ofcomputing devices well below typical CMOS limits Its major lim-itation is the requirement for passive components to generatetime-variable signals which cannot be efficiently integrated intocurrent generation ICs TSV-based 3D stacking and heterogeneousintegration of high-quality passive devices [66] may enable theuse of novel design techniques such as energy-recovery that
22 energy-recovery logic 14
Figure 210 Conventional AND gate
were not practically feasible in the past due to technology lim-itations In this PhD thesis energy-recovery is used for drivingTSV-based interconnections in a low-power scheme discussed inChapters 5 and 6
Historically energy-recovery logic has its roots in reversiblecomputing Reversibility is a concept from thermodynamics de-scribing physical processes that after they have taken place canbe reversed without loss or dissipation of energy Reversibleprocesses cause no change in the entropy of the system or itssurroundings (do not generate any waste heat) therefore theyachieve maximum energy efficiency The concept of reversibilityin thermodynamics may be extended to logical reversibility ifa computation can be implemented in such a way that it is al-ways possible to determine the previous state of the computationgiven a description of its current state In practical terms logicalreversibility would require that all information about the state ofthe computation in any stage is retained and never discarded
Yet the typical computing device is not reversible (irreversible)and as Landauer [36] has shown there is a fundamental min-imum amount of energy that must dissipated in such devicesThis minimum energy dissipation is known as the Von Neu-mann - Landauer limit and is at least kT ln 2 per irreversible bitcomputation The irreversible operation of conventional comput-ing can be demonstrated using the AND gate as an example(Figure 210) The AND gate has two input lines and one outputWhen both inputs are at logical 1 the output is 1 however theoutput is 0 in all of the remaining three combinations that have alogical 0 on the inputs Hence every time the gatersquos output be-comes 0 there is loss of information since we cannot determineexactly which input state caused the logical 0 at the output
Computations do not necessarily have to be irreversible aswas shown by Bennett [9] and thus it is still possible to breakthe kT ln 2 limit A reversible computing device would have to
22 energy-recovery logic 15
perform each computation twice the first time in the forwarddirection applying the inputs to a logic function and obtainingthe result and the second time backwards computing the inversefunction and returning the system to its initial state If a com-putation could be performed in such a way and implementedon non-dissipative hardware its energy requirements could bepotentially reduced to zero [10]
In practice even if a circuit is designed to be logically reversibleachieving zero-energy computations utilising existing electricalcircuits is limited considering that any process merely involvingcharge transfer will result in dissipated energy Nevertheless it isstill possible to reduce the energy dissipation during charge trans-ferring to ldquoasymptotically zerordquo levels applying circuit designtechniques generally known as adiabatic switching [8 31]
The operating principle of adiabatic switching can be bestexplained by comparing it to conventional CMOS switching In aconventional CMOS circuit (Figure 211) when the switch closes(typically the pull-up network) a certain amount of charge Q =
CVDD is pulled out of the positive power supply rail that chargesthe load capacitance C up to VDD The energy transferred fromthe power supply during the charging period (T) is [59]
E = QVDD = CV2DD (21)
From the total transferred energy only half ( 12 CV2
DD) is storedon the capacitor C while the other half is dissipated on theswitch and interconnect resistance R As can be derived fromEquation 21 to reduce the energy dissipation in conventionalCMOS either the supply voltage VDD has to be reduced or theload capacitance C
The adiabatic charging of an equivalent node capacitance C
is modelled in Figure 212 In contrast to the conventional CMOS
circuit the supply voltage in this case is not constant but time-variable with a slow rising time (T ≫ RC) If the charging periodof the capacitor (or rising time of the voltage supply) is T then itcan be proven [82 60] that the energy dissipated on the resistanceR would be proportional to
EAdiabatic prop
(
RC
T
)
CV2DD (22)
The distinct advantage of adiabatic switching is that the slowrising time of the supply voltage keeps the voltage difference(and thus current flow) on the resistance R low at all times Asthe charge transfer is spread more evenly over the entire chargingperiod the energy dissipation (E = I2R) is greatly reduced Inaddition as can be derived from Equation 22 increasing thecharging period (T) can potentially reduce the energy dissipationto arbitrary low levels irrespective of the VDD or C parameters
22 energy-recovery logic 16
A
B
CC
D
E
D
Figure 211 CMOS switching
AA
B
C
B
Figure 212 Adiabatic switching
Conventional CMOS Energy-recovery CMOS
2 states True False 3 states True False Off
Energy of information bitdissipated as heat on the
pull-down network
Energy of information bitrecovered by the
power-supply
High speed Lowmedium speed
High energy dissipation Very low energy dissipation
Area efficient Some area overhead
Table 24 Comparison energy-recoveryconventional CMOS
Combining reversible computations with adiabatic switchingtechniques allows for circuit implementations with extremely lowenergy dissipation compared to conventional designs [37 38]However fully reversible circuits in CMOS technology sufferfrom high area overhead as the inverse logic function has tobe implemented and intermediate logic states must be stored toenable reversibility of the computation [8] In cases where theoverhead of full reversibility is too high it can be significantlyreduced by allowing some information to be occasionally dis-carded without sacrificing much of the energy benefits Thesecircuits that partially apply the principles of reversible comput-ing and adiabatic switching to achieve low but not zero energydissipation are called energy or charge recovery [31]
A comparison between energy-recovery and conventional CMOSis summarised in Table 24
22 energy-recovery logic 17
ABCABC
Figure 213 Basic 2N-2P adiabaticcircuit
ABC
ABC
DEF
EE
DEF
Figure 214 2N-2P timing
221 Energy-recovery architectures
Several energy-recovery circuit architectures have been proposedin the literature over the past years Examples of logic families are2 nMOS2 pMOS adiabatic logic (2N-2P) [33] efficient charge re-covery logic (ECRL) [50] pass-transistor adiabatic logic (PAL) [54]clocked adiabatic logic (CAL) [45] quasi-static energy recoverylogic (QSERL) [83] and boost logic [62] These architectures applythe principles of reversible computing and adiabatic switchingto a variable degree while their differences are mainly relatedto characteristics such as singledual-rail logic style number ofclock phases chargingdischarging path and energy efficiency
A simple configuration is the 2N-2P family of logic circuits(Figure 213) which is based on the differential cascode voltageswitch logic (DCVSL) circuit [59] The timing diagram of the 2N-2P
circuit is presented in Figure 214 using an idealised adiabaticvoltage supply Initially the voltage supply is in the WAIT phase(LOW) keeping the outputs in the LOW state Then the inputs areset complementary to each other and the supply voltage rampsup As the inputs are evaluated the output that was enabled bythe pull-down network follows the power supply until it reachesVDD At that moment the inputs return to the LOW state andafter a certain period of time in the HOLD ldquo1rdquo phase the voltagesupply ramps down discharging the output As can be observedthe average voltage supply during the computation cycle is wellbelow VDD which can be advantageous in modern technologynodes where leakage currents are of a concern
The adiabatic chargingdischarging takes place during theramping updown of the EVALUATERESET phases while thereverse computation occurs during the RESET phase when thetransferred charge is returned to the power supply Howeverthe reversibility of the circuit is only partial since the p-channel
22 energy-recovery logic 18
Figure 215 Adiabatic driver
MOSFET (PMOS) transistor that was turned-on by the pull-downnetwork will be forced to turn-off when the output is approxim-ately a threshold voltage above zero This will cause some chargeto be trapped on the output which will be dissipated as heat on asubsequent cycle Using the 2N-2P circuit arbitrary logic functionsmay be implemented in the place of the pull-down network
The energy efficient characteristics of energy-recovery logic canbe also beneficial for driving large load capacitances A circuitthat may be used in such case is the adiabatic driveramplifier(Figure 215) [8]
The input of the adiabatic driver is dual-rail encoded as well asthe output The operation principle is straightforward First theinputs are set to valid values and then an adiabatic time-variablesignal goes through the pass-gate charging the load capacitanceat the output to its maximum value The output is now validand can be used by other stages to perform computations Sub-sequently the output slowly discharges returning the charge backto the power-supply Energy is saved by two methods Firstlythe slow-rising adiabatic signal forces a low voltage differencebetween the inputs of the pass-gate resulting in low currentand thus low energy dissipation Secondly the energy that isnot dissipated as heat during the transfer is returned back tothe power-supply which can be re-used in the next computationcycle
Circuits utilising the energy-recovery technique have been suc-cessfully implemented in the past while they have demonstratedgreat potential whilst driving highly-parasitic global intercon-nects such as clock distribution networks [42 63]
222 Power-clock generators
Energy-recovery circuits require specialised power supplies thatcan generate multiple-phase time-variable adiabatic signals and
22 energy-recovery logic 19
A
A
B
C
C
Figure 216 Stepwise charging generator
recover charge Since the power supply in an energy-recoverycircuit provides with both the power and clock signal they arereferred to as power-clock generators (PCGs) [6] The power-clockgenerators fall into two categories staircase and resonant gener-ators
Staircase generators use the stepwise charging technique [68]The basic circuit (Figure 216) is composed of N number of tankcapacitors (CT) that are successively discharged in N steps un-til the load capacitance (CLOAD) is charged to the maximumvoltage level This technique attempts to ldquoemulaterdquo the adiabaticchargingdischarging signal while its linearity depends on thenumber of charging steps N The advantage of staircase generat-ors is that they can be easily integrated on-chip since they do notrequire inductors as is the case with resonant generators How-ever both speed performance and energy efficiency are relativelylow
Resonant generators [44] are based on inductance-capacitance(LC) oscillators A simple circuit is the 1N single-phase powerclock generator of Figure 217 If R is the resistance of the adia-batic driver in Figure 215 then the load capacitance (CLOAD)along with a resonant-tank inductor (L) form an RLC oscillatorresonating at
f =1
2πradic
LCLOAD
(23)
As typical for RLC oscillators the shape of the power-clockis a sinewave which approximates the idealised waveform ofFigure 214 Energy is transferred adiabatically during the risinghalf of the sinewave while on the falling half the energy is
23 full-custom design flow 20
A
BCD
E
DFF
CD
Figure 217 1N power clock generator
partially recovered and stored in the inductor for later use RLC
oscillators are very energy efficient power-clock generators andcan operate at high speeds However inductors may be difficultto implement on-chip
23 full-custom design flow
All circuits presented in this thesis were designed using the full-custom approach since the target fabrication technology was anexperimental 3D process for which neither standard cells nordedicated 3D design tools were in our disposal
In the full-custom design methodology each individual tran-sistor and associated connectivity in the integrated circuit ismanually specified in the layout level by the designer Full-customis considered to be superior to standard-cell design when themain concern is circuit performance and area efficiency Further-more full-custom design approaches are essential in cases wherespecialised circuitry are to be designed or when standard celllibraries are not available
The electronic design automation (EDA) environment that wasused was based on Cadence1 applications and tools as integ-rated in the Design Framework II (DFII version 5141) CadenceDFII consists of tools for design management (library manager)schematic entry (Virtuoso schematic) physical layout (Virtuosolayout) verification (Assura) and simulation (Spectre) Thesetools are completely customisable supporting various processtechnologies through foundry-specific design kits In our case
1 Cadence Design Systems Inc (httpwwwcadencecom)
24 summary 21
the Assura tool was replaced by Mentor Graphics2 Calibre forthe verification step
Our customised design flow is illustrated in Figure 218 Ini-tially the circuit is captured at the transistor-level using theschematic editor (Figure 219) and the completed design is simu-lated and modified until it conforms to the design specificationsThe same circuit is re-created in the layout editor (Figure 220)where the detailed geometries and positioning of each fabricationmask layer is described The layout must conform to a series ofprocess design rules and any violations of these rules can be de-tected during the design rule check (DRC) step Subsequently thelayout design is translated to a circuit description (netlist) whichis compared to the initial circuit in the schematic editor duringthe layout versus schematic (LVS) step If the two designs matchthe interconnectivity and transistor parasitics are estimated fromthe layout design and another netlist is created which includesthese parasitics This detailed netlist can be simulated to get amore accurate estimation of the circuit behaviour including theinfluence of parasitics
24 summary
In this chapter the reader was introduced to the basic conceptsand terminology that are relevant in the context of this thesisThe following chapter begins the discussion of the research partof this work presenting a group of test structures that were usedfor the electrical characterization of the TSV parasitic capacitancein an experimental 3D process
2 Mentor Graphics Inc (httpwwwmentorcom)
24 summary 22
ABCDEA
FCCAAE
ABCDEAEC
DE
E
F
CEA
EEAEA
EDE
CAC
E
EC
CAC
Figure 218 Design flow
Figure 219 Inverter in Virtuososchematic
Figure 220 Inverter in Virtuosolayout
3C H A R A C T E R I Z AT I O N O F T S V C A PA C I TA N C E
31 introduction
In a semiconductor process technology electrical characterizationis crucial for testing device performance and reliability after goingthrough the process fabrication steps Electrical characterizationverifies the electrical properties of devices to specifications andprovides feedback to the wafer fabrication process engineers forimproving and optimising the process technology
In a 3D system the electrical behaviour of the TSV can bemodelled in terms of its resistance-inductance-capacitance (RLC)parasitics as discussed in 212 (page 12) Parasitic models are typ-ically used for simulating circuit behaviour and they are criticalfor accurately predicting the performance of a 3D system duringdesign verification [57]
From the RLC parasitics associated with a TSV capacitanceassumes the most significant role on its electrical properties suchas energy dissipation and delay [27] The capacitancersquos role is sodominant that for latency and signal integrity (SI) analysis theTSV behaviour can be modelled by considering the capacitancealone [21] Thus accurate characterization of the TSV capacitanceis highly desirable when designing a 3D system
Typical laboratory capacitance measurement techniques basedon LCR1 meters require well controlled process technologies forrealising a high number of devices without defects which mightnot be possible at the current maturity level of TSV process tech-nology In this chapter an alternative measurement technique isproposed for accurate on-chip characterization of the TSV parasiticcapacitance Test structures based on this technique are fabricatedon a 65nm 3D process (Table 23 page 11) and used for charac-terizing the TSV parasitic capacitance under various conditionsThe experimental measurements are analysed using statisticalmethods and the proposed test structures are evaluated
32 tsv parasitic capacitance
A typical isolated 2D back-end-of-line (BEOL) metal wire can bemodelled as a conductor over a ground plane (Figure 31) Sincethe conductor is separated from the ground plane by a dielectric
1 LCR meter is a measurement equipment used to determine the inductance (L)capacitance (C) and resistance (R) of devices based on impedance measure-ments
23
33 integrated capacitance measurement techniques 24
ABC
Figure 31 Isolated 2D wire capacitance
material (εox) its electrical behaviour closely resembles that ofa capacitor The basic formula for calculating a parallel platecapacitance is
C =εox
hwl (31)
In practice calculating the parasitic capacitance of a BEOL metalwire is significantly more complex since fringing fields increasethe total capacitance value [78] Nevertheless whether a 2D metalwire capacitance is calculated using Equation 31 or more com-plex formulas [85] its value is only dependent on geometricalcharacteristics and the dielectric material
In that context 3D interconnects in the form of TSVs do notbehave as simple wires but as devices The TSV copper (Cu) issurrounded by silicon (Si) and these two materials are separatedby a dielectric oxide liner (Figure 32) This structure resemblesa metal-oxide-semiconductor (MOS) capacitor and as has beenshown by modelling and measurements [27] exhibits distinctaccumulation depletion and inversion regions (Figure 33) TheTSV capacitance attains its maximum value (Cox) in the accumula-tion when Vbias lt VFB (flatband voltage) and inversion regions(Vbias gt Vth f lt 1kHz)
The bias voltage and frequency dependency of the TSV capacit-ance complicates both modelling and the experimental measure-ment of its parasitic capacitance Furthermore the MOS behaviourof TSVs is not desirable in circuits using 3D interconnections Forthat reason special techniques are employed [28] that shift theTSV C-V characteristic in such a way that for typical circuit oper-ating voltages (0 minus VDD) the TSV capacitance is constant and atits minimum value (Figure 33)
33 integrated capacitance measurement techniques
Extracting the TSV parasitic capacitance requires the use of meas-urement techniques that are compatible with the particular TSV
33 integrated capacitance measurement techniques 25
ABCDE
F
FD
F
F
FD
Figure 32 TSV parasitic capacitance
ABCDE FACDE
ECDE
BCFACDE
FCCDED DBCDE
D
DAB
Figure 33 Capacitance vs Gate voltage (CV) diagram of a MOS Capa-citor
33 integrated capacitance measurement techniques 26
characteristics as indicated in the previous section The tech-niques commonly used for measuring integrated capacitances ina laboratory environment can be classified into two categories
1 Measurements based on external instruments these can bedirect capacitance measurements LCR meter based [52] orRF S-parameter based measurements [39]
2 On-chip capacitance measurements examples are the ref-erence capacitor technique [32] ring-oscillator based [26]and charge-based [15]
The disadvantage of using external instruments directly connec-ted to the device under test (DUT) is that the measurement setupintroduces additional parasitics that have to be subtracted fromthe measurement result Using standard measurement setupssignificant noise is added when attempting to determine capa-citances well below 1pF due to the influence of the instrumentparasitics [64] In cases where the DUT capacitance is small suchas the femto-Farad TSV capacitance (Table 23) a large number ofidentical devices (typically in the hundreds) can be connected inparallel to increase the total capacitance value [75] However thismethod is only usable with process technologies that are wellcontrolled and allow the realisation of a high number of deviceswithout defects In general the extracted capacitance of the wholepopulation may not be representative of a single device [56] andas a result individual device measurement accuracy may suffer
On-chip techniques can achieve better accuracy when the DUT
capacitance is small Although DUT yield is not critical the pro-cess technology should allow the integration of active devicesand their parameters should be known to a reasonable degreeA simple yet high resolution technique is the charge-based ca-pacitance measurement (CBCM) which can be used to extractcapacitances down to the atto-Farad range [15]
In the CBCM technique the DUT capacitance (CDUT) is chargeddis-charged at frequency f using a pseudo-inverter (Figure 34) whilethe average charging current (Im) is measured with a laboratoryammeter Since transistor and wiring parasitic capacitances are inparallel with the DUT capacitance a second (reference) inverter isused to measure these parasitics (Ire f ) which are then subtractedfrom the calculated result The DUT capacitance is calculated asfollows
CDUT =Im minus Ire f
VDD middot f(32)
The accuracy of the CBCM technique can be further increasedby measuring the charging current over a range of frequenciesThe DUT capacitance can be then calculated from the slope of
34 electrical characterization 27
AB
AC
DE
FE
DE
FE
AB
AC
B
Figure 34 CBCM pseudo-inverters
Inet (Im minus Ire f ) (Figure 35) The accuracy is greatly improved aspseudo-inverter leakage current is removed from the extractedresult (slope is not affected by constant biases) and also defectiveDUTs can be detected by inspecting the linearity of the current-frequency plot
The CBCM technique has been previously used for measuringBEOL metal interconnect parasitic capacitances [15] and MOSFET
capacitances [64] In this work CBCM is proposed for the on-chipmeasurement of the TSV parasitic capacitance CBCM is compat-ible with the TSV parasitic capacitance on the premise that thevoltage and frequency ranges of CBCM are selected such thatthe TSV is forced into a region where its capacitance is constantAs was shown in the previous section (Figure 33) TSV capacit-ance is constant for typical circuit operating voltages while thefrequency dependency of the capacitance becomes significantonly in the multi-MHzGHz range [39] Therefore operating theCBCM pseudo-inverters in the 0 minus VDD range and below 1MHz
should be adequate to avoid the MOS behaviour of the TSV
34 electrical characterization
A group of test structures based on the CBCM technique wereimplemented for on-chip characterization of the TSV capacitanceon Imecrsquos 65nm 3D process technology (Table 23) As it is commonin IC design maximising the utilisation of the available chiparea is one of the priorities of the design effort For that reasonsupplementary test structures were designed and implemented
34 electrical characterization 28
AB
CADE
F
AB
E
CAD
B
Figure 35 Current as a function of frequency in CBCM
on the same chip that went beyond the core objectives of thiswork
The complete design consisted of 6 test structures 5 for ex-tracting the TSV capacitance under various conditions based onCBCM and 1 additional experiment for direct measurement ofthe RC constant All test structures are presented in this sectionhowever due to equipment and material availability only themost significant were measured and analysed
341 Physical implementation
In the CBCM technique the extraction accuracy of the TSV capacit-ance is highly dependant on the design of the pseudo-invertersThe rsquoreferencersquo pseudo-inverter in CBCM is used for measuringboth the wire parasitic capacitance and the inverter parasitics inthe form of diffusion and gate-drain overlap capacitances There-fore it is critical that the two pseudo-inverters are identical interms of parasitics as any mismatches will appear in the extrac-ted measurement result
Transistor mismatch is mostly influenced by process technologyparameters however it is possible to minimise process variationssuch as in the poly gate length by applying special design tech-niques A major cause of gate length variation is the polysiliconpitch to neighbours and one way to mitigate that effect is to fillunused space between transistor gates with polysilicon dummies[13] This technique was widely used in the CBCM pseudo-inverterdesign presented in this work (Figure 36) Concurrently theinverter transistors were designed with relatively large dimen-sions for the implementation process technology (P10microm1micromN5microm1microm) so as to minimise the effect of variation as a per-centage of the transistor parasitics
34 electrical characterization 29
Poly dummy fills
Metal2Metal1PolysiliconActive
1μm 5μm
Figure 36 CBCM pseudo-inverter pair in layout
Imecrsquos 65nm 3D process allows for 2-tier vertical stacking ofdies (TOP and BOTTOM) with the TSV passing through theTOP die Si substrate as illustrated in Figure 32 The IO foreach test structure was accessible through a standard 2 times 12 testpad module on the TOP die compatible with our wafer probetest system The 2 times 12 test pad module (PRC1) along with thecomplete test structure design is illustrated in Figure 37 while amicrograph image of the fabricated chip can be seen in Figure 38
The CBCM pseudo-inverters were implemented on either TOP
or BOTTOM tier while the TSVs were connected to the invert-ers from TOP or BOTTOM and arranged in 4 configurations (11 times 2 2 times 2 2 times 5) with different pitch distances The 6 test struc-tures available on the PRC1 test pad module are summarised inTable 31 A detailed description of the test structures (rsquoArsquo rsquoBrsquo rsquoCrsquorsquoDrsquo rsquoRrsquo and rsquoErsquo) follows in the paragraphs below
Structure rsquoArsquo Single TSV capacitance (TOP die)
The purpose of test structure rsquoArsquo (Figure 39) is to extract the capa-citance of a single TSV when it is electrically accessed through theTOP die The structure is composed of 3 pseudo-inverters con-nected to the TSVs using identical geometry Metal-2 layer wireswhose capacitance is also measured (Reference) In addition to
34 electrical characterization 30
DTOP DBTM
B10 A10
BREF AREF
B1 A1
R1 R2
SET VDD
RESET GND
C15 C10
C50 C20
SET VDD
ERC SET
EREF EC Metal2 (TOP)Metal1 (TOP)
Metal1 (BTM)Metal2 (BTM)
Figure 37 PRC1 test pad module(layout)
Figure 38 PRC1 test pad module(micrograph)
34 electrical characterization 31
Structure A B
TSVs - 1 2x5 - 1 2x5
Inverterlocation
TOP TOP TOP BTM BTM BTM
TSVconnection
TOP TOP TOP BTM BTM BTM
Multi-TSVconnection
- - TOP - - BTM
Multi-TSVpitch (microm)
- - 15 - - 15
Pad name AREF A1 A10 BREF B1 B10
Structure C D
TSVs 2x2 2x2 2x2 2x2 1x2 1x2
Inverterlocation
TOP TOP TOP TOP TOP TOP
TSVconnection
TOP TOP TOP TOP TOP TOP
Multi-TSVconnection
TOP TOP TOP TOP TOP BTM
Multi-TSVpitch (microm)
10 15 20 50 20 20
Pad name C10 C15 C20 C50 DTOP DBTM
Structure R E
TSVs 1 1 - 1x2 1x2
Inverterlocation
TOP TOP TOP TOP TOP
TSVconnection
TOP TOP - TOP TOP
Multi-TSVconnection
- - - TOP BTM
Multi-TSVpitch (microm)
- - - 20 20
Pad name R1 R2 EREF EC ERC
Table 31 Test structures on the PRC1 module
34 electrical characterization 32
Pseudo-inverters
2x5 TSVs
TSV
Reference
Metal2 (TOP)Metal1 (TOP)
Figure 39 Structure rsquoArsquo - Single TSV capacitance (TOP die)
the single TSV DUT the structure also contains 10 parallel TSVs
arranged in a 2 times 5 array with a 20microm pitch Since the suitabilityof the CBCM method for extracting the small TSV capacitance wasundetermined during design time the 2times 5 array was included asa precaution effectively increasing the DUT capacitance an orderof magnitude
Structure rsquoBrsquo Single TSV capacitance (BOTTOM die)
The design of structure rsquoBrsquo is similar to the structure rsquoArsquo asdescribed above The only difference is that the CBCM pseudo-inverters are implemented on the BOTTOM die (Figure 310)while connectivity to the TSVs is provided using a wire on theMetal-2 layer of the BOTTOM die The purpose of this structureis to examine whether the perceived TSV capacitance varies whenit is accessed from the BOTTOM side
Structure rsquoCrsquo Variable pitch TSV capacitance
The use of minimum TSV pitch in designs is highly desirable soas to minimise total chip area The purpose of test structure rsquoCrsquo(Figure 39) is to investigate whether TSV pitch has an effect onthe parasitic capacitance It has been previously reported in the
34 electrical characterization 33
2x5 TSVs
Reference
TSV
Pseudo-inverters
Metal2 (BTM)Metal1 (BTM)
Figure 310 Structure rsquoBrsquo - Single TSV capacitance (BOTTOM die)
literature that TSV proximity to other TSVs may affect its parasiticcapacitance probably through an interaction with the oxide liner[23] Test structure rsquoCrsquo aims to verify the presence of the pitcheffect on the 65nm 3D process
The CBCM pseudo-inverters are implemented on the TOP die(Figure 311) while the TSVs are connected in parallel in groups of4 with variable pitch distances at 10microm 15microm 20microm and 50micromAll groups connect to the pseudo-inverters using identical lengthwires Since wire parasitics are not measured their value is notextracted from the measured TSV capacitances thus the resultsare relative
Structure rsquoDrsquo TSV capacitance with interconnection on TOP or BOT-
TOM die
The purpose of structure rsquoDrsquo (Figure 312) is similar to structurersquoBrsquo which is to examine whether the perceived TSV capacitancechanges when it is accessed from the BOTTOM die However inthis structure the CBCM pseudo-inverters are implemented on theTOP die in the case the front-end-of-line (FEOL) active devices ofthe BOTTOM die were not capable of operating properly afterfabrication and stacking
34 electrical characterization 34
2x2 TSVs - 10μm
2x2 TSVs - 15μm
2x2 TSVs - 20μm
2x2 TSVs - 50μm
Metal2 (TOP)Metal1 (TOP)
Figure 311 Structure rsquoCrsquo - Variable pitch TSV capacitance
Metal2 (TOP)Metal1 (TOP)Metal1 (BTM)
Figure 312 Structure rsquoDrsquo - TSV capacitance measurement with inter-connection on TOP or BOTTOM die
34 electrical characterization 35
Metal2 (TOP)Metal1 (TOP)
Figure 313 Structure rsquoRrsquo - Evaluation of TSV capacitance measurementaccuracy
Metal2 (TOP)Metal1 (TOP)Metal1 (BTM)
Driver
Reference
Driver
Figure 314 Structure rsquoErsquo - TSV capacitance effect on signal propagationdelay (layout)
Structure rsquoRrsquo Evaluation of TSV capacitance measurement accuracy
Structure rsquoRrsquo (Figure 313) aims to evaluate the accuracy withwhich CBCM can extract the capacitance of two nearby TSVs Thetwo pseudo-inverters wires and TSVs are identical thus theor-etically they should give identical results Any variance in theresult should be mainly due to transistor variability
Structure rsquoErsquo TSV capacitance effect on signal propagation delay
Structure rsquoErsquo (Figure 314) allows the direct measurement ofthe delay caused by the TSV on signal propagation The signalpropagation delay is measured under two conditions When thesignal passes through the TSV (which includes the effect of resist-ance on delay) and when only the TSV capacitance is in the pathof the current (Figure 315) The contribution of wiretransistorparasitics is also measured so as to be subtracted from the finalresult
34 electrical characterization 36
ABC
D
ECBFC
ABC
Figure 315 Structure rsquoErsquo - TSV capacitance effect on signal propagationdelay (schematic)
ABC
DEF
CD
A
CA
A
Figure 316 Measurement setup
342 Measurement setup
The presented test structures were fabricated on 300mm wafers(Figure A1) each containing 472 identical dies For IO accessa general purpose on-wafer measurement system was used thatallowed direct contact with the wafer and the 2 times 12 test padmodule through probecard pins (Figure A2) All IO signals wereprovided by standard laboratory instruments controlled throughthe general purpose interface bus (GPIB) port of a PC runninga customised LabVIEW2 program The measurement setup isillustrated in Figure 316
The voltage source of the CBCM pseudo-inverters was set at1V and the CBCM pulse frequencies were swept in the range100kHz minus 1MHz For each measured test structure the DC cur-rent measurement was stored in non-volatile memory and sub-
2 LabVIEW is a visual programming language from National Instruments forautomating measurement equipment in a laboratory setup
34 electrical characterization 37
Test pad Direction Instrument connection
VDD Power Power supply - 1V
GND Power Power supply - 0V
SET InputPulse generator
(Non-overlapping with RESET)
RESET InputPulse generator
(Non-overlapping with SET)
A[REF 1 10] InputSMU (Inverter source - 1V
supply)
B[REF 1 10] InputSMU (Inverter source - 1V
supply)
C[10 15 2050]
InputSMU (Inverter source - 1V
supply)
D[TOP BTM] InputSMU (Inverter source - 1V
supply)
R[1 2] InputSMU (Inverter source - 1V
supply)
E[REF C RC] Output Oscilloscope (high-impedance)
Table 32 PRC1 test pad instrument connections
sequently the prober progressed to the next die measuringall dies on the wafer The IO test pads of the PRC1 module(Figure 37) and the instrument connections are summarised inTable 32
343 Experimental results
There were two wafers available for measurements named in thistext rsquoD11rsquo and rsquoD18rsquo which were selected from the same lot Eachwafer contained 472 identical dies of the TOP 3D tier measuredusing the measurement setup described in 342 Measurementdata were extracted only from test structures rsquoArsquo (Figure 39) andrsquoCrsquo (Figure 311) which were analysed using standard statisticalmethods
When utilising statistical methods to analyse data acquired ina laboratory environment it is common for some data pointsto be significantly away from the other measurements Thesedata points are termed outliers (Appendix B) and usually resultfrom measurement errors such as bad probe pin contacts ormalfunctioning dies on the wafer Since the following analysisis focused on TSV capacitance variability these few outliers areexcluded from the data set so as to improve analysis accuracy
34 electrical characterization 38
Structure rsquoArsquo (Fig 39)
TSV dimensions (ld) 40microm5microm
Oxide liner thickness 200nm
Total dies 472
Processed dies (excluding outliers) 454
Mean capacitance 709 f F
Standard deviation (σ) 19 f F
Coefficient of variation (3σmicro) 80
Table 33 TSV capacitance D2D variability on wafer rsquoD11rsquo
The statistical descriptors used for analysing the measurementdata are elaborated in Appendix B and include calculations ofmean standard deviation and coefficient of variation Theseparameters are used to evaluate the TSV capacitance in terms ofdie-to-die (D2D) wafer-to-wafer (W2W) and TSV pitch variabilityThe accuracy of the CBCM results is evaluated by comparing themto standard LCR measurements Furthermore the correlationbetween the extracted TSV capacitance and simulated oxide linerthickness is shown signifying the predominant role of the oxideliner on the observed TSV capacitance variability
Die to die (D2D) variability
The die-to-die (D2D) variability is evaluated by measuring thecapacitance of the single TSV of structure rsquoArsquo (Figure 39) acrossall dies on the wafer The measurements are repeated on bothavailable wafers (rsquoD11rsquo and rsquoD18rsquo)
On wafer rsquoD11rsquo 18 dies showed a capacitance value severalstandard deviations away from the calculated mean of the totalpopulation These outliers indicated a measurement error or mal-functioning on the die and they were removed from the statisticalanalysis In the remaining 454 dies the mean capacitance of theexperimental data was 709 f F with a standard deviation (σ) of19 f F (Table 33) Standard deviation is an important indicator al-lowing us to determine whether the variance in the experimentaldata is expected or if there is a random error affecting the meas-urements As can be seen in Figure 317 706 of the measuredvalues are within plusmnσ of the mean capacitance value Since mostof the experimental data are within a standard deviation of themean value the variability observed is insignificant and expectedthus the measurement results can be considered valid
On wafer rsquoD18rsquo from the 472 total dies 58 were removed asoutliers The mean capacitance of the experimental data was895 f F with a standard deviation (σ) of 22 f F (Table 34) Theprobability density function can be seen in Figure 318 with 691
34 electrical characterization 39
-60 -40 -20 00 20 40 60 80
σ-σ
706
Deviation from mean (fF)
0
005
01
015
02
025
Figure 317 TSV capacitance D2D variability on wafer rsquoD11rsquo - Probabil-ity density function
Structure rsquoArsquo (Fig 39)
TSV dimensions (ld) 40microm5microm
Oxide liner thickness 200nm
Total dies 472
Processed dies (excluding outliers) 414
Mean capacitance 895 f F
Standard deviation (σ) 22 f F
Coefficient of variation (3σmicro) 74
Table 34 TSV capacitance D2D variability on wafer rsquoD18rsquo
of the experimental data within plusmnσ of the mean capacitancevalue
Correlation between TSV capacitance and oxide liner thickness vari-
ation
The TSV capacitance as extracted from the experimental measure-ments is plotted for wafers rsquoD11rsquo and rsquoD18rsquo (Figures 319 and 320)The capacitance value in the figures is normalised to the calcu-lated mean while xy axis indicate die coordinates on the waferWe observe that on both wafers the capacitance is near the meanvalue at the periphery of the wafer and reduces as we approachto the centre indicating a common cause of variability on bothwafers
Parameters affecting the TSV capacitance are the geometry ofthe TSV (length diameter) and oxide liner characteristics (dielec-tric constant thickness) The variation of the oxide liner thicknessacross the wafer can be predicted by simulation means Oxideliner simulation results for the specific process and fabrication
34 electrical characterization 40
σ-σ
691
Deviation from mean (fF)-70 -50 -30 -10 10 30 50 70
0
002
004
006
008
01
012
014
016
018
02
Figure 318 TSV capacitance D2D variability on wafer rsquoD18rsquo - Probabil-ity density function
075
080
085
090
095
100
105
110
0
5
10
15
20
25
30
-30-25
-20-15
-10-5
Norm
aliz
ed
capacitance
Die X
Die Y
075
080
085
090
095
100
105
Wafer area
Figure 319 TSV capacitance measurement results on wafer D11
34 electrical characterization 41
086088090092094096098100102104106108
0
5
10
15
20
25
30
-30-25
-20-15
-10-5
Norm
aliz
ed
capacitance
Die X
Die Y
086
088
090
092
094
096
098
100
102
104
106
108
Wafer area
Figure 320 TSV capacitance measurement results on wafer D18
tools can be seen in Figure 321 The liner thickness in the graphis normalised to the mean value and xy axis indicate die coordin-ates on the wafer As can be observed the oxide liner thickness isnear the mean value at the periphery of the wafer and increasesto a peak value when approaching the centre Since the TSV ca-pacitance is inversely proportional to the oxide liner thicknesstherersquos a strong correlation between the two variations suggestingthe oxide liner thickness to be the major cause of die-to-die (D2D)variability on wafers rsquoD11rsquo and rsquoD18rsquo
Wafer to wafer (W2W) variability
Probability density functions for both wafers rsquoD11rsquo and rsquoD18rsquo areplotted in Figure 322 Comparing the two experimental data sets(Tables 33 and 34) we can see that although the main statisticaldescriptors have not changed much (standard deviation is slightlylarger in D18) the mean capacitance is much larger in rsquoD18rsquo(895 f F) when compared to rsquoD11rsquo (709 f F) The difference is 262capacitance increase from wafer to wafer Taking into accountthat both wafers are from the same lot this result indicates theneed for better controllability on process parameters
34 electrical characterization 42
109
100
095Nor
mal
ized
oxi
de th
ickn
ess
-5-10 -15 -20 -25 -30
510
1520
2530
0Die Y
Die X
Figure 321 Simulated oxide liner thickness variation
60 65 70 75 80 85 90 95 1000
005
01
015
02
025D11 D18
plusmnσ
plusmnσ
Capacitance (fF)
Figure 322 TSV capacitance W2W variability - Probability density func-tions
34 electrical characterization 43
Structure rsquoCrsquo (Fig 311)
TSV dimensions (ld) 40microm5microm
Oxide liner thickness 200nm
Total dies 472
Processed dies (excluding outliers) 424
Experiment C10 C50
Number of TSVs 4 4
Mean capacitance 4073 f F 4125 f F
Standard deviation (σ) 84 f F 74 f F
Population within σ 695 653
Table 35 Variable pitch TSV capacitance experimental data from waferrsquoD18rsquo
Variable pitch TSV capacitance
The variable pitch TSV capacitance was extracted using structurersquoCrsquo (Figure 311) TSVs with pitch distances at 10microm and 50microm
were measured (experiments C10 and C50) on wafer rsquoD18rsquo Fromthe 472 total dies 424 were functional and the measurementresults are summarised in Table 35 The extracted capacitancevalues represent the 4 parallel-connected TSVs which also includethe parasitics of Metal 2 connectivity wires
As can be observed in the probability distribution functions(Figure 323) the measurement data for experiments C10 and C50
are very close to each other and there is an overlap between theirstandard deviations Even though there is an observable changein mean capacitance between the two experiments the overlap ofstandard deviations does not allow to conclude whether the capa-citance change is due to difference in pitch or process variability
To decide whether the change observed in capacitance is signi-ficant we run a KruskalndashWallis (Appendix B) one-way statisticalanalysis of variance in Minitab3 Assuming that the null hypo-thesis is true we calculate a p-value of 0 indicating that the twodata sets are different and that the pitch has indeed an affect theTSVs capacitance value The pitch dependence of the capacitancewith the same direction has been also previously observed onImecrsquos 130nm 3D process [23]
Measurement accuracy of CBCM
The capacitance measurements extracted using the CBCM tech-nique were compared to conventional LCR measurements (Fig-ure 324) on wafer rsquoD18rsquo Comparing the two measurements we
3 Minitab is a program that implements common statistical functions It is de-veloped by Minitab Inc (httpwwwminitabcom)
35 summary and conclusions 44
376 380 384 388 392 396 400 404 408 412 416 420 424 428 4320
001
002
003
004
005
006C10 C50
Capacitance (fF)
plusmnσC10
plusmnσC50
695 653
Figure 323 Variable pitch TSV capacitance experimental data fromwafer rsquoD18rsquo - Probability density function
-80 -60 -40 -20 00 20 40 60 80 1000
002
004
006
008
01
012
014
016
018σ-σ
688
Deviation from mean (fF)
Figure 324 LCR experimental data from wafer rsquoD18rsquo - Probability dens-ity function
observe in Table 36 that the standard deviations of the two datasets are similar while mean capacitance is sim 8 smaller in theLCR method However the LCR method produced more out-liers than CBCM That was expected as LCR cannot extract thecapacitance of a single TSV but can only measure groups of TSVs
connected in parallel (32 in our example) so as to increase thetotal measured capacitance This increases the chance of measur-ing TSVs with defects which in turn affects the accuracy of theLCR extracted capacitance
35 summary and conclusions
In this chapter various test structures based on the CBCM tech-nique were presented for the electrical characterization of theTSV parasitic capacitance The test structures were implementedon a 65nm 3D process technology and measured using a gen-eral purpose on-wafer measurement system The measurement
35 summary and conclusions 45
Method LCR CBCM
Total dies 472 472
Processed dies (excluding outliers) 393 414
Mean capacitance 820 f F 895 f F
Standard deviation (σ) 23 f F 22 f F
Coefficient of variation (3σmicro) 86 74
Table 36 Comparison between CBCMLCR measurements on waferrsquoD18rsquo
results were statistically analysed in terms of TSV capacitancedie-to-die (D2D) and die-to-wafer (D2W) variability The calculatedD2D variability was found to be reasonable while a comparisonto simulated data of the oxide liner thickness variation revealedthe oxide liner as a major contributor to the observed TSV capa-citance variability However D2W variability was extensive andfeedback was provided to the fabrication engineering team forfurther improvement of the process The results also confirmedthe effect of TSV pitch on the parasitic capacitance as has beenpreviously reported in the literature for a different process [23]
The comparison of CBCM-based measurements to conventionalLCR gave similar results while in addition the CBCM techniquewas shown to produce less outliers (less defective samples) in themeasurement data Considering the simplicity of CBCM and thatit has acceptable accuracy for extracting the capacitance of singleTSVs it can be a valuable tool for process characterization in theearly research efforts of 3D integration
The ability of CBCM for measuring single-ended TSVs could alsoprove beneficial for testing TSVs prior to die stacking which is ancritical step for keeping 3D-IC yield acceptable [47] Extracting TSV
parasitic capacitance can provide important information about theTSV device fabrication quality since slight variations in processparameters will greatly affect its parasitic capacitance as wasdemonstrated in this chapter Techniques based on RC constantmeasurement and sense amplification have been proposed in theliterature for a similar role [16] however they are significantlymore complex to implement than CBCM
In the following chapter another critical TSV characteristic ther-momechanical stress is evaluated using a test structure designedand implemented on the same 65nm 3D process
4T S V P R O X I M I T Y I M PA C T O N M O S F E TP E R F O R M A N C E
41 introduction
The integration of through-silicon via (TSV) 3D interconnects inCMOS technology has raised concerns over the impact on activedevices implemented on the same silicon (Si) substrate The largecoefficient of thermal expansion (CTE) mismatch between the TSV
filling material and Si substrate has been shown to induce ther-momechanical stress and deform the surface in proximity to theTSV Since thermomechanical stress can have an effect on activedevice carrier mobility [71] it is essential that the stress impacton MOSFET performance is kept under control Monitoring theTSV-induced thermomechanical stress effect on MOSFET deviceperformance and generating keep-out-zone (KOZ) guidelines fordigital and analog circuits is an important milestone for design-ing TSV-aware circuits with predictable performance
Simulations using finite element method (FEM) modelling [40]predict that TSV-induced thermomechanical stress has a com-plex distribution that depends on numerous conditions such asTSV geometry device type channel orientation and proximityHowever currently published experimental measurements onthermomechanical stress investigate just a small number of test-cases typically limited to a single TSVchannel orientationaxis[53 70 81] The complex distribution of thermomechanical stressemphasises the need for sophisticated characterization structuresthat can monitor stress impact with precision and verify simula-tion models
In this chapter a test structure is presented for monitoring theimpact of TSV-induced thermomechanical stress on MOSFET deviceperformance The test structure uses proven design techniquesfor implementing a large number of test-cases and accessingnumerous devices with maximum precision Furthermore thetest structure is fabricated on an experimental 65nm 3D process(Table 23) and used for characterizing that process Some of thepresented circuits and results have previously appeared in thefollowing publications [48 49 58]
42 tsv-induced thermomechanical stress
Individual TSV integration technologies use different conductivefill materials which may include copper (Cu) tungsten (W) or
46
42 tsv-induced thermomechanical stress 47
Material CTE at 20 ˚C (10minus6K)
Si 3
Cu 17
W 45
Table 41 Coefficients of thermal expansion
Figure 41 TSV thermomechanical stress simulation [40]
polysilicon [20] The coefficient of thermal expansion (CTE) mis-match (Table 41) between the fill material and the silicon (Si) caninduce strain on the surface in proximity to the TSV This deform-ation is caused by thermomechanical stress along the XY-planeduring the cooling-down phase [25] due to the filling materialrsquoshigher-than-Si contraction rate
Simulations using finite element method (FEM) modelling showthat the thermomechanical stress exhibits two-fold rotationalsymmetry with tensile and compressive stresses concentratedon separate axis [40] The thermomechanical stress distributiondepends on TSV geometrical characteristics as well as the inter-action with other TSVs Figure 41 illustrates the simulated stressdistribution of a single TSV while in Figure 42 the interactionbetween two near-by TSVs is shown As can be observed whenmultiple TSVs are in proximity to each other stress-free areas canbe generated as a result of destructive interaction
It is well known that stress can affect carrier mobility in MOS-
FET devices In fact strained-silicon technology processes havebeen in use for years as means to enhance device performance inseveral logic families [71] However in TSV-based 3D integrationthe induced strain is not intentional since it is a characteristic in-trinsic to the TSV process technology and thus the resulting shiftin device performance might not be always desirable Character-ization of the TSV impact on device performance and definition ofdesign rules such as keep-out-zones (KOZs) for digital and ana-log circuits is an important milestone for implementing TSV-awarecircuits with predictable performance
Simulation results [65] have shown that the mobility shift de-pends on various conditions such as TSV geometry device type
43 test structure implementation 48
Figure 42 TSV thermomechanical stress interaction between two TSVs[40]
proximity and channel orientation In Figure 43 the mobilityshift of transistors between two TSVs at variable distances is simu-lated As can be observed the effect is positive and more intenseon PMOS devices while for n-channel MOSFET (NMOS) the mobil-ity shift is mostly negative
The mobility shift due to TSV proximity has been also con-firmed by experimental data [53 70 81] and in most cases theperformance shift was found to be within acceptable limits How-ever currently published experimental results investigate a smallnumber of test-cases typically limited to a single TSVdevice ori-entationaxis As was shown in Figures 41 and 42 and discussedin the text the thermomechanical stress distribution is complexand depends on various factors In order to evaluate the impactof TSV thermomechanical stress accurately and under differentconditions more sophisticated test structures are required Suchapproach is attempted in the test structure which is presented inthe following sections
43 test structure implementation
TSV-induced thermomechanical stress can be compressive in somedirections tensile in others and itrsquos effect on device mobilitymay vary depending on MOSFET typeorientationdistance orinteraction with other TSVs To accurately characterize the effectof TSV proximity on MOSFET device performance a test structurewas designed and fabricated on Imecrsquos 65nm 3D process whichattempted to cover as many test-cases as possible
43 test structure implementation 49
Figure 43 Estimated mobility shift on PMOS (a) and NMOS (b) tran-sistors in between two TSVs separated by distance d [65]
The test structure consisted of MOSFET arrays superimposedover TSVs such that MOSFET devices surrounded the TSVs in alldirections at variable distances The purpose of these experimentswas to indirectly observe the thermomechanical stress effect ontransistor mobility by measuring the saturation current of MOSFET
devices at various distances and directions in relation to the TSVsSeveral configurations of TSVs and MOSFET arrays were investig-ated so as to characterize the impact of single and multiple TSVsas well as to observe the effect on different device types
Stacked silicon dies were not fabricated thus all devices wereimplemented on the TOP die Si substrate as illustrated in Fig-ure 32 (page 25) The test structure inputoutput (IO) was ac-cessible through a standard 2 times 12 test pad module on the TOP
die compatible with our wafer probe test system In total two testpad modules were designed for this experiment PTP1 and PTP2implementing NMOS-type and PMOS-type arrays respectively Testpad module PTP1 along with the test structure layout is illus-trated in Figure 44 while a micrograph image of the fabricatedchip can be seen in Figure 45
43 test structure implementation 50
A0
B0
C0
D0
E0
F0
G0
H0
I0
J0
A90
B90
E90
F90
G90
J90
DS3 DF3
DF2 DS0
DS2 DF0
DF1 DS1
SS SF
DEC6 VDD
DEC7 GND
DEC8 VG
DEC4 DEC9
DEC5 GND
DEC2 DEC3
DEC0 DEC1
Arraydecoder
Draindecoder
Figure 44 PTP1 test pad mod-ule (layout)
Figure 45 PTP1 test pad mod-ule (micrograph)
43 test structure implementation 51
Metal2Metal1PolyActive
TSV
MOSFETs
07μm007μm 0deg
07μm007μm 90deg
05μm05μm 90deg
05μm05μm 0deg
Figure 46 MOSFET dimensions and rotations
431 MOSFET arrays
Each MOSFET array was a separate experiment consisting of 37 times37 transistors which could be either NMOS-type or PMOS-typeshort channel (07microm times 007microm) or long channel (05microm times 05microm)as well as at 0ordm or 90ordm rotation in relation to TSVs (Figure 46)
The MOSFET arrays enclosed TSVs in one of five different config-urations 1 TSV (Figure 47) 4 TSVs diagonal (Figure 48) 4 TSVs
lateral (Figure 49) 9 TSVs (Figure 410) and no-TSVs In total therewere 16 configurations for each MOSFET array type (NMOSPMOS)as listed in Table 42 The location of each MOSFET array in thetop-level design is illustrated in Figure 44
The number of available IO test pads on each module was2 times 12 which imposed a limitation for accessing all 21904 (16 times37 times 37) MOSFETs To overcome that restriction only a subset ofthe transistors in each MOSFET array were wired out (16times 16) andthus electrically accessed using digital selection logic The restof the transistors in the 37 times 37 MOSFET array were inserted asdummy devices placed in the gaps to ensure uniform boundaryconditions and good matching between transistors (Figure 411)
Since thermomechanical stress exhibits two-fold rotational sym-metry as discussed in 42 the stress effect is the same for tran-sistors on the same axis whether they are placed eastwest ofthe TSV on the latitudinal axis or northsouth of the TSV on thelongitudinal axis The symmetry of the stress distribution wasexploited in the design of the MOSFET arrays so as to increase theresolution of the experiment despite the limited number of tran-sistors As can be seen in Figure 411 the transistors in the arrayare placed in an irregular grid with a 2microm pitch Some transistorsare placed on odd-number distances (1microm 3microm) in relation tothe TSV while others on even-number distances (2microm 4microm)Nevertheless rotational symmetry allows the evaluation of thethermomechanical stress at all directions with a 1microm resolution
43 test structure implementation 52
TSV
MOSFETs Substrate taps
Figure 47 1 TSV MOSFET array configuration
MOSFETs Substrate taps
TSV TSV
TSV TSV
Figure 48 4 diagonal TSVs MOSFET array configuration
43 test structure implementation 53
MOSFETs Substrate taps
TSV
TSV TSV
TSV
Figure 49 4 lateral TSVs MOSFET array configuration
MOSFETs Substrate taps
TSV TSV TSV
TSV TSV TSV
TSV TSV TSV
Figure 410 9 TSVs MOSFET array configuration
43 test structure implementation 54
Configuration RotationFET
width(microm)
FETlength(microm)
Name
0 0deg 07 007 A0
1 0deg 07 007 B0
4diagonal 0deg 07 007 C0
4lateral 0deg 07 007 D0
9 0deg 07 007 E0
0 0deg 05 05 F0
1 0deg 05 05 G0
4diagonal 0deg 05 05 H0
4lateral 0deg 05 05 I0
9 0deg 05 05 J0
0 90deg 07 007 A90
1 90deg 07 007 B90
9 90deg 07 007 E90
0 90deg 05 05 F90
1 90deg 05 05 G90
9 90deg 05 05 J90
Table 42 MOSFET array configurations for test pad modulesPTP1PTP2
43 test structure implementation 55
2μm 1μm
Metal2Metal1PolyActive
FET
DUMMY
TAP
TSV
Figure 411 MOSFET array detail
432 Digital selection logic
Each MOSFET array was composed of 256 active transistors (16 times16) Test modules PTP1 and PTP2 implemented 16 array con-figurations each with 4096 devices in total (16 con f igurations times256 transistors) which could be accessed individually throughthe 24 IO pads of the 2 times 12 test pad modules Individual tran-sistor Gate Drain and Source terminals were multiplexed tothe IO pads using a combination of decoders and transmission-gates (Figure 412) The transmission-gates were used as analogswitches for current flow while the decoders enabled the corres-ponding transmission-gates for accessing the externally selectedtransistor for measurement
The procedure for selecting a single MOSFET device for meas-urement was as follows
bull Array (0 to 15) was enabled using the 4 minus bit Array-selectdecoder The MOSFET Source terminal (S) common to alltransistors in a single array was electrically connected toIO pads ldquoSFrdquoldquoSSrdquo when enabled
bull MOSFET Gate terminal (G0 to G15) was connected to IO
pad ldquoVGrdquo using the 4 minus bit Gate-select decoder and corres-ponding transmission-gates
bull MOSFET Drain terminal (D0 to D15) was connected to IO
pads ldquoDFnrdquoldquoDSnrdquo using the 2 minus bit Drain-select decoder
43 test structure implementation 56
D0
D15
DF[0-3]
DS[0-3]
Drain-select decoder (2-bit)
SSF
SS
Array-select decoder (4-bit)
VG
Gate-select decoder (4-bit)
G0 G15
MOSFET Array
Figure 412 Digital selection logic
and corresponding transmission-gates Drain terminals werealways enabled in groups of 4 and connected sequentiallyto IO pads ldquoDF0rdquoldquoDS0rdquo to ldquoDF3rdquoldquoDS3rdquo thus the 2 minus bit
Drain-select decoder was adequate for accessing all 16Drain terminals in the array
When a MOSFET is switched on a voltage drop is expectedbetween the IO pads and the transistor DrainSource terminalsdue to the resistance of wires To guarantee accurate voltagedelivery transistor DrainSource terminals are ldquosensedrdquo on theldquoDSnrdquoldquoSSrdquo IO pads so as to ensure that the applied voltageon IO pads ldquoDFnrdquoldquoSFrdquo supplies the desired voltage values onthe transistor terminals This procedure known as four-terminalsensing [29] is typically automated when using laboratory meas-urement equipment such as a source measurement unit (SMU)
A single MOSFET array with its digital selection logic can beseen in layout view in Figure 413 Transistor Gate terminals areselected using a 4minus bit decoder (Figure 414) while the externallyapplied voltage on IO pad ldquoVGrdquo passes through transmissiongates (Figure 415) The transmission gates schematic used forthe DrainSource terminals can be seen in Figure 416 The Drainterminal transmission gates are enabled in groups of 4 usinga 2 minus bit decoder (Figure 417) If a MOSFET array is not in theselected state by the Array-select decoder Gate terminals areforced at GND voltage level (for PTP1) or to VDD voltage (forPTP2) so that non-selected MOSFETs remain turned-off insteadof floating
43 test structure implementation 57
Gatedecoder
Metal2Metal1PolyActive
Transmissiongates (DS)
Transmission gates (G)
MOSFETArray
Figure 413 MOSFET array with digital selection logic
43 test structure implementation 58
BLK
A[0] A[1] A[2] A[3]
DEC[0-15]
Figure 414 4-bit Gate-enableArray-enable decoder
43 test structure implementation 59
DEC
VGGVDD
Transmission gate
Pull-downtransistor
Figure 415 Transmission gate for MOSFET Gate terminals
Transmission gates
FORCE
SENSE
DRAINSOURCE
EN
VDD
Figure 416 Transmission gates for MOSFET DrainSource terminals
43 test structure implementation 60
A[0] A[1]
DEC[0-3]
Figure 417 2-bit Drain-select decoder
44 experimental measurements 61
Configuration Type Name Wafer
1 No-TSVPMOS
(05microm times 05microm)F0 D08
2 1-TSV 0ordm 25 ordmCPMOS
(05microm times 05microm)G0 D08
3 1-TSV 0ordm 25 ordmCPMOS
(05microm times 05microm)G0 D20
4 1-TSV 0ordm 60 ordmCPMOS
(05microm times 05microm)G0 D08
54-lateral TSV 0ordm 25
ordmCPMOS
(05microm times 05microm)I0 D08
6 1-TSV 90ordm 25 ordmCPMOS
(05microm times 05microm)G90 D08
7 1-TSV 0ordm 25 ordmCPMOS (07microm times
007microm)B0 D08
Table 43 MOSFET array configurations selected for experimental eval-uation
44 experimental measurements
Test modules PTP1 and PTP2 implementing the presented MOS-
FET arrays were fabricated on 300mm wafers (Figure A1) eachcontaining 472 identical dies Two wafers were selected for meas-urements referred to as ldquoD08rdquo and ldquoD20rdquo both originating fromthe same lot The test module used for experimental evaluationwas PTP2 containing the PMOS-device arrays which were expec-ted to have the highest performance impact from thermomech-anical stress as was discussed in 42 The specific MOSFET arraysthat were selected for evaluation are listed in Table 43
441 Data processing methodology
For each MOSFET array evaluated the drain-to-source current(IDS) of all transistors in the array was measured in the saturationregion (VDS = 1V VGS = 1V) Subsequently the saturation cur-rent was divided with the median current value of all transistorsin the array generating IDS values normalised to the median Themedian of a data set represents the arithmetic value that the ma-jority of values tend (Appendix B) Since only a small percentageof the 256 transistors in the array are in proximity to the TSVsthe median IDS value represented typical transistor saturationcurrent not affected by thermomechanical stress
Measurement precision is increased with repeated measure-ments as random noise is removed from the experiment reducing
44 experimental measurements 62
ABCDEFBBC
FCDDFBCFFC
FCB
A
BAACADE
CECBBC
CFFCDE
B$BCFFBDBCBF
amp(BFC)B
BC
FCDDFA
CAF
Figure 418 Processing of measurement results
the standard deviation of the sampled mean (Appendix B) Con-sequently each MOSFET array was measured across 98 identicaldies on the wafer and their results were averaged Since the satur-ation current shift due to thermomechanical stress is systematicacross all dies averaging the saturation currents only removedthe random noise effectively increasing the measurement pre-cision by a factor of sim 10 (
radic98) The described procedure for
processing the measurement results is illustrated in Figure 418Each MOSFET array contains 256 transistors which are not
placed in a regular 16times 16 grid as can be seen in Figure 411 Thisapproach was chosen so as to minimise the number of transistorsin an array while allowing for 1microm resolution in the evaluation ofthe thermomechanical stress by taking advantage of its two-foldrotational symmetry as discussed in 431 However illustratingmeasurements in a contour graph requires a regular grid of dataThus for illustration purposes the experimental measurementswere reconstructed to a 33 times 31 regular grid (1023 transistors)with the intermediate values estimated using linear interpolation(Figure 419)
442 Measurement setup
For IO access a general purpose on-wafer measurement sys-tem was used that allowed direct contact with the wafer and the2times 12 test pad modules through probecard pins (Figure A2) IO
signal generation and measurement was provided by standard
44 experimental measurements 63
ABCDECDBF
ABECBF
AB
AB
Figure 419 Interpolation of measurement results
laboratory instruments controlled through the GPIB port of a PCrunning a customised LabVIEW program Four-terminal sens-ing for MOSFET DrainSource terminals was automated using aKeithley 2602 dual-channel SMU [1] The measurement setup isillustrated in Figure 420
The IO test pads for the PTP1PTP2 modules can be seen inFigure 44 while the instrument connections used for each testpad are summarised in Table 44
ABCCDE
F
AA
A
DBCB
Figure 420 Measurement setup
44 experimental measurements 64
Test pad Direction Instrument connection
VDD Power Power supply - 1V
GND Power Power supply - 0V
DEC[0-9] InputDigital Output
(GATEDRAINARRAY SelectDecoders Input)
DF[0-3] Input SMU (Drain Force terminal)
DS[0-3] Output SMU (Drain Sense terminal)
SF Input SMU (Source Force terminal)
SS Output SMU (Source Sense terminal)
VG Input Digital Output (Gate voltage)
Table 44 PTP1PTP2 test pad module instrument connections
443 PMOS (long channel)
No-TSV configuration
This MOSFET array does not contain any TSVs thus transistormobility is not affected by thermomechanical stress The arraywas measured on 98 identical dies of wafer ldquoD08rdquo the saturationcurrents normalised and then averaged over all measured diesusing the procedure elaborated in 441 In Figure 421 the norm-alised saturation current for each transistor in the array can beseen after interpolation to a 33 times 31 grid Each coordinate in thefigure represents a 1microm distance
As can be observed the majority of transistors in the array havealmost identical saturation currents as expected However there isa distinctive stripe running from north to south where the currentvalue is reduced up to sim 5 This revealed a flaw in the designof the MOSFET arrays that can be best observed in Figure 411The transistors that are located adjacent to the substrate tap donot perceive the same boundary conditions as the rest of thetransistors in the array That difference is enough to affect thephysical characteristics of these transistors during fabricationwhich in turn determine transistor parameters such as saturationcurrent This flaw stresses out the significance of careful boundarycondition design when targeting for good transistor matching ina test structure
1 TSV configuration
In this configuration long-channel MOSFETs (05microm times 05microm) areplaced in the periphery of a single TSV (Figure 47) The transistorsin the array were measured over 98 identical dies of wafer ldquoD08rdquo
44 experimental measurements 65
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
096
097
098
099
100
101
Figure 421 Normalised on-current for PMOS structure ldquoF0rdquo (D08)
and processed according to 441 The normalised saturation cur-rent for each transistor in the array can be seen in Figure 422 afterinterpolation to a 33 times 31 grid It can be clearly observed that thetransistors in proximity to the TSV show a distinctive variationin their saturation current in comparison to the other transist-ors in the array which is the direct consequence of TSV-inducedthermomechanical stress These results also demonstrate that theproposed test structure design has enough measurement preci-sion to distinguish thermomechanical stress from backgroundnoise and variability The north to south stripe observed in theprevious configuration due to boundary condition mismatch isalso present in these measurements
To observe the TSV effect on nearby transistors more accuratelythe normalised saturation current versus distance is plotted foraxis X = 16 and Y = 14 (Figure 423) which cross the TSV almostat its the centre In the graph only actual measurements are plot-ted prior to interpolation We observe that the saturation currentof a transistor at 1microm distance from the TSV increases sim 20 onthe latitudinal axis and decreases sim 15 on the longitudinalwhile the effect is almost negligible after 8microm
The same measurement is repeated on wafer ldquoD20rdquo In Fig-ure 424 the normalised saturation current for each measuredtransistor in the array can be seen after interpolation to a 33 times 31grid Also the normalised saturation current versus distance isplotted for axis X = 16 and Y = 14 (Figure 425) In the graph
44 experimental measurements 66
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
09
10
11
12
105
110
115
120TSV
Figure 422 Normalised on-current for PMOS structure ldquoG0rdquo (D08)
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
085
09
095
1
105
11
115
12
125Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 423 Normalised on-current for PMOS structure ldquoG0rdquo(D08Y=14X=16)
44 experimental measurements 67
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
09
10
11
12
13
TSV
Figure 424 Normalised on-current for PMOS structure ldquoG0rdquo (D20)
we observe that the saturation current of a transistor at 1microm dis-tance from the TSV increases sim 30 on the latitudinal axis anddecreases sim 18 on the longitudinal while the effect is almostnegligible after 8microm These results show that thermomechanicalstress can be quite different at distances very close to the TSV
even in between wafers originating from the same lotFinally the measurement is repeated on wafer ldquoD08rdquo after
increasing the temperature to 60 ordmC Normalised current contourplot can be seen in Figure 426 and measurements for X = 16and Y = 14 in Figure 427 The saturation current of a transistor
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
09
1
11
12
13
14
Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 425 Normalised on-current for PMOS structure ldquoG0rdquo(D20Y=14X=16)
44 experimental measurements 68
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
090
095
100
105
110
115TSV
Figure 426 Normalised on-current for PMOS structure ldquoG0rdquo (D0860˚C)
at 1microm distance from the TSV increases sim 16 on the latitudinalaxis (minus4 in comparison to 25 ordmC) and decreases sim 11 on thelongitudinal (minus4 in comparison to 25 ordmC) while the effect isalmost negligible after 6microm (minus2microm in relation to 25 ordmC)
4 lateral TSV configuration
In this configuration the thermomechanical stress is evaluatedin the lateral direction between 4 TSVs The array is measuredover 98 identical dies of wafer ldquoD08rdquo at 25 ordmC The contour plot ofthe normalised saturation current can be seen in Figure 428 andmeasurements for X = 16 and Y = 14 in Figures 429 and 430
respectively As can be observed the saturation current of atransistor at 1microm distance from the TSV increases sim 18 on thelatitudinal axis (minus2 in relation to 1 TSV) and decreases sim 12 onthe longitudinal (minus3 in relation to 1 TSV) while TSV interactionreduces the effect range to 5microm (minus3microm in relation to 1 TSV)
90˚ rotation TSV configuration
In this configuration the orientation of the transistor channel inrelation to the thermomechanical stress is rotated to 90˚ Thearray is measured over 98 identical dies of wafer ldquoD08rdquo at 25 ordmCThe contour plot of the normalised saturation current can beseen in Figure 431 and measurements for X = 16 and Y = 14in Figure 432 As was expected the stress effect on mobility has
44 experimental measurements 69
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
085
09
095
1
105
11
115
12Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 427 Normalised on-current for PMOS structure ldquoG0rdquo (D08Y=14 X=16 60˚C)
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
085
090
095
100
105
110
115
TSV
TSV TSV
TSV
Figure 428 Normalised on-current for PMOS structure ldquoI0rdquo (D08)
44 experimental measurements 70
2 1 1 2 3 4 5 6 7 7 6 5 4 3 2 1 1 2 309
095
1
105
11
115
12
X16
Distance (μm)
Normalized
Ion
TSV TSV
Figure 429 Normalised on-current for PMOS structure ldquoI0rdquo (D08X=16)
3 2 1 1 2 3 4 5 6 7 8 8 7 6 5 4 3 2 1 1 208208408608809092094096098
1102
Y14
Distance (μm)
Normalized
Ion
TSV TSV
Figure 430 Normalised on-current for PMOS structure ldquoI0rdquo (D08Y=14)
44 experimental measurements 71
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
090
095
100
105
110
115
120TSV
Figure 431 Normalised on-current for PMOS structure ldquoG90rdquo (D08)
been reversed between the two axis The saturation current of atransistor at 1microm distance from the TSV decreases sim 13 on thelatitudinal axis and increases sim 20 on the longitudinal whilethe effect is almost negligible after 6microm
444 PMOS (short channel)
In this configuration short-channel MOSFETs (07microm times 007microm) areplaced around a single TSV (Figure 47) The contour plot of thenormalised saturation current can be seen in Figure 433 and
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
08509
0951
10511
11512
125Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 432 Normalised on-current for PMOS structure ldquoG90rdquo (D08Y=14 X=16)
44 experimental measurements 72
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
094
096
098
100
102
104
106
108
110
TSV
Figure 433 Normalised on-current for PMOS structure ldquoB0rdquo (D08)
measurements for X = 16 and Y = 14 in Figure 434 At at 1microm
distance from the TSV the saturation current decreases 7 onthe longitudinal axis (minus8 in relation to long channel) while onthe latitudinal axis no stress effect could be detected That couldbe attributed either to short-channel effects or malfunctioningof the specific transistor due to proximity to the TSV and theexperimental nature of the manufacturing process On both axisthe effect is almost negligible after 5 minus 6microm
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
085
09
095
1
105
11
115Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 434 Normalised on-current for PMOS structure ldquoB0rdquo (D08Y=14 X=16)
45 summary and conclusions 73
Size
Waf
er
Con
figu
rati
on
Rot
atio
n
Eff
ect
1micro
mla
t
Eff
ect
1micro
mlo
ng
KO
Z
L D08 1-TSV (25 ordmC) 0ordm +20 minus15 8microm
L D20 1-TSV (25 ordmC) 0ordm +30 minus18 8microm
L D08 1-TSV (60 ordmC) 0ordm +16 minus11 6microm
L D08 4-lat TSV (25 ordmC) 0ordm +18 minus12 5microm
L D08 1-TSV (25 ordmC) 90ordm minus13 +20 6microm
S D08 1-TSV (25 ordmC) 0ordm sim 1 minus7 8microm
Table 45 Summary of experimental measurements on long (L) andshort (S) channel PMOS devices
45 summary and conclusions
In this chapter a test structure was presented for monitoringthe impact of TSV-induced thermomechanical stress on MOSFET
device performance The test structure implemented arrays ofMOSFET devices superimposed with TSVs in various configura-tions Digital selection logic allowed access to the MOSFET deviceswhich could be used to indirectly monitor thermomechanicalstress by measuring the saturation current of transistors
The structure was implemented in Imecrsquos experimental 65nm
3D process and used to characterize the process technology whileevaluating the test structure Experimental measurements werecarried on PMOS devices and the thermomechanical stress wasfound to be on par with simulation estimations
In Figure 45 the measurements are summarised for the latit-udinal and longitudinal axis at 1microm distance from the TSV alongwith the recommended keep-out-zone (KOZ) The KOZ is indic-ative and represents the distance where the thermomechanicalstress effect on saturation current is almost negligible (lt 1) Ina practical case KOZs will depend on the tolerated saturation cur-rent mismatch between devices which is a direct consequence oftiming and other circuit constraints Typically digital circuits cantolerate and thus function properly with much more transistorvariability than analog circuits
Comparing the experimental results of this work to other pub-lished data in the literature reveals that the thermomechanicalstress effect varies dramatically with different process character-istics Experimental measurements reported by Cho et al [17]confirm that long channel devices are more sensitive than short
45 summary and conclusions 74
channel however both the saturation current shift and KOZs areconsiderably reduced to 2 and lt 2microm respectively In additionthe dependency of the stress effect on TSV position was not ob-served by the authors in contrast to the measurements presentedin this work and predictions by theoretical models Contradictoryresults between theory and experiments and between differentexperiments emphasise the need for better understanding of thespecific mechanisms involved in the TSV-induced thermomech-anical stress
Evaluation of the measurements results allowed for some con-clusions to be made on the effectiveness of the presented teststructure
1 It is desirable for transistors in the array to be in a regu-lar grid so as processing of the measurement data is lesscomplex
2 The boundary conditions of transistors have an consider-able effect on transistor matching and thus measurementaccuracy
3 Distances of interest are generally lt 8microm therefore thearrays could be smaller or more dense in the area in prox-imity to the TSV than further away
4 The experimental process used to implement the test struc-ture had low yield which stressed out the need for mech-anisms to detect faults in the digital selection logic
The experiences learned from this project can be potentially usedfor designing improved thermomechanical stress characterizationstructures for next generation TSV processes In a future workit would be interesting to perform experimental measurementson NMOS devices which were not evaluated in this project Fur-thermore the cause of the glitch measured on the short-channeltransistor nearest to the TSV should be further investigated
In the following chapter the potential of the energy-recoverytechnique is investigated for reducing the energy dissipation ofTSV interconnects in future 3D-SoC designs implementing a largenumber of TSVs
5E VA L U AT I O N O F E N E R G Y- R E C O V E RY F O R T S VI N T E R C O N N E C T S
51 introduction
TSV-based 3D-ICs enable low-parasitic direct connections betweenfunctional blocks in a SoC by replacing traditional long horizontal2D global interconnects with short vertical ones (21 page 5)Compared to 2D interconnects the reduced resistance-capacitance(RC) parasitics of TSVs improve both speed (RC) and energy(12CV2
DD) performance of circuits However the TSV parasiticcapacitance may still become an important source of energydissipation in large densely interconnected 3D-SoCs since thecombined capacitance and thus the energy required to drive TSVs
increases linearly with the number of tiers and interconnections(Figure 51)
The principal technique for reducing energy dissipation inCMOS has traditionally been based on supply voltage scaling[14] though this represents a technical challenge below the 65nm
node since PVT variations and physical limits decrease in practicethe range over which voltages may be scaled An alternativelow-power design technique is energy-recovery logic [31] whichdemonstrates frequency-dependent energy dissipation and thusdoes not suffer from the limitations of voltage scaling The originsand operating principle of energy-recovery logic were discussedin detail in 22 (page 13)
In this chapter the potential of the energy-recovery techniqueis investigated for reducing the energy dissipation of TSV inter-connects in 3D-ICs The total energy dissipation per cycle andoptimum device sizing are extracted for the proposed schemeusing theoretical modelling while the configuration is evaluatedagainst conventional static CMOS Some of the presented methodsand results have appeared previously in Asimakopoulos et al[7]
52 energy-recovery tsv interconnects
As was discussed in 22 (page 13) energy-recovery circuits con-serve energy by restricting current-flow across resistances andsubsequently recovering part of the supplied energy using time-variable power sources The energy dissipated in an energy-
75
52 energy-recovery tsv interconnects 76
1XTSV
Tier1
TSV
Tier0LOGIC 0
LOGIC 0
TSV
LOGIC 0
LOGIC 0
TSV
LOGIC n
LOGIC n Tier1
Tier0
TSV
LOGIC 0
LOGIC 0
TSV
LOGIC n
LOGIC n
Tier(m-1)
Tier(m)
mXnXTSVFigure 51 TSV energy dissipation per cycle increase with 3D integra-
tion density
recovery circuit becomes proportional to the charging period (T)of the capacitive load (CL) [82 60]
EER prop
(
RCL
T
)
CLV2DD (51)
In comparison a conventional CMOS circuit driving an equi-valent capacitive load would dissipate during the same chargingperiod T [59]
ECMOS =12
CLV2DD (52)
Combining Equations 51 and 52 we calculate the energydissipation relation between the two circuits
EER
ECMOSprop
2RCL
T(53)
From Equation 53 we observe that the conditions under whichenergy-recovery conserves energy when compared to conven-tional CMOS are when either the charging period (T) is large orthe resistancecapacitance (RCL) of the load is small
In a 3D-IC TSV parasitics in terms of resistance-capacitance (RC)are inherently low by design as was discussed in 21 (page 5)In fact as was demonstrated by Katti et al [27] resistance isnegligible and thus the electrical behaviour of a TSV in a 3D-IC
52 energy-recovery tsv interconnects 77
Tier2Tier1TSV
LOGIC 1
LOGIC 2
driver
TSV
LOGIC 1
LOGIC 2
CMOS
driver
Convert
Energy-recovery
PCG
Figure 52 Comparison of a typical CMOS TSV driver to the proposedenergy-recovery
can be modelled as a single capacitance due to its predomin-ant impact on energydelay Therefore and based exclusivelyon Equation 53 energy-recovery has an advantage over con-ventional static CMOS when used for driving low-parasitic TSV
interconnectsThe differences between a typical CMOS driver in a TSV-based
3D-IC and the energy-recovery scheme proposed in this chapterare illustrated in Figure 52 A logic block in Tier1 generates adigital output which is regenerated by a driver and receivedby a second logic block in Tier2 after passing through the TSVIn the proposed energy-recovery scheme as an intermediatestep the digital output is converted to an adiabatic signal beforepassing through the TSV so as to take advantage of the low-power characteristics of energy-recovery Compatibility betweenthe receiving digital logic block in Tier2 and energy-recovery isretained by restoring the adiabatic signal back to its digital form
Converting the digital signal to an adiabatic as depicted inthe simplified diagram of Figure 52 requires the use of a spe-cialised driver that can combine (mix) the time-variable signalof the power-clock generator (PCG) with a digital output Signalsgenerated by a PCG return to zero at the end of a cycle resem-bling the operation of a digital clock (222 page 18) In contrastdigital outputs are typically allowed to have arbitrary values fromone cycle to the next thus a special structure is required thatcan encode the PCG generated signal to arbitrary binary values
52 energy-recovery tsv interconnects 78
IN IN
ININ
OUT OUT
IN
IN
OUT
OUT
Figure 53 Adiabatic driver
Figure 54 A simple pulse-to-level converter (P2LC)
An example of such structure is the adiabatic driver [8 84] (Fig-ure 53) Adiabatic drivers are composed of a pair of transmissiongates switching onoff alternately based on the value of a digitalinput On each cycle the time-variable adiabatic signal is redirec-ted either to the driverrsquos inverting or non-inverting output andas a result the data carried by the adiabatic signal are dual-railencoded
The signals at the outputs of the adiabatic driver can be restoredto their digital form after passing through the TSV by convertingthe dual-rail encoded sinusoidal pulses back to level signals usinga pulse-to-level converter (P2LC) [72] (Figure 54)
For generation of the time-variable adiabatic signal a resonantPCG may be used such as the one discussed in 222 (page 18)A resonant PCG generates continuous sinusoidal pulses approx-imating the idealised (linear) adiabatic signal of Figure 52 Theadiabatic driverrsquos resistance (RTG) along with the capacitive load(CL) assume a crucial role in the generation of the resonant si-nusoidal pulse as their presence in the signal path along with
53 analysis 79
VDD2
Rind
RTG
AdiabaticDriver
CL
M1
T=1f
L ind
RTG
CL
Figure 55 Resonant pulse generator with dual-rail data encoding
Tier2
Tier1
f-fPULSE
VDD2CLK IN
Delay1CLK1
CLK0
Delay2CLK2OUT1
Data In
Data Out
CLKRES
TSV TSV TSV
CLKBufferAdiabatic
Driver
P2LC
f-f
Data1
M1
Figure 56 Energy-recovery scheme for TSV interconnects
a resonant-tank inductor form an RLC oscillator [44] (Figure 55)resonating at
f =1
2πradic
LindCL(54)
A detailed block diagram of the proposed energy-recoveryscheme including the PCG and dual-rail encoding is illustratedin Figure 56
53 analysis
The energy dissipation relation depicted in Equation 51 is basedon the assumption that an idealised (linear) adiabatic signalis used for charging the TSV capacitance Furthermore energy
53 analysis 80
losses occurring in the additional components used in the realisticenergy-recovery circuit illustrated in Figure 56 are not consideredin Equation 51 The purpose of this section is to analyse andmodel all possible sources of energy dissipation in the proposedenergy-recovery scheme
The bulk of the energy dissipation in the energy-recovery cir-cuit of Figure 56 will occur in the adiabatic driver (EAD) theinductorrsquos parasitic resistance (Rind) transistor M1 (EM1) and theP2LC (EP2L) Therefore the total energy dissipated per cycle inthe energy-recovery circuit would be
EER = EAD + Eind + EM1 + EP2L (55)
Parameters EAD Eind and EM1 are theoretically derived in thefollowing subsections while EP2L depends on the specific P2LCs
architecture used and could be easily extracted from simulationdata as will be shown later in this chapter
531 Adiabatic driver
There are two sources of energy dissipation in the adiabaticdriver of Figure 53 adiabatic dissipation caused by current flow-ing through the transmission gate resistance (RTG) and CMOS
dissipation caused by external stages driving the transmissiongate and pull-down NMOS transistor input capacitances
The sinusoidal pulses generated by the PCG chargedischargethe load capacitance (CL) through the transmission gate resistance(RTG) The full chargedischarge cycle of the load capacitanceconcludes during a period T = 1 f while the energy dissipatedon the transmission gate resistance (ETG) during that period isadiabatic and thus proportional to Equation 22 Appending acorrection factor ξ equal to π28 for a sine-shaped current [8 76]and multiplying the equation by 2 to calculate energy dissipationfor the full cycle
ETG = 2ξ
(
RTGCL12 T
)
CLV2DD =
π2
2(RTGCL f )CLV2
DD (56)
The cross-coupled PMOS pair of Figure 53 reduces the transmis-sion gate input capacitance by 12 when compared to the simpleadiabatic driver of Figure 215 (page 18) External CMOS stagesare not required to drive the PMOS pair however the PMOS gatecapacitance will appear as an additional capacitive load at theoutput of the driver Energy will still be dissipated for driving thePMOS pair but will be adiabatic and thus more energy efficient
53 analysis 81
Transmission gates are typically sized to have equal NMOS andPMOS parts as increasing the size of the PMOS only slightly im-proves the gate resistance while significantly increasing the inputcapacitance [78] Thus assuming both NMOSPMOS transistorsare equally sized to Wn the additional capacitive load appearingat the driverrsquos output due to the PMOS pair would be one gatecapacitance Cn In addition drainsource diffusion capacitance(CD) will be an important portion of the load since in each cycle6CD will be present in the current flow path (4 contributed bythe ON and 2 by the OFF transmission gate) Including the TSV
capacitance at the output of the adiabatic driver the combinedload capacitance seen by the PCG in any cycle would be
CL = CTSV + Cn + 6CD (57)
The transmission gate resistance (RTG) can be related to thegate capacitance by a ldquodevice technology factorrdquo (κTG) [43] whichwe can define for calculation convenience
κTG = RTGCn rArr RTG =κTG
Cn(58)
Assuming that the pull-down NMOS transistors of Figure 53are small CMOS dissipation is primarily the result of driving theNMOS part of the transmission gates which is CnV2
DD Combiningthe CMOS dissipation with Equations 56 57 and 58 gives thetotal dissipated energy per cycle in the adiabatic driver
EAD = CnV2DD +
π2
2κTG
Cnf [CTSV + Cn + 6CD]
2 V2DD (59)
The second term of Equation 59 has a consistent contributionon the energy dissipation in every cycle while the first term isdependent on data switching activity (D) We can also furthersimplify this equation by defining the diffusion capacitance asa fraction of the transistors input capacitance (CD = bCn) andassigning term π2
2 κTG f to a variable (y = π2
2 κTG f ) Equation 59then becomes
EAD = D middot CnV2DD +
y
Cn[CTSV + (6b + 1)Cn]
2 V2DD
=
[
Cn
(
D
y+ (6b + 1)2
)
+1
CnC2
TSV
]
middot yV2DD
+(12b + 2)CTSV middot yV2DD (510)
Since in Equation 510 Cn is the free parameter the first twoterms are inversely proportional and thus EAD is minimised when
53 analysis 82
they become equal Solving equation Cn
(
Dy + (6b + 1)2
)
= 1Cn
C2TSV
for Cn the optimum transmission gate input capacitance is calcu-lated for a given CTSV
Cn(opt) =
radic
[D
y+ (6b + 1)2]minus1CTSV (511)
Combining Equations 510 and 511 gives the energy dissipationof an optimally sized adiabatic driver
EAD(min) =
[radic
D
(6b + 1)2y+ 1 + 1
]
middot (12b+ 2)yCTSVV2DD (512)
532 Inductorrsquos parasitic resistance
The inductorrsquos (Lind) parasitic resistance Rind is proportional tothe Qind factor which typically depends on the inductorrsquos imple-mentation technology For the purposes of this analysis it can beestimated as
Qind =1
Rind
radic
Lind
CLrArr Rind =
1QindCL2π f
(513)
Energy dissipation in Rind is adiabatic and can be estimatedcombining Equations 56 and 513
Eind =π2
2(RindCL f )CLV2
DD =π
4CL
QindV2
DD (514)
533 Switch M1
Transistor M1 is switched-on briefly to recover the energy dissip-ated each cycle in the Rtotal = RTG + Rind Its energy dissipationis a trade-off between dissipation in its on-resistance RM1 andinput capacitance CM1
Since M1 is a fairly large transistor previous ratioed stages willhave a significant contribution to the energy consumption as wellFor that reason a m factor is used to compensate for the additionallosses IM1(rms) is the root mean square (RMS) current passingthrough the transistor while turned-on and VGM1 is the peak gatevoltage A methodology for deriving optimum values for boththese parameters is proposed in [44] and the total dissipatedenergy in M1 can be estimated as
EM1(min) = 2IM1(rms)VGM1
radic
mκM1
f(515)
54 simulations 83
Delay2
CLK1
CLK2Delay1
CLKW
PULSE
CLK0T
PD
OUT1
PULSE
P2L
RES
CLK
Figure 57 Energy-recovery timing constraints
534 Timing Constraints
The gate of transistor M1 in Figure 55 is controlled by a pulsewith period TCLK and width WPULSE For optimum energy ef-ficiency the digital inputs of the adiabatic driver in Figure 53should switch during the time period defined by WPULSE whilethe input values must be valid for the complete resonant clockcycle as shown in Figure 57
The adiabatic driverrsquos input data latching is controlled by clocksignal CLK1 (Figure 56) and thus its positive edge should takeplace during WPULSE (Delay1 lt WPULSE) If these constraints arenot met energy that could be otherwise recovered by the inductorwill be dissipated on M1 during WPULSE The clock signal at thereceiving end CLK2 does not have the same constraint howeveritrsquos positive edge should occur before the P2LC outputs new data(Delay2 lt WPULSE + PDP2L)
54 simulations
To verify and evaluate the proposed theoretical models a testcase was designed for which energy dissipation was calculatedusing both theory and SPICE simulations The test case assumed1 minus bit of information per cycle was transferred between 2 tiersof a 3D-IC in which the TSV capacitance (CTSV) was 80 f F TheTSV was driven using the energy-recovery technique proposedin this chapter implemented with a single adiabatic driver fordata encoding and an inductor (Lind Rind) for generation of the
54 simulations 84
M1IN IN
IN
OUT OUT
IN
Figure 58 Test case for the evaluation of the theoretical model
Q250MHz 500MHz 750MHz
T S e () T S e () T S e ()
6 360 409 -120 476 516 -78 578 613 -57
8 308 347 -114 417 449 -70 515 543 -52
10 276 312 -115 382 411 -70 477 498 -43
12 255 289 -119 358 386 -71 451 472 -44
Table 51 Energy dissipation results ( f Jcycle) theoretical model (T)versus SPICE simulation (S)
adiabatic signal and energy recovery The circuit used for the testcase is illustrated in Figure 58
The energy dissipation on the adiabatic driver inductor andtransistor M1 were calculated using Equations 512 514 and 515with the Q factor and operating frequency acting as free variablesTechnology-specific parameters κTG κM1 and CD were extractedusing simulation models for Imecrsquos 130nm 3D process (Table 22page 11) The same circuit was simulated on Cadence VirtuosoSpectre simulator under identical conditions The energy dissipa-tion results for various Q factors and operating frequencies arelisted in Table 51
The calculated and simulated energy dissipation results arealso plotted in Figure 59 As can be observed theoretical es-timations are lower than than the equivalent results reported bythe simulator in all the cases examined That can be expected asthe calculations performed by the simulator are thorough andconsider much more details that are skipped in the theoreticalmodels Nevertheless there is good correlation between theoryand simulations
In Figure 510 the error in the estimation of the energy dissip-ation is plotted when using the theoretical model The Q factordoes not seem to have a significant effect on the error howeverthere is a considerable dependence on the operating frequency
54 simulations 85
5 6 7 8 9 10 11 12 130
10
20
30
40
50
60
70
T250 S250 T500 S500 T750 S750
Q factor
Energydissipation(fF)
Figure 59 Energy dissipation results ( f Jcycle) theoretical model (T)versus SPICE simulator (S)
5 6 7 8 9 10 11 12 13
-14
-12
-10
-8
-6
-4
-2
0
250MHz 500MHz 750MHz
Q factor
Error()
Figure 510 Energy dissipation estimation error of the theoretical modelcompared to the SPICE simulator
The parameters κTG κM1 and CD that were extracted for thespecific technology even though they were used as constants inthe models in practice they are frequency dependent Thereforeit could be said that based on the parameters chosen the theor-etical results are optimised for a specific frequency Increasingfrequency doesnrsquot necessary result in reduced error as seemsto be suggested by the graphs in Figure 510 In fact it is ex-pected that the estimation error will show a positive sign forlarger frequencies As was discussed previously in the text it isa reasonable expectation for the theoretical models to slightlyunderestimate the energy-dissipation
54 simulations 86
M1
TSV
TSV
TSV
Data in OutCMOS
OutER
Energy-recovery
CMOS
P2LC
Adiabaticdriver
Figure 511 Test case for comparison of energy-recovery to CMOS
541 Comparison to CMOS
When comparing different circuits it is typically desirable to takea holistic approach evaluating various aspects such as powerarea speed cost etc The purpose of this section is to comparethe energy-recovery scheme to a conventional static CMOS circuitusing just the theoretical models presented in this chapter Thisbasically limits the comparison to the energy dissipation andspeed parameters while on Chapter 6 (page 92) area is alsoconsidered The test case used for the comparison is illustratedin Figure 511 The TSV load capacitance was assumed to be 80 f F
for both circuitsThe energy dissipation of the CMOS circuit in Figure 511 is
described by Equation 52 Appending the data switching activity(D)
ECMOS = D middot 12
CLV2DD (516)
Equation 516 obviously calculates the theoretical minimumenergy dissipation since no parasitics and previous CMOS stagesare included In practice the energy dissipation of the CMOS
circuit can be much larger however Equation 516 is consideredadequate for the purposes of this comparison
Energy dissipation of the energy-recovery circuit is calculatedusing Equation 55 The energy contribution of each sub-componentthe adiabatic driver (EAD) the inductorrsquos parasitic resistance(Rind) and transistor M1 (EM1) were theoretically derived pre-viously in the text The P2LC energy was simulated in SPICEfor convenience The simple circuit topology of Figure 54 wasslightly improved adding the PMOS transistors seen in Figure 512so as to reduce short-circuit current when the input is in the range
54 simulations 87
Figure 512 Improved P2LC design
0 100 200 300 400 500 600 700 800 9000
2
4
6
8
10
12
14
Frequency (MHz)
Energy
dissipation(fJ)
Figure 513 Simulated energy dissipation of improved P2LC design
Vth rarr (VDD minus Vth) The energy dissipation results extracted us-ing SPICE can be seen in Figure 513 As can be observed theenergy dissipation is frequency dependent something not expec-ted in a CMOS circuit This behaviour is caused by the slow risingsinusoidal inputs which for low frequencies extend the periodduring which both PMOS and NMOS are on and thus increaseshort-circuit current
The Q factor is a critical parameter in the energy efficiency ofthe energy-recovery scheme since the entire circuit is basically anRLC oscillator A wide range of inductors are available as discretecomponents for use in circuits with Q factors depending on thesize and material On-chip integrated inductors have more re-stricted performance however their compact size is an advantageIntegrated Q factors are typically in the neighbourhood of 10for nH range inductors with sim 100microm radius [5] The Q factor
54 simulations 88
4 6 8 10 12 14 16 18 20 22-20
-10
0
10
20
30
40
50
60
800MHz 700MHz 600MHz 500MHz 400MHz 300MHz 200MHzQ factor
Energyred uction()
Figure 514 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS for variable Q factor(C = 80 f F)
4 6 8 10 12 14 16 18 20 220050
100150200250300350400450
Inductor M1 Adiabatic driver P2LC
Q factor
Energy
con trib
ution(
)
Figure 515 Energy contribution of energy-recovery circuit componentsat 200MHz (C = 80 f F)
values in the following simulations were deliberately chosen tobe in realistic ranges for integrated inductors
In Figure 514 the percentage reduction in energy dissipationachieved by the energy-recovery circuit is plotted when comparedto CMOS As expected energy efficiency of the energy-recoverycircuit improves with high Q factors and low frequencies
In Figures 515 and 516 the energy dissipation contributionof each component in the energy-recovery circuit can be seenat 200MHz and 800MHz respectively In both cases the energydissipated on the inductorrsquos resistance is reduced with higher Q
factors At lower frequencies the P2LC energy dissipation becomesa significant part of the total
In Figure 517 the percentage reduction in energy dissipationis plotted when the TSV capacitance (CTSV) is variable The en-ergy dissipation in the adiabatic driver (Equation 512) inductor(Equation 514) and transistor M1 (Equation 515) are linearlyrelated to the TSV capacitance Since the same applies to the CMOS
circuit (Equation 516) relative energy dissipation of both circuitsshould be constant with TSV capacitance The change observed in
54 simulations 89
4 6 8 10 12 14 16 18 20 2200
100
200
300
400
500
600
700
Inductor M1 Adiabatic driver P2LC
Q factor
Energy
con trib
ution(
)
Figure 516 Energy contribution of energy-recovery circuit componentsat 800MHz (C = 80 f F)
20 40 60 80 100 120 140 160 180 200 220-100
00
100
200
300
400
500
Q6 Q10 Q14 Q18
TSV capacitance (fF)
Energy
red u
ction(
)
Figure 517 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS with variable TSVcapacitance (C = 80 f F f = 500MHz)
Figure 517 is caused by the constant energy dissipation contribu-tion of the P2LC However as the load capacitance increases theenergy dissipation on the P2LC becomes insignificant in relationto the rest of the circuit components thus the reduction in energyis expected to asymptotically approach a maximum value
Switching activity (D) can also be a significant factor affect-ing energy performance Since in the energy-recovery circuit thesinusoidal oscillation may not be halted all capacitances in thecurrent flow path will charge and discharge on each cycle regard-less of data activity In contrast static CMOS ideally dissipatesenergy only when switching and thus the energy-recovery circuithas a disadvantage at low switching activities In Figure 518 theestimated effect of the switching activity on energy performanceis plotted for an operating frequency of 200MHz
Since the technology factors κTG and κM1 were extracted for a130nm process reducing their value by 12 could also provide uswith an estimation of the circuitrsquos energy performance for a 65nm
node The result in plotted in Figure 519 and as can be observedtechnology scaling has a positive effect on energy dissipationwhen compared to conventional static CMOS
54 simulations 90
4 6 8 10 12 14 16 18 20 22-40
-20
0
20
40
60
80
D=05 D=07 D=09 D=1
Q factor
Energyred uction()
Figure 518 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS (C = 80 f F f =200MHz)
4 6 8 10 12 14 16 18 20 220
10
20
30
40
50
60
130nm 65nm
Q factor
Energyred uction()
Figure 519 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS for variable technology(C = 80 f F f = 500MHz)
55 summary and conclusions 91
55 summary and conclusions
The work presented in this chapter investigated the potential ofthe energy-recovery technique as used in adiabatic logic andresonant clock distribution networks for reducing the energy dis-sipation of TSV interconnections in 3D-ICs The proposed energy-recovery scheme for TSVs was analysed using theoretical model-ling generating equations for energy dissipation and optimumdevice sizing The model accuracy was evaluated against SPICEsimulations which showed good correlation with the theoreticalestimations on a 130nm technology process
The energy-recovery scheme was compared to conventionalstatic CMOS when both circuits driving equivalent TSV load ca-pacitances under similar conditions The analysis revealed theenergy performance dependence on multiple circuit parametersthe Q factor operating frequency switching activity and TSV capa-citance The results demonstrated favourable energy performancefor the energy-recovery scheme for low operating frequenciesand high Q factors switching activities and TSV capacitancesFurthermore an estimation was provided on the energy perform-ance of the proposed scheme in an advanced technology nodewhich predicted further energy improvement
Encoding the PCG signal to arbitrary digital values using thedual-rail adiabatic driver (Figure 53) is only one of the possiblesolutions In fact energy could be saved if instead of dual-railanother encoding method is used such as 1-out-of-4 [74] In 1-out-of-4 encoding 4 outputs are used to represent 2-bits however onlyone of the outputs is charged at any given cycle In comparisondual-rail charges two outputs per cycle for the same number ofbits and thus 1-out-of-4 could potentially be saving up to 50energy when compared to dual-rail 1-out-of-4 encoding couldprove an interesting extension to the proposed energy-recoveryscheme
The theoretical models developed in this work can be used asa quick alternative to SPICE simulations for design-space explor-ation This is shown in the following chapter where the theoryestablished on the energy-recovery TSV scheme is used for design-ing a low-power 3D demonstrator circuit that was fabricated on a130nm process
6E N E R G Y- R E C O V E RY 3 D - I C D E M O N S T R AT O R
61 introduction
In the previous chapter a theoretical approach was taken toinvestigate the potential of the energy-recovery technique forreducing the energy dissipation of TSV interconnects in 3D-ICs Thetotal energy dissipation per cycle and optimum device sizing wereextracted for the proposed scheme using theoretical modellingwhile analysis revealed the design parameters affecting energyperformance
Practical design issues such as wire parasitics signal distri-bution area etc were not considered hence in this chapter a3D demonstrator circuit based on the energy-recovery scheme isdesigned under realistic physical and electrical constraints Thedemonstrator is compared to a CMOS circuit designed with thesame specifications using post-layout simulations Both circuitsare fabricated on a 130nm 3D technology process and evaluatedbased on the experimental measurements
62 physical implementation
The demonstrator circuit implemented a 500MHz 80-bit parallelIO channel on a 5-tier TSV-based 3D-IC The TSVs interconnectionswere driven using the energy-recovery scheme as discussed inChapter 5
The target process technology allowed for a low-resistivitybackside re-distribution layer (RDL) to be deposited which wasused to implement a high-Q integrated inductor Stacking of dieswas not possible in the target process so the 5-tier 3D-IC wasemulated by series-connecting TSVs in groups of 4 as is illustratedin Figure 61
The circuit consisted of a total of 640 TSVs (2 times 4 times 80bits) ofwhich half were electrically connected to the integrated inductorduring a single clock cycle The TSV capacitance for the fabrica-tion technology process had already been characterized and wasestimated at sim 40 f F (Table 22 page 11) Optimal device sizingand parameters for the energy-recovery IO circuit at 500MHz
were calculated using the methodology developed in Chapter 5
and are listed in Table 61In Figure 62 the schematic diagram of the energy-recovery
circuit is illustrated The main components comprising the circuitare the adiabatic drivers the pulse-to-level converters (P2LCs)
92
62 physical implementation 93
AB
AB
CDEFD
C
ABCD
EFD
C
Figure 61 Demonstrator circuit structure (showing 1-bit IO channelface-down view)
Adiabatic driver 80-bit circuit
Wn 461microm (Eq511) Rtotal 21Ω ( RTG80 )
RTG 168Ω (Eq58) Ctotal 165pF (80 middot Cload)
CL 207 f F (Eq57) Lind 613nH (Eq54)
Table 61 Energy-recovery IO circuit parameters
62 physical implementation 94
IntegratedInductor
VINDOER
PulseGenerator
S1AS2
DataGenerator
TSV
TSV
P2LCAdiabaticDriver
P2LCAdiabaticDriver TSV
TSV
Buffer
Buffer
1
80
1
80
EN
DecouplingCapacitor
Figure 62 Energy-recovery IO circuit schematic diagram
Metal2Metal1PolyActiveVDD
GND
RES_CLK
DATA
OUT
OUT_B
DATA
OUT_B OUT
RES_CLK
GND
VDD
Figure 63 Adiabatic driver (layoutschematic view)
the data generator the pulse generator transistor M1 and theintegrated inductor All sub-components are discussed in detailin the following paragraphs
621 Adiabatic driver
The adiabatic driver was based on the configuration theoreticallyanalysed in 531 (page 80) The adiabatic driver had 2 inputsfor the resonant pulse (RES_CLK) and the digital data (DATA)signals and 2 outputs for the inverted (OUT_B) and non-inverted(OUT) signals The layout and schematic views of the adiabaticdriver are illustrated in Figure 63
62 physical implementation 95
Metal2Metal1PolyActive
IN
OUT_B
VDD
IN_B
OUT
GND
OUT_B
IN
GND
VDD
OUT
IN_B
Figure 64 P2LC (layoutschematic view)
622 Pulse-to-level converter
The pulse-to-level converter (P2LC) was based on the simple circuitof Figure 54 (page 78) The design in layout and schematic viewsis illustrated in Figure 64 The circuit had 2 inputs for the inverted(IN_B) and non-inverted (IN) resonant pulse and 2 outputs forthe inverted (OUT_B) and non-inverted (OUT) digital signals
623 Data generator
A data generator was used for generating the digital data sig-nal required at the adiabatic driverrsquos input (DATA) The circuitwhich is illustrated in Figure 65 was a simple D-type flip-flopconfigured as a toggle The data generator had 1 input a digitalclock signal (CLK) operating at the same frequency as the res-onant clock and the digital data output (Q) with a switchingactivity of 1
As discussed in 534 (page 83) there are certain timing con-straints that must be respected for the energy-recovery circuitto operate efficiently The digital data received at the adiabaticdriverrsquos input (DATA) should switch during the period definedby WPULSE (Figure 66) In the demonstrator circuit the distribu-tion network carrying the DATA signal to the adiabatic driverswas carefully designed so that its delay impact was less thanWPULSE
624 Pulse generator
The gate of transistor M1 in Figure 62 is controlled by a pulsehaving the same frequency as the resonant clock (RES_CLK)and width WPULSE which is calculated using the models in 533(page 82) This digital pulse can be generated externally of thechip using a highly accurate laboratory signal generator Howeverwire parasitics in the path from the external signal source to the
62 physical implementation 96
Metal2Metal1PolyActive EN
CLK
VDD
Q
GND
D
CLKGND
ENVDD
QF-F
D
GND
VDDCLK
D Q
Figure 65 Data generator (layoutschematic view)
Delay1
WPULSE
PULSE
CLKTCLK
DATA
CLK_RES
Figure 66 Timing constraints for the energy-recovery demonstratorcircuit
62 physical implementation 97
PULSES1A
S2
GND
VDD
Metal2Metal1PolyActive
VDD
S1
S2
GND
PULSE
Figure 67 Pulse generator (layoutschematic view)
S2ΔPhase
WPULSE
PULSE
S1A
Figure 68 Pulse generator InputOutput signals
gate of on-chip transistor M1 affect the rising and falling times ofthe pulse resulting in pulse widths which are difficult to control
For that reason the pulse generator was implemented on-chip(Figure 67) and could be used to generate pulses with arbitrarywidths and frequencies The pulse generator had 2 sinusoidalinputs (S1A S2) supplied externally of the chip and generateda digital pulse with the same frequency as S1AS2 The pulsewidth was defined by the phase difference between S1AS2 asillustrated in Figure 68
625 Integrated inductor
The integrated inductor was designed in ASITIC1 as a planarsquare spiral with fixed dimensions (500microm times 500microm) and vari-able inductance while aiming for optimum Q factor The simu-
1 ASITIC is CAD tool that aids in the optimization and modelling of spiralinductors (httprficeecsberkeleyedu~niknejadasitichtml)
62 physical implementation 98
100 200 300 400 500 600 700 8005
7
9
11
13
15
17
19
21
10
15
20
25
30
35
40
45
Q (Simulated) Energy Reduction ()
Frequency (MHz)
Qfactor
Energyreduction()
Figure 69 Inductorrsquos Q factor simulation in ASITIC Estimated energydissipation reduction per cycle over standard CMOS
lated Q factor of the inductor for each frequency was then usedin Equation 55 to estimate the energy performance of the energy-recovery circuit when compared to standard CMOS (Equation 52)
As can be observed in Figure 69 the maximum performance isnot reached at the point where the maximum Q factor is achievedsince energy-recovery is also dependent on the operating fre-quency As a result the largest reduction in energy dissipationwas 40 attained at 300MHz while the maximum Q factor re-ported by ASITIC was 20 at 625MHz The high Q factors werea consequence of the large area allocated for the inductor andthe availability of the backside re-distribution layer (RDL) on thetarget process technology for the implementation of the inductorwhere thick low-resistance copper paths could be deposited
For an operating frequency of 500MHz the inductor had a Qfactor of sim 17 which resulted in an estimated energy reductionof 33 in comparison to the CMOS circuit The inductor had avalue of 613nH at that frequency and its design in layout viewcan be seen in Figure 610
626 Top level design
The various sub-components of the circuit were interconnectedusing Metal1 and Metal2 layers In the design there were twoglobal signals that assumed a critical role in the performance ofthe circuit The digital data signal (DATA) which was discussedin 623 and the resonant clock (RES_CLK)
For the distribution of the data signal buffers and interconnectswere properly sized so that the timing constraints of Figure 66were respected A sinusoidal signal such as RES_CLK does notrequire regeneration as it carries just one frequency harmonichowever the path itself has to be properly sized so as to make surethat any parasitic resistancescapacitances are not significantlyaffecting the target performance of the design As was previously
62 physical implementation 99
500μ
m
500μm
Figure 610 Planar square spiral inductor designed in ASITIC
shown by Chueh et al [19] when the driven load capacitancein a resonant distribution network is much larger than wireparasitic capacitance then wire resistance is more significant fordetermining both skew and energy performance
The resonant pulse distribution topology used in this designwas an asymmetric tree as shown in Figure 611 in which theinductor forces a current at the root branch terminating on alladiabatic drivers in parallel As such and considering that theresonant pulse supplies 80 adiabatic drivers with transmissiongate resistance of RTG = 168Ω the total load resistance seen atthe output node of the inductor is RTG
80 = 16880 = 21Ω However as
the current flows to lower branch levels in the tree the perceivedresistance increases to RTG
16 RTG4 and on the last one to RTG Any
additional resistance from wires is going to add in series with theadiabatic driver resistance which in turn will induce an almostlinear effect on the energy performance as can be estimated fromEquation 512
Assuming that the total driven resistance is not desirable toincrease more than 20 and that there are 4 branch levels inthe design each branch can tolerate wire resistance equal to20
4 = 5 of its value Starting from the inductor output nodeand calculating allowed wire resistivity for each branch level thecalculated tolerances for each branch are listed in Table 62 Thesecalculated limits were used for optimally sizing the resonantclock distribution tree in the demonstrator circuit
To ensure a stable voltage supply at the input of the inductor(VIND) a 1pF interdigitated metal-insulator-metal (MIM) de-coupling capacitor [78] was designed (Figure 612) which wasinserted between VIND and ground The top level design of
62 physical implementation 100
Adiabatic drivers
RTG
16
RTG
80
RTG
4
RTG
Level 1Level 2Level 3Level 4
Figure 611 Resonant pulse distribution tree
Branch level Wire resistance
1st 5 middot 16880 = 105mΩ
2nd 5 middot 16816 = 525mΩ
3rd 5 middot 1684 = 21Ω
4th 5 middot 168Ω = 84Ω
Total load 168+8480 + 21
20 + 05255 + 0105 = 252Ω
Table 62 Calculated resistance tolerances for each wire branch in theresonant clock distribution tree
63 post-layout simulation and comparison 101
+-
+-Metal2 Metal1
180μ
m
100μm
Figure 612 Interdigitated metal-insulator-metal capacitor (fringe capa-citor)
the energy-recovery IO circuit in layout view is illustrated inFigure 613
627 CMOS IO circuit
The CMOS IO circuit which was designed for comparison pur-poses is illustrated in the schematic diagram of Figure 614 Thecircuit used half the number of TSVs than the energy-recoverycircuit and was composed of two basic sub-components The datagenerator which was identical to the one discussed in 623 andthe CMOS drivers (Figure 615) which were properly sized to drivethe 160 f F TSV capacitance
The top level layout view of the CMOS IO circuit can be seenin Figure 616
63 post-layout simulation and comparison
The theoretically calculated parameters were used for the designof the demonstrator circuit in Imecrsquos 3D130nm process technologyIn Table 63 the estimated energy dissipation of each circuit sub-component is listed
63 post-layout simulation and comparison 102
22mm
12mm
GND GNDVBUF1 VDR
GND
VIND
S1A
S2
EN
OER
Figure 613 Energy-recovery IO circuit in layout view
63 post-layout simulation and comparison 103
OCMOS
S1B DataGenerator
TSV
CMOSDriver
CMOSDriver
TSV Buffer
1
80
EN
Figure 614 CMOS IO circuit schematic diagram
GND
VCMOS
DATA
078u
039u
105u
053u
316u
158u
950u
475u
OUT
Figure 615 CMOS driver schematic
12mm
12mm
OCMOS S1B GND VBUF2 VCMOSGND
GND GNDEN VCMOS
Figure 616 CMOS IO circuit in layout view
63 post-layout simulation and comparison 104
Energy dissipation (80-bits cycle)
Inductor 11pJ (Eq514)
Switch M1 092pJ [ ]
P2LCs 098pJ (Simulated)
Adiabatic drivers 315pJ (Eq512)
Total 615pJ
Table 63 Energy-dissipation theoretical calculations for Q = 17 f =500MHz
After parasitics extraction the top-level design was simulatedin SPICE The post-layout simulation results for a selection ofoutput signals are plotted in Figure 617 As can be observedthe timing constraints discussed in 623 are respected since theadiabatic driverrsquos input signal (DATA) optimally switches duringthe 327ps PULSE width It can also be seen that the resonantclock (CLK_RES) does not fully complete its discharge cyclebefore it is being forced to do so by the PULSE signal since theadditional capacitive load induced by the wire parasitics reducesthe oscillatorrsquos natural frequency
Post-layout simulated energy dissipation as well as area re-quirements for the energy-recovery and CMOS IO circuits arelisted in Table 64 The energy dissipation of the energy-recoveryimplementation is 25 more at the post-layout level in compar-ison to the theoretical estimations The energy overhead can beattributed to the effect of the wire parasitics as discussed in626 and the transistor parasitics that are not included in thetheoretical equations Therefore the accuracy of the theoreticalmodels can be considered acceptable for designing circuits basedon the proposed energy-recovery scheme
As was expected the dual-rail energy-recovery circuit occupiesalmost double the area of the equivalent CMOS circuit and eventhough both implementations have increased energy dissipationin comparison to the theoretical calculations the CMOS circuitenergy increases relatively more The latter was anticipated sincethe theoretical estimation of the CMOS energy dissipation wascalculated using the simple Equation 52 which does not accountfor any additional driver stages and parasitics The end result isthat the comparative performance favours the energy-recoveryimplementation which improves its energy savings from the
63 post-layout simulation and comparison 105
8 9 10 11 12 13 14 15 160
02
04
06
08
1
12
14RES_CLK
Time (ns)
V(V
)
8 9 10 11 12 13 14 15 160
02
04
06
08
1
12
14DATA
Time (ns)
V(V
)
8 9 10 11 12 13 14 15 160
02
04
06
08
1
12
14PULSE
Time (ns)
V(V
)
8 9 10 11 12 13 14 15 160
02
04
06
08
1
12
14OER
Time (ns)
V(V
)
Figure 617 Post-layout SPICE simulation of the energy-recovery cir-cuit
64 experimental measurements 106
AreaEnergy
dissipation(Theory)
Energydissipation
(Post-layout)
CMOS 144mm2922pJ
(115 f Jbit)
(Eq52)
1265pJ
(158 f Jbit)
Energy recovery 264mm2 615pJ
(77 f Jbit)
768pJ
(96 f Jbit)
Difference +83 minus333 minus393
Table 64 Post-layout simulated energy dissipation Area require-ments
theoretically calculated 333 up to 393 in the post-layoutsimulations
64 experimental measurements
The presented energy-recovery and CMOS IO circuits were fabric-ated on a 200mm wafer (Figure A3) which contained 47 identicaldies Micrograph images of the energy recovery and CMOS circuitscan be seen in Figures 618 and 619 respectively
641 Measurement setup
For IO access a manual on-wafer measurement system was usedthat allowed direct contact with the wafer and the IO test padsthrough RF probe heads (Figure A4) IO signal generation andmeasurement was provided by laboratory instruments whichwere manually controlled The measurement setup is illustratedin Figure 620
The IO test pads for the energy-recovery and CMOS circuitsand their equivalent instrument connections used for each testpad are summarised in Tables 65 and 66
642 Results
CMOS IO circuit
In Figure 621 the input signal S1B and the output signal OCMOS
are shown as captured by the oscilloscope Signal S1B whichhas a frequency of 500MHz is used by the data generator (Fig-ure 614) to generate a data signal switching at the same frequency
64 experimental measurements 107
Figure 618 Energy-recovery IO circuit micrograph
Figure 619 CMOS IO circuit micrograph
AABC
DEFF
D
EEBA
EBBA
Figure 620 Measurement setup
64 experimental measurements 108
Test pad Direction Instrument connection
VIND Power Power supply - gt06V
GND Power Power supply - 0V
S1A InputSine-wave generator (0deg Phase500MHz non-50Ω-terminated)
S2 InputSine-wave generator (VariablePhase 500MHznon-50Ω-terminated)
EN InputDigital Output (enable outputdata switching)
OER Output Oscilloscope
VDR PowerPower supply (adiabaticdrivers+P2LCs) - 12V
VBUF1 Power Power supply (Buffers) - 12V
Table 65 Energy-recovery circuit instrument connections
Test pad Direction Instrument connection
VCMOS PowerPower supply (CMOS drivers) -12V
GND Power Power supply - 0V
VBUF2 Power Power supply (Buffers) - 12V
S1B InputSine-wave generator (0deg Phase500MHz non-50Ω-terminated)
OCMOS Output Oscilloscope
EN InputDigital Output (enable outputdata switching)
Table 66 CMOS circuit instrument connections
64 experimental measurements 109
S1B
OCMOS
Figure 621 Oscilloscope output for signals S1B and OCMOS
100 200 300 400 500 600 700 800012345678910
Die1 Die2 Simulation
Frequency (MHz)
Powerdissipation(mW)
Figure 622 Measuredsimulated power dissipation for the CMOS cir-cuit
that passes through the TSVs and is output on OCMOS Thereforethe measured result on the oscilloscope suggests that the circuitis operating as expected
The proper operation of the circuit is also verified by measuringpower and energy per cycle dissipation on pad VCMOS Thepower dissipation of the two dies measured is approximatelylinear as expected by simulations (Figure 622) while energydissipation approximately constant (Figure 623) In additionabsolute power and energy dissipation values closely match post-layout simulations
Energy-recovery IO circuit
Input signal S1A and output signal OER can be seen in Fig-ure 624 as captured by the oscilloscope The output at OER
64 experimental measurements 110
100 200 300 400 500 600 700 800024681012141618
Die1 Die2 Simulation
Frequency (MHz)
Energydissipation(pJ)
Figure 623 Measuredsimulated energy dissipation for the CMOScircuit
S1A
OER
Figure 624 Oscilloscope output for signals S1A and OER
shows a data signal switching at 500MHz as expected The slowrising edge of the data signal suggests higher than anticipatedparasitics at the OER IO pad though the functionality of thecircuit is not affected
The energy dissipation is calculated by measuring the inputcurrent on IO test pads VIND and VDR The results for the twodies measured can be seen in Figure 625 The energy dissipationshows an approximately constant relation to frequency similar tothe CMOS circuit which does not correlate with post-layout simu-lations A properly operating energy-recovery circuit would beexpected to show minimum energy dissipation at the resonancefrequency (sim 500MHz)
Since the energy-recovery circuit shows CMOS-type energy dis-sipation it was estimated that the inductor could not resonate
65 summary and conclusions 111
200 250 300 350 400 450 500 550 600 65002468101214161820
Die1 Die2 Simulation
Frequency (MHz)
Energydissipation(pJ)
Figure 625 Measuredsimulated energy dissipation for the energy-recovery circuit
IntegratedInductor
VINDOER
PulseGenerator
S1A
S2
DataGenerator
TSV
TSV
P2LCAdiabaticDriver
P2LCAdiabaticDriver TSV
TSV
Buffer
Buffer
1
80
1
80
EN
DecouplingCapacitor
200pF
Figure 626 Energy-recovery IO circuit with fault hypothesis
properly and a hypothesis was made that the cause was a largerthan anticipated parasitic capacitance on the integrated inductorTo determine whether the hypothesis is valid a 200pF capacitoris inserted in the circuit schematic connected at the output ofthe inductor as shown in Figure 626 The new circuit is simu-lated and as can be seen in Figure 627 the output signal at OER
switches properly even though the resonant clock (RES_CLK) isnot oscillating The simulated energy dissipation of the circuitwith the fault hypothesis can be seen in Figure 628 and as canbe observed it is very similar to the experimental measurementsboth in terms of absolute values and frequency relation
65 summary and conclusions
In this chapter the design approach of a demonstrator circuitwas presented for evaluating the energy-recovery 3D interconnect-ing scheme Expanding on the theoretical analysis proposed inChapter 5 practical design issues were discussed and methodswere suggested for overcoming them
The circuit was designed and implemented on an experimental130nm 3D process technology and post-layout simulations res-
65 summary and conclusions 112
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15050
055
060
065
070
075
080RES_CLK
Time (ns)
V(V
)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 150
02
04
06
08
1
12
14PULSE
Time (ns)
V(V
)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1500
02
04
06
08
10
12
14OER
Time (ns)
V(V
)
Figure 627 Simulation of circuit with fault hypothesis
65 summary and conclusions 113
200 250 300 350 400 450 500 550 600 6500
5
10
15
20
25
Die1 Die2 Simulation CL=200pF
Frequency (MHz)
Ene
rgy
diss
ipat
ion
(pJ)
Figure 628 Measured energy dissipation and simulation of circuit withfault hypothesis
ults including parasitics were shown and interpreted The post-layout results revealed that the demonstrator circuit dissipated25 more energy per cycle than the value estimated by the-ory however the energy overhead was attributed to the effectof the wiretransistor parasitics which are not included in thetheoretical models Nevertheless the energy-recovery IO circuitdissipated 39 less energy per cycle than the equivalent CMOS
circuit implemented on the same process technologyExperimental measurements demonstrated that the fabricated
CMOS IO circuit closely matched the energy-dissipation estim-ated by post-layout simulations However the energy-recoveryIO circuit did not operate as expected and as a result its wasnot possible to evaluate the energy-recovery 3D interconnectingscheme on-chip Using simulation data the fault was attributedto higher than anticipated integrated inductor parasitics
7C O N C L U S I O N S
Planar on-chip interconnects have become a major concern inmodern processes as their impact on IC performance has beenprogressively increasing with each technology node In recentyears 3D interconnects based on vertical TSVs have been proposedas a possible solution to the interconnect bottleneck TSVs havethe potential to offer the best all around technology for meet-ing future technology trends with their advantages includingreduced parasitics and high interconnection density
As TSV is a relatively recent 3D interconnect technology thereare still challenges to be overcome before TSVs can be widelyadopted and commercialised Issues in the fabrication of TSV
devices are in the process of being resolved or solutions havealready been found For implementing TSV-based 3D-ICs withmore complex functionalities than simple test chips 3D designmethodologies and tools are needed which are still at their in-fancy stage Accurate perception of the TSV characteristics thatinfluence circuit performance is a major step towards the devel-opment of design methodologies and tools for 3D-ICs
Key TSV characteristics are the TSV parasitic capacitance andthermomechanical stress distribution The parasitic capacitanceaffects delay and power consumption of signals passing throughthe TSV interconnect while thermomechanical stress affects MOS-
FET device mobility in the vicinity of TSVs For extracting theseparameters accurately and under different conditions sophist-icated test structures were presented in this work which werefabricated and evaluated on an experimental 3D process
Although the TSV parasitic capacitance is small it may stillbecome an important source of power consumption as TSVs areimplemented in large densely interconnected 3D-SoCs Since con-ventional low-power design techniques based on voltage scalingrepresent a technical challenge in modern technology nodesa novel TSV interconnection scheme with frequency-dependentpower consumption was proposed in this work based on energy-recovery logic The proposed scheme was analysed using the-oretical modelling while a demonstrator IC was designed andfabricated on a 130nm 3D process
The contributions of this thesis as presented in the previouschapters are summarised in the following sections as well aspossible directions for future work
114
71 main contributions 115
71 main contributions
From the parasitics associated with a TSV capacitance assumesthe most significant role on its electrical properties such as en-ergy dissipation and delay thus accurate characterization of theTSV capacitance is highly desirable when designing a 3D systemIn Chapter 3 a group of test structures were presented for theelectrical characterization of the TSV parasitic capacitance un-der various conditions The test structures were based on theCBCM technique which has been used in the past for measuringBEOL metal interconnect parasitic capacitances and MOSFET ca-pacitances The presented test structures were fabricated on a65nm 3D process and the measurement results were statisticallyanalysed in terms of TSV capacitance die-to-die (D2D) and die-to-wafer (D2W) variability Comparison of the measurement data tosimulated oxide liner thickness variation for the specific processrevealed the oxide liner as a major contributor to the observedTSV capacitance variability Measurement results also confirmedthe effect of TSV pitch on the parasitic capacitance as has beenpreviously reported in the literature for a different process Theproposed CBCM-based TSV capacitance measurement techniquewas shown to be superior to conventional LCR measurementsproducing both accurate measurement results as well as lessoutliers (less defective samples) in the measurement data
The integration of TSV 3D interconnects in CMOS technology hasbeen shown to induce thermomechanical stress on the silicon (Si)substrate Thermomechanical stress can have an impact on activedevice carrier mobility implemented on the same Si substratethus monitoring and controlling stress is crucial for designing TSV-aware circuits with predictable performance Simulations usingfinite element method (FEM) modelling predict a complex ther-momechanical stress distribution which emphasises the need forsophisticated characterization structures that can monitor stressimpact with precision and verify simulation models In Chapter4 a test structure was presented for indirectly monitoring of theTSV-induced thermomechanical stress distribution by measuringits effect on transistor saturation current The structure implemen-ted arrays of MOSFET devices superimposed with TSVs in variousconfigurations and enabled accurate measurement of individualdevices by combining digital selection logic with four-terminalsensing Imecrsquos experimental 65nm 3D process was characterizedusing the presented test structure extracting the thermomechan-ical stress effect on PMOS devices which was found to be on parwith simulation estimations
In future large densely interconnected 3D-SoCs the TSV parasiticcapacitance may become an important source of energy dissip-ation Energy-recovery logic is an interesting low-power design
72 future work 116
approach as it is not limited by voltage scaling which representsa technical challenge in modern technology nodes In Chapter5 the potential of the energy-recovery technique was investig-ated for reducing the energy dissipation of TSV interconnectsin 3D-ICs The proposed energy-recovery scheme was analysedusing theoretical modelling generating equations for energy dis-sipation and optimum device sizing The model accuracy wasevaluated against SPICE simulations which showed good cor-relation with the theoretical estimations on a 130nm technologyprocess Comparison of the energy-recovery scheme to conven-tional static CMOS demonstrated favourable energy performancefor the energy-recovery scheme for low operating frequenciesand high Q factors switching activities and TSV capacitances
The theoretical models developed in Chapter 5 were used fordesigning a energy-recovery 3D demonstrator circuit in Chapter6 under realistic physical and electrical constraints The energy-recovery circuit and its CMOS equivalent were fabricated on a130nm 3D process technology and post-layout simulations showedthat the energy-recovery circuit dissipated 39 less energy percycle than the CMOS circuit on the same process technology Exper-imental measurements and comparison to simulations confirmedthe proper operation of the fabricated CMOS circuit however theenergy-recovery circuit did not operate as expected and as a res-ult its was not possible to evaluate the proposed energy-recoveryscheme on a chip Using simulation data the fault was attributedto higher than anticipated integrated inductor parasitics
72 future work
The CBCM-based test structures in Chapter 3 were shown to be atleast as accurate as external instruments for characterizing TSV
capacitance Considering that CBCM can evaluate single-endedTSVs it could prove a valuable tool for testing TSVs prior to diestacking which is a critical step for keeping 3D-IC yield acceptableMeasuring the TSV parasitic capacitance can provide with import-ant information about the quality of the fabricated TSV devicesince even slight variations in process parameters will affect itsparasitic capacitance as was demonstrated in Chapter 3 CBCM issimpler and potentially more accurate than other techniques thathave been proposed in the literature for the same role such asRC constant measurement using sense amplification [16]
In Chapter 4 the experiences learned from evaluation of theexperimental measurements results could be potentially used fordesigning improved thermomechanical stress characterizationstructures for next generation TSV processes
1 The transistors should be in a regular grid in the array forsimple processing of the measurement data
72 future work 117
2 The boundary conditions of transistors should be carefullydesigned as they have an considerable effect on transistormatching and thus measurement accuracy
3 Distances of interest are generally lt 8microm thus the arrayscould be smaller and more dense in the area in proximityto the TSV
4 Mechanisms are required to detect faults in the digitalselection logic when the implementation process is expectedto have low-yield issues
Furthermore in a future work the thermomechanical stress effecton NMOS devices should be measured which were not evaluatedin this project Simulations predict that NMOS devices are muchless affected by thermomechanical stress than PMOS devices thusthe approach used in this work might not be sufficiently accurateto extract stress effect on NMOS mobility
In Chapter 5 the energy-recovery scheme proposed could use adifferent encoding technique than dual-rail as a possible improve-ment A suggested encoding is 1-out-of-4 [74] since only chargesa single output at any given cycle which can potentially save upto 50 energy in comparison to dual-rail Also the demonstra-tion of the energy-recovery scheme feasibility on a chip is stillpending since the energy-recovery 3D demonstrator discussedin Chapter 6 failed to operate properly The fault was attributedto higher than anticipated integrated inductor parasitics thusthe same circuit should be implemented in the future with animproved inductor design for verifying the proposed theory
AA P P E N D I X A P H O T O S
Figure A1 3D65 300mm wafer (on chuck)
Figure A2 Probe card
118
appendix a photos 119
Figure A3 3D130 200mm wafer (on chuck)
ProbeHead
Microscope
Figure A4 Probe head
BA P P E N D I X B S TAT I S T I C S
mean (micro)
In a given data set the mean is the sum of the values divided bythe number of values The mean represents the average value in apopulation If xi is a value in the data set and n the total numberof values the mean micro can be calculated as
micro =1n
n
sumi=1
xi
median (m)
The median is useful for skewed distributions and represents thearithmetic value that the majority of the values in a data set tendThe median of a normal distribution with mean micro and varianceσ2 is the same as the mean
standard deviation (σ)
Standard deviation is used for measuring variability in a dataset It shows how much variation there is from the mean valueTo calculate the standard deviation of a data set first the squareof the difference of each value from the mean is calculated andthen the square root of the mean gives the standard deviation
A normal distribution contains 682 of its values within plusmn1σOther distributions can have a different spread and typically alow standard deviation indicates low variability whereas a highstandard deviation increased variability
standard deviation of the mean (σmicro )
The standard deviation of the sampled mean (σmicro) provides in-formation on the precision of the calculated mean of a data setThis is especially useful in experimental measurements wherethe certainty of the data being an accurate representation of themeasured physical quantity increases the more the measurementis repeated
Assuming there is a data set X with n values xi of mean micro
120
appendix b statistics 121
variance(X) = σ2X
variance(micro) = variance( 1n sum
ni=1 xi) =
1n2 variance(sumn
i=1 xi) =1n2 sum
ni=1 variance(xi) =
nn2 variance(X) =1n variance(X) =
σ2Xn rarr
σmicro = σXradicn
coefficient of variation
The coefficient of variation is the ratio of the standard deviationσ to the mean micro
CV =σ
micro
It can be more useful in determining variability in a populationthan standard deviation as in contrast to standard deviationcoefficient of variation is a dimensionless number often expressedas percentage
outlier
Outliers are values in a data set that are numerically distant fromthe rest of the data Observing and sometimes removing theoutliers is crucial in experimental measurement as these valuesusually indicate a measurement error or other malfunction De-termining outliers is generally subjective to the observer as thereis no specific mathematical definition of what constitutes outliersTypically the outliers are located by observing the probabilitydistribution function of a data set
kruskalndash wallis [34]
KruskalndashWallis one-way analysis of variance compares the medi-ans of two or more data sets to determine if the samples originatefrom separate populations This analysis tests the null-hypothesisthat there is no relationship between the data sets calculating theprobability of the data sets originating from the same populationand rejecting the null-hypothesis if it is less than a significancelevel The KruskalndashWallis test can be used on data sets that donot have a normal distribution and their variances do not have tobe equal
B I B L I O G R A P H Y
[1] Model 2602 dual-channel system sourcemeter instru-ment URL httpwwwkeithleycomproductsdcac
currentsourcebroadpurposemn=2602
[2] Max-3d URL httpwwwmicromagiccom
[3] The international technology roadmap for semiconductors2007 URL httppublicitrsnet
[4] The international technology roadmap for semiconductors2009 URL httppublicitrsnet
[5] Jaime Aguilera Joaquin de No Andres Garcia-Alonso FrankOehler Heiko Hein and Josef Sauerer A guide for on-chip inductor design in a conventional cmos process forrf applications In Applied Microwave amp Wireless Magazinepages 56ndash65 October 2001
[6] M Arsalan and M Shams Charge-recovery power clockgenerators for adiabatic logic circuits In Proc 18th Int VLSI
Design Conf pages 171ndash174 2005 doi 101109ICVD200564
[7] P Asimakopoulos G Van der Plas A Yakovlev andP Marchal Evaluation of energy-recovering interconnectsfor low-power 3d stacked ics In Proc IEEE Int Conf
3D System Integration 3DIC 2009 pages 1ndash5 2009 doi1011093DIC20095306581
[8] W C Athas L J Svensson J G Koller N Tzartzanis andE Ying-Chin Chou Low-power digital systems based onadiabatic-switching principles 2(4)398ndash407 1994 doi 10110992335009
[9] C H Bennett Logical reversibility of computation IBM
Journal of Research and Development 17(6)525ndash532 1973 doi101147rd1760525
[10] Charles H Bennett The fundamental physical lim-its of computation Scientific American 253(1)48ndash561985 URL httpwwwnaturecomdoifinder101038
scientificamerican0785-48
[11] E Beyne 3d system integration technologies In Proc Int
VLSI Technology Systems and Applications Symp pages 1ndash92006 doi 101109VTSA2006251113
122
bibliography 123
[12] Mark Bohr and Kaizad Mistry Intelrsquos revolu-tionary 22 nm transistor technology 2011 URLhttpdownloadintelcomnewsroomkits22nmpdfs
22nm-Details_Presentationpdf
[13] Ke Cao S Dobre and Jiang Hu Standard cell charac-terization considering lithography induced variations InProc 43rd ACMIEEE Design Automation Conf pages 801ndash8042006 doi 101109DAC2006229327
[14] A P Chandrakasan S Sheng and R W Brodersen Low-power cmos digital design 27(4)473ndash484 1992 doi 1011094126534
[15] J C Chen B W McGaughy D Sylvester and Chenming HuAn on-chip attofarad interconnect charge-based capacitancemeasurement (cbcm) technique In Proc Int Electron Devices
Meeting IEDM rsquo96 pages 69ndash72 1996 doi 101109IEDM1996553124
[16] Po-Yuan Chen Cheng-Wen Wu and Ding-Ming Kwai On-chip tsv testing for 3d ic before bonding using sense ampli-fication In Proc ATS rsquo09 Asian Test Symp pages 450ndash4552009 doi 101109ATS200942
[17] Sungdong Cho Sinwoo Kang Kangwook Park Jaechul KimKiyoung Yun Kisoon Bae Woon Seob Lee Sangwook JiEunji Kim Jangho Kim Y L Park and E S Jung Impact oftsv proximity on 45nm cmos devices in wafer level In Proc
IEEE Int Interconnect Technology Conf and 2011 Materials for
Advanced Metallization (IITCMAM) pages 1ndash3 2011 doi101109IITC20115940326
[18] Kyu-Myung Choi An industrial perspective of 3d ic inte-gration technology from the viewpoint of design technologyIn Proc 15th Asia and South Pacific Design Automation Conf
(ASP-DAC) pages 544ndash547 2010 doi 101109ASPDAC20105419823
[19] Juang-Ying Chueh C H Ziesler and M C PapaefthymiouEmpirical evaluation of timing and power in resonant clockdistribution In Proc Int Symp Circuits and Systems ISCAS
rsquo04 volume 2 2004 doi 101109ISCAS20041329255
[20] Thuy Dao D H Triyoso M Petras and M CanonicoThrough silicon via stress characterization In Proc IEEE
Int Conf IC Design and Technology ICICDT rsquo09 pages 39ndash412009 doi 101109ICICDT20095166260
[21] M Grange R Weerasekera D Pamunuwa and H Ten-hunen Exploration of through silicon via interconnect par-
bibliography 124
asitics for 3-dimensional integrated circuits In Workshop
Notes Design Automation and Test in Europe (DATE) 2009
[22] N Z Haron and S Hamdioui Why is cmos scaling comingto an end In Proc 3rd Int Design and Test Workshop IDT
2008 pages 98ndash103 2008 doi 101109IDT20084802475
[23] Imec 200mm 3d-sic tsv line 2009 URL httpwwwimec
beScientificReportSR2009HTML1213307html
[24] J W Joyner R Venkatesan P Zarkesh-Ha J A Davis andJ D Meindl Impact of three-dimensional architectures oninterconnects in gigascale integration 9(6)922ndash928 2001doi 10110992974905
[25] A P Karmarkar Xiaopeng Xu and V Moroz Perfor-manace and reliability analysis of 3d-integration struc-tures employing through silicon via (tsv) In Proc IEEE
Int Reliability Physics Symp pages 682ndash687 2009 doi101109IRPS20095173329
[26] G Katti A Mercha M Stucchi Z Tokei D VelenisJ Van Olmen C Huyghebaert A Jourdain M RakowskiI Debusschere P Soussan H Oprins W DehaeneK De Meyer Y Travaly E Beyne S Biesemans andB Swinnen Temperature dependent electrical character-istics of through-si-via (tsv) interconnections In Proc Int
Interconnect Technology Conf (IITC) pages 1ndash3 2010 doi101109IITC20105510311
[27] G Katti M Stucchi K De Meyer and W Dehaene Electricalmodeling and characterization of through silicon via forthree-dimensional ics 57(1)256ndash262 2010 doi 101109TED20092034508
[28] G Katti M Stucchi J Van Olmen K De Meyer and W De-haene Through-silicon-via capacitance reduction techniqueto benefit 3-d ic performance 31(6)549ndash551 2010 doi101109LED20102046712
[29] Walt Kester Sensor signal conditioning In Jon SWilson editor Sensor Technology Handbook pages 31 ndash136 Newnes Burlington 2005 ISBN 978-0-75-067729-5 doi DOI101016B978-075067729-550044-6 URLhttpwwwsciencedirectcomsciencearticlepii
B9780750677295500446
[30] Jonggab Kil Jie Gu and C H Kim A high-speed variation-tolerant interconnect technique for sub-threshold circuitsusing capacitive boosting 16(4)456ndash465 2008 doi 101109TVLSI2007915455
bibliography 125
[31] Suhwan Kim C H Ziesler and M C PapaefthymiouCharge-recovery computing on silicon 54(6)651ndash659 2005doi 101109TC200591
[32] C Kortekaas On-chip quasi-static floating-gate capacitancemeasurement method In Proc Int Conf Microelectronic Test
Structures ICMTS 1990 pages 109ndash113 1990 doi 101109ICMTS199067889
[33] A Kramer J S Denker B Flower and J Moroney 2ndorder adiabatic computation with 2n-2p and 2n-2n2p logiccircuits In Proceedings of the 1995 international symposium on
Low power design ISLPED rsquo95 pages 191ndash196 New York NYUSA 1995 ACM ISBN 0-89791-744-8 doi httpdoiacmorg101145224081224115 URL httpdoiacmorg10
1145224081224115
[34] William H Kruskal and W Allen Wallis Use of ranks in one-criterion variance analysis Journal of the American Statistical
Association 47(260)583ndash621 1952 ISSN 01621459 doi 1023072280779 URL httpdxdoiorg1023072280779
[35] Kelin J Kuhn Moorersquos law past 32nm Future challengesin device scaling 2009 URL httpdownloadintelcom
pressroompdfkkuhnKuhn_IWCE_invited_textpdf
[36] R Landauer Irreversibility and heat generation in the com-puting process IBM Journal of Research and Development 5
(3)183ndash191 1961 doi 101147rd530183
[37] Joonho Lim Dong-Gyu Kim and Soo-Ik Chae A 16-bitcarry-lookahead adder using reversible energy recoverylogic for ultra-low-energy systems 34(6)898ndash903 1999 doi1011094766827
[38] Joonho Lim Dong-Gyu Kim and Soo-Ik Chae nmos re-versible energy recovery logic for ultra-low-energy applica-tions 35(6)865ndash875 2000 doi 1011094845190
[39] F Liu X Gu K A Jenkins E A Cartier Y Liu P Song andS J Koester Electrical characterization of 3d through-silicon-vias In Proc 60th Electronic Components and Technology Conf
(ECTC) pages 1100ndash1105 2010 doi 101109ECTC20105490839
[40] K H Lu Xuefeng Zhang Suk-Kyu Ryu J Im Rui Huangand P S Ho Thermo-mechanical reliability of 3-d ics con-taining through silicon vias In Proc 59th Electronic Compo-
nents and Technology Conf ECTC 2009 pages 630ndash634 2009doi 101109ECTC20095074079
bibliography 126
[41] Nir Magen Avinoam Kolodny Uri Weiser and NachumShamir Interconnect-power dissipation in a microprocessorIn Proceedings of the 2004 international workshop on System
level interconnect prediction SLIP rsquo04 pages 7ndash13 New YorkNY USA 2004 ACM ISBN 1-58113-818-0 doi httpdoiacmorg101145966747966750 URL httpdoiacm
org101145966747966750
[42] H Mahmoodi V Tirumalashetty M Cooke and K RoyUltra low-power clocking scheme using energy recoveryand clock gating 17(1)33ndash44 2009 doi 101109TVLSI20082008453
[43] D Maksimovic A mos gate drive with resonant transitionsIn Proc nd Annual IEEE Power Electronics Specialists Conf
PESC rsquo91 Record pages 527ndash532 1991 doi 101109PESC1991162725
[44] D Maksimovic and V G Oklobdzija Integrated power clockgenerators for low energy logic In Proc th Annual IEEE
Power Electronics Specialists Conf PESC rsquo95 Record volume 1pages 61ndash67 1995 doi 101109PESC1995474793
[45] D Maksimovic V G Oklobdzija B Nikolic and K WCurrent Clocked cmos adiabatic logic with integratedsingle-phase power-clock supply 8(4)460ndash463 2000 doi10110992863629
[46] E J Marinissen Testing tsv-based three-dimensional stackedics In Proc Design Automation amp Test in Europe Conf amp
Exhibition (DATE) pages 1689ndash1694 2010
[47] E J Marinissen and Y Zorian Testing 3d chips containingthrough-silicon vias In Proc Int Test Conf ITC 2009 pages1ndash11 2009 doi 101109TEST20095355573
[48] A Mercha A Redolfi M Stucchi N Minas J Van Ol-men S Thangaraju D Velenis S Domae Y Yang G KattiR Labie C Okoro M Zhao P Asimakopoulos I De WolfT Chiarella T Schram E Rohr A Van Ammel A Jour-dain W Ruythooren S Armini A Radisic H PhilipsenN Heylen M Kostermans P Jaenen E Sleeckx D Sabun-cuoglu Tezcan I Debusschere P Soussan D PerryG Van der Plas J H Cho P Marchal Y Travaly E BeyneS Biesemans and B Swinnen Impact of thinning andthrough silicon via proximity on high-k metal gate firstcmos performance In Proc Symp VLSI Technology (VLSIT)pages 109ndash110 2010 doi 101109VLSIT20105556190
[49] A Mercha G Van der Plas V Moroz I De Wolf P Asi-makopoulos N Minas S Domae D Perry M Choi
bibliography 127
A Redolfi C Okoro Y Yang J Van Olmen S ThangarajuD S Tezcan P Soussan J H Cho A Yakovlev P Mar-chal Y Travaly E Beyne S Biesemans and B SwinnenComprehensive analysis of the impact of single and arraysof through silicon vias induced stress on high-k metalgate cmos performance In Proc IEEE Int Electron Devices
Meeting (IEDM) 2010 doi 101109IEDM20105703278
[50] Yong Moon and Deog-Kyoon Jeong An efficient chargerecovery logic circuit 31(4)514ndash522 1996 doi 1011094499727
[51] G E Moore Cramming more components onto inte-grated circuits Electronics 38(8)114ndash117 April 1965 doi101109JPROC1998658762 URL httpdxdoiorg10
1109JPROC1998658762
[52] O S Nakagawa S-Y Oh T Hsu and S Habu Benchmarkmethodology of interconnect capacitance simulation usinginter-digitated capacitors In Proc Int Microelectronic Test
Structures ICMTS 1998 Conf pages 235ndash237 1998 doi 101109ICMTS1998688103
[53] M W Newman S Muthukumar M SchueleinT Dambrauskas P A Dunaway J M Jordan S KulkarniC D Linde T A Opheim R A Stingel W WorwagL A Topic and J M Swan Fabrication and electricalcharacterization of 3d vertical interconnects In Proc
56th Electronic Components and Technology Conf 2006 doi101109ECTC20061645676
[54] V G Oklobdzija D Maksimovic and Fengcheng Lin Pass-transistor adiabatic logic using single power-clock supply44(10)842ndash846 1997 doi 10110982633443
[55] J Van Olmen C Huyghebaert J Coenen J Van AelstE Sleeckx A Van Ammel S Armini G Katti J VaesW Dehaene E Beyne and Y Travaly Integration chal-lenges of copper through silicon via (tsv) metallization for3d-stacked ic integration Microelectronic Engineering 88
(5)745 ndash 748 2011 ISSN 0167-9317 doi DOI101016jmee201006026 URL httpwwwsciencedirectcom
sciencearticlepiiS0167931710002121 The 2010 Inter-national workshop on - MAM 2010 The 2010 Internationalworkshop on
[56] J J Paulos and D A Antoniadis Measurement of minimum-geometry mos transistor capacitances 20(1)277ndash283 1985doi 101109JSSC19851052303
bibliography 128
[57] Vasilis F Pavlidis and Eby G Friedman Interconnect de-lay minimization through interlayer via placement in 3-dics In Proceedings of the 15th ACM Great Lakes symposium
on VLSI GLSVLSI rsquo05 pages 20ndash25 New York NY USA2005 ACM ISBN 1-59593-057-4 doi httpdoiacmorg10114510576611057669 URL httpdoiacmorg101145
10576611057669
[58] Dan Perry Jonghoon Cho Shinichi Domae Panagiotis Asi-makopoulos Alex Yakovlev Pol Marchal Geert Van derPlas and Nikolaos Minas An efficient array structure tocharacterize the impact of through silicon vias on fet devicesIn Proc IEEE Int Microelectronic Test Structures (ICMTS) Confpages 118ndash122 2011 doi 101109ICMTS20115976872
[59] Jan M Rabaey Anantha Chandrakasan and Borivoje NikolicDigital integrated circuits- A design perspective Prentice Hall2ed edition 2004
[60] K Roy and S Prasad Low-power CMOS VLSI circuit de-
sign A Wiley-Interscience publication Wiley 2000 ISBN9780471114888
[61] K Roy S Mukhopadhyay and H Mahmoodi-MeimandLeakage current mechanisms and leakage reduction tech-niques in deep-submicrometer cmos circuits 91(2)305ndash3272003 doi 101109JPROC2002808156
[62] V S Sathe M C Papaefthymiou and C H Ziesler Boostlogic a high speed energy recovery circuit family In Proc
IEEE Computer Society Annual Symp VLSI pages 22ndash27 2005doi 101109ISVLSI200522
[63] V S Sathe J C Kao and M C Papaefthymiou Resonant-clock latch-based design 43(4)864ndash873 2008 doi 101109JSSC2008917501
[64] B Sell A Avellan and W H Krautschneider Charge-basedcapacitance measurements (cbcm) on mos devices 2(1)9ndash122002 doi 101109TDMR20021014667
[65] C S Selvanayagam Xiaowu Zhang R Rajoo and D PinjalaModelling stress in silicon with tsvs and its effect on mobilityIn Proc 11th Electronics Packaging Technology Conf EPTC rsquo09pages 612ndash618 2009 doi 101109EPTC20095416477
[66] D Shariff P C Marimuthu K Hsiao L Asoy Chia LaiYee Aung Kyaw Oo K Buchanan K Crook T Wilby andS Burgess Integration of fine-pitched through-silicon viasand integrated passive devices In Proc IEEE 61st Elec-
tronic Components and Technology Conf (ECTC) pages 844ndash848 2011 doi 101109ECTC20115898609
bibliography 129
[67] S Stoukatch C Winters E Beyne W De Raedt andC Van Hoof 3d-sip integration for autonomous sensornodes In Proc 56th Electronic Components and Technology
Conf 2006 doi 101109ECTC20061645678
[68] L J Svensson and J G Koller Driving a capacitive loadwithout dissipating fcv2 In Proc IEEE Symp Low Power
Electronics Digest of Technical Papers pages 100ndash101 1994 doi101109LPE1994573220
[69] B Swinnen W Ruythooren P De Moor L Bogaerts L Car-bonell K De Munck B Eyckens S Stoukatch D S TezcanZ Tokei J Vaes J Van Aelst and E Beyne 3d integrationby cu-cu thermo-compression bonding of extremely thinnedbulk-si die containing 10 IcircŒm pitch through-si vias In Proc
Int Electron Devices Meeting IEDM rsquo06 pages 1ndash4 2006 doi101109IEDM2006346786
[70] N Tanaka M Kawashita Y Yoshimura T Uematsu M Fuji-sawa H Shimokawa N Kinoshita T Naito T Kikuchi andT Akazawa Characterization of mos transistors after tsvfabrication and 3d-assembly In Proc 2nd Electronics System-
Integration Technology Conf ESTC 2008 pages 131ndash134 2008doi 101109ESTC20084684338
[71] S E Thompson Guangyu Sun Youn Sung Choi andT Nishida Uniaxial-process-induced strained-si extend-ing the cmos roadmap 53(5)1010ndash1020 2006 doi 101109TED2006872088
[72] N Tzartzanis and W C Athas Clock-powered cmos ahybrid adiabatic logic style for energy-efficient computingIn Proc 20th Anniversary Conf Advanced Research in VLSIpages 137ndash151 1999 doi 101109ARVLSI1999756044
[73] J Van Olmen A Mercha G Katti C HuyghebaertJ Van Aelst E Seppala Zhao Chao S Armini J VaesR C Teixeira M Van Cauwenberghe P Verdonck K Ver-hemeldonck A Jourdain W Ruythooren M de Potterde ten Broeck A Opdebeeck T Chiarella B Parvais I De-busschere T Y Hoffmann B De Wachter W DehaeneM Stucchi M Rakowski P Soussan R Cartuyvels E BeyneS Biesemans and B Swinnen 3d stacked ic demonstrationusing a through silicon via first approach In Proc IEEE Int
Electron Devices Meeting IEDM 2008 pages 1ndash4 2008 doi101109IEDM20084796763
[74] Tom Verhoeff Delay-insensitive codes - an overview Dis-
tributed Computing 31ndash8 1988 ISSN 0178-2770 URL http
dxdoiorg101007BF01788562 101007BF01788562
bibliography 130
[75] P Vitanov T Dimitrova and I Eisele Direct capacitancemeasurements of small geometry mos transistors Micro-
electronics Journal 22(7-8)77 ndash 89 1991 ISSN 0026-2692doi DOI1010160026-2692(91)90016-G URL httpwww
sciencedirectcomsciencearticleB6V44-4829XDJ-2N
2f5b2c221dcf240c38cd13b7b3432f913
[76] B Voss and M Glesner A low power sinusoidal clockIn Proc IEEE Int Symp Circuits and Systems ISCAS 2001volume 4 pages 108ndash111 2001 doi 101109ISCAS2001922182
[77] R Weerasekera M Grange D Pamunuwa H Tenhunenand Li-Rong Zheng Compact modelling of through-siliconvias (tsvs) in three-dimensional (3-d) integrated circuits InProc IEEE Int Conf 3D System Integration 3DIC 2009 pages1ndash8 2009 doi 1011093DIC20095306541
[78] NHE Weste and DF Harris CMOS VLSI design a cir-
cuits and systems perspective PearsonAddison-Wesley 2005ISBN 9780321269775
[79] Steve Wozniak Failure magazine interview (by jason zasky)July 2000 URL httpfailuremagcomindexphpsite
printsteve_wozniak_interview
[80] Hyung Suk Yang and Muhannad S Bakir 3d integrationof cmos and mems using mechanically flexible intercon-nects (mfi) and through silicon vias (tsv) In Proc 60th Elec-
tronic Components and Technology Conf (ECTC) pages 822ndash828 2010 doi 101109ECTC20105490716
[81] Yu Yang G Katti R Labie Y Travaly B Verlinden andI De Wolf Electrical evaluation of 130-nm mosfets withtsv proximity in 3d-sic structure In Proc Int Interconnect
Technology Conf (IITC) pages 1ndash3 2010 doi 101109IITC20105510710
[82] Yibin Ye and K Roy Energy recovery circuits using re-versible and partially reversible logic 43(9)769ndash778 1996doi 10110981536746
[83] Yibin Ye and K Roy Qserl quasi-static energy recoverylogic 36(2)239ndash248 2001 doi 1011094902764
[84] C C Yeh J H Lou and J B Kuo 15 v cmos full-swingenergy efficient logic (eel) circuit suitable for low-voltageand low-power vlsi applications Electronics Letters 33(16)1375ndash1376 1997 doi 101049el19970915
bibliography 131
[85] C P Yuan and T N Trick A simple formula for the esti-mation of the capacitance of two-dimensional interconnectsin vlsi circuits 3(12)391ndash393 1982 doi 101109EDL198225610
- Abstract
- Publications
- Acknowledgments
- Contents
- List of Figures
- List of Tables
- Acronyms
- 1 Introduction
-
- 11 Research goals and contribution
- 12 Thesis outline
-
- 2 Background
-
- 21 3D integrated circuits
-
- 211 3D integration approaches
- 212 TSV interconnects
-
- 22 Energy-recovery logic
-
- 221 Energy-recovery architectures
- 222 Power-clock generators
-
- 23 Full-custom design flow
- 24 Summary
-
- 3 Characterization of TSV capacitance
-
- 31 Introduction
- 32 TSV parasitic capacitance
- 33 Integrated capacitance measurement techniques
- 34 Electrical characterization
-
- 341 Physical implementation
- 342 Measurement setup
- 343 Experimental results
-
- 35 Summary and conclusions
-
- 4 TSV proximity impact on MOSFET performance
-
- 41 Introduction
- 42 TSV-induced thermomechanical stress
- 43 Test structure implementation
-
- 431 MOSFET arrays
- 432 Digital selection logic
-
- 44 Experimental measurements
-
- 441 Data processing methodology
- 442 Measurement setup
- 443 PMOS (long channel)
- 444 PMOS (short channel)
-
- 45 Summary and conclusions
-
- 5 Evaluation of energy-recovery for TSV interconnects
-
- 51 Introduction
- 52 Energy-recovery TSV interconnects
- 53 Analysis
-
- 531 Adiabatic driver
- 532 Inductors parasitic resistance
- 533 Switch M1
- 534 Timing Constraints
-
- 54 Simulations
-
- 541 Comparison to CMOS
-
- 55 Summary and conclusions
-
- 6 Energy-recovery 3D-IC demonstrator
-
- 61 Introduction
- 62 Physical implementation
-
- 621 Adiabatic driver
- 622 Pulse-to-level converter
- 623 Data generator
- 624 Pulse generator
- 625 Integrated inductor
- 626 Top level design
- 627 CMOS IO circuit
-
- 63 Post-layout simulation and comparison
- 64 Experimental measurements
-
- 641 Measurement setup
- 642 Results
-
- 65 Summary and conclusions
-
- 7 Conclusions
-
- 71 Main contributions
- 72 Future work
-
- A Appendix A Photos
- B Appendix B Statistics
- Bibliography
-
Contact panagiotisasimakopoulosn13la13uk
NCL-EECE-MSD-TR-2011-173
Copyright ccopy 2011 Newcastle University
Microelectronics System Design Research Group
School of Electrical Electronic amp Computer Engineering
Merz Court
Newcastle University
Newcastle upon Tyne NE1 7RU UKhttpasyn13orguk
O P T I M I Z I N G T H E I N T E G R AT I O N A N DE N E R G Y E F F I C I E N C Y O F T H R O U G H
S I L I C O N V I A - B A S E D 3 D I N T E R C O N N E C T S
panagiotis asimakopoulos
A Thesis Submitted for the Degree of Doctor of Philosophyat Newcastle University
School of Electrical Electronic and Computer Engineering
Newcastle upon Tyne
November 2011
Panagiotis Asimakopoulos Optimizing the integration and energy
efficiency of through silicon via-based 3D interconnects PhD Thesiscopy November 2011
A B S T R A C T
The aggressive scaling of CMOS process technology has been driv-ing the rapid growth of the semiconductor industry for more thanthree decades In recent years the performance gains enabledby CMOS scaling have been increasingly challenged by highly-parasitic on-chip interconnects as wire parasitics do not scale atthe same pace Emerging 3D integration technologies based onvertical through-silicon vias (TSVs) promise a solution to the inter-connect performance bottleneck along with reduced fabricationcost and heterogeneous integration
As TSVs are a relatively recent interconnect technology innov-ative test structures are required to evaluate and optimise theprocess as well as extract parameters for the generation of designrules and models From the circuit designerrsquos perspective criticalTSV characteristics are its parasitic capacitance and thermomech-anical stress distribution This work proposes new test structuresfor extracting these characteristics The structures were fabric-ated on a 65nm 3D process and used for the evaluation of thattechnology
Furthermore as TSVs are implemented in large densely inter-connected 3D-system-on-chips (SoCs) the TSV parasitic capacitancemay become an important source of energy dissipation Typicallow-power techniques based on voltage scaling can be usedthough this represents a technical challenge in modern techno-logy nodes In this work a novel TSV interconnection scheme isproposed based on reversible computing which shows frequency-dependent energy dissipation The scheme is analysed using the-oretical modelling while a demonstrator IC was designed basedon the developed theory and fabricated on a 130nm 3D process
iii
P U B L I C AT I O N S
Some ideas and figures have appeared previously in the followingpublications
1 D Perry J Cho S Domae P Asimakopoulos A YakovlevP Marchal G Van der Plas and N Minas An efficientarray structure to characterize the impact of through siliconvias on fet devices In Proc IEEE Int Microelectronic TestStructures (ICMTS) Conf pages 118ndash122 2011 (best paper
award)
2 A Mercha G Van der Plas V Moroz I De Wolf P Asi-
makopoulos N Minas S Domae D Perry M Choi ARedolfi C Okoro Y Yang J Van Olmen S ThangarajuD S Tezcan P Soussan J H Cho A Yakovlev P Mar-chal Y Travaly E Beyne S Biesemans and B SwinnenComprehensive analysis of the impact of single and arraysof through silicon vias induced stress on high-k metalgate cmos performance In Proc IEEE Int Electron DevicesMeeting (IEDM) 2010
3 A Mercha A Redolfi M Stucchi N Minas J Van OlmenS Thangaraju D Velenis S Domae Y Yang G Katti RLa- bie C Okoro M Zhao P Asimakopoulos I De WolfT Chiarella T Schram E Rohr A Van Ammel A Jour-dain W Ruythooren S Armini A Radisic H PhilipsenN Heylen M Kostermans P Jaenen E Sleeckx D Sabun-cuoglu Tezcan I Debusschere P Soussan D Perry G Vander Plas J H Cho P Marchal Y Travaly E Beyne S Biese-mans and B Swinnen Impact of thinning and throughsilicon via proximity on high-k metal gate first cmos per-formance In Proc Symp VLSI Technology (VLSIT) pages109ndash110 2010
4 P Asimakopoulos G Van der Plas A Yakovlev and PMarchal Evaluation of energy-recovering interconnects forlow-power 3d stacked ics In Proc IEEE Int Conf 3D Sys-tem Integration 3DIC 2009 pages 1ndash5 2009
5 P Asimakopoulos and A Yakovlev An AdiabaticPower-Supply Controller For Asynchronous Logic Circuits20th UK Asynchronous Forum pages 1ndash4 2008
iv
In the end I hope therersquos a little note somewhere
that says I designed a good computer
mdash Steve Wozniak [79]
A C K N O W L E D G E M E N T S
First and foremost I would like to express my gratitude to mysupervisor Prof Alex Yakovlev who provided the perfect balancebetween freedom and guidance allowing me to obtain sphericalknowledge and scope in microelectronics but at the same time topromptly complete my studies Irsquom also grateful to the Engineer-ing and Physical Science Research Council (EPSRC) for fundingmy PhD studies through their Doctoral Training Accounts
As part of my research project I had the opportunity to spendsome time at Imec Belgium The experiences and lessons learnedduring my stay at Imec will undoubtedly prove valuable in myforthcoming professional career I would like to thank Dan Perryfrom Qualcomm for sharing his experience and ideas for the de-sign of the characterization structures implemented on the 65nm
Imec process Special thanks to Shinichi Domae from Panasonicand Dr Dimitrios Velenis for their assistance with the 65nm wafermeasurements as well as Marco Facchini for the time he spendtrying to measure the energy-recovery circuits Last but not leastIrsquod like to thank Geert Van der Plas and Paul Marchal for theiradvices and guidance throughout my stay at Imec
Finally many thanks to my colleagues at Newcastle Universityfor assisting me in diverse ways during my studies I would liketo especially mention Dr Santosh Shedabale and James Dochertyfor reviewing and commenting on parts of this text as well asDr Robin Emery for the numerous technical discussions and forhelping setup the design tools
v
C O N T E N T S
1 Introduction 1
11 Research goals and contribution 2
12 Thesis outline 3
2 Background 5
21 3D integrated circuits 5
211 3D integration approaches 6
212 TSV interconnects 9
22 Energy-recovery logic 13
221 Energy-recovery architectures 17
222 Power-clock generators 18
23 Full-custom design flow 20
24 Summary 21
3 Characterization of TSV capacitance 23
31 Introduction 23
32 TSV parasitic capacitance 23
33 Integrated capacitance measurement techniques 24
34 Electrical characterization 27
341 Physical implementation 28
342 Measurement setup 36
343 Experimental results 37
35 Summary and conclusions 44
4 TSV proximity impact on MOSFET performance 46
41 Introduction 46
42 TSV-induced thermomechanical stress 46
43 Test structure implementation 48
431 MOSFET arrays 51
432 Digital selection logic 55
44 Experimental measurements 61
441 Data processing methodology 61
442 Measurement setup 62
443 PMOS (long channel) 64
444 PMOS (short channel) 71
45 Summary and conclusions 73
5 Evaluation of energy-recovery for TSV interconnects 75
51 Introduction 75
52 Energy-recovery TSV interconnects 75
53 Analysis 79
531 Adiabatic driver 80
532 Inductorrsquos parasitic resistance 82
533 Switch M1 82
534 Timing Constraints 83
54 Simulations 83
541 Comparison to CMOS 86
vi
contents vii
55 Summary and conclusions 91
6 Energy-recovery 3D-IC demonstrator 92
61 Introduction 92
62 Physical implementation 92
621 Adiabatic driver 94
622 Pulse-to-level converter 95
623 Data generator 95
624 Pulse generator 95
625 Integrated inductor 97
626 Top level design 98
627 CMOS IO circuit 101
63 Post-layout simulation and comparison 101
64 Experimental measurements 106
641 Measurement setup 106
642 Results 106
65 Summary and conclusions 111
7 Conclusions 114
71 Main contributions 115
72 Future work 116
a Appendix A Photos 118
b Appendix B Statistics 120
bibliography 122
L I S T O F F I G U R E S
Figure 11 Scaling trend of logic and interconnect delay[30] 2
Figure 21 Comparison of 2D3D Integration 5
Figure 22 System-in-a-package 6
Figure 23 Package-on-package 6
Figure 24 Imecrsquos 3-D SIP wireless bio-electronic sensor[67] 7
Figure 25 3D wafer-level-packaging 7
Figure 26 3D-stacked IC 8
Figure 27 TSV process steps 9
Figure 28 Before thinningafter thinning 10
Figure 29 TSV stacking 10
Figure 210 Conventional AND gate 14
Figure 211 CMOS switching 16
Figure 212 Adiabatic switching 16
Figure 213 Basic 2N-2P adiabatic circuit 17
Figure 214 2N-2P timing 17
Figure 215 Adiabatic driver 18
Figure 216 Stepwise charging generator 19
Figure 217 1N power clock generator 20
Figure 218 Design flow 22
Figure 219 Inverter in Virtuoso schematic 22
Figure 220 Inverter in Virtuoso layout 22
Figure 31 Isolated 2D wire capacitance 24
Figure 32 TSV parasitic capacitance 25
Figure 33 Capacitance vs Gate voltage (CV) diagramof a MOS Capacitor 25
Figure 34 CBCM pseudo-inverters 27
Figure 35 Current as a function of frequency in CBCM 28
Figure 36 CBCM pseudo-inverter pair in layout 29
Figure 37 PRC1 test pad module (layout) 30
Figure 38 PRC1 test pad module (micrograph) 30
Figure 39 Structure rsquoArsquo - Single TSV capacitance (TOPdie) 32
Figure 310 Structure rsquoBrsquo - Single TSV capacitance (BOT-TOM die) 33
Figure 311 Structure rsquoCrsquo - Variable pitch TSV capacitance34
Figure 312 Structure rsquoDrsquo - TSV capacitance measure-ment with interconnection on TOP or BOT-TOM die 34
Figure 313 Structure rsquoRrsquo - Evaluation of TSV capacit-ance measurement accuracy 35
viii
List of Figures ix
Figure 314 Structure rsquoErsquo - TSV capacitance effect onsignal propagation delay (layout) 35
Figure 315 Structure rsquoErsquo - TSV capacitance effect onsignal propagation delay (schematic) 36
Figure 316 Measurement setup 36
Figure 317 TSV capacitance D2D variability on waferrsquoD11rsquo - Probability density function 39
Figure 318 TSV capacitance D2D variability on waferrsquoD18rsquo - Probability density function 40
Figure 319 TSV capacitance measurement results onwafer D11 40
Figure 320 TSV capacitance measurement results onwafer D18 41
Figure 321 Simulated oxide liner thickness variation 42
Figure 322 TSV capacitance W2W variability - Probab-ility density functions 42
Figure 323 Variable pitch TSV capacitance experimentaldata from wafer rsquoD18rsquo - Probability densityfunction 44
Figure 324 LCR experimental data from wafer rsquoD18rsquo -Probability density function 44
Figure 41 TSV thermomechanical stress simulation [40] 47
Figure 42 TSV thermomechanical stress interactionbetween two TSVs [40] 48
Figure 43 Estimated mobility shift on PMOS (a) andNMOS (b) transistors in between two TSVsseparated by distance d [65] 49
Figure 44 PTP1 test pad module (layout) 50
Figure 45 PTP1 test pad module (micrograph) 50
Figure 46 MOSFET dimensions and rotations 51
Figure 47 1 TSV MOSFET array configuration 52
Figure 48 4 diagonal TSVs MOSFET array configuration52
Figure 49 4 lateral TSVs MOSFET array configuration 53
Figure 410 9 TSVs MOSFET array configuration 53
Figure 411 MOSFET array detail 55
Figure 412 Digital selection logic 56
Figure 413 MOSFET array with digital selection logic 57
Figure 414 4-bit Gate-enableArray-enable decoder 58
Figure 415 Transmission gate for MOSFET Gate terminals59
Figure 416 Transmission gates for MOSFET Drain-Source terminals 59
Figure 417 2-bit Drain-select decoder 60
Figure 418 Processing of measurement results 62
Figure 419 Interpolation of measurement results 63
Figure 420 Measurement setup 63
Figure 421 Normalised on-current for PMOS structureldquoF0rdquo (D08) 65
List of Figures x
Figure 422 Normalised on-current for PMOS structureldquoG0rdquo (D08) 66
Figure 423 Normalised on-current for PMOS structureldquoG0rdquo (D08Y=14X=16) 66
Figure 424 Normalised on-current for PMOS structureldquoG0rdquo (D20) 67
Figure 425 Normalised on-current for PMOS structureldquoG0rdquo (D20Y=14X=16) 67
Figure 426 Normalised on-current for PMOS structureldquoG0rdquo (D08 60˚C) 68
Figure 427 Normalised on-current for PMOS structureldquoG0rdquo (D08 Y=14 X=16 60˚C) 69
Figure 428 Normalised on-current for PMOS structureldquoI0rdquo (D08) 69
Figure 429 Normalised on-current for PMOS structureldquoI0rdquo (D08 X=16) 70
Figure 430 Normalised on-current for PMOS structureldquoI0rdquo (D08 Y=14) 70
Figure 431 Normalised on-current for PMOS structureldquoG90rdquo (D08) 71
Figure 432 Normalised on-current for PMOS structureldquoG90rdquo (D08 Y=14 X=16) 71
Figure 433 Normalised on-current for PMOS structureldquoB0rdquo (D08) 72
Figure 434 Normalised on-current for PMOS structureldquoB0rdquo (D08 Y=14 X=16) 72
Figure 51 TSV energy dissipation per cycle increasewith 3D integration density 76
Figure 52 Comparison of a typical CMOS TSV driverto the proposed energy-recovery 77
Figure 53 Adiabatic driver 78
Figure 54 A simple pulse-to-level converter (P2LC) 78
Figure 55 Resonant pulse generator with dual-rail dataencoding 79
Figure 56 Energy-recovery scheme for TSV interconnects79
Figure 57 Energy-recovery timing constraints 83
Figure 58 Test case for the evaluation of the theoret-ical model 84
Figure 59 Energy dissipation results ( f Jcycle) the-oretical model (T) versus SPICE simulator(S) 85
Figure 510 Energy dissipation estimation error of thetheoretical model compared to the SPICEsimulator 85
Figure 511 Test case for comparison of energy-recoveryto CMOS 86
Figure 512 Improved P2LC design 87
List of Figures xi
Figure 513 Simulated energy dissipation of improvedP2LC design 87
Figure 514 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS for variable Q factor (C = 80 f F) 88
Figure 515 Energy contribution of energy-recovery cir-cuit components at 200MHz (C = 80 f F) 88
Figure 516 Energy contribution of energy-recovery cir-cuit components at 800MHz (C = 80 f F) 89
Figure 517 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS with variable TSV capacitance (C =
80 f F f = 500MHz) 89
Figure 518 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS (C = 80 f F f = 200MHz) 90
Figure 519 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS for variable technology (C = 80 f Ff = 500MHz) 90
Figure 61 Demonstrator circuit structure (showing 1-bit inputoutput (IO) channel face-downview) 93
Figure 62 Energy-recovery IO circuit schematic dia-gram 94
Figure 63 Adiabatic driver (layoutschematic view) 94
Figure 64 P2LC (layoutschematic view) 95
Figure 65 Data generator (layoutschematic view) 96
Figure 66 Timing constraints for the energy-recoverydemonstrator circuit 96
Figure 67 Pulse generator (layoutschematic view) 97
Figure 68 Pulse generator InputOutput signals 97
Figure 69 Inductorrsquos Q factor simulation in ASITIC Estimated energy dissipation reduction percycle over standard CMOS 98
Figure 610 Planar square spiral inductor designed inASITIC 99
Figure 611 Resonant pulse distribution tree 100
Figure 612 Interdigitated metal-insulator-metal capa-citor (fringe capacitor) 101
Figure 613 Energy-recovery IO circuit in layout view 102
Figure 614 CMOS IO circuit schematic diagram 103
Figure 615 CMOS driver schematic 103
Figure 616 CMOS IO circuit in layout view 103
Figure 617 Post-layout SPICE simulation of the energy-recovery circuit 105
Figure 618 Energy-recovery IO circuit micrograph 107
List of Figures xii
Figure 619 CMOS IO circuit micrograph 107
Figure 620 Measurement setup 107
Figure 621 Oscilloscope output for signals S1B and OC-MOS 109
Figure 622 Measuredsimulated power dissipation forthe CMOS circuit 109
Figure 623 Measuredsimulated energy dissipation forthe CMOS circuit 110
Figure 624 Oscilloscope output for signals S1A and OER110
Figure 625 Measuredsimulated energy dissipation forthe energy-recovery circuit 111
Figure 626 Energy-recovery IO circuit with fault hy-pothesis 111
Figure 627 Simulation of circuit with fault hypothesis 112
Figure 628 Measured energy dissipation and simula-tion of circuit with fault hypothesis 113
Figure A1 3D65 300mm wafer (on chuck) 118
Figure A2 Probe card 118
Figure A3 3D130 200mm wafer (on chuck) 119
Figure A4 Probe head 119
L I S T O F TA B L E S
Table 21 3D interconnect technologies comparison 8
Table 22 Technology parameters for Imec 3D130 process11
Table 23 Technology parameters for Imec 3D65 process11
Table 24 Comparison energy-recoveryconventionalCMOS 16
Table 31 Test structures on the PRC1 module 31
Table 32 PRC1 test pad instrument connections 37
Table 33 TSV capacitance D2D variability on waferrsquoD11rsquo 38
Table 34 TSV capacitance D2D variability on waferrsquoD18rsquo 39
Table 35 Variable pitch TSV capacitance experimentaldata from wafer rsquoD18rsquo 43
Table 36 Comparison between CBCMLCR measure-ments on wafer rsquoD18rsquo 45
Table 41 Coefficients of thermal expansion 47
Table 42 MOSFET array configurations for test padmodules PTP1PTP2 54
Table 43 MOSFET array configurations selected forexperimental evaluation 61
Table 44 PTP1PTP2 test pad module instrumentconnections 64
Table 45 Summary of experimental measurementson long (L) and short (S) channel PMOSdevices 73
Table 51 Energy dissipation results ( f Jcycle) the-oretical model (T) versus SPICE simulation(S) 84
Table 61 Energy-recovery IO circuit parameters 93
Table 62 Calculated resistance tolerances for eachwire branch in the resonant clock distribu-tion tree 100
Table 63 Energy-dissipation theoretical calculationsfor Q = 17 f = 500MHz 104
Table 64 Post-layout simulated energy dissipation Area requirements 106
Table 65 Energy-recovery circuit instrument connec-tions 108
Table 66 CMOS circuit instrument connections 108
xiii
A C R O N Y M S
2D two-dimentional
2N-2P 2 nMOS2 pMOS adiabatic logic
3D-IC three-dimentional integrated circuit
3D-SIC 3D stacked-IC
3D-SIP 3D system-in-package
3D three-dimentional
3D-WLP 3D wafer-level-packaging
BCB benzocyclobutene
BEOL back-end-of-line
CAL clocked adiabatic logic
CBCM charge-based capacitance measurement
CMOS complementary metal-oxide-semiconductor
CMP chemical-mechanical planarization
CTE coefficient of thermal expansion
Cu copper
CVD chemical vapor deposition
D2D die-to-die
D2W die-to-wafer
DCVSL differential cascode voltage switch logic
DRC design rule check
DUT device under test
ECRL efficient charge recovery logic
EDA electronic design automation
FEM finite element method
FEOL front-end-of-line
GPIB general purpose interface bus
IC integrated circuit
xiv
acronyms xv
IO inputoutput
KOZ keep-out-zone
LC inductance-capacitance
LVS layout versus schematic
MEMS microelectromechanical systems
MIM metal-insulator-metal
MOSFET metal-oxide-semiconductor field-effect transistor
MOS metal-oxide-semiconductor
MPU microprocessor unit
MuGFET multiple gate field-effect transistor
NMOS n-channel MOSFET
P2LC pulse-to-level converter
PAL pass-transistor adiabatic logic
PCB printed circuit board
PCG power-clock generator
PMD pre-metal dielectric
PMOS p-channel MOSFET
PVT process voltage temperature
QSERL quasi-static energy recovery logic
RC resistance-capacitance
RDL re-distribution layer
RLC resistance-inductance-capacitance
RMS root mean square
SIP system-in-a-package
Si silicon
SMU source measurement unit
SoC system-on-chip
TaN tantalum nitride
TSV through-silicon via
W tungsten
acronyms xvi
W2W wafer-to-wafer
WLP wafer-level-packaging
1I N T R O D U C T I O N
Extending integrated circuit (IC) functionality while keepingcost per silicon area approximately constant has historicallybeen the fundamental factor behind the growth of the semi-conductor industry Ever since Moorersquos observation [51] in 1965
chip functionality has been doubling every 15-2 years advancingfrom the Intel 4004 with 2000 transistors to modern generationmicroprocessor units (MPUs) that integrate over 1 billion tran-sistors [35] While there are indications that this trend might beslowing down due to planar metal-oxide-semiconductor field-effect transistor (MOSFET) physical limitations [22] innovativesolutions such as multiple gate field-effect transistors (MuGFETs)continue to drive the transistor roadmap forward as was recentlydemonstrated by Intel on their 22nm node MPU [12] It can bereasonably expected that IC transistor density will continue toincrease well into the next decade reaching 10 or even 100 billiontransistors per chip [4]
As transistor gate length dielectric thickness and junctiondepth are scaled performance in terms of speed and power im-proves as well The performance requirements for ICs have beensteadily increasing for succeeding technology generations drivenby market demand Although transistor scaling has traditionallybeen the main technology focus determining both cost and per-formance of ICs in recent years the performance gains enabled byscaling have been gradually challenged by on-chip interconnects
In an IC interconnects are the conducting wires connectingdevices and functional blocks in the circuit With smaller tran-sistors and increased functionality in modern processes intercon-nects have been progressively becoming more complex occupyingup to 12 metal layers in the chip As wire length increased sodid the signal delay time and power consumption of intercon-nects which are typically determined by wire intrinsic parasiticsin terms of resistance-capacitance (RC) It was estimated that a130nm microprocessor consumed over 50 of dynamic poweron interconnects [41] with a projection that in future designsup to 80 of microprocessor power would be consumed byinterconnects [3] Likewise signal propagation delay of globalinterconnects as a portion of the total system delay has beenrapidly increasing for succeeding technology nodes (Figure 11)
The rapid increase of the interconnect impact on IC perform-ance prompted ITRS1 to appoint the realisation of novel and
1 International Technology Roadmap for Semiconductors
1
11 research goals and contribution 2
Figure 11 Scaling trend of logic and interconnect delay [30]
innovative interconnect technologies as one of the ldquoGrand Chal-lengesrdquo in 2007 [3] Since then several technologies have been in-vestigated to address the interconnect bottleneck such as copperlow-k interconnects optical interconnects and three-dimentional (3D)interconnects Although copperlow-k interconnects are morecompatible with current processes and remain the most attractiveoption into the near term 3D interconnects are probably the mostrealistic option for meeting future technology trends [53]
Emerging 3D integration technologies based on vertical through-silicon vias (TSVs) promise a solution to the interconnect per-formance bottleneck along with reduced fabrication cost andheterogeneous integration TSVs are electrical connections passingthrough the silicon substrate and interconnecting separate diesstacked vertically in a 3D package [69] While there are several 3D
interconnect technologies [11] TSVs have the potential to offer thebest all around 3D integration technology with their advantagesincluding reduced parasitics and high interconnection density
11 research goals and contribution
As TSV is a relatively recent 3D interconnect technology there arestill challenges to be overcome before TSVs can be widely adoptedand commercialised Fabrication issues such as etching linerdeposition metallization and chemical-mechanical planarization(CMP) steps are in the process of being resolved or solutions havealready been found [55] Consequently the next logical step isthe implementation of three-dimentional integrated circuits (3D-
ICs) with more complex functionalities than simple test chipsHowever complex circuit design requires design methodologiesand tools that for 3D-ICs are still at their infancy stage [18]
The motivation behind this project was to accelerate the de-velopment of such design methodologies and tools for future
12 thesis outline 3
3D-IC designs by addressing key concerns of the TSV technologyprimarily from the circuit designerrsquos perspective The researcheffort was aimed at developing characterization structures forevaluating TSV characteristics affecting signal propagation delaypower consumption and circuit robustness as well as proposingdesign techniques for overcoming limitations Although TSVs areinterconnects their behaviour resembles more to MOSFET devicesthan the commonly used back-end-of-line (BEOL) metal wireswhile their integration into silicon introduces new challenges forICs Consequently under certain conditions conventional designapproaches are sufficient neither for TSV characterization nor forTSV interconnection
The TSV parasitic capacitance and thermomechanical stress dis-tribution are critical TSV characteristics The parasitic capacitanceaffects both delay and power consumption of signals passingthrough the TSV interconnect while thermomechanical stress af-fects MOSFET device mobility in the vicinity of TSVs Accuratelydetermining these parameters is necessary for generating designrules and models that can be used in TSV-aware design tools Aspart of this project sophisticated test structures were developedand fabricated on a 65nm 3D process for extracting the TSV para-sitic capacitance and thermomechanical stress distribution
The TSV parasitic capacitance is typically small however it maystill become an important source of power consumption as TSVs
are gradually implemented in large densely interconnected 3D-system-on-chips (SoCs) Although conventional low-power designtechniques based on voltage scaling can be used this representsa technical challenge in modern technology nodes In this worka novel TSV interconnection scheme is proposed which is basedon energy-recovery logic and shows frequency-dependent powerconsumption The scheme is analysed using theoretical modellingwhile a demonstrator IC was designed based on the developedtheory and fabricated on a 130nm 3D process
The work presented in this thesis was part of a larger researchproject undertaken at Imec2 with the goal of improving andoptimising TSV-based 3D integration technology The authorrsquoscontribution within this project was to propose design and sim-ulate the circuits presented in the following chapters as well asperform and analyse the experimental measurements
12 thesis outline
This thesis is organised into seven Chapters and two Appendixesas follows
2 Imec is a micro- and nanoelectronics research center located in Leuven Belgium(httpwwwimecbe)
12 thesis outline 4
Chapter 2 (Background) introduces the basic concepts com-prising the fields of 3D integration and energy-recovery logic Inaddition the design flow used for implementing the design partof this work is described
Chapter 3 (Characterization of TSV capacitance) presents agroup of test structures for characterizing the TSV parasitic ca-pacitance under various conditions The test structures are fab-ricated on a 65nm 3D process and the measurement data areanalysed using statistical methods
Chapter 4 (TSV proximity impact on MOSFET performance)presents a test structure for monitoring the impact of TSV-inducedthermomechanical stress on MOSFET device performance Thetest structure uses proven design techniques for implementinga large number of test-cases and accessing numerous deviceswith maximum precision The structure is fabricated and usedfor characterizing thermomechanical stress on a 65nm 3D process
Chapter 5 (Evaluation of energy-recovery for TSV intercon-nects) investigates the potential of the energy-recovery techniquefor reducing the energy dissipation of TSV interconnects in 3D-ICsThe total energy dissipation per cycle and optimum device sizingare extracted for the proposed scheme using theoretical model-ling while the configuration is evaluated against conventionalstatic complementary metal-oxide-semiconductor (CMOS) design
Chapter 6 (Energy-recovery 3D-IC demonstrator) elaborates onthe design of a 3D demonstrator circuit based on the energy-recovery scheme under realistic physical and electrical con-straints The demonstrator is compared to a CMOS circuit de-signed with identical specifications using post-layout simulationsBoth circuits are fabricated on a 130nm 3D technology processand evaluated based on the experimental measurements
Chapter 7 (Conclusions) summarises the contributions of thisthesis as presented in the previous chapters and discusses areasof future work
Appendix A includes photos used as reference in various partsof this work
Appendix B elaborates on the statistical descriptors used inthis work for analysing experimental measurement data
2B A C K G R O U N D
The work presented in this thesis contributes in the fields of 3D
integration and energy-recovery logic This chapter introduces thebasic concepts comprising these two fields not in extensive detailbut enough that the reader can understand the motivation andthe choices made in the context of this work Furthermore thefull-custom design flow is discussed that was used to implementthe design part of this thesis as will be presented in the followingchapters
21 3d integrated circuits
CMOS integration technologies are traditionally based on two-dimentional (2D) planar architectures Each die is individuallypackaged and placed on a printed circuit board (PCB) which isused to provide mechanical support and interconnection betweendifferent dies and electronic components A 3D-IC is a stack ofmultiple dies (Figure 21) vertically integrated and interconnec-ted sharing a common package
There are several motivations for the development of 3D-ICsPerhaps the most significant advantage of 3D integration is thereduced interconnection delay and energy dissipation In generalglobal interconnect delay and energy is linearly related to wirelength in an IC By using 3D integration modelling of simple3D-IC architectures has demonstrated that up to 50 reduction inglobal interconnect length may be achieved [24] with comparablereductions in delay and energy
Another advantage of 3D integration is heterogeneous integra-tion Presently 2D planar architectures limit designers to a singlefabrication technology for both the analog and digital parts of
ABCDEFBA ABCDEFBA
Figure 21 Comparison of 2D3D Integration
5
21 3d integrated circuits 6
ABCDEFBC
E
Figure 22 System-in-a-package
ABCDEFBC
E
Figure 23 Package-on-package
a circuit The common practice is to use an inexpensive digitalprocess for the complete design which is usually inefficient forimplementing analog circuits In contrast a 3D-IC allows for themost suitable process technology to be used for implementingeach part of the design In addition other components may beintegrated in the same circuit such as microelectromechanicalsystems (MEMS) [80] and passive devices [66] (resistors capacitorsinductors) enabling ICs with novel functionalities
The replacement of long horizontal global interconnects withshort vertical ones allows for reduced die size as well which isdirectly related to the fabrication cost Also the vertical integ-ration of dies in the same package removes the need for thespacing overhead between components that is necessary in PCBsincreasing the area efficiency and reducing the form factor
211 3D integration approaches
3D integration may be achieved using several microelectronictechnologies which result in different densities and capabilitiesThe levels of 3D integration can be reasonably classified into threemajor technology platforms [11 4] based on the interconnectwiring hierarchy and the industrial infrastructure
bull 3D system-in-package (3D-SIP) Stacking of individual pack-ages
bull 3D wafer-level-packaging (3D-WLP) Stacking of embeddeddies by wafer-level-packaging (WLP)
bull 3D stacked-IC (3D-SIC) TSV-based vertical system integra-tion
The 3D-SIP integration is based on traditional packaging intercon-nect technologies Each sub-system is integrated in a system-in-a-package (SIP) fashion (Figure 22) using wire-bonds for inter-connection and subsequently each SIP subsystem is assembledvertically to create package-on-package 3D stacks (Figure 23)
3D-SIP is currently the most mature technology and has alreadyreached volume production It provides with increased function-ality in a smaller form factor and low cost However interconnectdensity is limited since both wire bond pads and solder bumps
21 3d integrated circuits 7
Figure 24 Imecrsquos 3-D SIP wireless bio-electronic sensor [67]
A
BCDEAFE
FFDC
CFE
Figure 25 3D wafer-level-packaging
take up a large area on silicon due to their large pitch The longbond wire length can also significantly distort signals due toits intrinsic parasitics Hence this approach offers limited per-formance when compared to other 3D interconnect solutions Anexample of 3D-SIP application is the wireless bio-electronic sensor(Figure 24) developed at Imec [67]
3D-WLP integration is based on WLP technology directly stack-ing individual dies face-to-face using flip-chip micro-bumps forinterconnection The 3D interconnects are processed after IC pas-sivation using WLP techniques and may also be realised usingTSVs (Figure 25) The TSVs if existing are much larger than theequivalent interconnections in a 3D-SIC typically in the range of20 minus 40microm (1 minus 5microm in 3D-SIC) [11]
The 3D-SIC technology [69] is similar to a 2D SoC approachexcept that circuits exist physically on different layers and verticalinterconnects replace the global interconnections of the SoC 3D-
SICs (Figure 26) are based on TSV interconnections and have thepotential to offer the best all around3D integration technology Theadvantages of TSV-based 3D-SICs include the short interconnectlength (reduced parasitics) and the high interconnection densityHowever the additional process steps required for implementingthe TSVs increase manufacturing cost
A comparison between the discussed 3D interconnect technolo-gies is attempted in Table 21
21 3d integrated circuits 8
ABCD
EFFDB
F
Figure 26 3D-stacked IC
3D interconnect
technologyAdvantages Disadvantages
3D system-in-package (3D-SIP)
Simple existingtechnology bestheterogeneous
integration
Performance(speed power) low
densitymanufacturing costin high-quantities
3D wafer-level-packaging (3D-
WLP)
Existingtechnology good
compromisebetween
performance-manufacturing
cost
Averageperformance and
density
3D stacked-IC (3D-SIC)
Performance(speed power)
high densitymanufacturing costin high-quantities
Manufacturing costin low-quantities
not ready formanufacturing
Table 21 3D interconnect technologies comparison
21 3d integrated circuits 9
ABBCDE
FBACACDA
ACCD F
Figure 27 TSV process steps
212 TSV interconnects
Of particular interest to this PhD thesis is the 3D-SIC approachdeveloped at Imec and specifically the TSVs used for verticalinterconnection
In Imecrsquos 3D-SIC process [69 73] TSVs are created prior to theBEOL processing The TSV process steps are illustrated in Fig-ure 27 After processing of the CMOS front-end-of-line (FEOL) andpre-metal dielectric (PMD) stack the TSV hole is plasma-etchedwith a diameter of 5microm and a minimum pitch of 10microm A thinchemical vapor deposition (CVD) oxide layer (sim 200nm) is thendeposited for insulating the TSVs from the substrate The metal-lization sequence consists of a tantalum nitride (TaN) barrier de-position and filling of the TSV hole with electroplated copper (Cu)Finally the Cu overburden is removed in a top-side CMP step anda standard 2-metal layer BEOL process is applied to finalise thedie
Before stacking the wafer is mounted on a temporary carrierand thinned down to a silicon (Si) thickness of sim 25microm leavingthe TSV Cu exposed on the back-side of the wafer (Figure 28)After dicing the dies are stacked by Cu-Cu thermocompressionin a die-to-die (D2D) or die-to-wafer (D2W) fashion (Figure 29)using a tacky polymer as an intermediate glue layer such asbenzocyclobutene (BCB) (ǫr = 27) Target technology parametersfor Imecrsquos 3D-SIC 130nm and 65nm processes are summarised inTables 22 and 23 respectively
As TSV is a relatively new 3D interconnect technology thereare several challenges to be overcome before TSV technology canbe widely adopted and commercialised Issues in the fabricationlike etching liner deposition metallization and CMP steps arein the process of being resolved or solutions have already beenfound [55] On the contrary design methodologies for TSV-basedICs are still at their infancy stage [18]
21 3d integrated circuits 10
AAAB
CAAAB
ABC
DEF
Figure 28 Before thinningafter thinning
ABC ABC
DE DE
F
DEE
B
CBA
Figure 29 TSV stacking
21 3d integrated circuits 11
Minimum devicechannel length (nm)
130
Supply voltage (V) 12
Routing layers 2
TSV diameter (microm) 5
TSV pitch (microm) 10
TSV length (microm) 22
TSV resistance (mΩ) ~20
TSV capacitance( f F)
~40
Wafer diameter(mm)
200
Table 22 Technology paramet-ers for Imec 3D130
process
Minimum devicechannel length (nm)
70
Supply voltage (V) 10
Routing layers 2
TSV diameter (microm) 5
TSV pitch (microm) 10
TSV length (microm) 40
TSV resistance (mΩ) ~40
TSV capacitance( f F)
~80
Wafer diameter(mm)
300
Table 23 Technology paramet-ers for Imec 3D65 pro-cess
Approaching the technology challenges from the designerrsquosperspective we can emphasise on the following issues designtools thermomechanical stress thermal analysis electrical char-acteristics and testing
Design tools
Although it is possible to design 3D-ICs using existing tools byutilising ad hoc techniques as 3D implementations become moreadvanced ldquonativerdquo 3D design and analysis tools will becomenecessary A few physical layout tools are currently availablefor TSV-based design [2] however they do not offer automaticpartitioning placement and routing Moreover there are noavailable commercial tools for evaluating performance (speedpower) reliability and manufacturability in 3D-IC designs
Thermomechanical stress
TSV fill materials may include copper (Cu) tungsten (W) orpolysilicon[20] The large mismatch in the coefficient of thermalexpansion (CTE) between the fill material of the TSV (17 middot 10minus6K)and the Si surface (3 middot 10minus6K) induces mechanical stress aroundthe vias As Cu contracts much faster than Si it pulls the surface ofthe surrounding bulk while cooling down causing tensile stressin the area [25] Since even low stress values affect carrier mobilityin MOSFET devices [71] performance shift of the transistors inproximity to TSVs may occur
21 3d integrated circuits 12
The intensity of TSV-induced stress depends on various condi-tions such as TSV geometry device proximity and process char-acteristics Defining design rules such as keep-out-zones (KOZs)for digital and analog circuits and generating stress-effect modelsis an important milestone for implementing TSV-aware circuitswith predictable performance Simulations predict the complexdistribution of thermomechanical stress [40] which emphasisesthe need for sophisticated characterization structures that canmonitor stress impact with precision and verify simulation mod-els In Chapter 4 a characterization structure is presented forevaluating the impact of TSV-induced thermomechanical stresson MOSFET device performance
Thermal analysis
The close proximity of dies in a 3D-IC vertical stack increasesthe heat density proportionally to the number of stacked diesAs temperature affects transistor parameters such as leakagecurrent accurate thermal analysis is required for 3D-ICs to ensurethat design specifications are met Presently tools for thermalanalysis of planar ICs are available however they have to beadapted for use with complex 3D systems so as to reduce runtime and improve the accuracy of TSV models [18] In additionldquothermally-awarerdquo design tools are required which can partitionthe design in such a way that high heat generating componentsare placed on layers in proximity to the heat-sink
Electrical characteristics
The electrical characteristics of TSVs such as delay crosstalk sig-nal integrity (SI) and power integrity (PI) dictate many aspects ofa 3D-IC design Proper evaluation of TSV characteristics typicallythrough simulation means is critical for accurately predicting theperformance of a 3D system [57]
Several attempts have been made to generate models of theTSV structure [77 27] using its intrinsic resistance-inductance-capacitance (RLC) parasitics in an equivalent circuit model TheRLC parasitics depend on the materials and geometry of the TSV
structure while prior work has indicated that these parametersare relatively small with the capacitance assuming the dominantrole on TSV electrical properties [27] For verifying the accur-acy and calibrating the models experimental measurements arerequired using compatible characterization structures Such struc-tures are presented in Chapter 3 which were used for extractingthe TSVs parasitic capacitance of an experimental 3D process
22 energy-recovery logic 13
Testing
Testing ICs for defects is crucial for guaranteeing a sufficient man-ufacturing quality The unique processing steps of TSVs such aswafer thinning alignment and bonding introduce new types ofdefects that should be considered during testing [46] Further-more finding and diagnosing defects in TSV interconnections isconsiderably more challenging than 2D interconnects consideringthat it is not possible to gain physical access to the vias after thetop die has been placed All these factors emphasise the need fornew testing structures and methodologies for testing 3D systems
22 energy-recovery logic
TSV 3D interconnect technology aims to reduce delay and en-ergy dissipation when compared to the traditionally used planarBEOL interconnections A key TSV characteristic is its parasiticcapacitance discussed in Chapter 3 which is typically small(Tables 22 and 23) influencing proportionally both delay andenergy dissipation Although TSV technology offers significantperformance improvements to previous solutions a low-powerinterconnection scheme may still be desirable under certain con-ditions due to specific characteristics of TSV technology
1 The TSV capacitance may become a concern in future largedensely interconnected 3D-SoCs as it increases linearly withthe number of tiers and interconnections
2 Since interconnects do not follow transistor scaling tran-sistor gate capacitance will always be smaller than intercon-nect capacitance With the gap tending to widen in futuretechnologies it can be reasonably expected that 3D intercon-nects will become a performance bottleneck in ICs just asthe BEOL interconnections preceding them
Furthermore conventional CMOS low-power design techniquesbased on voltage scaling represent a technical challenge in mod-ern technology nodes as process voltage temperature (PVT) vari-ations and increased leakage [61] decrease in practice the rangeover which voltages may be scaled An interesting low-powerdesign approach is energy-recovery logic which demonstratesfrequency-dependent energy dissipation Energy-recovery logicis a design technique for minimising the energy dissipation ofcomputing devices well below typical CMOS limits Its major lim-itation is the requirement for passive components to generatetime-variable signals which cannot be efficiently integrated intocurrent generation ICs TSV-based 3D stacking and heterogeneousintegration of high-quality passive devices [66] may enable theuse of novel design techniques such as energy-recovery that
22 energy-recovery logic 14
Figure 210 Conventional AND gate
were not practically feasible in the past due to technology lim-itations In this PhD thesis energy-recovery is used for drivingTSV-based interconnections in a low-power scheme discussed inChapters 5 and 6
Historically energy-recovery logic has its roots in reversiblecomputing Reversibility is a concept from thermodynamics de-scribing physical processes that after they have taken place canbe reversed without loss or dissipation of energy Reversibleprocesses cause no change in the entropy of the system or itssurroundings (do not generate any waste heat) therefore theyachieve maximum energy efficiency The concept of reversibilityin thermodynamics may be extended to logical reversibility ifa computation can be implemented in such a way that it is al-ways possible to determine the previous state of the computationgiven a description of its current state In practical terms logicalreversibility would require that all information about the state ofthe computation in any stage is retained and never discarded
Yet the typical computing device is not reversible (irreversible)and as Landauer [36] has shown there is a fundamental min-imum amount of energy that must dissipated in such devicesThis minimum energy dissipation is known as the Von Neu-mann - Landauer limit and is at least kT ln 2 per irreversible bitcomputation The irreversible operation of conventional comput-ing can be demonstrated using the AND gate as an example(Figure 210) The AND gate has two input lines and one outputWhen both inputs are at logical 1 the output is 1 however theoutput is 0 in all of the remaining three combinations that have alogical 0 on the inputs Hence every time the gatersquos output be-comes 0 there is loss of information since we cannot determineexactly which input state caused the logical 0 at the output
Computations do not necessarily have to be irreversible aswas shown by Bennett [9] and thus it is still possible to breakthe kT ln 2 limit A reversible computing device would have to
22 energy-recovery logic 15
perform each computation twice the first time in the forwarddirection applying the inputs to a logic function and obtainingthe result and the second time backwards computing the inversefunction and returning the system to its initial state If a com-putation could be performed in such a way and implementedon non-dissipative hardware its energy requirements could bepotentially reduced to zero [10]
In practice even if a circuit is designed to be logically reversibleachieving zero-energy computations utilising existing electricalcircuits is limited considering that any process merely involvingcharge transfer will result in dissipated energy Nevertheless it isstill possible to reduce the energy dissipation during charge trans-ferring to ldquoasymptotically zerordquo levels applying circuit designtechniques generally known as adiabatic switching [8 31]
The operating principle of adiabatic switching can be bestexplained by comparing it to conventional CMOS switching In aconventional CMOS circuit (Figure 211) when the switch closes(typically the pull-up network) a certain amount of charge Q =
CVDD is pulled out of the positive power supply rail that chargesthe load capacitance C up to VDD The energy transferred fromthe power supply during the charging period (T) is [59]
E = QVDD = CV2DD (21)
From the total transferred energy only half ( 12 CV2
DD) is storedon the capacitor C while the other half is dissipated on theswitch and interconnect resistance R As can be derived fromEquation 21 to reduce the energy dissipation in conventionalCMOS either the supply voltage VDD has to be reduced or theload capacitance C
The adiabatic charging of an equivalent node capacitance C
is modelled in Figure 212 In contrast to the conventional CMOS
circuit the supply voltage in this case is not constant but time-variable with a slow rising time (T ≫ RC) If the charging periodof the capacitor (or rising time of the voltage supply) is T then itcan be proven [82 60] that the energy dissipated on the resistanceR would be proportional to
EAdiabatic prop
(
RC
T
)
CV2DD (22)
The distinct advantage of adiabatic switching is that the slowrising time of the supply voltage keeps the voltage difference(and thus current flow) on the resistance R low at all times Asthe charge transfer is spread more evenly over the entire chargingperiod the energy dissipation (E = I2R) is greatly reduced Inaddition as can be derived from Equation 22 increasing thecharging period (T) can potentially reduce the energy dissipationto arbitrary low levels irrespective of the VDD or C parameters
22 energy-recovery logic 16
A
B
CC
D
E
D
Figure 211 CMOS switching
AA
B
C
B
Figure 212 Adiabatic switching
Conventional CMOS Energy-recovery CMOS
2 states True False 3 states True False Off
Energy of information bitdissipated as heat on the
pull-down network
Energy of information bitrecovered by the
power-supply
High speed Lowmedium speed
High energy dissipation Very low energy dissipation
Area efficient Some area overhead
Table 24 Comparison energy-recoveryconventional CMOS
Combining reversible computations with adiabatic switchingtechniques allows for circuit implementations with extremely lowenergy dissipation compared to conventional designs [37 38]However fully reversible circuits in CMOS technology sufferfrom high area overhead as the inverse logic function has tobe implemented and intermediate logic states must be stored toenable reversibility of the computation [8] In cases where theoverhead of full reversibility is too high it can be significantlyreduced by allowing some information to be occasionally dis-carded without sacrificing much of the energy benefits Thesecircuits that partially apply the principles of reversible comput-ing and adiabatic switching to achieve low but not zero energydissipation are called energy or charge recovery [31]
A comparison between energy-recovery and conventional CMOSis summarised in Table 24
22 energy-recovery logic 17
ABCABC
Figure 213 Basic 2N-2P adiabaticcircuit
ABC
ABC
DEF
EE
DEF
Figure 214 2N-2P timing
221 Energy-recovery architectures
Several energy-recovery circuit architectures have been proposedin the literature over the past years Examples of logic families are2 nMOS2 pMOS adiabatic logic (2N-2P) [33] efficient charge re-covery logic (ECRL) [50] pass-transistor adiabatic logic (PAL) [54]clocked adiabatic logic (CAL) [45] quasi-static energy recoverylogic (QSERL) [83] and boost logic [62] These architectures applythe principles of reversible computing and adiabatic switchingto a variable degree while their differences are mainly relatedto characteristics such as singledual-rail logic style number ofclock phases chargingdischarging path and energy efficiency
A simple configuration is the 2N-2P family of logic circuits(Figure 213) which is based on the differential cascode voltageswitch logic (DCVSL) circuit [59] The timing diagram of the 2N-2P
circuit is presented in Figure 214 using an idealised adiabaticvoltage supply Initially the voltage supply is in the WAIT phase(LOW) keeping the outputs in the LOW state Then the inputs areset complementary to each other and the supply voltage rampsup As the inputs are evaluated the output that was enabled bythe pull-down network follows the power supply until it reachesVDD At that moment the inputs return to the LOW state andafter a certain period of time in the HOLD ldquo1rdquo phase the voltagesupply ramps down discharging the output As can be observedthe average voltage supply during the computation cycle is wellbelow VDD which can be advantageous in modern technologynodes where leakage currents are of a concern
The adiabatic chargingdischarging takes place during theramping updown of the EVALUATERESET phases while thereverse computation occurs during the RESET phase when thetransferred charge is returned to the power supply Howeverthe reversibility of the circuit is only partial since the p-channel
22 energy-recovery logic 18
Figure 215 Adiabatic driver
MOSFET (PMOS) transistor that was turned-on by the pull-downnetwork will be forced to turn-off when the output is approxim-ately a threshold voltage above zero This will cause some chargeto be trapped on the output which will be dissipated as heat on asubsequent cycle Using the 2N-2P circuit arbitrary logic functionsmay be implemented in the place of the pull-down network
The energy efficient characteristics of energy-recovery logic canbe also beneficial for driving large load capacitances A circuitthat may be used in such case is the adiabatic driveramplifier(Figure 215) [8]
The input of the adiabatic driver is dual-rail encoded as well asthe output The operation principle is straightforward First theinputs are set to valid values and then an adiabatic time-variablesignal goes through the pass-gate charging the load capacitanceat the output to its maximum value The output is now validand can be used by other stages to perform computations Sub-sequently the output slowly discharges returning the charge backto the power-supply Energy is saved by two methods Firstlythe slow-rising adiabatic signal forces a low voltage differencebetween the inputs of the pass-gate resulting in low currentand thus low energy dissipation Secondly the energy that isnot dissipated as heat during the transfer is returned back tothe power-supply which can be re-used in the next computationcycle
Circuits utilising the energy-recovery technique have been suc-cessfully implemented in the past while they have demonstratedgreat potential whilst driving highly-parasitic global intercon-nects such as clock distribution networks [42 63]
222 Power-clock generators
Energy-recovery circuits require specialised power supplies thatcan generate multiple-phase time-variable adiabatic signals and
22 energy-recovery logic 19
A
A
B
C
C
Figure 216 Stepwise charging generator
recover charge Since the power supply in an energy-recoverycircuit provides with both the power and clock signal they arereferred to as power-clock generators (PCGs) [6] The power-clockgenerators fall into two categories staircase and resonant gener-ators
Staircase generators use the stepwise charging technique [68]The basic circuit (Figure 216) is composed of N number of tankcapacitors (CT) that are successively discharged in N steps un-til the load capacitance (CLOAD) is charged to the maximumvoltage level This technique attempts to ldquoemulaterdquo the adiabaticchargingdischarging signal while its linearity depends on thenumber of charging steps N The advantage of staircase generat-ors is that they can be easily integrated on-chip since they do notrequire inductors as is the case with resonant generators How-ever both speed performance and energy efficiency are relativelylow
Resonant generators [44] are based on inductance-capacitance(LC) oscillators A simple circuit is the 1N single-phase powerclock generator of Figure 217 If R is the resistance of the adia-batic driver in Figure 215 then the load capacitance (CLOAD)along with a resonant-tank inductor (L) form an RLC oscillatorresonating at
f =1
2πradic
LCLOAD
(23)
As typical for RLC oscillators the shape of the power-clockis a sinewave which approximates the idealised waveform ofFigure 214 Energy is transferred adiabatically during the risinghalf of the sinewave while on the falling half the energy is
23 full-custom design flow 20
A
BCD
E
DFF
CD
Figure 217 1N power clock generator
partially recovered and stored in the inductor for later use RLC
oscillators are very energy efficient power-clock generators andcan operate at high speeds However inductors may be difficultto implement on-chip
23 full-custom design flow
All circuits presented in this thesis were designed using the full-custom approach since the target fabrication technology was anexperimental 3D process for which neither standard cells nordedicated 3D design tools were in our disposal
In the full-custom design methodology each individual tran-sistor and associated connectivity in the integrated circuit ismanually specified in the layout level by the designer Full-customis considered to be superior to standard-cell design when themain concern is circuit performance and area efficiency Further-more full-custom design approaches are essential in cases wherespecialised circuitry are to be designed or when standard celllibraries are not available
The electronic design automation (EDA) environment that wasused was based on Cadence1 applications and tools as integ-rated in the Design Framework II (DFII version 5141) CadenceDFII consists of tools for design management (library manager)schematic entry (Virtuoso schematic) physical layout (Virtuosolayout) verification (Assura) and simulation (Spectre) Thesetools are completely customisable supporting various processtechnologies through foundry-specific design kits In our case
1 Cadence Design Systems Inc (httpwwwcadencecom)
24 summary 21
the Assura tool was replaced by Mentor Graphics2 Calibre forthe verification step
Our customised design flow is illustrated in Figure 218 Ini-tially the circuit is captured at the transistor-level using theschematic editor (Figure 219) and the completed design is simu-lated and modified until it conforms to the design specificationsThe same circuit is re-created in the layout editor (Figure 220)where the detailed geometries and positioning of each fabricationmask layer is described The layout must conform to a series ofprocess design rules and any violations of these rules can be de-tected during the design rule check (DRC) step Subsequently thelayout design is translated to a circuit description (netlist) whichis compared to the initial circuit in the schematic editor duringthe layout versus schematic (LVS) step If the two designs matchthe interconnectivity and transistor parasitics are estimated fromthe layout design and another netlist is created which includesthese parasitics This detailed netlist can be simulated to get amore accurate estimation of the circuit behaviour including theinfluence of parasitics
24 summary
In this chapter the reader was introduced to the basic conceptsand terminology that are relevant in the context of this thesisThe following chapter begins the discussion of the research partof this work presenting a group of test structures that were usedfor the electrical characterization of the TSV parasitic capacitancein an experimental 3D process
2 Mentor Graphics Inc (httpwwwmentorcom)
24 summary 22
ABCDEA
FCCAAE
ABCDEAEC
DE
E
F
CEA
EEAEA
EDE
CAC
E
EC
CAC
Figure 218 Design flow
Figure 219 Inverter in Virtuososchematic
Figure 220 Inverter in Virtuosolayout
3C H A R A C T E R I Z AT I O N O F T S V C A PA C I TA N C E
31 introduction
In a semiconductor process technology electrical characterizationis crucial for testing device performance and reliability after goingthrough the process fabrication steps Electrical characterizationverifies the electrical properties of devices to specifications andprovides feedback to the wafer fabrication process engineers forimproving and optimising the process technology
In a 3D system the electrical behaviour of the TSV can bemodelled in terms of its resistance-inductance-capacitance (RLC)parasitics as discussed in 212 (page 12) Parasitic models are typ-ically used for simulating circuit behaviour and they are criticalfor accurately predicting the performance of a 3D system duringdesign verification [57]
From the RLC parasitics associated with a TSV capacitanceassumes the most significant role on its electrical properties suchas energy dissipation and delay [27] The capacitancersquos role is sodominant that for latency and signal integrity (SI) analysis theTSV behaviour can be modelled by considering the capacitancealone [21] Thus accurate characterization of the TSV capacitanceis highly desirable when designing a 3D system
Typical laboratory capacitance measurement techniques basedon LCR1 meters require well controlled process technologies forrealising a high number of devices without defects which mightnot be possible at the current maturity level of TSV process tech-nology In this chapter an alternative measurement technique isproposed for accurate on-chip characterization of the TSV parasiticcapacitance Test structures based on this technique are fabricatedon a 65nm 3D process (Table 23 page 11) and used for charac-terizing the TSV parasitic capacitance under various conditionsThe experimental measurements are analysed using statisticalmethods and the proposed test structures are evaluated
32 tsv parasitic capacitance
A typical isolated 2D back-end-of-line (BEOL) metal wire can bemodelled as a conductor over a ground plane (Figure 31) Sincethe conductor is separated from the ground plane by a dielectric
1 LCR meter is a measurement equipment used to determine the inductance (L)capacitance (C) and resistance (R) of devices based on impedance measure-ments
23
33 integrated capacitance measurement techniques 24
ABC
Figure 31 Isolated 2D wire capacitance
material (εox) its electrical behaviour closely resembles that ofa capacitor The basic formula for calculating a parallel platecapacitance is
C =εox
hwl (31)
In practice calculating the parasitic capacitance of a BEOL metalwire is significantly more complex since fringing fields increasethe total capacitance value [78] Nevertheless whether a 2D metalwire capacitance is calculated using Equation 31 or more com-plex formulas [85] its value is only dependent on geometricalcharacteristics and the dielectric material
In that context 3D interconnects in the form of TSVs do notbehave as simple wires but as devices The TSV copper (Cu) issurrounded by silicon (Si) and these two materials are separatedby a dielectric oxide liner (Figure 32) This structure resemblesa metal-oxide-semiconductor (MOS) capacitor and as has beenshown by modelling and measurements [27] exhibits distinctaccumulation depletion and inversion regions (Figure 33) TheTSV capacitance attains its maximum value (Cox) in the accumula-tion when Vbias lt VFB (flatband voltage) and inversion regions(Vbias gt Vth f lt 1kHz)
The bias voltage and frequency dependency of the TSV capacit-ance complicates both modelling and the experimental measure-ment of its parasitic capacitance Furthermore the MOS behaviourof TSVs is not desirable in circuits using 3D interconnections Forthat reason special techniques are employed [28] that shift theTSV C-V characteristic in such a way that for typical circuit oper-ating voltages (0 minus VDD) the TSV capacitance is constant and atits minimum value (Figure 33)
33 integrated capacitance measurement techniques
Extracting the TSV parasitic capacitance requires the use of meas-urement techniques that are compatible with the particular TSV
33 integrated capacitance measurement techniques 25
ABCDE
F
FD
F
F
FD
Figure 32 TSV parasitic capacitance
ABCDE FACDE
ECDE
BCFACDE
FCCDED DBCDE
D
DAB
Figure 33 Capacitance vs Gate voltage (CV) diagram of a MOS Capa-citor
33 integrated capacitance measurement techniques 26
characteristics as indicated in the previous section The tech-niques commonly used for measuring integrated capacitances ina laboratory environment can be classified into two categories
1 Measurements based on external instruments these can bedirect capacitance measurements LCR meter based [52] orRF S-parameter based measurements [39]
2 On-chip capacitance measurements examples are the ref-erence capacitor technique [32] ring-oscillator based [26]and charge-based [15]
The disadvantage of using external instruments directly connec-ted to the device under test (DUT) is that the measurement setupintroduces additional parasitics that have to be subtracted fromthe measurement result Using standard measurement setupssignificant noise is added when attempting to determine capa-citances well below 1pF due to the influence of the instrumentparasitics [64] In cases where the DUT capacitance is small suchas the femto-Farad TSV capacitance (Table 23) a large number ofidentical devices (typically in the hundreds) can be connected inparallel to increase the total capacitance value [75] However thismethod is only usable with process technologies that are wellcontrolled and allow the realisation of a high number of deviceswithout defects In general the extracted capacitance of the wholepopulation may not be representative of a single device [56] andas a result individual device measurement accuracy may suffer
On-chip techniques can achieve better accuracy when the DUT
capacitance is small Although DUT yield is not critical the pro-cess technology should allow the integration of active devicesand their parameters should be known to a reasonable degreeA simple yet high resolution technique is the charge-based ca-pacitance measurement (CBCM) which can be used to extractcapacitances down to the atto-Farad range [15]
In the CBCM technique the DUT capacitance (CDUT) is chargeddis-charged at frequency f using a pseudo-inverter (Figure 34) whilethe average charging current (Im) is measured with a laboratoryammeter Since transistor and wiring parasitic capacitances are inparallel with the DUT capacitance a second (reference) inverter isused to measure these parasitics (Ire f ) which are then subtractedfrom the calculated result The DUT capacitance is calculated asfollows
CDUT =Im minus Ire f
VDD middot f(32)
The accuracy of the CBCM technique can be further increasedby measuring the charging current over a range of frequenciesThe DUT capacitance can be then calculated from the slope of
34 electrical characterization 27
AB
AC
DE
FE
DE
FE
AB
AC
B
Figure 34 CBCM pseudo-inverters
Inet (Im minus Ire f ) (Figure 35) The accuracy is greatly improved aspseudo-inverter leakage current is removed from the extractedresult (slope is not affected by constant biases) and also defectiveDUTs can be detected by inspecting the linearity of the current-frequency plot
The CBCM technique has been previously used for measuringBEOL metal interconnect parasitic capacitances [15] and MOSFET
capacitances [64] In this work CBCM is proposed for the on-chipmeasurement of the TSV parasitic capacitance CBCM is compat-ible with the TSV parasitic capacitance on the premise that thevoltage and frequency ranges of CBCM are selected such thatthe TSV is forced into a region where its capacitance is constantAs was shown in the previous section (Figure 33) TSV capacit-ance is constant for typical circuit operating voltages while thefrequency dependency of the capacitance becomes significantonly in the multi-MHzGHz range [39] Therefore operating theCBCM pseudo-inverters in the 0 minus VDD range and below 1MHz
should be adequate to avoid the MOS behaviour of the TSV
34 electrical characterization
A group of test structures based on the CBCM technique wereimplemented for on-chip characterization of the TSV capacitanceon Imecrsquos 65nm 3D process technology (Table 23) As it is commonin IC design maximising the utilisation of the available chiparea is one of the priorities of the design effort For that reasonsupplementary test structures were designed and implemented
34 electrical characterization 28
AB
CADE
F
AB
E
CAD
B
Figure 35 Current as a function of frequency in CBCM
on the same chip that went beyond the core objectives of thiswork
The complete design consisted of 6 test structures 5 for ex-tracting the TSV capacitance under various conditions based onCBCM and 1 additional experiment for direct measurement ofthe RC constant All test structures are presented in this sectionhowever due to equipment and material availability only themost significant were measured and analysed
341 Physical implementation
In the CBCM technique the extraction accuracy of the TSV capacit-ance is highly dependant on the design of the pseudo-invertersThe rsquoreferencersquo pseudo-inverter in CBCM is used for measuringboth the wire parasitic capacitance and the inverter parasitics inthe form of diffusion and gate-drain overlap capacitances There-fore it is critical that the two pseudo-inverters are identical interms of parasitics as any mismatches will appear in the extrac-ted measurement result
Transistor mismatch is mostly influenced by process technologyparameters however it is possible to minimise process variationssuch as in the poly gate length by applying special design tech-niques A major cause of gate length variation is the polysiliconpitch to neighbours and one way to mitigate that effect is to fillunused space between transistor gates with polysilicon dummies[13] This technique was widely used in the CBCM pseudo-inverterdesign presented in this work (Figure 36) Concurrently theinverter transistors were designed with relatively large dimen-sions for the implementation process technology (P10microm1micromN5microm1microm) so as to minimise the effect of variation as a per-centage of the transistor parasitics
34 electrical characterization 29
Poly dummy fills
Metal2Metal1PolysiliconActive
1μm 5μm
Figure 36 CBCM pseudo-inverter pair in layout
Imecrsquos 65nm 3D process allows for 2-tier vertical stacking ofdies (TOP and BOTTOM) with the TSV passing through theTOP die Si substrate as illustrated in Figure 32 The IO foreach test structure was accessible through a standard 2 times 12 testpad module on the TOP die compatible with our wafer probetest system The 2 times 12 test pad module (PRC1) along with thecomplete test structure design is illustrated in Figure 37 while amicrograph image of the fabricated chip can be seen in Figure 38
The CBCM pseudo-inverters were implemented on either TOP
or BOTTOM tier while the TSVs were connected to the invert-ers from TOP or BOTTOM and arranged in 4 configurations (11 times 2 2 times 2 2 times 5) with different pitch distances The 6 test struc-tures available on the PRC1 test pad module are summarised inTable 31 A detailed description of the test structures (rsquoArsquo rsquoBrsquo rsquoCrsquorsquoDrsquo rsquoRrsquo and rsquoErsquo) follows in the paragraphs below
Structure rsquoArsquo Single TSV capacitance (TOP die)
The purpose of test structure rsquoArsquo (Figure 39) is to extract the capa-citance of a single TSV when it is electrically accessed through theTOP die The structure is composed of 3 pseudo-inverters con-nected to the TSVs using identical geometry Metal-2 layer wireswhose capacitance is also measured (Reference) In addition to
34 electrical characterization 30
DTOP DBTM
B10 A10
BREF AREF
B1 A1
R1 R2
SET VDD
RESET GND
C15 C10
C50 C20
SET VDD
ERC SET
EREF EC Metal2 (TOP)Metal1 (TOP)
Metal1 (BTM)Metal2 (BTM)
Figure 37 PRC1 test pad module(layout)
Figure 38 PRC1 test pad module(micrograph)
34 electrical characterization 31
Structure A B
TSVs - 1 2x5 - 1 2x5
Inverterlocation
TOP TOP TOP BTM BTM BTM
TSVconnection
TOP TOP TOP BTM BTM BTM
Multi-TSVconnection
- - TOP - - BTM
Multi-TSVpitch (microm)
- - 15 - - 15
Pad name AREF A1 A10 BREF B1 B10
Structure C D
TSVs 2x2 2x2 2x2 2x2 1x2 1x2
Inverterlocation
TOP TOP TOP TOP TOP TOP
TSVconnection
TOP TOP TOP TOP TOP TOP
Multi-TSVconnection
TOP TOP TOP TOP TOP BTM
Multi-TSVpitch (microm)
10 15 20 50 20 20
Pad name C10 C15 C20 C50 DTOP DBTM
Structure R E
TSVs 1 1 - 1x2 1x2
Inverterlocation
TOP TOP TOP TOP TOP
TSVconnection
TOP TOP - TOP TOP
Multi-TSVconnection
- - - TOP BTM
Multi-TSVpitch (microm)
- - - 20 20
Pad name R1 R2 EREF EC ERC
Table 31 Test structures on the PRC1 module
34 electrical characterization 32
Pseudo-inverters
2x5 TSVs
TSV
Reference
Metal2 (TOP)Metal1 (TOP)
Figure 39 Structure rsquoArsquo - Single TSV capacitance (TOP die)
the single TSV DUT the structure also contains 10 parallel TSVs
arranged in a 2 times 5 array with a 20microm pitch Since the suitabilityof the CBCM method for extracting the small TSV capacitance wasundetermined during design time the 2times 5 array was included asa precaution effectively increasing the DUT capacitance an orderof magnitude
Structure rsquoBrsquo Single TSV capacitance (BOTTOM die)
The design of structure rsquoBrsquo is similar to the structure rsquoArsquo asdescribed above The only difference is that the CBCM pseudo-inverters are implemented on the BOTTOM die (Figure 310)while connectivity to the TSVs is provided using a wire on theMetal-2 layer of the BOTTOM die The purpose of this structureis to examine whether the perceived TSV capacitance varies whenit is accessed from the BOTTOM side
Structure rsquoCrsquo Variable pitch TSV capacitance
The use of minimum TSV pitch in designs is highly desirable soas to minimise total chip area The purpose of test structure rsquoCrsquo(Figure 39) is to investigate whether TSV pitch has an effect onthe parasitic capacitance It has been previously reported in the
34 electrical characterization 33
2x5 TSVs
Reference
TSV
Pseudo-inverters
Metal2 (BTM)Metal1 (BTM)
Figure 310 Structure rsquoBrsquo - Single TSV capacitance (BOTTOM die)
literature that TSV proximity to other TSVs may affect its parasiticcapacitance probably through an interaction with the oxide liner[23] Test structure rsquoCrsquo aims to verify the presence of the pitcheffect on the 65nm 3D process
The CBCM pseudo-inverters are implemented on the TOP die(Figure 311) while the TSVs are connected in parallel in groups of4 with variable pitch distances at 10microm 15microm 20microm and 50micromAll groups connect to the pseudo-inverters using identical lengthwires Since wire parasitics are not measured their value is notextracted from the measured TSV capacitances thus the resultsare relative
Structure rsquoDrsquo TSV capacitance with interconnection on TOP or BOT-
TOM die
The purpose of structure rsquoDrsquo (Figure 312) is similar to structurersquoBrsquo which is to examine whether the perceived TSV capacitancechanges when it is accessed from the BOTTOM die However inthis structure the CBCM pseudo-inverters are implemented on theTOP die in the case the front-end-of-line (FEOL) active devices ofthe BOTTOM die were not capable of operating properly afterfabrication and stacking
34 electrical characterization 34
2x2 TSVs - 10μm
2x2 TSVs - 15μm
2x2 TSVs - 20μm
2x2 TSVs - 50μm
Metal2 (TOP)Metal1 (TOP)
Figure 311 Structure rsquoCrsquo - Variable pitch TSV capacitance
Metal2 (TOP)Metal1 (TOP)Metal1 (BTM)
Figure 312 Structure rsquoDrsquo - TSV capacitance measurement with inter-connection on TOP or BOTTOM die
34 electrical characterization 35
Metal2 (TOP)Metal1 (TOP)
Figure 313 Structure rsquoRrsquo - Evaluation of TSV capacitance measurementaccuracy
Metal2 (TOP)Metal1 (TOP)Metal1 (BTM)
Driver
Reference
Driver
Figure 314 Structure rsquoErsquo - TSV capacitance effect on signal propagationdelay (layout)
Structure rsquoRrsquo Evaluation of TSV capacitance measurement accuracy
Structure rsquoRrsquo (Figure 313) aims to evaluate the accuracy withwhich CBCM can extract the capacitance of two nearby TSVs Thetwo pseudo-inverters wires and TSVs are identical thus theor-etically they should give identical results Any variance in theresult should be mainly due to transistor variability
Structure rsquoErsquo TSV capacitance effect on signal propagation delay
Structure rsquoErsquo (Figure 314) allows the direct measurement ofthe delay caused by the TSV on signal propagation The signalpropagation delay is measured under two conditions When thesignal passes through the TSV (which includes the effect of resist-ance on delay) and when only the TSV capacitance is in the pathof the current (Figure 315) The contribution of wiretransistorparasitics is also measured so as to be subtracted from the finalresult
34 electrical characterization 36
ABC
D
ECBFC
ABC
Figure 315 Structure rsquoErsquo - TSV capacitance effect on signal propagationdelay (schematic)
ABC
DEF
CD
A
CA
A
Figure 316 Measurement setup
342 Measurement setup
The presented test structures were fabricated on 300mm wafers(Figure A1) each containing 472 identical dies For IO accessa general purpose on-wafer measurement system was used thatallowed direct contact with the wafer and the 2 times 12 test padmodule through probecard pins (Figure A2) All IO signals wereprovided by standard laboratory instruments controlled throughthe general purpose interface bus (GPIB) port of a PC runninga customised LabVIEW2 program The measurement setup isillustrated in Figure 316
The voltage source of the CBCM pseudo-inverters was set at1V and the CBCM pulse frequencies were swept in the range100kHz minus 1MHz For each measured test structure the DC cur-rent measurement was stored in non-volatile memory and sub-
2 LabVIEW is a visual programming language from National Instruments forautomating measurement equipment in a laboratory setup
34 electrical characterization 37
Test pad Direction Instrument connection
VDD Power Power supply - 1V
GND Power Power supply - 0V
SET InputPulse generator
(Non-overlapping with RESET)
RESET InputPulse generator
(Non-overlapping with SET)
A[REF 1 10] InputSMU (Inverter source - 1V
supply)
B[REF 1 10] InputSMU (Inverter source - 1V
supply)
C[10 15 2050]
InputSMU (Inverter source - 1V
supply)
D[TOP BTM] InputSMU (Inverter source - 1V
supply)
R[1 2] InputSMU (Inverter source - 1V
supply)
E[REF C RC] Output Oscilloscope (high-impedance)
Table 32 PRC1 test pad instrument connections
sequently the prober progressed to the next die measuringall dies on the wafer The IO test pads of the PRC1 module(Figure 37) and the instrument connections are summarised inTable 32
343 Experimental results
There were two wafers available for measurements named in thistext rsquoD11rsquo and rsquoD18rsquo which were selected from the same lot Eachwafer contained 472 identical dies of the TOP 3D tier measuredusing the measurement setup described in 342 Measurementdata were extracted only from test structures rsquoArsquo (Figure 39) andrsquoCrsquo (Figure 311) which were analysed using standard statisticalmethods
When utilising statistical methods to analyse data acquired ina laboratory environment it is common for some data pointsto be significantly away from the other measurements Thesedata points are termed outliers (Appendix B) and usually resultfrom measurement errors such as bad probe pin contacts ormalfunctioning dies on the wafer Since the following analysisis focused on TSV capacitance variability these few outliers areexcluded from the data set so as to improve analysis accuracy
34 electrical characterization 38
Structure rsquoArsquo (Fig 39)
TSV dimensions (ld) 40microm5microm
Oxide liner thickness 200nm
Total dies 472
Processed dies (excluding outliers) 454
Mean capacitance 709 f F
Standard deviation (σ) 19 f F
Coefficient of variation (3σmicro) 80
Table 33 TSV capacitance D2D variability on wafer rsquoD11rsquo
The statistical descriptors used for analysing the measurementdata are elaborated in Appendix B and include calculations ofmean standard deviation and coefficient of variation Theseparameters are used to evaluate the TSV capacitance in terms ofdie-to-die (D2D) wafer-to-wafer (W2W) and TSV pitch variabilityThe accuracy of the CBCM results is evaluated by comparing themto standard LCR measurements Furthermore the correlationbetween the extracted TSV capacitance and simulated oxide linerthickness is shown signifying the predominant role of the oxideliner on the observed TSV capacitance variability
Die to die (D2D) variability
The die-to-die (D2D) variability is evaluated by measuring thecapacitance of the single TSV of structure rsquoArsquo (Figure 39) acrossall dies on the wafer The measurements are repeated on bothavailable wafers (rsquoD11rsquo and rsquoD18rsquo)
On wafer rsquoD11rsquo 18 dies showed a capacitance value severalstandard deviations away from the calculated mean of the totalpopulation These outliers indicated a measurement error or mal-functioning on the die and they were removed from the statisticalanalysis In the remaining 454 dies the mean capacitance of theexperimental data was 709 f F with a standard deviation (σ) of19 f F (Table 33) Standard deviation is an important indicator al-lowing us to determine whether the variance in the experimentaldata is expected or if there is a random error affecting the meas-urements As can be seen in Figure 317 706 of the measuredvalues are within plusmnσ of the mean capacitance value Since mostof the experimental data are within a standard deviation of themean value the variability observed is insignificant and expectedthus the measurement results can be considered valid
On wafer rsquoD18rsquo from the 472 total dies 58 were removed asoutliers The mean capacitance of the experimental data was895 f F with a standard deviation (σ) of 22 f F (Table 34) Theprobability density function can be seen in Figure 318 with 691
34 electrical characterization 39
-60 -40 -20 00 20 40 60 80
σ-σ
706
Deviation from mean (fF)
0
005
01
015
02
025
Figure 317 TSV capacitance D2D variability on wafer rsquoD11rsquo - Probabil-ity density function
Structure rsquoArsquo (Fig 39)
TSV dimensions (ld) 40microm5microm
Oxide liner thickness 200nm
Total dies 472
Processed dies (excluding outliers) 414
Mean capacitance 895 f F
Standard deviation (σ) 22 f F
Coefficient of variation (3σmicro) 74
Table 34 TSV capacitance D2D variability on wafer rsquoD18rsquo
of the experimental data within plusmnσ of the mean capacitancevalue
Correlation between TSV capacitance and oxide liner thickness vari-
ation
The TSV capacitance as extracted from the experimental measure-ments is plotted for wafers rsquoD11rsquo and rsquoD18rsquo (Figures 319 and 320)The capacitance value in the figures is normalised to the calcu-lated mean while xy axis indicate die coordinates on the waferWe observe that on both wafers the capacitance is near the meanvalue at the periphery of the wafer and reduces as we approachto the centre indicating a common cause of variability on bothwafers
Parameters affecting the TSV capacitance are the geometry ofthe TSV (length diameter) and oxide liner characteristics (dielec-tric constant thickness) The variation of the oxide liner thicknessacross the wafer can be predicted by simulation means Oxideliner simulation results for the specific process and fabrication
34 electrical characterization 40
σ-σ
691
Deviation from mean (fF)-70 -50 -30 -10 10 30 50 70
0
002
004
006
008
01
012
014
016
018
02
Figure 318 TSV capacitance D2D variability on wafer rsquoD18rsquo - Probabil-ity density function
075
080
085
090
095
100
105
110
0
5
10
15
20
25
30
-30-25
-20-15
-10-5
Norm
aliz
ed
capacitance
Die X
Die Y
075
080
085
090
095
100
105
Wafer area
Figure 319 TSV capacitance measurement results on wafer D11
34 electrical characterization 41
086088090092094096098100102104106108
0
5
10
15
20
25
30
-30-25
-20-15
-10-5
Norm
aliz
ed
capacitance
Die X
Die Y
086
088
090
092
094
096
098
100
102
104
106
108
Wafer area
Figure 320 TSV capacitance measurement results on wafer D18
tools can be seen in Figure 321 The liner thickness in the graphis normalised to the mean value and xy axis indicate die coordin-ates on the wafer As can be observed the oxide liner thickness isnear the mean value at the periphery of the wafer and increasesto a peak value when approaching the centre Since the TSV ca-pacitance is inversely proportional to the oxide liner thicknesstherersquos a strong correlation between the two variations suggestingthe oxide liner thickness to be the major cause of die-to-die (D2D)variability on wafers rsquoD11rsquo and rsquoD18rsquo
Wafer to wafer (W2W) variability
Probability density functions for both wafers rsquoD11rsquo and rsquoD18rsquo areplotted in Figure 322 Comparing the two experimental data sets(Tables 33 and 34) we can see that although the main statisticaldescriptors have not changed much (standard deviation is slightlylarger in D18) the mean capacitance is much larger in rsquoD18rsquo(895 f F) when compared to rsquoD11rsquo (709 f F) The difference is 262capacitance increase from wafer to wafer Taking into accountthat both wafers are from the same lot this result indicates theneed for better controllability on process parameters
34 electrical characterization 42
109
100
095Nor
mal
ized
oxi
de th
ickn
ess
-5-10 -15 -20 -25 -30
510
1520
2530
0Die Y
Die X
Figure 321 Simulated oxide liner thickness variation
60 65 70 75 80 85 90 95 1000
005
01
015
02
025D11 D18
plusmnσ
plusmnσ
Capacitance (fF)
Figure 322 TSV capacitance W2W variability - Probability density func-tions
34 electrical characterization 43
Structure rsquoCrsquo (Fig 311)
TSV dimensions (ld) 40microm5microm
Oxide liner thickness 200nm
Total dies 472
Processed dies (excluding outliers) 424
Experiment C10 C50
Number of TSVs 4 4
Mean capacitance 4073 f F 4125 f F
Standard deviation (σ) 84 f F 74 f F
Population within σ 695 653
Table 35 Variable pitch TSV capacitance experimental data from waferrsquoD18rsquo
Variable pitch TSV capacitance
The variable pitch TSV capacitance was extracted using structurersquoCrsquo (Figure 311) TSVs with pitch distances at 10microm and 50microm
were measured (experiments C10 and C50) on wafer rsquoD18rsquo Fromthe 472 total dies 424 were functional and the measurementresults are summarised in Table 35 The extracted capacitancevalues represent the 4 parallel-connected TSVs which also includethe parasitics of Metal 2 connectivity wires
As can be observed in the probability distribution functions(Figure 323) the measurement data for experiments C10 and C50
are very close to each other and there is an overlap between theirstandard deviations Even though there is an observable changein mean capacitance between the two experiments the overlap ofstandard deviations does not allow to conclude whether the capa-citance change is due to difference in pitch or process variability
To decide whether the change observed in capacitance is signi-ficant we run a KruskalndashWallis (Appendix B) one-way statisticalanalysis of variance in Minitab3 Assuming that the null hypo-thesis is true we calculate a p-value of 0 indicating that the twodata sets are different and that the pitch has indeed an affect theTSVs capacitance value The pitch dependence of the capacitancewith the same direction has been also previously observed onImecrsquos 130nm 3D process [23]
Measurement accuracy of CBCM
The capacitance measurements extracted using the CBCM tech-nique were compared to conventional LCR measurements (Fig-ure 324) on wafer rsquoD18rsquo Comparing the two measurements we
3 Minitab is a program that implements common statistical functions It is de-veloped by Minitab Inc (httpwwwminitabcom)
35 summary and conclusions 44
376 380 384 388 392 396 400 404 408 412 416 420 424 428 4320
001
002
003
004
005
006C10 C50
Capacitance (fF)
plusmnσC10
plusmnσC50
695 653
Figure 323 Variable pitch TSV capacitance experimental data fromwafer rsquoD18rsquo - Probability density function
-80 -60 -40 -20 00 20 40 60 80 1000
002
004
006
008
01
012
014
016
018σ-σ
688
Deviation from mean (fF)
Figure 324 LCR experimental data from wafer rsquoD18rsquo - Probability dens-ity function
observe in Table 36 that the standard deviations of the two datasets are similar while mean capacitance is sim 8 smaller in theLCR method However the LCR method produced more out-liers than CBCM That was expected as LCR cannot extract thecapacitance of a single TSV but can only measure groups of TSVs
connected in parallel (32 in our example) so as to increase thetotal measured capacitance This increases the chance of measur-ing TSVs with defects which in turn affects the accuracy of theLCR extracted capacitance
35 summary and conclusions
In this chapter various test structures based on the CBCM tech-nique were presented for the electrical characterization of theTSV parasitic capacitance The test structures were implementedon a 65nm 3D process technology and measured using a gen-eral purpose on-wafer measurement system The measurement
35 summary and conclusions 45
Method LCR CBCM
Total dies 472 472
Processed dies (excluding outliers) 393 414
Mean capacitance 820 f F 895 f F
Standard deviation (σ) 23 f F 22 f F
Coefficient of variation (3σmicro) 86 74
Table 36 Comparison between CBCMLCR measurements on waferrsquoD18rsquo
results were statistically analysed in terms of TSV capacitancedie-to-die (D2D) and die-to-wafer (D2W) variability The calculatedD2D variability was found to be reasonable while a comparisonto simulated data of the oxide liner thickness variation revealedthe oxide liner as a major contributor to the observed TSV capa-citance variability However D2W variability was extensive andfeedback was provided to the fabrication engineering team forfurther improvement of the process The results also confirmedthe effect of TSV pitch on the parasitic capacitance as has beenpreviously reported in the literature for a different process [23]
The comparison of CBCM-based measurements to conventionalLCR gave similar results while in addition the CBCM techniquewas shown to produce less outliers (less defective samples) in themeasurement data Considering the simplicity of CBCM and thatit has acceptable accuracy for extracting the capacitance of singleTSVs it can be a valuable tool for process characterization in theearly research efforts of 3D integration
The ability of CBCM for measuring single-ended TSVs could alsoprove beneficial for testing TSVs prior to die stacking which is ancritical step for keeping 3D-IC yield acceptable [47] Extracting TSV
parasitic capacitance can provide important information about theTSV device fabrication quality since slight variations in processparameters will greatly affect its parasitic capacitance as wasdemonstrated in this chapter Techniques based on RC constantmeasurement and sense amplification have been proposed in theliterature for a similar role [16] however they are significantlymore complex to implement than CBCM
In the following chapter another critical TSV characteristic ther-momechanical stress is evaluated using a test structure designedand implemented on the same 65nm 3D process
4T S V P R O X I M I T Y I M PA C T O N M O S F E TP E R F O R M A N C E
41 introduction
The integration of through-silicon via (TSV) 3D interconnects inCMOS technology has raised concerns over the impact on activedevices implemented on the same silicon (Si) substrate The largecoefficient of thermal expansion (CTE) mismatch between the TSV
filling material and Si substrate has been shown to induce ther-momechanical stress and deform the surface in proximity to theTSV Since thermomechanical stress can have an effect on activedevice carrier mobility [71] it is essential that the stress impacton MOSFET performance is kept under control Monitoring theTSV-induced thermomechanical stress effect on MOSFET deviceperformance and generating keep-out-zone (KOZ) guidelines fordigital and analog circuits is an important milestone for design-ing TSV-aware circuits with predictable performance
Simulations using finite element method (FEM) modelling [40]predict that TSV-induced thermomechanical stress has a com-plex distribution that depends on numerous conditions such asTSV geometry device type channel orientation and proximityHowever currently published experimental measurements onthermomechanical stress investigate just a small number of test-cases typically limited to a single TSVchannel orientationaxis[53 70 81] The complex distribution of thermomechanical stressemphasises the need for sophisticated characterization structuresthat can monitor stress impact with precision and verify simula-tion models
In this chapter a test structure is presented for monitoring theimpact of TSV-induced thermomechanical stress on MOSFET deviceperformance The test structure uses proven design techniquesfor implementing a large number of test-cases and accessingnumerous devices with maximum precision Furthermore thetest structure is fabricated on an experimental 65nm 3D process(Table 23) and used for characterizing that process Some of thepresented circuits and results have previously appeared in thefollowing publications [48 49 58]
42 tsv-induced thermomechanical stress
Individual TSV integration technologies use different conductivefill materials which may include copper (Cu) tungsten (W) or
46
42 tsv-induced thermomechanical stress 47
Material CTE at 20 ˚C (10minus6K)
Si 3
Cu 17
W 45
Table 41 Coefficients of thermal expansion
Figure 41 TSV thermomechanical stress simulation [40]
polysilicon [20] The coefficient of thermal expansion (CTE) mis-match (Table 41) between the fill material and the silicon (Si) caninduce strain on the surface in proximity to the TSV This deform-ation is caused by thermomechanical stress along the XY-planeduring the cooling-down phase [25] due to the filling materialrsquoshigher-than-Si contraction rate
Simulations using finite element method (FEM) modelling showthat the thermomechanical stress exhibits two-fold rotationalsymmetry with tensile and compressive stresses concentratedon separate axis [40] The thermomechanical stress distributiondepends on TSV geometrical characteristics as well as the inter-action with other TSVs Figure 41 illustrates the simulated stressdistribution of a single TSV while in Figure 42 the interactionbetween two near-by TSVs is shown As can be observed whenmultiple TSVs are in proximity to each other stress-free areas canbe generated as a result of destructive interaction
It is well known that stress can affect carrier mobility in MOS-
FET devices In fact strained-silicon technology processes havebeen in use for years as means to enhance device performance inseveral logic families [71] However in TSV-based 3D integrationthe induced strain is not intentional since it is a characteristic in-trinsic to the TSV process technology and thus the resulting shiftin device performance might not be always desirable Character-ization of the TSV impact on device performance and definition ofdesign rules such as keep-out-zones (KOZs) for digital and ana-log circuits is an important milestone for implementing TSV-awarecircuits with predictable performance
Simulation results [65] have shown that the mobility shift de-pends on various conditions such as TSV geometry device type
43 test structure implementation 48
Figure 42 TSV thermomechanical stress interaction between two TSVs[40]
proximity and channel orientation In Figure 43 the mobilityshift of transistors between two TSVs at variable distances is simu-lated As can be observed the effect is positive and more intenseon PMOS devices while for n-channel MOSFET (NMOS) the mobil-ity shift is mostly negative
The mobility shift due to TSV proximity has been also con-firmed by experimental data [53 70 81] and in most cases theperformance shift was found to be within acceptable limits How-ever currently published experimental results investigate a smallnumber of test-cases typically limited to a single TSVdevice ori-entationaxis As was shown in Figures 41 and 42 and discussedin the text the thermomechanical stress distribution is complexand depends on various factors In order to evaluate the impactof TSV thermomechanical stress accurately and under differentconditions more sophisticated test structures are required Suchapproach is attempted in the test structure which is presented inthe following sections
43 test structure implementation
TSV-induced thermomechanical stress can be compressive in somedirections tensile in others and itrsquos effect on device mobilitymay vary depending on MOSFET typeorientationdistance orinteraction with other TSVs To accurately characterize the effectof TSV proximity on MOSFET device performance a test structurewas designed and fabricated on Imecrsquos 65nm 3D process whichattempted to cover as many test-cases as possible
43 test structure implementation 49
Figure 43 Estimated mobility shift on PMOS (a) and NMOS (b) tran-sistors in between two TSVs separated by distance d [65]
The test structure consisted of MOSFET arrays superimposedover TSVs such that MOSFET devices surrounded the TSVs in alldirections at variable distances The purpose of these experimentswas to indirectly observe the thermomechanical stress effect ontransistor mobility by measuring the saturation current of MOSFET
devices at various distances and directions in relation to the TSVsSeveral configurations of TSVs and MOSFET arrays were investig-ated so as to characterize the impact of single and multiple TSVsas well as to observe the effect on different device types
Stacked silicon dies were not fabricated thus all devices wereimplemented on the TOP die Si substrate as illustrated in Fig-ure 32 (page 25) The test structure inputoutput (IO) was ac-cessible through a standard 2 times 12 test pad module on the TOP
die compatible with our wafer probe test system In total two testpad modules were designed for this experiment PTP1 and PTP2implementing NMOS-type and PMOS-type arrays respectively Testpad module PTP1 along with the test structure layout is illus-trated in Figure 44 while a micrograph image of the fabricatedchip can be seen in Figure 45
43 test structure implementation 50
A0
B0
C0
D0
E0
F0
G0
H0
I0
J0
A90
B90
E90
F90
G90
J90
DS3 DF3
DF2 DS0
DS2 DF0
DF1 DS1
SS SF
DEC6 VDD
DEC7 GND
DEC8 VG
DEC4 DEC9
DEC5 GND
DEC2 DEC3
DEC0 DEC1
Arraydecoder
Draindecoder
Figure 44 PTP1 test pad mod-ule (layout)
Figure 45 PTP1 test pad mod-ule (micrograph)
43 test structure implementation 51
Metal2Metal1PolyActive
TSV
MOSFETs
07μm007μm 0deg
07μm007μm 90deg
05μm05μm 90deg
05μm05μm 0deg
Figure 46 MOSFET dimensions and rotations
431 MOSFET arrays
Each MOSFET array was a separate experiment consisting of 37 times37 transistors which could be either NMOS-type or PMOS-typeshort channel (07microm times 007microm) or long channel (05microm times 05microm)as well as at 0ordm or 90ordm rotation in relation to TSVs (Figure 46)
The MOSFET arrays enclosed TSVs in one of five different config-urations 1 TSV (Figure 47) 4 TSVs diagonal (Figure 48) 4 TSVs
lateral (Figure 49) 9 TSVs (Figure 410) and no-TSVs In total therewere 16 configurations for each MOSFET array type (NMOSPMOS)as listed in Table 42 The location of each MOSFET array in thetop-level design is illustrated in Figure 44
The number of available IO test pads on each module was2 times 12 which imposed a limitation for accessing all 21904 (16 times37 times 37) MOSFETs To overcome that restriction only a subset ofthe transistors in each MOSFET array were wired out (16times 16) andthus electrically accessed using digital selection logic The restof the transistors in the 37 times 37 MOSFET array were inserted asdummy devices placed in the gaps to ensure uniform boundaryconditions and good matching between transistors (Figure 411)
Since thermomechanical stress exhibits two-fold rotational sym-metry as discussed in 42 the stress effect is the same for tran-sistors on the same axis whether they are placed eastwest ofthe TSV on the latitudinal axis or northsouth of the TSV on thelongitudinal axis The symmetry of the stress distribution wasexploited in the design of the MOSFET arrays so as to increase theresolution of the experiment despite the limited number of tran-sistors As can be seen in Figure 411 the transistors in the arrayare placed in an irregular grid with a 2microm pitch Some transistorsare placed on odd-number distances (1microm 3microm) in relation tothe TSV while others on even-number distances (2microm 4microm)Nevertheless rotational symmetry allows the evaluation of thethermomechanical stress at all directions with a 1microm resolution
43 test structure implementation 52
TSV
MOSFETs Substrate taps
Figure 47 1 TSV MOSFET array configuration
MOSFETs Substrate taps
TSV TSV
TSV TSV
Figure 48 4 diagonal TSVs MOSFET array configuration
43 test structure implementation 53
MOSFETs Substrate taps
TSV
TSV TSV
TSV
Figure 49 4 lateral TSVs MOSFET array configuration
MOSFETs Substrate taps
TSV TSV TSV
TSV TSV TSV
TSV TSV TSV
Figure 410 9 TSVs MOSFET array configuration
43 test structure implementation 54
Configuration RotationFET
width(microm)
FETlength(microm)
Name
0 0deg 07 007 A0
1 0deg 07 007 B0
4diagonal 0deg 07 007 C0
4lateral 0deg 07 007 D0
9 0deg 07 007 E0
0 0deg 05 05 F0
1 0deg 05 05 G0
4diagonal 0deg 05 05 H0
4lateral 0deg 05 05 I0
9 0deg 05 05 J0
0 90deg 07 007 A90
1 90deg 07 007 B90
9 90deg 07 007 E90
0 90deg 05 05 F90
1 90deg 05 05 G90
9 90deg 05 05 J90
Table 42 MOSFET array configurations for test pad modulesPTP1PTP2
43 test structure implementation 55
2μm 1μm
Metal2Metal1PolyActive
FET
DUMMY
TAP
TSV
Figure 411 MOSFET array detail
432 Digital selection logic
Each MOSFET array was composed of 256 active transistors (16 times16) Test modules PTP1 and PTP2 implemented 16 array con-figurations each with 4096 devices in total (16 con f igurations times256 transistors) which could be accessed individually throughthe 24 IO pads of the 2 times 12 test pad modules Individual tran-sistor Gate Drain and Source terminals were multiplexed tothe IO pads using a combination of decoders and transmission-gates (Figure 412) The transmission-gates were used as analogswitches for current flow while the decoders enabled the corres-ponding transmission-gates for accessing the externally selectedtransistor for measurement
The procedure for selecting a single MOSFET device for meas-urement was as follows
bull Array (0 to 15) was enabled using the 4 minus bit Array-selectdecoder The MOSFET Source terminal (S) common to alltransistors in a single array was electrically connected toIO pads ldquoSFrdquoldquoSSrdquo when enabled
bull MOSFET Gate terminal (G0 to G15) was connected to IO
pad ldquoVGrdquo using the 4 minus bit Gate-select decoder and corres-ponding transmission-gates
bull MOSFET Drain terminal (D0 to D15) was connected to IO
pads ldquoDFnrdquoldquoDSnrdquo using the 2 minus bit Drain-select decoder
43 test structure implementation 56
D0
D15
DF[0-3]
DS[0-3]
Drain-select decoder (2-bit)
SSF
SS
Array-select decoder (4-bit)
VG
Gate-select decoder (4-bit)
G0 G15
MOSFET Array
Figure 412 Digital selection logic
and corresponding transmission-gates Drain terminals werealways enabled in groups of 4 and connected sequentiallyto IO pads ldquoDF0rdquoldquoDS0rdquo to ldquoDF3rdquoldquoDS3rdquo thus the 2 minus bit
Drain-select decoder was adequate for accessing all 16Drain terminals in the array
When a MOSFET is switched on a voltage drop is expectedbetween the IO pads and the transistor DrainSource terminalsdue to the resistance of wires To guarantee accurate voltagedelivery transistor DrainSource terminals are ldquosensedrdquo on theldquoDSnrdquoldquoSSrdquo IO pads so as to ensure that the applied voltageon IO pads ldquoDFnrdquoldquoSFrdquo supplies the desired voltage values onthe transistor terminals This procedure known as four-terminalsensing [29] is typically automated when using laboratory meas-urement equipment such as a source measurement unit (SMU)
A single MOSFET array with its digital selection logic can beseen in layout view in Figure 413 Transistor Gate terminals areselected using a 4minus bit decoder (Figure 414) while the externallyapplied voltage on IO pad ldquoVGrdquo passes through transmissiongates (Figure 415) The transmission gates schematic used forthe DrainSource terminals can be seen in Figure 416 The Drainterminal transmission gates are enabled in groups of 4 usinga 2 minus bit decoder (Figure 417) If a MOSFET array is not in theselected state by the Array-select decoder Gate terminals areforced at GND voltage level (for PTP1) or to VDD voltage (forPTP2) so that non-selected MOSFETs remain turned-off insteadof floating
43 test structure implementation 57
Gatedecoder
Metal2Metal1PolyActive
Transmissiongates (DS)
Transmission gates (G)
MOSFETArray
Figure 413 MOSFET array with digital selection logic
43 test structure implementation 58
BLK
A[0] A[1] A[2] A[3]
DEC[0-15]
Figure 414 4-bit Gate-enableArray-enable decoder
43 test structure implementation 59
DEC
VGGVDD
Transmission gate
Pull-downtransistor
Figure 415 Transmission gate for MOSFET Gate terminals
Transmission gates
FORCE
SENSE
DRAINSOURCE
EN
VDD
Figure 416 Transmission gates for MOSFET DrainSource terminals
43 test structure implementation 60
A[0] A[1]
DEC[0-3]
Figure 417 2-bit Drain-select decoder
44 experimental measurements 61
Configuration Type Name Wafer
1 No-TSVPMOS
(05microm times 05microm)F0 D08
2 1-TSV 0ordm 25 ordmCPMOS
(05microm times 05microm)G0 D08
3 1-TSV 0ordm 25 ordmCPMOS
(05microm times 05microm)G0 D20
4 1-TSV 0ordm 60 ordmCPMOS
(05microm times 05microm)G0 D08
54-lateral TSV 0ordm 25
ordmCPMOS
(05microm times 05microm)I0 D08
6 1-TSV 90ordm 25 ordmCPMOS
(05microm times 05microm)G90 D08
7 1-TSV 0ordm 25 ordmCPMOS (07microm times
007microm)B0 D08
Table 43 MOSFET array configurations selected for experimental eval-uation
44 experimental measurements
Test modules PTP1 and PTP2 implementing the presented MOS-
FET arrays were fabricated on 300mm wafers (Figure A1) eachcontaining 472 identical dies Two wafers were selected for meas-urements referred to as ldquoD08rdquo and ldquoD20rdquo both originating fromthe same lot The test module used for experimental evaluationwas PTP2 containing the PMOS-device arrays which were expec-ted to have the highest performance impact from thermomech-anical stress as was discussed in 42 The specific MOSFET arraysthat were selected for evaluation are listed in Table 43
441 Data processing methodology
For each MOSFET array evaluated the drain-to-source current(IDS) of all transistors in the array was measured in the saturationregion (VDS = 1V VGS = 1V) Subsequently the saturation cur-rent was divided with the median current value of all transistorsin the array generating IDS values normalised to the median Themedian of a data set represents the arithmetic value that the ma-jority of values tend (Appendix B) Since only a small percentageof the 256 transistors in the array are in proximity to the TSVsthe median IDS value represented typical transistor saturationcurrent not affected by thermomechanical stress
Measurement precision is increased with repeated measure-ments as random noise is removed from the experiment reducing
44 experimental measurements 62
ABCDEFBBC
FCDDFBCFFC
FCB
A
BAACADE
CECBBC
CFFCDE
B$BCFFBDBCBF
amp(BFC)B
BC
FCDDFA
CAF
Figure 418 Processing of measurement results
the standard deviation of the sampled mean (Appendix B) Con-sequently each MOSFET array was measured across 98 identicaldies on the wafer and their results were averaged Since the satur-ation current shift due to thermomechanical stress is systematicacross all dies averaging the saturation currents only removedthe random noise effectively increasing the measurement pre-cision by a factor of sim 10 (
radic98) The described procedure for
processing the measurement results is illustrated in Figure 418Each MOSFET array contains 256 transistors which are not
placed in a regular 16times 16 grid as can be seen in Figure 411 Thisapproach was chosen so as to minimise the number of transistorsin an array while allowing for 1microm resolution in the evaluation ofthe thermomechanical stress by taking advantage of its two-foldrotational symmetry as discussed in 431 However illustratingmeasurements in a contour graph requires a regular grid of dataThus for illustration purposes the experimental measurementswere reconstructed to a 33 times 31 regular grid (1023 transistors)with the intermediate values estimated using linear interpolation(Figure 419)
442 Measurement setup
For IO access a general purpose on-wafer measurement sys-tem was used that allowed direct contact with the wafer and the2times 12 test pad modules through probecard pins (Figure A2) IO
signal generation and measurement was provided by standard
44 experimental measurements 63
ABCDECDBF
ABECBF
AB
AB
Figure 419 Interpolation of measurement results
laboratory instruments controlled through the GPIB port of a PCrunning a customised LabVIEW program Four-terminal sens-ing for MOSFET DrainSource terminals was automated using aKeithley 2602 dual-channel SMU [1] The measurement setup isillustrated in Figure 420
The IO test pads for the PTP1PTP2 modules can be seen inFigure 44 while the instrument connections used for each testpad are summarised in Table 44
ABCCDE
F
AA
A
DBCB
Figure 420 Measurement setup
44 experimental measurements 64
Test pad Direction Instrument connection
VDD Power Power supply - 1V
GND Power Power supply - 0V
DEC[0-9] InputDigital Output
(GATEDRAINARRAY SelectDecoders Input)
DF[0-3] Input SMU (Drain Force terminal)
DS[0-3] Output SMU (Drain Sense terminal)
SF Input SMU (Source Force terminal)
SS Output SMU (Source Sense terminal)
VG Input Digital Output (Gate voltage)
Table 44 PTP1PTP2 test pad module instrument connections
443 PMOS (long channel)
No-TSV configuration
This MOSFET array does not contain any TSVs thus transistormobility is not affected by thermomechanical stress The arraywas measured on 98 identical dies of wafer ldquoD08rdquo the saturationcurrents normalised and then averaged over all measured diesusing the procedure elaborated in 441 In Figure 421 the norm-alised saturation current for each transistor in the array can beseen after interpolation to a 33 times 31 grid Each coordinate in thefigure represents a 1microm distance
As can be observed the majority of transistors in the array havealmost identical saturation currents as expected However there isa distinctive stripe running from north to south where the currentvalue is reduced up to sim 5 This revealed a flaw in the designof the MOSFET arrays that can be best observed in Figure 411The transistors that are located adjacent to the substrate tap donot perceive the same boundary conditions as the rest of thetransistors in the array That difference is enough to affect thephysical characteristics of these transistors during fabricationwhich in turn determine transistor parameters such as saturationcurrent This flaw stresses out the significance of careful boundarycondition design when targeting for good transistor matching ina test structure
1 TSV configuration
In this configuration long-channel MOSFETs (05microm times 05microm) areplaced in the periphery of a single TSV (Figure 47) The transistorsin the array were measured over 98 identical dies of wafer ldquoD08rdquo
44 experimental measurements 65
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
096
097
098
099
100
101
Figure 421 Normalised on-current for PMOS structure ldquoF0rdquo (D08)
and processed according to 441 The normalised saturation cur-rent for each transistor in the array can be seen in Figure 422 afterinterpolation to a 33 times 31 grid It can be clearly observed that thetransistors in proximity to the TSV show a distinctive variationin their saturation current in comparison to the other transist-ors in the array which is the direct consequence of TSV-inducedthermomechanical stress These results also demonstrate that theproposed test structure design has enough measurement preci-sion to distinguish thermomechanical stress from backgroundnoise and variability The north to south stripe observed in theprevious configuration due to boundary condition mismatch isalso present in these measurements
To observe the TSV effect on nearby transistors more accuratelythe normalised saturation current versus distance is plotted foraxis X = 16 and Y = 14 (Figure 423) which cross the TSV almostat its the centre In the graph only actual measurements are plot-ted prior to interpolation We observe that the saturation currentof a transistor at 1microm distance from the TSV increases sim 20 onthe latitudinal axis and decreases sim 15 on the longitudinalwhile the effect is almost negligible after 8microm
The same measurement is repeated on wafer ldquoD20rdquo In Fig-ure 424 the normalised saturation current for each measuredtransistor in the array can be seen after interpolation to a 33 times 31grid Also the normalised saturation current versus distance isplotted for axis X = 16 and Y = 14 (Figure 425) In the graph
44 experimental measurements 66
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
09
10
11
12
105
110
115
120TSV
Figure 422 Normalised on-current for PMOS structure ldquoG0rdquo (D08)
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
085
09
095
1
105
11
115
12
125Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 423 Normalised on-current for PMOS structure ldquoG0rdquo(D08Y=14X=16)
44 experimental measurements 67
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
09
10
11
12
13
TSV
Figure 424 Normalised on-current for PMOS structure ldquoG0rdquo (D20)
we observe that the saturation current of a transistor at 1microm dis-tance from the TSV increases sim 30 on the latitudinal axis anddecreases sim 18 on the longitudinal while the effect is almostnegligible after 8microm These results show that thermomechanicalstress can be quite different at distances very close to the TSV
even in between wafers originating from the same lotFinally the measurement is repeated on wafer ldquoD08rdquo after
increasing the temperature to 60 ordmC Normalised current contourplot can be seen in Figure 426 and measurements for X = 16and Y = 14 in Figure 427 The saturation current of a transistor
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
09
1
11
12
13
14
Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 425 Normalised on-current for PMOS structure ldquoG0rdquo(D20Y=14X=16)
44 experimental measurements 68
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
090
095
100
105
110
115TSV
Figure 426 Normalised on-current for PMOS structure ldquoG0rdquo (D0860˚C)
at 1microm distance from the TSV increases sim 16 on the latitudinalaxis (minus4 in comparison to 25 ordmC) and decreases sim 11 on thelongitudinal (minus4 in comparison to 25 ordmC) while the effect isalmost negligible after 6microm (minus2microm in relation to 25 ordmC)
4 lateral TSV configuration
In this configuration the thermomechanical stress is evaluatedin the lateral direction between 4 TSVs The array is measuredover 98 identical dies of wafer ldquoD08rdquo at 25 ordmC The contour plot ofthe normalised saturation current can be seen in Figure 428 andmeasurements for X = 16 and Y = 14 in Figures 429 and 430
respectively As can be observed the saturation current of atransistor at 1microm distance from the TSV increases sim 18 on thelatitudinal axis (minus2 in relation to 1 TSV) and decreases sim 12 onthe longitudinal (minus3 in relation to 1 TSV) while TSV interactionreduces the effect range to 5microm (minus3microm in relation to 1 TSV)
90˚ rotation TSV configuration
In this configuration the orientation of the transistor channel inrelation to the thermomechanical stress is rotated to 90˚ Thearray is measured over 98 identical dies of wafer ldquoD08rdquo at 25 ordmCThe contour plot of the normalised saturation current can beseen in Figure 431 and measurements for X = 16 and Y = 14in Figure 432 As was expected the stress effect on mobility has
44 experimental measurements 69
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
085
09
095
1
105
11
115
12Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 427 Normalised on-current for PMOS structure ldquoG0rdquo (D08Y=14 X=16 60˚C)
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
085
090
095
100
105
110
115
TSV
TSV TSV
TSV
Figure 428 Normalised on-current for PMOS structure ldquoI0rdquo (D08)
44 experimental measurements 70
2 1 1 2 3 4 5 6 7 7 6 5 4 3 2 1 1 2 309
095
1
105
11
115
12
X16
Distance (μm)
Normalized
Ion
TSV TSV
Figure 429 Normalised on-current for PMOS structure ldquoI0rdquo (D08X=16)
3 2 1 1 2 3 4 5 6 7 8 8 7 6 5 4 3 2 1 1 208208408608809092094096098
1102
Y14
Distance (μm)
Normalized
Ion
TSV TSV
Figure 430 Normalised on-current for PMOS structure ldquoI0rdquo (D08Y=14)
44 experimental measurements 71
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
090
095
100
105
110
115
120TSV
Figure 431 Normalised on-current for PMOS structure ldquoG90rdquo (D08)
been reversed between the two axis The saturation current of atransistor at 1microm distance from the TSV decreases sim 13 on thelatitudinal axis and increases sim 20 on the longitudinal whilethe effect is almost negligible after 6microm
444 PMOS (short channel)
In this configuration short-channel MOSFETs (07microm times 007microm) areplaced around a single TSV (Figure 47) The contour plot of thenormalised saturation current can be seen in Figure 433 and
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
08509
0951
10511
11512
125Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 432 Normalised on-current for PMOS structure ldquoG90rdquo (D08Y=14 X=16)
44 experimental measurements 72
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
094
096
098
100
102
104
106
108
110
TSV
Figure 433 Normalised on-current for PMOS structure ldquoB0rdquo (D08)
measurements for X = 16 and Y = 14 in Figure 434 At at 1microm
distance from the TSV the saturation current decreases 7 onthe longitudinal axis (minus8 in relation to long channel) while onthe latitudinal axis no stress effect could be detected That couldbe attributed either to short-channel effects or malfunctioningof the specific transistor due to proximity to the TSV and theexperimental nature of the manufacturing process On both axisthe effect is almost negligible after 5 minus 6microm
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
085
09
095
1
105
11
115Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 434 Normalised on-current for PMOS structure ldquoB0rdquo (D08Y=14 X=16)
45 summary and conclusions 73
Size
Waf
er
Con
figu
rati
on
Rot
atio
n
Eff
ect
1micro
mla
t
Eff
ect
1micro
mlo
ng
KO
Z
L D08 1-TSV (25 ordmC) 0ordm +20 minus15 8microm
L D20 1-TSV (25 ordmC) 0ordm +30 minus18 8microm
L D08 1-TSV (60 ordmC) 0ordm +16 minus11 6microm
L D08 4-lat TSV (25 ordmC) 0ordm +18 minus12 5microm
L D08 1-TSV (25 ordmC) 90ordm minus13 +20 6microm
S D08 1-TSV (25 ordmC) 0ordm sim 1 minus7 8microm
Table 45 Summary of experimental measurements on long (L) andshort (S) channel PMOS devices
45 summary and conclusions
In this chapter a test structure was presented for monitoringthe impact of TSV-induced thermomechanical stress on MOSFET
device performance The test structure implemented arrays ofMOSFET devices superimposed with TSVs in various configura-tions Digital selection logic allowed access to the MOSFET deviceswhich could be used to indirectly monitor thermomechanicalstress by measuring the saturation current of transistors
The structure was implemented in Imecrsquos experimental 65nm
3D process and used to characterize the process technology whileevaluating the test structure Experimental measurements werecarried on PMOS devices and the thermomechanical stress wasfound to be on par with simulation estimations
In Figure 45 the measurements are summarised for the latit-udinal and longitudinal axis at 1microm distance from the TSV alongwith the recommended keep-out-zone (KOZ) The KOZ is indic-ative and represents the distance where the thermomechanicalstress effect on saturation current is almost negligible (lt 1) Ina practical case KOZs will depend on the tolerated saturation cur-rent mismatch between devices which is a direct consequence oftiming and other circuit constraints Typically digital circuits cantolerate and thus function properly with much more transistorvariability than analog circuits
Comparing the experimental results of this work to other pub-lished data in the literature reveals that the thermomechanicalstress effect varies dramatically with different process character-istics Experimental measurements reported by Cho et al [17]confirm that long channel devices are more sensitive than short
45 summary and conclusions 74
channel however both the saturation current shift and KOZs areconsiderably reduced to 2 and lt 2microm respectively In additionthe dependency of the stress effect on TSV position was not ob-served by the authors in contrast to the measurements presentedin this work and predictions by theoretical models Contradictoryresults between theory and experiments and between differentexperiments emphasise the need for better understanding of thespecific mechanisms involved in the TSV-induced thermomech-anical stress
Evaluation of the measurements results allowed for some con-clusions to be made on the effectiveness of the presented teststructure
1 It is desirable for transistors in the array to be in a regu-lar grid so as processing of the measurement data is lesscomplex
2 The boundary conditions of transistors have an consider-able effect on transistor matching and thus measurementaccuracy
3 Distances of interest are generally lt 8microm therefore thearrays could be smaller or more dense in the area in prox-imity to the TSV than further away
4 The experimental process used to implement the test struc-ture had low yield which stressed out the need for mech-anisms to detect faults in the digital selection logic
The experiences learned from this project can be potentially usedfor designing improved thermomechanical stress characterizationstructures for next generation TSV processes In a future workit would be interesting to perform experimental measurementson NMOS devices which were not evaluated in this project Fur-thermore the cause of the glitch measured on the short-channeltransistor nearest to the TSV should be further investigated
In the following chapter the potential of the energy-recoverytechnique is investigated for reducing the energy dissipation ofTSV interconnects in future 3D-SoC designs implementing a largenumber of TSVs
5E VA L U AT I O N O F E N E R G Y- R E C O V E RY F O R T S VI N T E R C O N N E C T S
51 introduction
TSV-based 3D-ICs enable low-parasitic direct connections betweenfunctional blocks in a SoC by replacing traditional long horizontal2D global interconnects with short vertical ones (21 page 5)Compared to 2D interconnects the reduced resistance-capacitance(RC) parasitics of TSVs improve both speed (RC) and energy(12CV2
DD) performance of circuits However the TSV parasiticcapacitance may still become an important source of energydissipation in large densely interconnected 3D-SoCs since thecombined capacitance and thus the energy required to drive TSVs
increases linearly with the number of tiers and interconnections(Figure 51)
The principal technique for reducing energy dissipation inCMOS has traditionally been based on supply voltage scaling[14] though this represents a technical challenge below the 65nm
node since PVT variations and physical limits decrease in practicethe range over which voltages may be scaled An alternativelow-power design technique is energy-recovery logic [31] whichdemonstrates frequency-dependent energy dissipation and thusdoes not suffer from the limitations of voltage scaling The originsand operating principle of energy-recovery logic were discussedin detail in 22 (page 13)
In this chapter the potential of the energy-recovery techniqueis investigated for reducing the energy dissipation of TSV inter-connects in 3D-ICs The total energy dissipation per cycle andoptimum device sizing are extracted for the proposed schemeusing theoretical modelling while the configuration is evaluatedagainst conventional static CMOS Some of the presented methodsand results have appeared previously in Asimakopoulos et al[7]
52 energy-recovery tsv interconnects
As was discussed in 22 (page 13) energy-recovery circuits con-serve energy by restricting current-flow across resistances andsubsequently recovering part of the supplied energy using time-variable power sources The energy dissipated in an energy-
75
52 energy-recovery tsv interconnects 76
1XTSV
Tier1
TSV
Tier0LOGIC 0
LOGIC 0
TSV
LOGIC 0
LOGIC 0
TSV
LOGIC n
LOGIC n Tier1
Tier0
TSV
LOGIC 0
LOGIC 0
TSV
LOGIC n
LOGIC n
Tier(m-1)
Tier(m)
mXnXTSVFigure 51 TSV energy dissipation per cycle increase with 3D integra-
tion density
recovery circuit becomes proportional to the charging period (T)of the capacitive load (CL) [82 60]
EER prop
(
RCL
T
)
CLV2DD (51)
In comparison a conventional CMOS circuit driving an equi-valent capacitive load would dissipate during the same chargingperiod T [59]
ECMOS =12
CLV2DD (52)
Combining Equations 51 and 52 we calculate the energydissipation relation between the two circuits
EER
ECMOSprop
2RCL
T(53)
From Equation 53 we observe that the conditions under whichenergy-recovery conserves energy when compared to conven-tional CMOS are when either the charging period (T) is large orthe resistancecapacitance (RCL) of the load is small
In a 3D-IC TSV parasitics in terms of resistance-capacitance (RC)are inherently low by design as was discussed in 21 (page 5)In fact as was demonstrated by Katti et al [27] resistance isnegligible and thus the electrical behaviour of a TSV in a 3D-IC
52 energy-recovery tsv interconnects 77
Tier2Tier1TSV
LOGIC 1
LOGIC 2
driver
TSV
LOGIC 1
LOGIC 2
CMOS
driver
Convert
Energy-recovery
PCG
Figure 52 Comparison of a typical CMOS TSV driver to the proposedenergy-recovery
can be modelled as a single capacitance due to its predomin-ant impact on energydelay Therefore and based exclusivelyon Equation 53 energy-recovery has an advantage over con-ventional static CMOS when used for driving low-parasitic TSV
interconnectsThe differences between a typical CMOS driver in a TSV-based
3D-IC and the energy-recovery scheme proposed in this chapterare illustrated in Figure 52 A logic block in Tier1 generates adigital output which is regenerated by a driver and receivedby a second logic block in Tier2 after passing through the TSVIn the proposed energy-recovery scheme as an intermediatestep the digital output is converted to an adiabatic signal beforepassing through the TSV so as to take advantage of the low-power characteristics of energy-recovery Compatibility betweenthe receiving digital logic block in Tier2 and energy-recovery isretained by restoring the adiabatic signal back to its digital form
Converting the digital signal to an adiabatic as depicted inthe simplified diagram of Figure 52 requires the use of a spe-cialised driver that can combine (mix) the time-variable signalof the power-clock generator (PCG) with a digital output Signalsgenerated by a PCG return to zero at the end of a cycle resem-bling the operation of a digital clock (222 page 18) In contrastdigital outputs are typically allowed to have arbitrary values fromone cycle to the next thus a special structure is required thatcan encode the PCG generated signal to arbitrary binary values
52 energy-recovery tsv interconnects 78
IN IN
ININ
OUT OUT
IN
IN
OUT
OUT
Figure 53 Adiabatic driver
Figure 54 A simple pulse-to-level converter (P2LC)
An example of such structure is the adiabatic driver [8 84] (Fig-ure 53) Adiabatic drivers are composed of a pair of transmissiongates switching onoff alternately based on the value of a digitalinput On each cycle the time-variable adiabatic signal is redirec-ted either to the driverrsquos inverting or non-inverting output andas a result the data carried by the adiabatic signal are dual-railencoded
The signals at the outputs of the adiabatic driver can be restoredto their digital form after passing through the TSV by convertingthe dual-rail encoded sinusoidal pulses back to level signals usinga pulse-to-level converter (P2LC) [72] (Figure 54)
For generation of the time-variable adiabatic signal a resonantPCG may be used such as the one discussed in 222 (page 18)A resonant PCG generates continuous sinusoidal pulses approx-imating the idealised (linear) adiabatic signal of Figure 52 Theadiabatic driverrsquos resistance (RTG) along with the capacitive load(CL) assume a crucial role in the generation of the resonant si-nusoidal pulse as their presence in the signal path along with
53 analysis 79
VDD2
Rind
RTG
AdiabaticDriver
CL
M1
T=1f
L ind
RTG
CL
Figure 55 Resonant pulse generator with dual-rail data encoding
Tier2
Tier1
f-fPULSE
VDD2CLK IN
Delay1CLK1
CLK0
Delay2CLK2OUT1
Data In
Data Out
CLKRES
TSV TSV TSV
CLKBufferAdiabatic
Driver
P2LC
f-f
Data1
M1
Figure 56 Energy-recovery scheme for TSV interconnects
a resonant-tank inductor form an RLC oscillator [44] (Figure 55)resonating at
f =1
2πradic
LindCL(54)
A detailed block diagram of the proposed energy-recoveryscheme including the PCG and dual-rail encoding is illustratedin Figure 56
53 analysis
The energy dissipation relation depicted in Equation 51 is basedon the assumption that an idealised (linear) adiabatic signalis used for charging the TSV capacitance Furthermore energy
53 analysis 80
losses occurring in the additional components used in the realisticenergy-recovery circuit illustrated in Figure 56 are not consideredin Equation 51 The purpose of this section is to analyse andmodel all possible sources of energy dissipation in the proposedenergy-recovery scheme
The bulk of the energy dissipation in the energy-recovery cir-cuit of Figure 56 will occur in the adiabatic driver (EAD) theinductorrsquos parasitic resistance (Rind) transistor M1 (EM1) and theP2LC (EP2L) Therefore the total energy dissipated per cycle inthe energy-recovery circuit would be
EER = EAD + Eind + EM1 + EP2L (55)
Parameters EAD Eind and EM1 are theoretically derived in thefollowing subsections while EP2L depends on the specific P2LCs
architecture used and could be easily extracted from simulationdata as will be shown later in this chapter
531 Adiabatic driver
There are two sources of energy dissipation in the adiabaticdriver of Figure 53 adiabatic dissipation caused by current flow-ing through the transmission gate resistance (RTG) and CMOS
dissipation caused by external stages driving the transmissiongate and pull-down NMOS transistor input capacitances
The sinusoidal pulses generated by the PCG chargedischargethe load capacitance (CL) through the transmission gate resistance(RTG) The full chargedischarge cycle of the load capacitanceconcludes during a period T = 1 f while the energy dissipatedon the transmission gate resistance (ETG) during that period isadiabatic and thus proportional to Equation 22 Appending acorrection factor ξ equal to π28 for a sine-shaped current [8 76]and multiplying the equation by 2 to calculate energy dissipationfor the full cycle
ETG = 2ξ
(
RTGCL12 T
)
CLV2DD =
π2
2(RTGCL f )CLV2
DD (56)
The cross-coupled PMOS pair of Figure 53 reduces the transmis-sion gate input capacitance by 12 when compared to the simpleadiabatic driver of Figure 215 (page 18) External CMOS stagesare not required to drive the PMOS pair however the PMOS gatecapacitance will appear as an additional capacitive load at theoutput of the driver Energy will still be dissipated for driving thePMOS pair but will be adiabatic and thus more energy efficient
53 analysis 81
Transmission gates are typically sized to have equal NMOS andPMOS parts as increasing the size of the PMOS only slightly im-proves the gate resistance while significantly increasing the inputcapacitance [78] Thus assuming both NMOSPMOS transistorsare equally sized to Wn the additional capacitive load appearingat the driverrsquos output due to the PMOS pair would be one gatecapacitance Cn In addition drainsource diffusion capacitance(CD) will be an important portion of the load since in each cycle6CD will be present in the current flow path (4 contributed bythe ON and 2 by the OFF transmission gate) Including the TSV
capacitance at the output of the adiabatic driver the combinedload capacitance seen by the PCG in any cycle would be
CL = CTSV + Cn + 6CD (57)
The transmission gate resistance (RTG) can be related to thegate capacitance by a ldquodevice technology factorrdquo (κTG) [43] whichwe can define for calculation convenience
κTG = RTGCn rArr RTG =κTG
Cn(58)
Assuming that the pull-down NMOS transistors of Figure 53are small CMOS dissipation is primarily the result of driving theNMOS part of the transmission gates which is CnV2
DD Combiningthe CMOS dissipation with Equations 56 57 and 58 gives thetotal dissipated energy per cycle in the adiabatic driver
EAD = CnV2DD +
π2
2κTG
Cnf [CTSV + Cn + 6CD]
2 V2DD (59)
The second term of Equation 59 has a consistent contributionon the energy dissipation in every cycle while the first term isdependent on data switching activity (D) We can also furthersimplify this equation by defining the diffusion capacitance asa fraction of the transistors input capacitance (CD = bCn) andassigning term π2
2 κTG f to a variable (y = π2
2 κTG f ) Equation 59then becomes
EAD = D middot CnV2DD +
y
Cn[CTSV + (6b + 1)Cn]
2 V2DD
=
[
Cn
(
D
y+ (6b + 1)2
)
+1
CnC2
TSV
]
middot yV2DD
+(12b + 2)CTSV middot yV2DD (510)
Since in Equation 510 Cn is the free parameter the first twoterms are inversely proportional and thus EAD is minimised when
53 analysis 82
they become equal Solving equation Cn
(
Dy + (6b + 1)2
)
= 1Cn
C2TSV
for Cn the optimum transmission gate input capacitance is calcu-lated for a given CTSV
Cn(opt) =
radic
[D
y+ (6b + 1)2]minus1CTSV (511)
Combining Equations 510 and 511 gives the energy dissipationof an optimally sized adiabatic driver
EAD(min) =
[radic
D
(6b + 1)2y+ 1 + 1
]
middot (12b+ 2)yCTSVV2DD (512)
532 Inductorrsquos parasitic resistance
The inductorrsquos (Lind) parasitic resistance Rind is proportional tothe Qind factor which typically depends on the inductorrsquos imple-mentation technology For the purposes of this analysis it can beestimated as
Qind =1
Rind
radic
Lind
CLrArr Rind =
1QindCL2π f
(513)
Energy dissipation in Rind is adiabatic and can be estimatedcombining Equations 56 and 513
Eind =π2
2(RindCL f )CLV2
DD =π
4CL
QindV2
DD (514)
533 Switch M1
Transistor M1 is switched-on briefly to recover the energy dissip-ated each cycle in the Rtotal = RTG + Rind Its energy dissipationis a trade-off between dissipation in its on-resistance RM1 andinput capacitance CM1
Since M1 is a fairly large transistor previous ratioed stages willhave a significant contribution to the energy consumption as wellFor that reason a m factor is used to compensate for the additionallosses IM1(rms) is the root mean square (RMS) current passingthrough the transistor while turned-on and VGM1 is the peak gatevoltage A methodology for deriving optimum values for boththese parameters is proposed in [44] and the total dissipatedenergy in M1 can be estimated as
EM1(min) = 2IM1(rms)VGM1
radic
mκM1
f(515)
54 simulations 83
Delay2
CLK1
CLK2Delay1
CLKW
PULSE
CLK0T
PD
OUT1
PULSE
P2L
RES
CLK
Figure 57 Energy-recovery timing constraints
534 Timing Constraints
The gate of transistor M1 in Figure 55 is controlled by a pulsewith period TCLK and width WPULSE For optimum energy ef-ficiency the digital inputs of the adiabatic driver in Figure 53should switch during the time period defined by WPULSE whilethe input values must be valid for the complete resonant clockcycle as shown in Figure 57
The adiabatic driverrsquos input data latching is controlled by clocksignal CLK1 (Figure 56) and thus its positive edge should takeplace during WPULSE (Delay1 lt WPULSE) If these constraints arenot met energy that could be otherwise recovered by the inductorwill be dissipated on M1 during WPULSE The clock signal at thereceiving end CLK2 does not have the same constraint howeveritrsquos positive edge should occur before the P2LC outputs new data(Delay2 lt WPULSE + PDP2L)
54 simulations
To verify and evaluate the proposed theoretical models a testcase was designed for which energy dissipation was calculatedusing both theory and SPICE simulations The test case assumed1 minus bit of information per cycle was transferred between 2 tiersof a 3D-IC in which the TSV capacitance (CTSV) was 80 f F TheTSV was driven using the energy-recovery technique proposedin this chapter implemented with a single adiabatic driver fordata encoding and an inductor (Lind Rind) for generation of the
54 simulations 84
M1IN IN
IN
OUT OUT
IN
Figure 58 Test case for the evaluation of the theoretical model
Q250MHz 500MHz 750MHz
T S e () T S e () T S e ()
6 360 409 -120 476 516 -78 578 613 -57
8 308 347 -114 417 449 -70 515 543 -52
10 276 312 -115 382 411 -70 477 498 -43
12 255 289 -119 358 386 -71 451 472 -44
Table 51 Energy dissipation results ( f Jcycle) theoretical model (T)versus SPICE simulation (S)
adiabatic signal and energy recovery The circuit used for the testcase is illustrated in Figure 58
The energy dissipation on the adiabatic driver inductor andtransistor M1 were calculated using Equations 512 514 and 515with the Q factor and operating frequency acting as free variablesTechnology-specific parameters κTG κM1 and CD were extractedusing simulation models for Imecrsquos 130nm 3D process (Table 22page 11) The same circuit was simulated on Cadence VirtuosoSpectre simulator under identical conditions The energy dissipa-tion results for various Q factors and operating frequencies arelisted in Table 51
The calculated and simulated energy dissipation results arealso plotted in Figure 59 As can be observed theoretical es-timations are lower than than the equivalent results reported bythe simulator in all the cases examined That can be expected asthe calculations performed by the simulator are thorough andconsider much more details that are skipped in the theoreticalmodels Nevertheless there is good correlation between theoryand simulations
In Figure 510 the error in the estimation of the energy dissip-ation is plotted when using the theoretical model The Q factordoes not seem to have a significant effect on the error howeverthere is a considerable dependence on the operating frequency
54 simulations 85
5 6 7 8 9 10 11 12 130
10
20
30
40
50
60
70
T250 S250 T500 S500 T750 S750
Q factor
Energydissipation(fF)
Figure 59 Energy dissipation results ( f Jcycle) theoretical model (T)versus SPICE simulator (S)
5 6 7 8 9 10 11 12 13
-14
-12
-10
-8
-6
-4
-2
0
250MHz 500MHz 750MHz
Q factor
Error()
Figure 510 Energy dissipation estimation error of the theoretical modelcompared to the SPICE simulator
The parameters κTG κM1 and CD that were extracted for thespecific technology even though they were used as constants inthe models in practice they are frequency dependent Thereforeit could be said that based on the parameters chosen the theor-etical results are optimised for a specific frequency Increasingfrequency doesnrsquot necessary result in reduced error as seemsto be suggested by the graphs in Figure 510 In fact it is ex-pected that the estimation error will show a positive sign forlarger frequencies As was discussed previously in the text it isa reasonable expectation for the theoretical models to slightlyunderestimate the energy-dissipation
54 simulations 86
M1
TSV
TSV
TSV
Data in OutCMOS
OutER
Energy-recovery
CMOS
P2LC
Adiabaticdriver
Figure 511 Test case for comparison of energy-recovery to CMOS
541 Comparison to CMOS
When comparing different circuits it is typically desirable to takea holistic approach evaluating various aspects such as powerarea speed cost etc The purpose of this section is to comparethe energy-recovery scheme to a conventional static CMOS circuitusing just the theoretical models presented in this chapter Thisbasically limits the comparison to the energy dissipation andspeed parameters while on Chapter 6 (page 92) area is alsoconsidered The test case used for the comparison is illustratedin Figure 511 The TSV load capacitance was assumed to be 80 f F
for both circuitsThe energy dissipation of the CMOS circuit in Figure 511 is
described by Equation 52 Appending the data switching activity(D)
ECMOS = D middot 12
CLV2DD (516)
Equation 516 obviously calculates the theoretical minimumenergy dissipation since no parasitics and previous CMOS stagesare included In practice the energy dissipation of the CMOS
circuit can be much larger however Equation 516 is consideredadequate for the purposes of this comparison
Energy dissipation of the energy-recovery circuit is calculatedusing Equation 55 The energy contribution of each sub-componentthe adiabatic driver (EAD) the inductorrsquos parasitic resistance(Rind) and transistor M1 (EM1) were theoretically derived pre-viously in the text The P2LC energy was simulated in SPICEfor convenience The simple circuit topology of Figure 54 wasslightly improved adding the PMOS transistors seen in Figure 512so as to reduce short-circuit current when the input is in the range
54 simulations 87
Figure 512 Improved P2LC design
0 100 200 300 400 500 600 700 800 9000
2
4
6
8
10
12
14
Frequency (MHz)
Energy
dissipation(fJ)
Figure 513 Simulated energy dissipation of improved P2LC design
Vth rarr (VDD minus Vth) The energy dissipation results extracted us-ing SPICE can be seen in Figure 513 As can be observed theenergy dissipation is frequency dependent something not expec-ted in a CMOS circuit This behaviour is caused by the slow risingsinusoidal inputs which for low frequencies extend the periodduring which both PMOS and NMOS are on and thus increaseshort-circuit current
The Q factor is a critical parameter in the energy efficiency ofthe energy-recovery scheme since the entire circuit is basically anRLC oscillator A wide range of inductors are available as discretecomponents for use in circuits with Q factors depending on thesize and material On-chip integrated inductors have more re-stricted performance however their compact size is an advantageIntegrated Q factors are typically in the neighbourhood of 10for nH range inductors with sim 100microm radius [5] The Q factor
54 simulations 88
4 6 8 10 12 14 16 18 20 22-20
-10
0
10
20
30
40
50
60
800MHz 700MHz 600MHz 500MHz 400MHz 300MHz 200MHzQ factor
Energyred uction()
Figure 514 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS for variable Q factor(C = 80 f F)
4 6 8 10 12 14 16 18 20 220050
100150200250300350400450
Inductor M1 Adiabatic driver P2LC
Q factor
Energy
con trib
ution(
)
Figure 515 Energy contribution of energy-recovery circuit componentsat 200MHz (C = 80 f F)
values in the following simulations were deliberately chosen tobe in realistic ranges for integrated inductors
In Figure 514 the percentage reduction in energy dissipationachieved by the energy-recovery circuit is plotted when comparedto CMOS As expected energy efficiency of the energy-recoverycircuit improves with high Q factors and low frequencies
In Figures 515 and 516 the energy dissipation contributionof each component in the energy-recovery circuit can be seenat 200MHz and 800MHz respectively In both cases the energydissipated on the inductorrsquos resistance is reduced with higher Q
factors At lower frequencies the P2LC energy dissipation becomesa significant part of the total
In Figure 517 the percentage reduction in energy dissipationis plotted when the TSV capacitance (CTSV) is variable The en-ergy dissipation in the adiabatic driver (Equation 512) inductor(Equation 514) and transistor M1 (Equation 515) are linearlyrelated to the TSV capacitance Since the same applies to the CMOS
circuit (Equation 516) relative energy dissipation of both circuitsshould be constant with TSV capacitance The change observed in
54 simulations 89
4 6 8 10 12 14 16 18 20 2200
100
200
300
400
500
600
700
Inductor M1 Adiabatic driver P2LC
Q factor
Energy
con trib
ution(
)
Figure 516 Energy contribution of energy-recovery circuit componentsat 800MHz (C = 80 f F)
20 40 60 80 100 120 140 160 180 200 220-100
00
100
200
300
400
500
Q6 Q10 Q14 Q18
TSV capacitance (fF)
Energy
red u
ction(
)
Figure 517 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS with variable TSVcapacitance (C = 80 f F f = 500MHz)
Figure 517 is caused by the constant energy dissipation contribu-tion of the P2LC However as the load capacitance increases theenergy dissipation on the P2LC becomes insignificant in relationto the rest of the circuit components thus the reduction in energyis expected to asymptotically approach a maximum value
Switching activity (D) can also be a significant factor affect-ing energy performance Since in the energy-recovery circuit thesinusoidal oscillation may not be halted all capacitances in thecurrent flow path will charge and discharge on each cycle regard-less of data activity In contrast static CMOS ideally dissipatesenergy only when switching and thus the energy-recovery circuithas a disadvantage at low switching activities In Figure 518 theestimated effect of the switching activity on energy performanceis plotted for an operating frequency of 200MHz
Since the technology factors κTG and κM1 were extracted for a130nm process reducing their value by 12 could also provide uswith an estimation of the circuitrsquos energy performance for a 65nm
node The result in plotted in Figure 519 and as can be observedtechnology scaling has a positive effect on energy dissipationwhen compared to conventional static CMOS
54 simulations 90
4 6 8 10 12 14 16 18 20 22-40
-20
0
20
40
60
80
D=05 D=07 D=09 D=1
Q factor
Energyred uction()
Figure 518 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS (C = 80 f F f =200MHz)
4 6 8 10 12 14 16 18 20 220
10
20
30
40
50
60
130nm 65nm
Q factor
Energyred uction()
Figure 519 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS for variable technology(C = 80 f F f = 500MHz)
55 summary and conclusions 91
55 summary and conclusions
The work presented in this chapter investigated the potential ofthe energy-recovery technique as used in adiabatic logic andresonant clock distribution networks for reducing the energy dis-sipation of TSV interconnections in 3D-ICs The proposed energy-recovery scheme for TSVs was analysed using theoretical model-ling generating equations for energy dissipation and optimumdevice sizing The model accuracy was evaluated against SPICEsimulations which showed good correlation with the theoreticalestimations on a 130nm technology process
The energy-recovery scheme was compared to conventionalstatic CMOS when both circuits driving equivalent TSV load ca-pacitances under similar conditions The analysis revealed theenergy performance dependence on multiple circuit parametersthe Q factor operating frequency switching activity and TSV capa-citance The results demonstrated favourable energy performancefor the energy-recovery scheme for low operating frequenciesand high Q factors switching activities and TSV capacitancesFurthermore an estimation was provided on the energy perform-ance of the proposed scheme in an advanced technology nodewhich predicted further energy improvement
Encoding the PCG signal to arbitrary digital values using thedual-rail adiabatic driver (Figure 53) is only one of the possiblesolutions In fact energy could be saved if instead of dual-railanother encoding method is used such as 1-out-of-4 [74] In 1-out-of-4 encoding 4 outputs are used to represent 2-bits however onlyone of the outputs is charged at any given cycle In comparisondual-rail charges two outputs per cycle for the same number ofbits and thus 1-out-of-4 could potentially be saving up to 50energy when compared to dual-rail 1-out-of-4 encoding couldprove an interesting extension to the proposed energy-recoveryscheme
The theoretical models developed in this work can be used asa quick alternative to SPICE simulations for design-space explor-ation This is shown in the following chapter where the theoryestablished on the energy-recovery TSV scheme is used for design-ing a low-power 3D demonstrator circuit that was fabricated on a130nm process
6E N E R G Y- R E C O V E RY 3 D - I C D E M O N S T R AT O R
61 introduction
In the previous chapter a theoretical approach was taken toinvestigate the potential of the energy-recovery technique forreducing the energy dissipation of TSV interconnects in 3D-ICs Thetotal energy dissipation per cycle and optimum device sizing wereextracted for the proposed scheme using theoretical modellingwhile analysis revealed the design parameters affecting energyperformance
Practical design issues such as wire parasitics signal distri-bution area etc were not considered hence in this chapter a3D demonstrator circuit based on the energy-recovery scheme isdesigned under realistic physical and electrical constraints Thedemonstrator is compared to a CMOS circuit designed with thesame specifications using post-layout simulations Both circuitsare fabricated on a 130nm 3D technology process and evaluatedbased on the experimental measurements
62 physical implementation
The demonstrator circuit implemented a 500MHz 80-bit parallelIO channel on a 5-tier TSV-based 3D-IC The TSVs interconnectionswere driven using the energy-recovery scheme as discussed inChapter 5
The target process technology allowed for a low-resistivitybackside re-distribution layer (RDL) to be deposited which wasused to implement a high-Q integrated inductor Stacking of dieswas not possible in the target process so the 5-tier 3D-IC wasemulated by series-connecting TSVs in groups of 4 as is illustratedin Figure 61
The circuit consisted of a total of 640 TSVs (2 times 4 times 80bits) ofwhich half were electrically connected to the integrated inductorduring a single clock cycle The TSV capacitance for the fabrica-tion technology process had already been characterized and wasestimated at sim 40 f F (Table 22 page 11) Optimal device sizingand parameters for the energy-recovery IO circuit at 500MHz
were calculated using the methodology developed in Chapter 5
and are listed in Table 61In Figure 62 the schematic diagram of the energy-recovery
circuit is illustrated The main components comprising the circuitare the adiabatic drivers the pulse-to-level converters (P2LCs)
92
62 physical implementation 93
AB
AB
CDEFD
C
ABCD
EFD
C
Figure 61 Demonstrator circuit structure (showing 1-bit IO channelface-down view)
Adiabatic driver 80-bit circuit
Wn 461microm (Eq511) Rtotal 21Ω ( RTG80 )
RTG 168Ω (Eq58) Ctotal 165pF (80 middot Cload)
CL 207 f F (Eq57) Lind 613nH (Eq54)
Table 61 Energy-recovery IO circuit parameters
62 physical implementation 94
IntegratedInductor
VINDOER
PulseGenerator
S1AS2
DataGenerator
TSV
TSV
P2LCAdiabaticDriver
P2LCAdiabaticDriver TSV
TSV
Buffer
Buffer
1
80
1
80
EN
DecouplingCapacitor
Figure 62 Energy-recovery IO circuit schematic diagram
Metal2Metal1PolyActiveVDD
GND
RES_CLK
DATA
OUT
OUT_B
DATA
OUT_B OUT
RES_CLK
GND
VDD
Figure 63 Adiabatic driver (layoutschematic view)
the data generator the pulse generator transistor M1 and theintegrated inductor All sub-components are discussed in detailin the following paragraphs
621 Adiabatic driver
The adiabatic driver was based on the configuration theoreticallyanalysed in 531 (page 80) The adiabatic driver had 2 inputsfor the resonant pulse (RES_CLK) and the digital data (DATA)signals and 2 outputs for the inverted (OUT_B) and non-inverted(OUT) signals The layout and schematic views of the adiabaticdriver are illustrated in Figure 63
62 physical implementation 95
Metal2Metal1PolyActive
IN
OUT_B
VDD
IN_B
OUT
GND
OUT_B
IN
GND
VDD
OUT
IN_B
Figure 64 P2LC (layoutschematic view)
622 Pulse-to-level converter
The pulse-to-level converter (P2LC) was based on the simple circuitof Figure 54 (page 78) The design in layout and schematic viewsis illustrated in Figure 64 The circuit had 2 inputs for the inverted(IN_B) and non-inverted (IN) resonant pulse and 2 outputs forthe inverted (OUT_B) and non-inverted (OUT) digital signals
623 Data generator
A data generator was used for generating the digital data sig-nal required at the adiabatic driverrsquos input (DATA) The circuitwhich is illustrated in Figure 65 was a simple D-type flip-flopconfigured as a toggle The data generator had 1 input a digitalclock signal (CLK) operating at the same frequency as the res-onant clock and the digital data output (Q) with a switchingactivity of 1
As discussed in 534 (page 83) there are certain timing con-straints that must be respected for the energy-recovery circuitto operate efficiently The digital data received at the adiabaticdriverrsquos input (DATA) should switch during the period definedby WPULSE (Figure 66) In the demonstrator circuit the distribu-tion network carrying the DATA signal to the adiabatic driverswas carefully designed so that its delay impact was less thanWPULSE
624 Pulse generator
The gate of transistor M1 in Figure 62 is controlled by a pulsehaving the same frequency as the resonant clock (RES_CLK)and width WPULSE which is calculated using the models in 533(page 82) This digital pulse can be generated externally of thechip using a highly accurate laboratory signal generator Howeverwire parasitics in the path from the external signal source to the
62 physical implementation 96
Metal2Metal1PolyActive EN
CLK
VDD
Q
GND
D
CLKGND
ENVDD
QF-F
D
GND
VDDCLK
D Q
Figure 65 Data generator (layoutschematic view)
Delay1
WPULSE
PULSE
CLKTCLK
DATA
CLK_RES
Figure 66 Timing constraints for the energy-recovery demonstratorcircuit
62 physical implementation 97
PULSES1A
S2
GND
VDD
Metal2Metal1PolyActive
VDD
S1
S2
GND
PULSE
Figure 67 Pulse generator (layoutschematic view)
S2ΔPhase
WPULSE
PULSE
S1A
Figure 68 Pulse generator InputOutput signals
gate of on-chip transistor M1 affect the rising and falling times ofthe pulse resulting in pulse widths which are difficult to control
For that reason the pulse generator was implemented on-chip(Figure 67) and could be used to generate pulses with arbitrarywidths and frequencies The pulse generator had 2 sinusoidalinputs (S1A S2) supplied externally of the chip and generateda digital pulse with the same frequency as S1AS2 The pulsewidth was defined by the phase difference between S1AS2 asillustrated in Figure 68
625 Integrated inductor
The integrated inductor was designed in ASITIC1 as a planarsquare spiral with fixed dimensions (500microm times 500microm) and vari-able inductance while aiming for optimum Q factor The simu-
1 ASITIC is CAD tool that aids in the optimization and modelling of spiralinductors (httprficeecsberkeleyedu~niknejadasitichtml)
62 physical implementation 98
100 200 300 400 500 600 700 8005
7
9
11
13
15
17
19
21
10
15
20
25
30
35
40
45
Q (Simulated) Energy Reduction ()
Frequency (MHz)
Qfactor
Energyreduction()
Figure 69 Inductorrsquos Q factor simulation in ASITIC Estimated energydissipation reduction per cycle over standard CMOS
lated Q factor of the inductor for each frequency was then usedin Equation 55 to estimate the energy performance of the energy-recovery circuit when compared to standard CMOS (Equation 52)
As can be observed in Figure 69 the maximum performance isnot reached at the point where the maximum Q factor is achievedsince energy-recovery is also dependent on the operating fre-quency As a result the largest reduction in energy dissipationwas 40 attained at 300MHz while the maximum Q factor re-ported by ASITIC was 20 at 625MHz The high Q factors werea consequence of the large area allocated for the inductor andthe availability of the backside re-distribution layer (RDL) on thetarget process technology for the implementation of the inductorwhere thick low-resistance copper paths could be deposited
For an operating frequency of 500MHz the inductor had a Qfactor of sim 17 which resulted in an estimated energy reductionof 33 in comparison to the CMOS circuit The inductor had avalue of 613nH at that frequency and its design in layout viewcan be seen in Figure 610
626 Top level design
The various sub-components of the circuit were interconnectedusing Metal1 and Metal2 layers In the design there were twoglobal signals that assumed a critical role in the performance ofthe circuit The digital data signal (DATA) which was discussedin 623 and the resonant clock (RES_CLK)
For the distribution of the data signal buffers and interconnectswere properly sized so that the timing constraints of Figure 66were respected A sinusoidal signal such as RES_CLK does notrequire regeneration as it carries just one frequency harmonichowever the path itself has to be properly sized so as to make surethat any parasitic resistancescapacitances are not significantlyaffecting the target performance of the design As was previously
62 physical implementation 99
500μ
m
500μm
Figure 610 Planar square spiral inductor designed in ASITIC
shown by Chueh et al [19] when the driven load capacitancein a resonant distribution network is much larger than wireparasitic capacitance then wire resistance is more significant fordetermining both skew and energy performance
The resonant pulse distribution topology used in this designwas an asymmetric tree as shown in Figure 611 in which theinductor forces a current at the root branch terminating on alladiabatic drivers in parallel As such and considering that theresonant pulse supplies 80 adiabatic drivers with transmissiongate resistance of RTG = 168Ω the total load resistance seen atthe output node of the inductor is RTG
80 = 16880 = 21Ω However as
the current flows to lower branch levels in the tree the perceivedresistance increases to RTG
16 RTG4 and on the last one to RTG Any
additional resistance from wires is going to add in series with theadiabatic driver resistance which in turn will induce an almostlinear effect on the energy performance as can be estimated fromEquation 512
Assuming that the total driven resistance is not desirable toincrease more than 20 and that there are 4 branch levels inthe design each branch can tolerate wire resistance equal to20
4 = 5 of its value Starting from the inductor output nodeand calculating allowed wire resistivity for each branch level thecalculated tolerances for each branch are listed in Table 62 Thesecalculated limits were used for optimally sizing the resonantclock distribution tree in the demonstrator circuit
To ensure a stable voltage supply at the input of the inductor(VIND) a 1pF interdigitated metal-insulator-metal (MIM) de-coupling capacitor [78] was designed (Figure 612) which wasinserted between VIND and ground The top level design of
62 physical implementation 100
Adiabatic drivers
RTG
16
RTG
80
RTG
4
RTG
Level 1Level 2Level 3Level 4
Figure 611 Resonant pulse distribution tree
Branch level Wire resistance
1st 5 middot 16880 = 105mΩ
2nd 5 middot 16816 = 525mΩ
3rd 5 middot 1684 = 21Ω
4th 5 middot 168Ω = 84Ω
Total load 168+8480 + 21
20 + 05255 + 0105 = 252Ω
Table 62 Calculated resistance tolerances for each wire branch in theresonant clock distribution tree
63 post-layout simulation and comparison 101
+-
+-Metal2 Metal1
180μ
m
100μm
Figure 612 Interdigitated metal-insulator-metal capacitor (fringe capa-citor)
the energy-recovery IO circuit in layout view is illustrated inFigure 613
627 CMOS IO circuit
The CMOS IO circuit which was designed for comparison pur-poses is illustrated in the schematic diagram of Figure 614 Thecircuit used half the number of TSVs than the energy-recoverycircuit and was composed of two basic sub-components The datagenerator which was identical to the one discussed in 623 andthe CMOS drivers (Figure 615) which were properly sized to drivethe 160 f F TSV capacitance
The top level layout view of the CMOS IO circuit can be seenin Figure 616
63 post-layout simulation and comparison
The theoretically calculated parameters were used for the designof the demonstrator circuit in Imecrsquos 3D130nm process technologyIn Table 63 the estimated energy dissipation of each circuit sub-component is listed
63 post-layout simulation and comparison 102
22mm
12mm
GND GNDVBUF1 VDR
GND
VIND
S1A
S2
EN
OER
Figure 613 Energy-recovery IO circuit in layout view
63 post-layout simulation and comparison 103
OCMOS
S1B DataGenerator
TSV
CMOSDriver
CMOSDriver
TSV Buffer
1
80
EN
Figure 614 CMOS IO circuit schematic diagram
GND
VCMOS
DATA
078u
039u
105u
053u
316u
158u
950u
475u
OUT
Figure 615 CMOS driver schematic
12mm
12mm
OCMOS S1B GND VBUF2 VCMOSGND
GND GNDEN VCMOS
Figure 616 CMOS IO circuit in layout view
63 post-layout simulation and comparison 104
Energy dissipation (80-bits cycle)
Inductor 11pJ (Eq514)
Switch M1 092pJ [ ]
P2LCs 098pJ (Simulated)
Adiabatic drivers 315pJ (Eq512)
Total 615pJ
Table 63 Energy-dissipation theoretical calculations for Q = 17 f =500MHz
After parasitics extraction the top-level design was simulatedin SPICE The post-layout simulation results for a selection ofoutput signals are plotted in Figure 617 As can be observedthe timing constraints discussed in 623 are respected since theadiabatic driverrsquos input signal (DATA) optimally switches duringthe 327ps PULSE width It can also be seen that the resonantclock (CLK_RES) does not fully complete its discharge cyclebefore it is being forced to do so by the PULSE signal since theadditional capacitive load induced by the wire parasitics reducesthe oscillatorrsquos natural frequency
Post-layout simulated energy dissipation as well as area re-quirements for the energy-recovery and CMOS IO circuits arelisted in Table 64 The energy dissipation of the energy-recoveryimplementation is 25 more at the post-layout level in compar-ison to the theoretical estimations The energy overhead can beattributed to the effect of the wire parasitics as discussed in626 and the transistor parasitics that are not included in thetheoretical equations Therefore the accuracy of the theoreticalmodels can be considered acceptable for designing circuits basedon the proposed energy-recovery scheme
As was expected the dual-rail energy-recovery circuit occupiesalmost double the area of the equivalent CMOS circuit and eventhough both implementations have increased energy dissipationin comparison to the theoretical calculations the CMOS circuitenergy increases relatively more The latter was anticipated sincethe theoretical estimation of the CMOS energy dissipation wascalculated using the simple Equation 52 which does not accountfor any additional driver stages and parasitics The end result isthat the comparative performance favours the energy-recoveryimplementation which improves its energy savings from the
63 post-layout simulation and comparison 105
8 9 10 11 12 13 14 15 160
02
04
06
08
1
12
14RES_CLK
Time (ns)
V(V
)
8 9 10 11 12 13 14 15 160
02
04
06
08
1
12
14DATA
Time (ns)
V(V
)
8 9 10 11 12 13 14 15 160
02
04
06
08
1
12
14PULSE
Time (ns)
V(V
)
8 9 10 11 12 13 14 15 160
02
04
06
08
1
12
14OER
Time (ns)
V(V
)
Figure 617 Post-layout SPICE simulation of the energy-recovery cir-cuit
64 experimental measurements 106
AreaEnergy
dissipation(Theory)
Energydissipation
(Post-layout)
CMOS 144mm2922pJ
(115 f Jbit)
(Eq52)
1265pJ
(158 f Jbit)
Energy recovery 264mm2 615pJ
(77 f Jbit)
768pJ
(96 f Jbit)
Difference +83 minus333 minus393
Table 64 Post-layout simulated energy dissipation Area require-ments
theoretically calculated 333 up to 393 in the post-layoutsimulations
64 experimental measurements
The presented energy-recovery and CMOS IO circuits were fabric-ated on a 200mm wafer (Figure A3) which contained 47 identicaldies Micrograph images of the energy recovery and CMOS circuitscan be seen in Figures 618 and 619 respectively
641 Measurement setup
For IO access a manual on-wafer measurement system was usedthat allowed direct contact with the wafer and the IO test padsthrough RF probe heads (Figure A4) IO signal generation andmeasurement was provided by laboratory instruments whichwere manually controlled The measurement setup is illustratedin Figure 620
The IO test pads for the energy-recovery and CMOS circuitsand their equivalent instrument connections used for each testpad are summarised in Tables 65 and 66
642 Results
CMOS IO circuit
In Figure 621 the input signal S1B and the output signal OCMOS
are shown as captured by the oscilloscope Signal S1B whichhas a frequency of 500MHz is used by the data generator (Fig-ure 614) to generate a data signal switching at the same frequency
64 experimental measurements 107
Figure 618 Energy-recovery IO circuit micrograph
Figure 619 CMOS IO circuit micrograph
AABC
DEFF
D
EEBA
EBBA
Figure 620 Measurement setup
64 experimental measurements 108
Test pad Direction Instrument connection
VIND Power Power supply - gt06V
GND Power Power supply - 0V
S1A InputSine-wave generator (0deg Phase500MHz non-50Ω-terminated)
S2 InputSine-wave generator (VariablePhase 500MHznon-50Ω-terminated)
EN InputDigital Output (enable outputdata switching)
OER Output Oscilloscope
VDR PowerPower supply (adiabaticdrivers+P2LCs) - 12V
VBUF1 Power Power supply (Buffers) - 12V
Table 65 Energy-recovery circuit instrument connections
Test pad Direction Instrument connection
VCMOS PowerPower supply (CMOS drivers) -12V
GND Power Power supply - 0V
VBUF2 Power Power supply (Buffers) - 12V
S1B InputSine-wave generator (0deg Phase500MHz non-50Ω-terminated)
OCMOS Output Oscilloscope
EN InputDigital Output (enable outputdata switching)
Table 66 CMOS circuit instrument connections
64 experimental measurements 109
S1B
OCMOS
Figure 621 Oscilloscope output for signals S1B and OCMOS
100 200 300 400 500 600 700 800012345678910
Die1 Die2 Simulation
Frequency (MHz)
Powerdissipation(mW)
Figure 622 Measuredsimulated power dissipation for the CMOS cir-cuit
that passes through the TSVs and is output on OCMOS Thereforethe measured result on the oscilloscope suggests that the circuitis operating as expected
The proper operation of the circuit is also verified by measuringpower and energy per cycle dissipation on pad VCMOS Thepower dissipation of the two dies measured is approximatelylinear as expected by simulations (Figure 622) while energydissipation approximately constant (Figure 623) In additionabsolute power and energy dissipation values closely match post-layout simulations
Energy-recovery IO circuit
Input signal S1A and output signal OER can be seen in Fig-ure 624 as captured by the oscilloscope The output at OER
64 experimental measurements 110
100 200 300 400 500 600 700 800024681012141618
Die1 Die2 Simulation
Frequency (MHz)
Energydissipation(pJ)
Figure 623 Measuredsimulated energy dissipation for the CMOScircuit
S1A
OER
Figure 624 Oscilloscope output for signals S1A and OER
shows a data signal switching at 500MHz as expected The slowrising edge of the data signal suggests higher than anticipatedparasitics at the OER IO pad though the functionality of thecircuit is not affected
The energy dissipation is calculated by measuring the inputcurrent on IO test pads VIND and VDR The results for the twodies measured can be seen in Figure 625 The energy dissipationshows an approximately constant relation to frequency similar tothe CMOS circuit which does not correlate with post-layout simu-lations A properly operating energy-recovery circuit would beexpected to show minimum energy dissipation at the resonancefrequency (sim 500MHz)
Since the energy-recovery circuit shows CMOS-type energy dis-sipation it was estimated that the inductor could not resonate
65 summary and conclusions 111
200 250 300 350 400 450 500 550 600 65002468101214161820
Die1 Die2 Simulation
Frequency (MHz)
Energydissipation(pJ)
Figure 625 Measuredsimulated energy dissipation for the energy-recovery circuit
IntegratedInductor
VINDOER
PulseGenerator
S1A
S2
DataGenerator
TSV
TSV
P2LCAdiabaticDriver
P2LCAdiabaticDriver TSV
TSV
Buffer
Buffer
1
80
1
80
EN
DecouplingCapacitor
200pF
Figure 626 Energy-recovery IO circuit with fault hypothesis
properly and a hypothesis was made that the cause was a largerthan anticipated parasitic capacitance on the integrated inductorTo determine whether the hypothesis is valid a 200pF capacitoris inserted in the circuit schematic connected at the output ofthe inductor as shown in Figure 626 The new circuit is simu-lated and as can be seen in Figure 627 the output signal at OER
switches properly even though the resonant clock (RES_CLK) isnot oscillating The simulated energy dissipation of the circuitwith the fault hypothesis can be seen in Figure 628 and as canbe observed it is very similar to the experimental measurementsboth in terms of absolute values and frequency relation
65 summary and conclusions
In this chapter the design approach of a demonstrator circuitwas presented for evaluating the energy-recovery 3D interconnect-ing scheme Expanding on the theoretical analysis proposed inChapter 5 practical design issues were discussed and methodswere suggested for overcoming them
The circuit was designed and implemented on an experimental130nm 3D process technology and post-layout simulations res-
65 summary and conclusions 112
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15050
055
060
065
070
075
080RES_CLK
Time (ns)
V(V
)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 150
02
04
06
08
1
12
14PULSE
Time (ns)
V(V
)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1500
02
04
06
08
10
12
14OER
Time (ns)
V(V
)
Figure 627 Simulation of circuit with fault hypothesis
65 summary and conclusions 113
200 250 300 350 400 450 500 550 600 6500
5
10
15
20
25
Die1 Die2 Simulation CL=200pF
Frequency (MHz)
Ene
rgy
diss
ipat
ion
(pJ)
Figure 628 Measured energy dissipation and simulation of circuit withfault hypothesis
ults including parasitics were shown and interpreted The post-layout results revealed that the demonstrator circuit dissipated25 more energy per cycle than the value estimated by the-ory however the energy overhead was attributed to the effectof the wiretransistor parasitics which are not included in thetheoretical models Nevertheless the energy-recovery IO circuitdissipated 39 less energy per cycle than the equivalent CMOS
circuit implemented on the same process technologyExperimental measurements demonstrated that the fabricated
CMOS IO circuit closely matched the energy-dissipation estim-ated by post-layout simulations However the energy-recoveryIO circuit did not operate as expected and as a result its wasnot possible to evaluate the energy-recovery 3D interconnectingscheme on-chip Using simulation data the fault was attributedto higher than anticipated integrated inductor parasitics
7C O N C L U S I O N S
Planar on-chip interconnects have become a major concern inmodern processes as their impact on IC performance has beenprogressively increasing with each technology node In recentyears 3D interconnects based on vertical TSVs have been proposedas a possible solution to the interconnect bottleneck TSVs havethe potential to offer the best all around technology for meet-ing future technology trends with their advantages includingreduced parasitics and high interconnection density
As TSV is a relatively recent 3D interconnect technology thereare still challenges to be overcome before TSVs can be widelyadopted and commercialised Issues in the fabrication of TSV
devices are in the process of being resolved or solutions havealready been found For implementing TSV-based 3D-ICs withmore complex functionalities than simple test chips 3D designmethodologies and tools are needed which are still at their in-fancy stage Accurate perception of the TSV characteristics thatinfluence circuit performance is a major step towards the devel-opment of design methodologies and tools for 3D-ICs
Key TSV characteristics are the TSV parasitic capacitance andthermomechanical stress distribution The parasitic capacitanceaffects delay and power consumption of signals passing throughthe TSV interconnect while thermomechanical stress affects MOS-
FET device mobility in the vicinity of TSVs For extracting theseparameters accurately and under different conditions sophist-icated test structures were presented in this work which werefabricated and evaluated on an experimental 3D process
Although the TSV parasitic capacitance is small it may stillbecome an important source of power consumption as TSVs areimplemented in large densely interconnected 3D-SoCs Since con-ventional low-power design techniques based on voltage scalingrepresent a technical challenge in modern technology nodesa novel TSV interconnection scheme with frequency-dependentpower consumption was proposed in this work based on energy-recovery logic The proposed scheme was analysed using the-oretical modelling while a demonstrator IC was designed andfabricated on a 130nm 3D process
The contributions of this thesis as presented in the previouschapters are summarised in the following sections as well aspossible directions for future work
114
71 main contributions 115
71 main contributions
From the parasitics associated with a TSV capacitance assumesthe most significant role on its electrical properties such as en-ergy dissipation and delay thus accurate characterization of theTSV capacitance is highly desirable when designing a 3D systemIn Chapter 3 a group of test structures were presented for theelectrical characterization of the TSV parasitic capacitance un-der various conditions The test structures were based on theCBCM technique which has been used in the past for measuringBEOL metal interconnect parasitic capacitances and MOSFET ca-pacitances The presented test structures were fabricated on a65nm 3D process and the measurement results were statisticallyanalysed in terms of TSV capacitance die-to-die (D2D) and die-to-wafer (D2W) variability Comparison of the measurement data tosimulated oxide liner thickness variation for the specific processrevealed the oxide liner as a major contributor to the observedTSV capacitance variability Measurement results also confirmedthe effect of TSV pitch on the parasitic capacitance as has beenpreviously reported in the literature for a different process Theproposed CBCM-based TSV capacitance measurement techniquewas shown to be superior to conventional LCR measurementsproducing both accurate measurement results as well as lessoutliers (less defective samples) in the measurement data
The integration of TSV 3D interconnects in CMOS technology hasbeen shown to induce thermomechanical stress on the silicon (Si)substrate Thermomechanical stress can have an impact on activedevice carrier mobility implemented on the same Si substratethus monitoring and controlling stress is crucial for designing TSV-aware circuits with predictable performance Simulations usingfinite element method (FEM) modelling predict a complex ther-momechanical stress distribution which emphasises the need forsophisticated characterization structures that can monitor stressimpact with precision and verify simulation models In Chapter4 a test structure was presented for indirectly monitoring of theTSV-induced thermomechanical stress distribution by measuringits effect on transistor saturation current The structure implemen-ted arrays of MOSFET devices superimposed with TSVs in variousconfigurations and enabled accurate measurement of individualdevices by combining digital selection logic with four-terminalsensing Imecrsquos experimental 65nm 3D process was characterizedusing the presented test structure extracting the thermomechan-ical stress effect on PMOS devices which was found to be on parwith simulation estimations
In future large densely interconnected 3D-SoCs the TSV parasiticcapacitance may become an important source of energy dissip-ation Energy-recovery logic is an interesting low-power design
72 future work 116
approach as it is not limited by voltage scaling which representsa technical challenge in modern technology nodes In Chapter5 the potential of the energy-recovery technique was investig-ated for reducing the energy dissipation of TSV interconnectsin 3D-ICs The proposed energy-recovery scheme was analysedusing theoretical modelling generating equations for energy dis-sipation and optimum device sizing The model accuracy wasevaluated against SPICE simulations which showed good cor-relation with the theoretical estimations on a 130nm technologyprocess Comparison of the energy-recovery scheme to conven-tional static CMOS demonstrated favourable energy performancefor the energy-recovery scheme for low operating frequenciesand high Q factors switching activities and TSV capacitances
The theoretical models developed in Chapter 5 were used fordesigning a energy-recovery 3D demonstrator circuit in Chapter6 under realistic physical and electrical constraints The energy-recovery circuit and its CMOS equivalent were fabricated on a130nm 3D process technology and post-layout simulations showedthat the energy-recovery circuit dissipated 39 less energy percycle than the CMOS circuit on the same process technology Exper-imental measurements and comparison to simulations confirmedthe proper operation of the fabricated CMOS circuit however theenergy-recovery circuit did not operate as expected and as a res-ult its was not possible to evaluate the proposed energy-recoveryscheme on a chip Using simulation data the fault was attributedto higher than anticipated integrated inductor parasitics
72 future work
The CBCM-based test structures in Chapter 3 were shown to be atleast as accurate as external instruments for characterizing TSV
capacitance Considering that CBCM can evaluate single-endedTSVs it could prove a valuable tool for testing TSVs prior to diestacking which is a critical step for keeping 3D-IC yield acceptableMeasuring the TSV parasitic capacitance can provide with import-ant information about the quality of the fabricated TSV devicesince even slight variations in process parameters will affect itsparasitic capacitance as was demonstrated in Chapter 3 CBCM issimpler and potentially more accurate than other techniques thathave been proposed in the literature for the same role such asRC constant measurement using sense amplification [16]
In Chapter 4 the experiences learned from evaluation of theexperimental measurements results could be potentially used fordesigning improved thermomechanical stress characterizationstructures for next generation TSV processes
1 The transistors should be in a regular grid in the array forsimple processing of the measurement data
72 future work 117
2 The boundary conditions of transistors should be carefullydesigned as they have an considerable effect on transistormatching and thus measurement accuracy
3 Distances of interest are generally lt 8microm thus the arrayscould be smaller and more dense in the area in proximityto the TSV
4 Mechanisms are required to detect faults in the digitalselection logic when the implementation process is expectedto have low-yield issues
Furthermore in a future work the thermomechanical stress effecton NMOS devices should be measured which were not evaluatedin this project Simulations predict that NMOS devices are muchless affected by thermomechanical stress than PMOS devices thusthe approach used in this work might not be sufficiently accurateto extract stress effect on NMOS mobility
In Chapter 5 the energy-recovery scheme proposed could use adifferent encoding technique than dual-rail as a possible improve-ment A suggested encoding is 1-out-of-4 [74] since only chargesa single output at any given cycle which can potentially save upto 50 energy in comparison to dual-rail Also the demonstra-tion of the energy-recovery scheme feasibility on a chip is stillpending since the energy-recovery 3D demonstrator discussedin Chapter 6 failed to operate properly The fault was attributedto higher than anticipated integrated inductor parasitics thusthe same circuit should be implemented in the future with animproved inductor design for verifying the proposed theory
AA P P E N D I X A P H O T O S
Figure A1 3D65 300mm wafer (on chuck)
Figure A2 Probe card
118
appendix a photos 119
Figure A3 3D130 200mm wafer (on chuck)
ProbeHead
Microscope
Figure A4 Probe head
BA P P E N D I X B S TAT I S T I C S
mean (micro)
In a given data set the mean is the sum of the values divided bythe number of values The mean represents the average value in apopulation If xi is a value in the data set and n the total numberof values the mean micro can be calculated as
micro =1n
n
sumi=1
xi
median (m)
The median is useful for skewed distributions and represents thearithmetic value that the majority of the values in a data set tendThe median of a normal distribution with mean micro and varianceσ2 is the same as the mean
standard deviation (σ)
Standard deviation is used for measuring variability in a dataset It shows how much variation there is from the mean valueTo calculate the standard deviation of a data set first the squareof the difference of each value from the mean is calculated andthen the square root of the mean gives the standard deviation
A normal distribution contains 682 of its values within plusmn1σOther distributions can have a different spread and typically alow standard deviation indicates low variability whereas a highstandard deviation increased variability
standard deviation of the mean (σmicro )
The standard deviation of the sampled mean (σmicro) provides in-formation on the precision of the calculated mean of a data setThis is especially useful in experimental measurements wherethe certainty of the data being an accurate representation of themeasured physical quantity increases the more the measurementis repeated
Assuming there is a data set X with n values xi of mean micro
120
appendix b statistics 121
variance(X) = σ2X
variance(micro) = variance( 1n sum
ni=1 xi) =
1n2 variance(sumn
i=1 xi) =1n2 sum
ni=1 variance(xi) =
nn2 variance(X) =1n variance(X) =
σ2Xn rarr
σmicro = σXradicn
coefficient of variation
The coefficient of variation is the ratio of the standard deviationσ to the mean micro
CV =σ
micro
It can be more useful in determining variability in a populationthan standard deviation as in contrast to standard deviationcoefficient of variation is a dimensionless number often expressedas percentage
outlier
Outliers are values in a data set that are numerically distant fromthe rest of the data Observing and sometimes removing theoutliers is crucial in experimental measurement as these valuesusually indicate a measurement error or other malfunction De-termining outliers is generally subjective to the observer as thereis no specific mathematical definition of what constitutes outliersTypically the outliers are located by observing the probabilitydistribution function of a data set
kruskalndash wallis [34]
KruskalndashWallis one-way analysis of variance compares the medi-ans of two or more data sets to determine if the samples originatefrom separate populations This analysis tests the null-hypothesisthat there is no relationship between the data sets calculating theprobability of the data sets originating from the same populationand rejecting the null-hypothesis if it is less than a significancelevel The KruskalndashWallis test can be used on data sets that donot have a normal distribution and their variances do not have tobe equal
B I B L I O G R A P H Y
[1] Model 2602 dual-channel system sourcemeter instru-ment URL httpwwwkeithleycomproductsdcac
currentsourcebroadpurposemn=2602
[2] Max-3d URL httpwwwmicromagiccom
[3] The international technology roadmap for semiconductors2007 URL httppublicitrsnet
[4] The international technology roadmap for semiconductors2009 URL httppublicitrsnet
[5] Jaime Aguilera Joaquin de No Andres Garcia-Alonso FrankOehler Heiko Hein and Josef Sauerer A guide for on-chip inductor design in a conventional cmos process forrf applications In Applied Microwave amp Wireless Magazinepages 56ndash65 October 2001
[6] M Arsalan and M Shams Charge-recovery power clockgenerators for adiabatic logic circuits In Proc 18th Int VLSI
Design Conf pages 171ndash174 2005 doi 101109ICVD200564
[7] P Asimakopoulos G Van der Plas A Yakovlev andP Marchal Evaluation of energy-recovering interconnectsfor low-power 3d stacked ics In Proc IEEE Int Conf
3D System Integration 3DIC 2009 pages 1ndash5 2009 doi1011093DIC20095306581
[8] W C Athas L J Svensson J G Koller N Tzartzanis andE Ying-Chin Chou Low-power digital systems based onadiabatic-switching principles 2(4)398ndash407 1994 doi 10110992335009
[9] C H Bennett Logical reversibility of computation IBM
Journal of Research and Development 17(6)525ndash532 1973 doi101147rd1760525
[10] Charles H Bennett The fundamental physical lim-its of computation Scientific American 253(1)48ndash561985 URL httpwwwnaturecomdoifinder101038
scientificamerican0785-48
[11] E Beyne 3d system integration technologies In Proc Int
VLSI Technology Systems and Applications Symp pages 1ndash92006 doi 101109VTSA2006251113
122
bibliography 123
[12] Mark Bohr and Kaizad Mistry Intelrsquos revolu-tionary 22 nm transistor technology 2011 URLhttpdownloadintelcomnewsroomkits22nmpdfs
22nm-Details_Presentationpdf
[13] Ke Cao S Dobre and Jiang Hu Standard cell charac-terization considering lithography induced variations InProc 43rd ACMIEEE Design Automation Conf pages 801ndash8042006 doi 101109DAC2006229327
[14] A P Chandrakasan S Sheng and R W Brodersen Low-power cmos digital design 27(4)473ndash484 1992 doi 1011094126534
[15] J C Chen B W McGaughy D Sylvester and Chenming HuAn on-chip attofarad interconnect charge-based capacitancemeasurement (cbcm) technique In Proc Int Electron Devices
Meeting IEDM rsquo96 pages 69ndash72 1996 doi 101109IEDM1996553124
[16] Po-Yuan Chen Cheng-Wen Wu and Ding-Ming Kwai On-chip tsv testing for 3d ic before bonding using sense ampli-fication In Proc ATS rsquo09 Asian Test Symp pages 450ndash4552009 doi 101109ATS200942
[17] Sungdong Cho Sinwoo Kang Kangwook Park Jaechul KimKiyoung Yun Kisoon Bae Woon Seob Lee Sangwook JiEunji Kim Jangho Kim Y L Park and E S Jung Impact oftsv proximity on 45nm cmos devices in wafer level In Proc
IEEE Int Interconnect Technology Conf and 2011 Materials for
Advanced Metallization (IITCMAM) pages 1ndash3 2011 doi101109IITC20115940326
[18] Kyu-Myung Choi An industrial perspective of 3d ic inte-gration technology from the viewpoint of design technologyIn Proc 15th Asia and South Pacific Design Automation Conf
(ASP-DAC) pages 544ndash547 2010 doi 101109ASPDAC20105419823
[19] Juang-Ying Chueh C H Ziesler and M C PapaefthymiouEmpirical evaluation of timing and power in resonant clockdistribution In Proc Int Symp Circuits and Systems ISCAS
rsquo04 volume 2 2004 doi 101109ISCAS20041329255
[20] Thuy Dao D H Triyoso M Petras and M CanonicoThrough silicon via stress characterization In Proc IEEE
Int Conf IC Design and Technology ICICDT rsquo09 pages 39ndash412009 doi 101109ICICDT20095166260
[21] M Grange R Weerasekera D Pamunuwa and H Ten-hunen Exploration of through silicon via interconnect par-
bibliography 124
asitics for 3-dimensional integrated circuits In Workshop
Notes Design Automation and Test in Europe (DATE) 2009
[22] N Z Haron and S Hamdioui Why is cmos scaling comingto an end In Proc 3rd Int Design and Test Workshop IDT
2008 pages 98ndash103 2008 doi 101109IDT20084802475
[23] Imec 200mm 3d-sic tsv line 2009 URL httpwwwimec
beScientificReportSR2009HTML1213307html
[24] J W Joyner R Venkatesan P Zarkesh-Ha J A Davis andJ D Meindl Impact of three-dimensional architectures oninterconnects in gigascale integration 9(6)922ndash928 2001doi 10110992974905
[25] A P Karmarkar Xiaopeng Xu and V Moroz Perfor-manace and reliability analysis of 3d-integration struc-tures employing through silicon via (tsv) In Proc IEEE
Int Reliability Physics Symp pages 682ndash687 2009 doi101109IRPS20095173329
[26] G Katti A Mercha M Stucchi Z Tokei D VelenisJ Van Olmen C Huyghebaert A Jourdain M RakowskiI Debusschere P Soussan H Oprins W DehaeneK De Meyer Y Travaly E Beyne S Biesemans andB Swinnen Temperature dependent electrical character-istics of through-si-via (tsv) interconnections In Proc Int
Interconnect Technology Conf (IITC) pages 1ndash3 2010 doi101109IITC20105510311
[27] G Katti M Stucchi K De Meyer and W Dehaene Electricalmodeling and characterization of through silicon via forthree-dimensional ics 57(1)256ndash262 2010 doi 101109TED20092034508
[28] G Katti M Stucchi J Van Olmen K De Meyer and W De-haene Through-silicon-via capacitance reduction techniqueto benefit 3-d ic performance 31(6)549ndash551 2010 doi101109LED20102046712
[29] Walt Kester Sensor signal conditioning In Jon SWilson editor Sensor Technology Handbook pages 31 ndash136 Newnes Burlington 2005 ISBN 978-0-75-067729-5 doi DOI101016B978-075067729-550044-6 URLhttpwwwsciencedirectcomsciencearticlepii
B9780750677295500446
[30] Jonggab Kil Jie Gu and C H Kim A high-speed variation-tolerant interconnect technique for sub-threshold circuitsusing capacitive boosting 16(4)456ndash465 2008 doi 101109TVLSI2007915455
bibliography 125
[31] Suhwan Kim C H Ziesler and M C PapaefthymiouCharge-recovery computing on silicon 54(6)651ndash659 2005doi 101109TC200591
[32] C Kortekaas On-chip quasi-static floating-gate capacitancemeasurement method In Proc Int Conf Microelectronic Test
Structures ICMTS 1990 pages 109ndash113 1990 doi 101109ICMTS199067889
[33] A Kramer J S Denker B Flower and J Moroney 2ndorder adiabatic computation with 2n-2p and 2n-2n2p logiccircuits In Proceedings of the 1995 international symposium on
Low power design ISLPED rsquo95 pages 191ndash196 New York NYUSA 1995 ACM ISBN 0-89791-744-8 doi httpdoiacmorg101145224081224115 URL httpdoiacmorg10
1145224081224115
[34] William H Kruskal and W Allen Wallis Use of ranks in one-criterion variance analysis Journal of the American Statistical
Association 47(260)583ndash621 1952 ISSN 01621459 doi 1023072280779 URL httpdxdoiorg1023072280779
[35] Kelin J Kuhn Moorersquos law past 32nm Future challengesin device scaling 2009 URL httpdownloadintelcom
pressroompdfkkuhnKuhn_IWCE_invited_textpdf
[36] R Landauer Irreversibility and heat generation in the com-puting process IBM Journal of Research and Development 5
(3)183ndash191 1961 doi 101147rd530183
[37] Joonho Lim Dong-Gyu Kim and Soo-Ik Chae A 16-bitcarry-lookahead adder using reversible energy recoverylogic for ultra-low-energy systems 34(6)898ndash903 1999 doi1011094766827
[38] Joonho Lim Dong-Gyu Kim and Soo-Ik Chae nmos re-versible energy recovery logic for ultra-low-energy applica-tions 35(6)865ndash875 2000 doi 1011094845190
[39] F Liu X Gu K A Jenkins E A Cartier Y Liu P Song andS J Koester Electrical characterization of 3d through-silicon-vias In Proc 60th Electronic Components and Technology Conf
(ECTC) pages 1100ndash1105 2010 doi 101109ECTC20105490839
[40] K H Lu Xuefeng Zhang Suk-Kyu Ryu J Im Rui Huangand P S Ho Thermo-mechanical reliability of 3-d ics con-taining through silicon vias In Proc 59th Electronic Compo-
nents and Technology Conf ECTC 2009 pages 630ndash634 2009doi 101109ECTC20095074079
bibliography 126
[41] Nir Magen Avinoam Kolodny Uri Weiser and NachumShamir Interconnect-power dissipation in a microprocessorIn Proceedings of the 2004 international workshop on System
level interconnect prediction SLIP rsquo04 pages 7ndash13 New YorkNY USA 2004 ACM ISBN 1-58113-818-0 doi httpdoiacmorg101145966747966750 URL httpdoiacm
org101145966747966750
[42] H Mahmoodi V Tirumalashetty M Cooke and K RoyUltra low-power clocking scheme using energy recoveryand clock gating 17(1)33ndash44 2009 doi 101109TVLSI20082008453
[43] D Maksimovic A mos gate drive with resonant transitionsIn Proc nd Annual IEEE Power Electronics Specialists Conf
PESC rsquo91 Record pages 527ndash532 1991 doi 101109PESC1991162725
[44] D Maksimovic and V G Oklobdzija Integrated power clockgenerators for low energy logic In Proc th Annual IEEE
Power Electronics Specialists Conf PESC rsquo95 Record volume 1pages 61ndash67 1995 doi 101109PESC1995474793
[45] D Maksimovic V G Oklobdzija B Nikolic and K WCurrent Clocked cmos adiabatic logic with integratedsingle-phase power-clock supply 8(4)460ndash463 2000 doi10110992863629
[46] E J Marinissen Testing tsv-based three-dimensional stackedics In Proc Design Automation amp Test in Europe Conf amp
Exhibition (DATE) pages 1689ndash1694 2010
[47] E J Marinissen and Y Zorian Testing 3d chips containingthrough-silicon vias In Proc Int Test Conf ITC 2009 pages1ndash11 2009 doi 101109TEST20095355573
[48] A Mercha A Redolfi M Stucchi N Minas J Van Ol-men S Thangaraju D Velenis S Domae Y Yang G KattiR Labie C Okoro M Zhao P Asimakopoulos I De WolfT Chiarella T Schram E Rohr A Van Ammel A Jour-dain W Ruythooren S Armini A Radisic H PhilipsenN Heylen M Kostermans P Jaenen E Sleeckx D Sabun-cuoglu Tezcan I Debusschere P Soussan D PerryG Van der Plas J H Cho P Marchal Y Travaly E BeyneS Biesemans and B Swinnen Impact of thinning andthrough silicon via proximity on high-k metal gate firstcmos performance In Proc Symp VLSI Technology (VLSIT)pages 109ndash110 2010 doi 101109VLSIT20105556190
[49] A Mercha G Van der Plas V Moroz I De Wolf P Asi-makopoulos N Minas S Domae D Perry M Choi
bibliography 127
A Redolfi C Okoro Y Yang J Van Olmen S ThangarajuD S Tezcan P Soussan J H Cho A Yakovlev P Mar-chal Y Travaly E Beyne S Biesemans and B SwinnenComprehensive analysis of the impact of single and arraysof through silicon vias induced stress on high-k metalgate cmos performance In Proc IEEE Int Electron Devices
Meeting (IEDM) 2010 doi 101109IEDM20105703278
[50] Yong Moon and Deog-Kyoon Jeong An efficient chargerecovery logic circuit 31(4)514ndash522 1996 doi 1011094499727
[51] G E Moore Cramming more components onto inte-grated circuits Electronics 38(8)114ndash117 April 1965 doi101109JPROC1998658762 URL httpdxdoiorg10
1109JPROC1998658762
[52] O S Nakagawa S-Y Oh T Hsu and S Habu Benchmarkmethodology of interconnect capacitance simulation usinginter-digitated capacitors In Proc Int Microelectronic Test
Structures ICMTS 1998 Conf pages 235ndash237 1998 doi 101109ICMTS1998688103
[53] M W Newman S Muthukumar M SchueleinT Dambrauskas P A Dunaway J M Jordan S KulkarniC D Linde T A Opheim R A Stingel W WorwagL A Topic and J M Swan Fabrication and electricalcharacterization of 3d vertical interconnects In Proc
56th Electronic Components and Technology Conf 2006 doi101109ECTC20061645676
[54] V G Oklobdzija D Maksimovic and Fengcheng Lin Pass-transistor adiabatic logic using single power-clock supply44(10)842ndash846 1997 doi 10110982633443
[55] J Van Olmen C Huyghebaert J Coenen J Van AelstE Sleeckx A Van Ammel S Armini G Katti J VaesW Dehaene E Beyne and Y Travaly Integration chal-lenges of copper through silicon via (tsv) metallization for3d-stacked ic integration Microelectronic Engineering 88
(5)745 ndash 748 2011 ISSN 0167-9317 doi DOI101016jmee201006026 URL httpwwwsciencedirectcom
sciencearticlepiiS0167931710002121 The 2010 Inter-national workshop on - MAM 2010 The 2010 Internationalworkshop on
[56] J J Paulos and D A Antoniadis Measurement of minimum-geometry mos transistor capacitances 20(1)277ndash283 1985doi 101109JSSC19851052303
bibliography 128
[57] Vasilis F Pavlidis and Eby G Friedman Interconnect de-lay minimization through interlayer via placement in 3-dics In Proceedings of the 15th ACM Great Lakes symposium
on VLSI GLSVLSI rsquo05 pages 20ndash25 New York NY USA2005 ACM ISBN 1-59593-057-4 doi httpdoiacmorg10114510576611057669 URL httpdoiacmorg101145
10576611057669
[58] Dan Perry Jonghoon Cho Shinichi Domae Panagiotis Asi-makopoulos Alex Yakovlev Pol Marchal Geert Van derPlas and Nikolaos Minas An efficient array structure tocharacterize the impact of through silicon vias on fet devicesIn Proc IEEE Int Microelectronic Test Structures (ICMTS) Confpages 118ndash122 2011 doi 101109ICMTS20115976872
[59] Jan M Rabaey Anantha Chandrakasan and Borivoje NikolicDigital integrated circuits- A design perspective Prentice Hall2ed edition 2004
[60] K Roy and S Prasad Low-power CMOS VLSI circuit de-
sign A Wiley-Interscience publication Wiley 2000 ISBN9780471114888
[61] K Roy S Mukhopadhyay and H Mahmoodi-MeimandLeakage current mechanisms and leakage reduction tech-niques in deep-submicrometer cmos circuits 91(2)305ndash3272003 doi 101109JPROC2002808156
[62] V S Sathe M C Papaefthymiou and C H Ziesler Boostlogic a high speed energy recovery circuit family In Proc
IEEE Computer Society Annual Symp VLSI pages 22ndash27 2005doi 101109ISVLSI200522
[63] V S Sathe J C Kao and M C Papaefthymiou Resonant-clock latch-based design 43(4)864ndash873 2008 doi 101109JSSC2008917501
[64] B Sell A Avellan and W H Krautschneider Charge-basedcapacitance measurements (cbcm) on mos devices 2(1)9ndash122002 doi 101109TDMR20021014667
[65] C S Selvanayagam Xiaowu Zhang R Rajoo and D PinjalaModelling stress in silicon with tsvs and its effect on mobilityIn Proc 11th Electronics Packaging Technology Conf EPTC rsquo09pages 612ndash618 2009 doi 101109EPTC20095416477
[66] D Shariff P C Marimuthu K Hsiao L Asoy Chia LaiYee Aung Kyaw Oo K Buchanan K Crook T Wilby andS Burgess Integration of fine-pitched through-silicon viasand integrated passive devices In Proc IEEE 61st Elec-
tronic Components and Technology Conf (ECTC) pages 844ndash848 2011 doi 101109ECTC20115898609
bibliography 129
[67] S Stoukatch C Winters E Beyne W De Raedt andC Van Hoof 3d-sip integration for autonomous sensornodes In Proc 56th Electronic Components and Technology
Conf 2006 doi 101109ECTC20061645678
[68] L J Svensson and J G Koller Driving a capacitive loadwithout dissipating fcv2 In Proc IEEE Symp Low Power
Electronics Digest of Technical Papers pages 100ndash101 1994 doi101109LPE1994573220
[69] B Swinnen W Ruythooren P De Moor L Bogaerts L Car-bonell K De Munck B Eyckens S Stoukatch D S TezcanZ Tokei J Vaes J Van Aelst and E Beyne 3d integrationby cu-cu thermo-compression bonding of extremely thinnedbulk-si die containing 10 IcircŒm pitch through-si vias In Proc
Int Electron Devices Meeting IEDM rsquo06 pages 1ndash4 2006 doi101109IEDM2006346786
[70] N Tanaka M Kawashita Y Yoshimura T Uematsu M Fuji-sawa H Shimokawa N Kinoshita T Naito T Kikuchi andT Akazawa Characterization of mos transistors after tsvfabrication and 3d-assembly In Proc 2nd Electronics System-
Integration Technology Conf ESTC 2008 pages 131ndash134 2008doi 101109ESTC20084684338
[71] S E Thompson Guangyu Sun Youn Sung Choi andT Nishida Uniaxial-process-induced strained-si extend-ing the cmos roadmap 53(5)1010ndash1020 2006 doi 101109TED2006872088
[72] N Tzartzanis and W C Athas Clock-powered cmos ahybrid adiabatic logic style for energy-efficient computingIn Proc 20th Anniversary Conf Advanced Research in VLSIpages 137ndash151 1999 doi 101109ARVLSI1999756044
[73] J Van Olmen A Mercha G Katti C HuyghebaertJ Van Aelst E Seppala Zhao Chao S Armini J VaesR C Teixeira M Van Cauwenberghe P Verdonck K Ver-hemeldonck A Jourdain W Ruythooren M de Potterde ten Broeck A Opdebeeck T Chiarella B Parvais I De-busschere T Y Hoffmann B De Wachter W DehaeneM Stucchi M Rakowski P Soussan R Cartuyvels E BeyneS Biesemans and B Swinnen 3d stacked ic demonstrationusing a through silicon via first approach In Proc IEEE Int
Electron Devices Meeting IEDM 2008 pages 1ndash4 2008 doi101109IEDM20084796763
[74] Tom Verhoeff Delay-insensitive codes - an overview Dis-
tributed Computing 31ndash8 1988 ISSN 0178-2770 URL http
dxdoiorg101007BF01788562 101007BF01788562
bibliography 130
[75] P Vitanov T Dimitrova and I Eisele Direct capacitancemeasurements of small geometry mos transistors Micro-
electronics Journal 22(7-8)77 ndash 89 1991 ISSN 0026-2692doi DOI1010160026-2692(91)90016-G URL httpwww
sciencedirectcomsciencearticleB6V44-4829XDJ-2N
2f5b2c221dcf240c38cd13b7b3432f913
[76] B Voss and M Glesner A low power sinusoidal clockIn Proc IEEE Int Symp Circuits and Systems ISCAS 2001volume 4 pages 108ndash111 2001 doi 101109ISCAS2001922182
[77] R Weerasekera M Grange D Pamunuwa H Tenhunenand Li-Rong Zheng Compact modelling of through-siliconvias (tsvs) in three-dimensional (3-d) integrated circuits InProc IEEE Int Conf 3D System Integration 3DIC 2009 pages1ndash8 2009 doi 1011093DIC20095306541
[78] NHE Weste and DF Harris CMOS VLSI design a cir-
cuits and systems perspective PearsonAddison-Wesley 2005ISBN 9780321269775
[79] Steve Wozniak Failure magazine interview (by jason zasky)July 2000 URL httpfailuremagcomindexphpsite
printsteve_wozniak_interview
[80] Hyung Suk Yang and Muhannad S Bakir 3d integrationof cmos and mems using mechanically flexible intercon-nects (mfi) and through silicon vias (tsv) In Proc 60th Elec-
tronic Components and Technology Conf (ECTC) pages 822ndash828 2010 doi 101109ECTC20105490716
[81] Yu Yang G Katti R Labie Y Travaly B Verlinden andI De Wolf Electrical evaluation of 130-nm mosfets withtsv proximity in 3d-sic structure In Proc Int Interconnect
Technology Conf (IITC) pages 1ndash3 2010 doi 101109IITC20105510710
[82] Yibin Ye and K Roy Energy recovery circuits using re-versible and partially reversible logic 43(9)769ndash778 1996doi 10110981536746
[83] Yibin Ye and K Roy Qserl quasi-static energy recoverylogic 36(2)239ndash248 2001 doi 1011094902764
[84] C C Yeh J H Lou and J B Kuo 15 v cmos full-swingenergy efficient logic (eel) circuit suitable for low-voltageand low-power vlsi applications Electronics Letters 33(16)1375ndash1376 1997 doi 101049el19970915
bibliography 131
[85] C P Yuan and T N Trick A simple formula for the esti-mation of the capacitance of two-dimensional interconnectsin vlsi circuits 3(12)391ndash393 1982 doi 101109EDL198225610
- Abstract
- Publications
- Acknowledgments
- Contents
- List of Figures
- List of Tables
- Acronyms
- 1 Introduction
-
- 11 Research goals and contribution
- 12 Thesis outline
-
- 2 Background
-
- 21 3D integrated circuits
-
- 211 3D integration approaches
- 212 TSV interconnects
-
- 22 Energy-recovery logic
-
- 221 Energy-recovery architectures
- 222 Power-clock generators
-
- 23 Full-custom design flow
- 24 Summary
-
- 3 Characterization of TSV capacitance
-
- 31 Introduction
- 32 TSV parasitic capacitance
- 33 Integrated capacitance measurement techniques
- 34 Electrical characterization
-
- 341 Physical implementation
- 342 Measurement setup
- 343 Experimental results
-
- 35 Summary and conclusions
-
- 4 TSV proximity impact on MOSFET performance
-
- 41 Introduction
- 42 TSV-induced thermomechanical stress
- 43 Test structure implementation
-
- 431 MOSFET arrays
- 432 Digital selection logic
-
- 44 Experimental measurements
-
- 441 Data processing methodology
- 442 Measurement setup
- 443 PMOS (long channel)
- 444 PMOS (short channel)
-
- 45 Summary and conclusions
-
- 5 Evaluation of energy-recovery for TSV interconnects
-
- 51 Introduction
- 52 Energy-recovery TSV interconnects
- 53 Analysis
-
- 531 Adiabatic driver
- 532 Inductors parasitic resistance
- 533 Switch M1
- 534 Timing Constraints
-
- 54 Simulations
-
- 541 Comparison to CMOS
-
- 55 Summary and conclusions
-
- 6 Energy-recovery 3D-IC demonstrator
-
- 61 Introduction
- 62 Physical implementation
-
- 621 Adiabatic driver
- 622 Pulse-to-level converter
- 623 Data generator
- 624 Pulse generator
- 625 Integrated inductor
- 626 Top level design
- 627 CMOS IO circuit
-
- 63 Post-layout simulation and comparison
- 64 Experimental measurements
-
- 641 Measurement setup
- 642 Results
-
- 65 Summary and conclusions
-
- 7 Conclusions
-
- 71 Main contributions
- 72 Future work
-
- A Appendix A Photos
- B Appendix B Statistics
- Bibliography
-
O P T I M I Z I N G T H E I N T E G R AT I O N A N DE N E R G Y E F F I C I E N C Y O F T H R O U G H
S I L I C O N V I A - B A S E D 3 D I N T E R C O N N E C T S
panagiotis asimakopoulos
A Thesis Submitted for the Degree of Doctor of Philosophyat Newcastle University
School of Electrical Electronic and Computer Engineering
Newcastle upon Tyne
November 2011
Panagiotis Asimakopoulos Optimizing the integration and energy
efficiency of through silicon via-based 3D interconnects PhD Thesiscopy November 2011
A B S T R A C T
The aggressive scaling of CMOS process technology has been driv-ing the rapid growth of the semiconductor industry for more thanthree decades In recent years the performance gains enabledby CMOS scaling have been increasingly challenged by highly-parasitic on-chip interconnects as wire parasitics do not scale atthe same pace Emerging 3D integration technologies based onvertical through-silicon vias (TSVs) promise a solution to the inter-connect performance bottleneck along with reduced fabricationcost and heterogeneous integration
As TSVs are a relatively recent interconnect technology innov-ative test structures are required to evaluate and optimise theprocess as well as extract parameters for the generation of designrules and models From the circuit designerrsquos perspective criticalTSV characteristics are its parasitic capacitance and thermomech-anical stress distribution This work proposes new test structuresfor extracting these characteristics The structures were fabric-ated on a 65nm 3D process and used for the evaluation of thattechnology
Furthermore as TSVs are implemented in large densely inter-connected 3D-system-on-chips (SoCs) the TSV parasitic capacitancemay become an important source of energy dissipation Typicallow-power techniques based on voltage scaling can be usedthough this represents a technical challenge in modern techno-logy nodes In this work a novel TSV interconnection scheme isproposed based on reversible computing which shows frequency-dependent energy dissipation The scheme is analysed using the-oretical modelling while a demonstrator IC was designed basedon the developed theory and fabricated on a 130nm 3D process
iii
P U B L I C AT I O N S
Some ideas and figures have appeared previously in the followingpublications
1 D Perry J Cho S Domae P Asimakopoulos A YakovlevP Marchal G Van der Plas and N Minas An efficientarray structure to characterize the impact of through siliconvias on fet devices In Proc IEEE Int Microelectronic TestStructures (ICMTS) Conf pages 118ndash122 2011 (best paper
award)
2 A Mercha G Van der Plas V Moroz I De Wolf P Asi-
makopoulos N Minas S Domae D Perry M Choi ARedolfi C Okoro Y Yang J Van Olmen S ThangarajuD S Tezcan P Soussan J H Cho A Yakovlev P Mar-chal Y Travaly E Beyne S Biesemans and B SwinnenComprehensive analysis of the impact of single and arraysof through silicon vias induced stress on high-k metalgate cmos performance In Proc IEEE Int Electron DevicesMeeting (IEDM) 2010
3 A Mercha A Redolfi M Stucchi N Minas J Van OlmenS Thangaraju D Velenis S Domae Y Yang G Katti RLa- bie C Okoro M Zhao P Asimakopoulos I De WolfT Chiarella T Schram E Rohr A Van Ammel A Jour-dain W Ruythooren S Armini A Radisic H PhilipsenN Heylen M Kostermans P Jaenen E Sleeckx D Sabun-cuoglu Tezcan I Debusschere P Soussan D Perry G Vander Plas J H Cho P Marchal Y Travaly E Beyne S Biese-mans and B Swinnen Impact of thinning and throughsilicon via proximity on high-k metal gate first cmos per-formance In Proc Symp VLSI Technology (VLSIT) pages109ndash110 2010
4 P Asimakopoulos G Van der Plas A Yakovlev and PMarchal Evaluation of energy-recovering interconnects forlow-power 3d stacked ics In Proc IEEE Int Conf 3D Sys-tem Integration 3DIC 2009 pages 1ndash5 2009
5 P Asimakopoulos and A Yakovlev An AdiabaticPower-Supply Controller For Asynchronous Logic Circuits20th UK Asynchronous Forum pages 1ndash4 2008
iv
In the end I hope therersquos a little note somewhere
that says I designed a good computer
mdash Steve Wozniak [79]
A C K N O W L E D G E M E N T S
First and foremost I would like to express my gratitude to mysupervisor Prof Alex Yakovlev who provided the perfect balancebetween freedom and guidance allowing me to obtain sphericalknowledge and scope in microelectronics but at the same time topromptly complete my studies Irsquom also grateful to the Engineer-ing and Physical Science Research Council (EPSRC) for fundingmy PhD studies through their Doctoral Training Accounts
As part of my research project I had the opportunity to spendsome time at Imec Belgium The experiences and lessons learnedduring my stay at Imec will undoubtedly prove valuable in myforthcoming professional career I would like to thank Dan Perryfrom Qualcomm for sharing his experience and ideas for the de-sign of the characterization structures implemented on the 65nm
Imec process Special thanks to Shinichi Domae from Panasonicand Dr Dimitrios Velenis for their assistance with the 65nm wafermeasurements as well as Marco Facchini for the time he spendtrying to measure the energy-recovery circuits Last but not leastIrsquod like to thank Geert Van der Plas and Paul Marchal for theiradvices and guidance throughout my stay at Imec
Finally many thanks to my colleagues at Newcastle Universityfor assisting me in diverse ways during my studies I would liketo especially mention Dr Santosh Shedabale and James Dochertyfor reviewing and commenting on parts of this text as well asDr Robin Emery for the numerous technical discussions and forhelping setup the design tools
v
C O N T E N T S
1 Introduction 1
11 Research goals and contribution 2
12 Thesis outline 3
2 Background 5
21 3D integrated circuits 5
211 3D integration approaches 6
212 TSV interconnects 9
22 Energy-recovery logic 13
221 Energy-recovery architectures 17
222 Power-clock generators 18
23 Full-custom design flow 20
24 Summary 21
3 Characterization of TSV capacitance 23
31 Introduction 23
32 TSV parasitic capacitance 23
33 Integrated capacitance measurement techniques 24
34 Electrical characterization 27
341 Physical implementation 28
342 Measurement setup 36
343 Experimental results 37
35 Summary and conclusions 44
4 TSV proximity impact on MOSFET performance 46
41 Introduction 46
42 TSV-induced thermomechanical stress 46
43 Test structure implementation 48
431 MOSFET arrays 51
432 Digital selection logic 55
44 Experimental measurements 61
441 Data processing methodology 61
442 Measurement setup 62
443 PMOS (long channel) 64
444 PMOS (short channel) 71
45 Summary and conclusions 73
5 Evaluation of energy-recovery for TSV interconnects 75
51 Introduction 75
52 Energy-recovery TSV interconnects 75
53 Analysis 79
531 Adiabatic driver 80
532 Inductorrsquos parasitic resistance 82
533 Switch M1 82
534 Timing Constraints 83
54 Simulations 83
541 Comparison to CMOS 86
vi
contents vii
55 Summary and conclusions 91
6 Energy-recovery 3D-IC demonstrator 92
61 Introduction 92
62 Physical implementation 92
621 Adiabatic driver 94
622 Pulse-to-level converter 95
623 Data generator 95
624 Pulse generator 95
625 Integrated inductor 97
626 Top level design 98
627 CMOS IO circuit 101
63 Post-layout simulation and comparison 101
64 Experimental measurements 106
641 Measurement setup 106
642 Results 106
65 Summary and conclusions 111
7 Conclusions 114
71 Main contributions 115
72 Future work 116
a Appendix A Photos 118
b Appendix B Statistics 120
bibliography 122
L I S T O F F I G U R E S
Figure 11 Scaling trend of logic and interconnect delay[30] 2
Figure 21 Comparison of 2D3D Integration 5
Figure 22 System-in-a-package 6
Figure 23 Package-on-package 6
Figure 24 Imecrsquos 3-D SIP wireless bio-electronic sensor[67] 7
Figure 25 3D wafer-level-packaging 7
Figure 26 3D-stacked IC 8
Figure 27 TSV process steps 9
Figure 28 Before thinningafter thinning 10
Figure 29 TSV stacking 10
Figure 210 Conventional AND gate 14
Figure 211 CMOS switching 16
Figure 212 Adiabatic switching 16
Figure 213 Basic 2N-2P adiabatic circuit 17
Figure 214 2N-2P timing 17
Figure 215 Adiabatic driver 18
Figure 216 Stepwise charging generator 19
Figure 217 1N power clock generator 20
Figure 218 Design flow 22
Figure 219 Inverter in Virtuoso schematic 22
Figure 220 Inverter in Virtuoso layout 22
Figure 31 Isolated 2D wire capacitance 24
Figure 32 TSV parasitic capacitance 25
Figure 33 Capacitance vs Gate voltage (CV) diagramof a MOS Capacitor 25
Figure 34 CBCM pseudo-inverters 27
Figure 35 Current as a function of frequency in CBCM 28
Figure 36 CBCM pseudo-inverter pair in layout 29
Figure 37 PRC1 test pad module (layout) 30
Figure 38 PRC1 test pad module (micrograph) 30
Figure 39 Structure rsquoArsquo - Single TSV capacitance (TOPdie) 32
Figure 310 Structure rsquoBrsquo - Single TSV capacitance (BOT-TOM die) 33
Figure 311 Structure rsquoCrsquo - Variable pitch TSV capacitance34
Figure 312 Structure rsquoDrsquo - TSV capacitance measure-ment with interconnection on TOP or BOT-TOM die 34
Figure 313 Structure rsquoRrsquo - Evaluation of TSV capacit-ance measurement accuracy 35
viii
List of Figures ix
Figure 314 Structure rsquoErsquo - TSV capacitance effect onsignal propagation delay (layout) 35
Figure 315 Structure rsquoErsquo - TSV capacitance effect onsignal propagation delay (schematic) 36
Figure 316 Measurement setup 36
Figure 317 TSV capacitance D2D variability on waferrsquoD11rsquo - Probability density function 39
Figure 318 TSV capacitance D2D variability on waferrsquoD18rsquo - Probability density function 40
Figure 319 TSV capacitance measurement results onwafer D11 40
Figure 320 TSV capacitance measurement results onwafer D18 41
Figure 321 Simulated oxide liner thickness variation 42
Figure 322 TSV capacitance W2W variability - Probab-ility density functions 42
Figure 323 Variable pitch TSV capacitance experimentaldata from wafer rsquoD18rsquo - Probability densityfunction 44
Figure 324 LCR experimental data from wafer rsquoD18rsquo -Probability density function 44
Figure 41 TSV thermomechanical stress simulation [40] 47
Figure 42 TSV thermomechanical stress interactionbetween two TSVs [40] 48
Figure 43 Estimated mobility shift on PMOS (a) andNMOS (b) transistors in between two TSVsseparated by distance d [65] 49
Figure 44 PTP1 test pad module (layout) 50
Figure 45 PTP1 test pad module (micrograph) 50
Figure 46 MOSFET dimensions and rotations 51
Figure 47 1 TSV MOSFET array configuration 52
Figure 48 4 diagonal TSVs MOSFET array configuration52
Figure 49 4 lateral TSVs MOSFET array configuration 53
Figure 410 9 TSVs MOSFET array configuration 53
Figure 411 MOSFET array detail 55
Figure 412 Digital selection logic 56
Figure 413 MOSFET array with digital selection logic 57
Figure 414 4-bit Gate-enableArray-enable decoder 58
Figure 415 Transmission gate for MOSFET Gate terminals59
Figure 416 Transmission gates for MOSFET Drain-Source terminals 59
Figure 417 2-bit Drain-select decoder 60
Figure 418 Processing of measurement results 62
Figure 419 Interpolation of measurement results 63
Figure 420 Measurement setup 63
Figure 421 Normalised on-current for PMOS structureldquoF0rdquo (D08) 65
List of Figures x
Figure 422 Normalised on-current for PMOS structureldquoG0rdquo (D08) 66
Figure 423 Normalised on-current for PMOS structureldquoG0rdquo (D08Y=14X=16) 66
Figure 424 Normalised on-current for PMOS structureldquoG0rdquo (D20) 67
Figure 425 Normalised on-current for PMOS structureldquoG0rdquo (D20Y=14X=16) 67
Figure 426 Normalised on-current for PMOS structureldquoG0rdquo (D08 60˚C) 68
Figure 427 Normalised on-current for PMOS structureldquoG0rdquo (D08 Y=14 X=16 60˚C) 69
Figure 428 Normalised on-current for PMOS structureldquoI0rdquo (D08) 69
Figure 429 Normalised on-current for PMOS structureldquoI0rdquo (D08 X=16) 70
Figure 430 Normalised on-current for PMOS structureldquoI0rdquo (D08 Y=14) 70
Figure 431 Normalised on-current for PMOS structureldquoG90rdquo (D08) 71
Figure 432 Normalised on-current for PMOS structureldquoG90rdquo (D08 Y=14 X=16) 71
Figure 433 Normalised on-current for PMOS structureldquoB0rdquo (D08) 72
Figure 434 Normalised on-current for PMOS structureldquoB0rdquo (D08 Y=14 X=16) 72
Figure 51 TSV energy dissipation per cycle increasewith 3D integration density 76
Figure 52 Comparison of a typical CMOS TSV driverto the proposed energy-recovery 77
Figure 53 Adiabatic driver 78
Figure 54 A simple pulse-to-level converter (P2LC) 78
Figure 55 Resonant pulse generator with dual-rail dataencoding 79
Figure 56 Energy-recovery scheme for TSV interconnects79
Figure 57 Energy-recovery timing constraints 83
Figure 58 Test case for the evaluation of the theoret-ical model 84
Figure 59 Energy dissipation results ( f Jcycle) the-oretical model (T) versus SPICE simulator(S) 85
Figure 510 Energy dissipation estimation error of thetheoretical model compared to the SPICEsimulator 85
Figure 511 Test case for comparison of energy-recoveryto CMOS 86
Figure 512 Improved P2LC design 87
List of Figures xi
Figure 513 Simulated energy dissipation of improvedP2LC design 87
Figure 514 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS for variable Q factor (C = 80 f F) 88
Figure 515 Energy contribution of energy-recovery cir-cuit components at 200MHz (C = 80 f F) 88
Figure 516 Energy contribution of energy-recovery cir-cuit components at 800MHz (C = 80 f F) 89
Figure 517 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS with variable TSV capacitance (C =
80 f F f = 500MHz) 89
Figure 518 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS (C = 80 f F f = 200MHz) 90
Figure 519 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS for variable technology (C = 80 f Ff = 500MHz) 90
Figure 61 Demonstrator circuit structure (showing 1-bit inputoutput (IO) channel face-downview) 93
Figure 62 Energy-recovery IO circuit schematic dia-gram 94
Figure 63 Adiabatic driver (layoutschematic view) 94
Figure 64 P2LC (layoutschematic view) 95
Figure 65 Data generator (layoutschematic view) 96
Figure 66 Timing constraints for the energy-recoverydemonstrator circuit 96
Figure 67 Pulse generator (layoutschematic view) 97
Figure 68 Pulse generator InputOutput signals 97
Figure 69 Inductorrsquos Q factor simulation in ASITIC Estimated energy dissipation reduction percycle over standard CMOS 98
Figure 610 Planar square spiral inductor designed inASITIC 99
Figure 611 Resonant pulse distribution tree 100
Figure 612 Interdigitated metal-insulator-metal capa-citor (fringe capacitor) 101
Figure 613 Energy-recovery IO circuit in layout view 102
Figure 614 CMOS IO circuit schematic diagram 103
Figure 615 CMOS driver schematic 103
Figure 616 CMOS IO circuit in layout view 103
Figure 617 Post-layout SPICE simulation of the energy-recovery circuit 105
Figure 618 Energy-recovery IO circuit micrograph 107
List of Figures xii
Figure 619 CMOS IO circuit micrograph 107
Figure 620 Measurement setup 107
Figure 621 Oscilloscope output for signals S1B and OC-MOS 109
Figure 622 Measuredsimulated power dissipation forthe CMOS circuit 109
Figure 623 Measuredsimulated energy dissipation forthe CMOS circuit 110
Figure 624 Oscilloscope output for signals S1A and OER110
Figure 625 Measuredsimulated energy dissipation forthe energy-recovery circuit 111
Figure 626 Energy-recovery IO circuit with fault hy-pothesis 111
Figure 627 Simulation of circuit with fault hypothesis 112
Figure 628 Measured energy dissipation and simula-tion of circuit with fault hypothesis 113
Figure A1 3D65 300mm wafer (on chuck) 118
Figure A2 Probe card 118
Figure A3 3D130 200mm wafer (on chuck) 119
Figure A4 Probe head 119
L I S T O F TA B L E S
Table 21 3D interconnect technologies comparison 8
Table 22 Technology parameters for Imec 3D130 process11
Table 23 Technology parameters for Imec 3D65 process11
Table 24 Comparison energy-recoveryconventionalCMOS 16
Table 31 Test structures on the PRC1 module 31
Table 32 PRC1 test pad instrument connections 37
Table 33 TSV capacitance D2D variability on waferrsquoD11rsquo 38
Table 34 TSV capacitance D2D variability on waferrsquoD18rsquo 39
Table 35 Variable pitch TSV capacitance experimentaldata from wafer rsquoD18rsquo 43
Table 36 Comparison between CBCMLCR measure-ments on wafer rsquoD18rsquo 45
Table 41 Coefficients of thermal expansion 47
Table 42 MOSFET array configurations for test padmodules PTP1PTP2 54
Table 43 MOSFET array configurations selected forexperimental evaluation 61
Table 44 PTP1PTP2 test pad module instrumentconnections 64
Table 45 Summary of experimental measurementson long (L) and short (S) channel PMOSdevices 73
Table 51 Energy dissipation results ( f Jcycle) the-oretical model (T) versus SPICE simulation(S) 84
Table 61 Energy-recovery IO circuit parameters 93
Table 62 Calculated resistance tolerances for eachwire branch in the resonant clock distribu-tion tree 100
Table 63 Energy-dissipation theoretical calculationsfor Q = 17 f = 500MHz 104
Table 64 Post-layout simulated energy dissipation Area requirements 106
Table 65 Energy-recovery circuit instrument connec-tions 108
Table 66 CMOS circuit instrument connections 108
xiii
A C R O N Y M S
2D two-dimentional
2N-2P 2 nMOS2 pMOS adiabatic logic
3D-IC three-dimentional integrated circuit
3D-SIC 3D stacked-IC
3D-SIP 3D system-in-package
3D three-dimentional
3D-WLP 3D wafer-level-packaging
BCB benzocyclobutene
BEOL back-end-of-line
CAL clocked adiabatic logic
CBCM charge-based capacitance measurement
CMOS complementary metal-oxide-semiconductor
CMP chemical-mechanical planarization
CTE coefficient of thermal expansion
Cu copper
CVD chemical vapor deposition
D2D die-to-die
D2W die-to-wafer
DCVSL differential cascode voltage switch logic
DRC design rule check
DUT device under test
ECRL efficient charge recovery logic
EDA electronic design automation
FEM finite element method
FEOL front-end-of-line
GPIB general purpose interface bus
IC integrated circuit
xiv
acronyms xv
IO inputoutput
KOZ keep-out-zone
LC inductance-capacitance
LVS layout versus schematic
MEMS microelectromechanical systems
MIM metal-insulator-metal
MOSFET metal-oxide-semiconductor field-effect transistor
MOS metal-oxide-semiconductor
MPU microprocessor unit
MuGFET multiple gate field-effect transistor
NMOS n-channel MOSFET
P2LC pulse-to-level converter
PAL pass-transistor adiabatic logic
PCB printed circuit board
PCG power-clock generator
PMD pre-metal dielectric
PMOS p-channel MOSFET
PVT process voltage temperature
QSERL quasi-static energy recovery logic
RC resistance-capacitance
RDL re-distribution layer
RLC resistance-inductance-capacitance
RMS root mean square
SIP system-in-a-package
Si silicon
SMU source measurement unit
SoC system-on-chip
TaN tantalum nitride
TSV through-silicon via
W tungsten
acronyms xvi
W2W wafer-to-wafer
WLP wafer-level-packaging
1I N T R O D U C T I O N
Extending integrated circuit (IC) functionality while keepingcost per silicon area approximately constant has historicallybeen the fundamental factor behind the growth of the semi-conductor industry Ever since Moorersquos observation [51] in 1965
chip functionality has been doubling every 15-2 years advancingfrom the Intel 4004 with 2000 transistors to modern generationmicroprocessor units (MPUs) that integrate over 1 billion tran-sistors [35] While there are indications that this trend might beslowing down due to planar metal-oxide-semiconductor field-effect transistor (MOSFET) physical limitations [22] innovativesolutions such as multiple gate field-effect transistors (MuGFETs)continue to drive the transistor roadmap forward as was recentlydemonstrated by Intel on their 22nm node MPU [12] It can bereasonably expected that IC transistor density will continue toincrease well into the next decade reaching 10 or even 100 billiontransistors per chip [4]
As transistor gate length dielectric thickness and junctiondepth are scaled performance in terms of speed and power im-proves as well The performance requirements for ICs have beensteadily increasing for succeeding technology generations drivenby market demand Although transistor scaling has traditionallybeen the main technology focus determining both cost and per-formance of ICs in recent years the performance gains enabled byscaling have been gradually challenged by on-chip interconnects
In an IC interconnects are the conducting wires connectingdevices and functional blocks in the circuit With smaller tran-sistors and increased functionality in modern processes intercon-nects have been progressively becoming more complex occupyingup to 12 metal layers in the chip As wire length increased sodid the signal delay time and power consumption of intercon-nects which are typically determined by wire intrinsic parasiticsin terms of resistance-capacitance (RC) It was estimated that a130nm microprocessor consumed over 50 of dynamic poweron interconnects [41] with a projection that in future designsup to 80 of microprocessor power would be consumed byinterconnects [3] Likewise signal propagation delay of globalinterconnects as a portion of the total system delay has beenrapidly increasing for succeeding technology nodes (Figure 11)
The rapid increase of the interconnect impact on IC perform-ance prompted ITRS1 to appoint the realisation of novel and
1 International Technology Roadmap for Semiconductors
1
11 research goals and contribution 2
Figure 11 Scaling trend of logic and interconnect delay [30]
innovative interconnect technologies as one of the ldquoGrand Chal-lengesrdquo in 2007 [3] Since then several technologies have been in-vestigated to address the interconnect bottleneck such as copperlow-k interconnects optical interconnects and three-dimentional (3D)interconnects Although copperlow-k interconnects are morecompatible with current processes and remain the most attractiveoption into the near term 3D interconnects are probably the mostrealistic option for meeting future technology trends [53]
Emerging 3D integration technologies based on vertical through-silicon vias (TSVs) promise a solution to the interconnect per-formance bottleneck along with reduced fabrication cost andheterogeneous integration TSVs are electrical connections passingthrough the silicon substrate and interconnecting separate diesstacked vertically in a 3D package [69] While there are several 3D
interconnect technologies [11] TSVs have the potential to offer thebest all around 3D integration technology with their advantagesincluding reduced parasitics and high interconnection density
11 research goals and contribution
As TSV is a relatively recent 3D interconnect technology there arestill challenges to be overcome before TSVs can be widely adoptedand commercialised Fabrication issues such as etching linerdeposition metallization and chemical-mechanical planarization(CMP) steps are in the process of being resolved or solutions havealready been found [55] Consequently the next logical step isthe implementation of three-dimentional integrated circuits (3D-
ICs) with more complex functionalities than simple test chipsHowever complex circuit design requires design methodologiesand tools that for 3D-ICs are still at their infancy stage [18]
The motivation behind this project was to accelerate the de-velopment of such design methodologies and tools for future
12 thesis outline 3
3D-IC designs by addressing key concerns of the TSV technologyprimarily from the circuit designerrsquos perspective The researcheffort was aimed at developing characterization structures forevaluating TSV characteristics affecting signal propagation delaypower consumption and circuit robustness as well as proposingdesign techniques for overcoming limitations Although TSVs areinterconnects their behaviour resembles more to MOSFET devicesthan the commonly used back-end-of-line (BEOL) metal wireswhile their integration into silicon introduces new challenges forICs Consequently under certain conditions conventional designapproaches are sufficient neither for TSV characterization nor forTSV interconnection
The TSV parasitic capacitance and thermomechanical stress dis-tribution are critical TSV characteristics The parasitic capacitanceaffects both delay and power consumption of signals passingthrough the TSV interconnect while thermomechanical stress af-fects MOSFET device mobility in the vicinity of TSVs Accuratelydetermining these parameters is necessary for generating designrules and models that can be used in TSV-aware design tools Aspart of this project sophisticated test structures were developedand fabricated on a 65nm 3D process for extracting the TSV para-sitic capacitance and thermomechanical stress distribution
The TSV parasitic capacitance is typically small however it maystill become an important source of power consumption as TSVs
are gradually implemented in large densely interconnected 3D-system-on-chips (SoCs) Although conventional low-power designtechniques based on voltage scaling can be used this representsa technical challenge in modern technology nodes In this worka novel TSV interconnection scheme is proposed which is basedon energy-recovery logic and shows frequency-dependent powerconsumption The scheme is analysed using theoretical modellingwhile a demonstrator IC was designed based on the developedtheory and fabricated on a 130nm 3D process
The work presented in this thesis was part of a larger researchproject undertaken at Imec2 with the goal of improving andoptimising TSV-based 3D integration technology The authorrsquoscontribution within this project was to propose design and sim-ulate the circuits presented in the following chapters as well asperform and analyse the experimental measurements
12 thesis outline
This thesis is organised into seven Chapters and two Appendixesas follows
2 Imec is a micro- and nanoelectronics research center located in Leuven Belgium(httpwwwimecbe)
12 thesis outline 4
Chapter 2 (Background) introduces the basic concepts com-prising the fields of 3D integration and energy-recovery logic Inaddition the design flow used for implementing the design partof this work is described
Chapter 3 (Characterization of TSV capacitance) presents agroup of test structures for characterizing the TSV parasitic ca-pacitance under various conditions The test structures are fab-ricated on a 65nm 3D process and the measurement data areanalysed using statistical methods
Chapter 4 (TSV proximity impact on MOSFET performance)presents a test structure for monitoring the impact of TSV-inducedthermomechanical stress on MOSFET device performance Thetest structure uses proven design techniques for implementinga large number of test-cases and accessing numerous deviceswith maximum precision The structure is fabricated and usedfor characterizing thermomechanical stress on a 65nm 3D process
Chapter 5 (Evaluation of energy-recovery for TSV intercon-nects) investigates the potential of the energy-recovery techniquefor reducing the energy dissipation of TSV interconnects in 3D-ICsThe total energy dissipation per cycle and optimum device sizingare extracted for the proposed scheme using theoretical model-ling while the configuration is evaluated against conventionalstatic complementary metal-oxide-semiconductor (CMOS) design
Chapter 6 (Energy-recovery 3D-IC demonstrator) elaborates onthe design of a 3D demonstrator circuit based on the energy-recovery scheme under realistic physical and electrical con-straints The demonstrator is compared to a CMOS circuit de-signed with identical specifications using post-layout simulationsBoth circuits are fabricated on a 130nm 3D technology processand evaluated based on the experimental measurements
Chapter 7 (Conclusions) summarises the contributions of thisthesis as presented in the previous chapters and discusses areasof future work
Appendix A includes photos used as reference in various partsof this work
Appendix B elaborates on the statistical descriptors used inthis work for analysing experimental measurement data
2B A C K G R O U N D
The work presented in this thesis contributes in the fields of 3D
integration and energy-recovery logic This chapter introduces thebasic concepts comprising these two fields not in extensive detailbut enough that the reader can understand the motivation andthe choices made in the context of this work Furthermore thefull-custom design flow is discussed that was used to implementthe design part of this thesis as will be presented in the followingchapters
21 3d integrated circuits
CMOS integration technologies are traditionally based on two-dimentional (2D) planar architectures Each die is individuallypackaged and placed on a printed circuit board (PCB) which isused to provide mechanical support and interconnection betweendifferent dies and electronic components A 3D-IC is a stack ofmultiple dies (Figure 21) vertically integrated and interconnec-ted sharing a common package
There are several motivations for the development of 3D-ICsPerhaps the most significant advantage of 3D integration is thereduced interconnection delay and energy dissipation In generalglobal interconnect delay and energy is linearly related to wirelength in an IC By using 3D integration modelling of simple3D-IC architectures has demonstrated that up to 50 reduction inglobal interconnect length may be achieved [24] with comparablereductions in delay and energy
Another advantage of 3D integration is heterogeneous integra-tion Presently 2D planar architectures limit designers to a singlefabrication technology for both the analog and digital parts of
ABCDEFBA ABCDEFBA
Figure 21 Comparison of 2D3D Integration
5
21 3d integrated circuits 6
ABCDEFBC
E
Figure 22 System-in-a-package
ABCDEFBC
E
Figure 23 Package-on-package
a circuit The common practice is to use an inexpensive digitalprocess for the complete design which is usually inefficient forimplementing analog circuits In contrast a 3D-IC allows for themost suitable process technology to be used for implementingeach part of the design In addition other components may beintegrated in the same circuit such as microelectromechanicalsystems (MEMS) [80] and passive devices [66] (resistors capacitorsinductors) enabling ICs with novel functionalities
The replacement of long horizontal global interconnects withshort vertical ones allows for reduced die size as well which isdirectly related to the fabrication cost Also the vertical integ-ration of dies in the same package removes the need for thespacing overhead between components that is necessary in PCBsincreasing the area efficiency and reducing the form factor
211 3D integration approaches
3D integration may be achieved using several microelectronictechnologies which result in different densities and capabilitiesThe levels of 3D integration can be reasonably classified into threemajor technology platforms [11 4] based on the interconnectwiring hierarchy and the industrial infrastructure
bull 3D system-in-package (3D-SIP) Stacking of individual pack-ages
bull 3D wafer-level-packaging (3D-WLP) Stacking of embeddeddies by wafer-level-packaging (WLP)
bull 3D stacked-IC (3D-SIC) TSV-based vertical system integra-tion
The 3D-SIP integration is based on traditional packaging intercon-nect technologies Each sub-system is integrated in a system-in-a-package (SIP) fashion (Figure 22) using wire-bonds for inter-connection and subsequently each SIP subsystem is assembledvertically to create package-on-package 3D stacks (Figure 23)
3D-SIP is currently the most mature technology and has alreadyreached volume production It provides with increased function-ality in a smaller form factor and low cost However interconnectdensity is limited since both wire bond pads and solder bumps
21 3d integrated circuits 7
Figure 24 Imecrsquos 3-D SIP wireless bio-electronic sensor [67]
A
BCDEAFE
FFDC
CFE
Figure 25 3D wafer-level-packaging
take up a large area on silicon due to their large pitch The longbond wire length can also significantly distort signals due toits intrinsic parasitics Hence this approach offers limited per-formance when compared to other 3D interconnect solutions Anexample of 3D-SIP application is the wireless bio-electronic sensor(Figure 24) developed at Imec [67]
3D-WLP integration is based on WLP technology directly stack-ing individual dies face-to-face using flip-chip micro-bumps forinterconnection The 3D interconnects are processed after IC pas-sivation using WLP techniques and may also be realised usingTSVs (Figure 25) The TSVs if existing are much larger than theequivalent interconnections in a 3D-SIC typically in the range of20 minus 40microm (1 minus 5microm in 3D-SIC) [11]
The 3D-SIC technology [69] is similar to a 2D SoC approachexcept that circuits exist physically on different layers and verticalinterconnects replace the global interconnections of the SoC 3D-
SICs (Figure 26) are based on TSV interconnections and have thepotential to offer the best all around3D integration technology Theadvantages of TSV-based 3D-SICs include the short interconnectlength (reduced parasitics) and the high interconnection densityHowever the additional process steps required for implementingthe TSVs increase manufacturing cost
A comparison between the discussed 3D interconnect technolo-gies is attempted in Table 21
21 3d integrated circuits 8
ABCD
EFFDB
F
Figure 26 3D-stacked IC
3D interconnect
technologyAdvantages Disadvantages
3D system-in-package (3D-SIP)
Simple existingtechnology bestheterogeneous
integration
Performance(speed power) low
densitymanufacturing costin high-quantities
3D wafer-level-packaging (3D-
WLP)
Existingtechnology good
compromisebetween
performance-manufacturing
cost
Averageperformance and
density
3D stacked-IC (3D-SIC)
Performance(speed power)
high densitymanufacturing costin high-quantities
Manufacturing costin low-quantities
not ready formanufacturing
Table 21 3D interconnect technologies comparison
21 3d integrated circuits 9
ABBCDE
FBACACDA
ACCD F
Figure 27 TSV process steps
212 TSV interconnects
Of particular interest to this PhD thesis is the 3D-SIC approachdeveloped at Imec and specifically the TSVs used for verticalinterconnection
In Imecrsquos 3D-SIC process [69 73] TSVs are created prior to theBEOL processing The TSV process steps are illustrated in Fig-ure 27 After processing of the CMOS front-end-of-line (FEOL) andpre-metal dielectric (PMD) stack the TSV hole is plasma-etchedwith a diameter of 5microm and a minimum pitch of 10microm A thinchemical vapor deposition (CVD) oxide layer (sim 200nm) is thendeposited for insulating the TSVs from the substrate The metal-lization sequence consists of a tantalum nitride (TaN) barrier de-position and filling of the TSV hole with electroplated copper (Cu)Finally the Cu overburden is removed in a top-side CMP step anda standard 2-metal layer BEOL process is applied to finalise thedie
Before stacking the wafer is mounted on a temporary carrierand thinned down to a silicon (Si) thickness of sim 25microm leavingthe TSV Cu exposed on the back-side of the wafer (Figure 28)After dicing the dies are stacked by Cu-Cu thermocompressionin a die-to-die (D2D) or die-to-wafer (D2W) fashion (Figure 29)using a tacky polymer as an intermediate glue layer such asbenzocyclobutene (BCB) (ǫr = 27) Target technology parametersfor Imecrsquos 3D-SIC 130nm and 65nm processes are summarised inTables 22 and 23 respectively
As TSV is a relatively new 3D interconnect technology thereare several challenges to be overcome before TSV technology canbe widely adopted and commercialised Issues in the fabricationlike etching liner deposition metallization and CMP steps arein the process of being resolved or solutions have already beenfound [55] On the contrary design methodologies for TSV-basedICs are still at their infancy stage [18]
21 3d integrated circuits 10
AAAB
CAAAB
ABC
DEF
Figure 28 Before thinningafter thinning
ABC ABC
DE DE
F
DEE
B
CBA
Figure 29 TSV stacking
21 3d integrated circuits 11
Minimum devicechannel length (nm)
130
Supply voltage (V) 12
Routing layers 2
TSV diameter (microm) 5
TSV pitch (microm) 10
TSV length (microm) 22
TSV resistance (mΩ) ~20
TSV capacitance( f F)
~40
Wafer diameter(mm)
200
Table 22 Technology paramet-ers for Imec 3D130
process
Minimum devicechannel length (nm)
70
Supply voltage (V) 10
Routing layers 2
TSV diameter (microm) 5
TSV pitch (microm) 10
TSV length (microm) 40
TSV resistance (mΩ) ~40
TSV capacitance( f F)
~80
Wafer diameter(mm)
300
Table 23 Technology paramet-ers for Imec 3D65 pro-cess
Approaching the technology challenges from the designerrsquosperspective we can emphasise on the following issues designtools thermomechanical stress thermal analysis electrical char-acteristics and testing
Design tools
Although it is possible to design 3D-ICs using existing tools byutilising ad hoc techniques as 3D implementations become moreadvanced ldquonativerdquo 3D design and analysis tools will becomenecessary A few physical layout tools are currently availablefor TSV-based design [2] however they do not offer automaticpartitioning placement and routing Moreover there are noavailable commercial tools for evaluating performance (speedpower) reliability and manufacturability in 3D-IC designs
Thermomechanical stress
TSV fill materials may include copper (Cu) tungsten (W) orpolysilicon[20] The large mismatch in the coefficient of thermalexpansion (CTE) between the fill material of the TSV (17 middot 10minus6K)and the Si surface (3 middot 10minus6K) induces mechanical stress aroundthe vias As Cu contracts much faster than Si it pulls the surface ofthe surrounding bulk while cooling down causing tensile stressin the area [25] Since even low stress values affect carrier mobilityin MOSFET devices [71] performance shift of the transistors inproximity to TSVs may occur
21 3d integrated circuits 12
The intensity of TSV-induced stress depends on various condi-tions such as TSV geometry device proximity and process char-acteristics Defining design rules such as keep-out-zones (KOZs)for digital and analog circuits and generating stress-effect modelsis an important milestone for implementing TSV-aware circuitswith predictable performance Simulations predict the complexdistribution of thermomechanical stress [40] which emphasisesthe need for sophisticated characterization structures that canmonitor stress impact with precision and verify simulation mod-els In Chapter 4 a characterization structure is presented forevaluating the impact of TSV-induced thermomechanical stresson MOSFET device performance
Thermal analysis
The close proximity of dies in a 3D-IC vertical stack increasesthe heat density proportionally to the number of stacked diesAs temperature affects transistor parameters such as leakagecurrent accurate thermal analysis is required for 3D-ICs to ensurethat design specifications are met Presently tools for thermalanalysis of planar ICs are available however they have to beadapted for use with complex 3D systems so as to reduce runtime and improve the accuracy of TSV models [18] In additionldquothermally-awarerdquo design tools are required which can partitionthe design in such a way that high heat generating componentsare placed on layers in proximity to the heat-sink
Electrical characteristics
The electrical characteristics of TSVs such as delay crosstalk sig-nal integrity (SI) and power integrity (PI) dictate many aspects ofa 3D-IC design Proper evaluation of TSV characteristics typicallythrough simulation means is critical for accurately predicting theperformance of a 3D system [57]
Several attempts have been made to generate models of theTSV structure [77 27] using its intrinsic resistance-inductance-capacitance (RLC) parasitics in an equivalent circuit model TheRLC parasitics depend on the materials and geometry of the TSV
structure while prior work has indicated that these parametersare relatively small with the capacitance assuming the dominantrole on TSV electrical properties [27] For verifying the accur-acy and calibrating the models experimental measurements arerequired using compatible characterization structures Such struc-tures are presented in Chapter 3 which were used for extractingthe TSVs parasitic capacitance of an experimental 3D process
22 energy-recovery logic 13
Testing
Testing ICs for defects is crucial for guaranteeing a sufficient man-ufacturing quality The unique processing steps of TSVs such aswafer thinning alignment and bonding introduce new types ofdefects that should be considered during testing [46] Further-more finding and diagnosing defects in TSV interconnections isconsiderably more challenging than 2D interconnects consideringthat it is not possible to gain physical access to the vias after thetop die has been placed All these factors emphasise the need fornew testing structures and methodologies for testing 3D systems
22 energy-recovery logic
TSV 3D interconnect technology aims to reduce delay and en-ergy dissipation when compared to the traditionally used planarBEOL interconnections A key TSV characteristic is its parasiticcapacitance discussed in Chapter 3 which is typically small(Tables 22 and 23) influencing proportionally both delay andenergy dissipation Although TSV technology offers significantperformance improvements to previous solutions a low-powerinterconnection scheme may still be desirable under certain con-ditions due to specific characteristics of TSV technology
1 The TSV capacitance may become a concern in future largedensely interconnected 3D-SoCs as it increases linearly withthe number of tiers and interconnections
2 Since interconnects do not follow transistor scaling tran-sistor gate capacitance will always be smaller than intercon-nect capacitance With the gap tending to widen in futuretechnologies it can be reasonably expected that 3D intercon-nects will become a performance bottleneck in ICs just asthe BEOL interconnections preceding them
Furthermore conventional CMOS low-power design techniquesbased on voltage scaling represent a technical challenge in mod-ern technology nodes as process voltage temperature (PVT) vari-ations and increased leakage [61] decrease in practice the rangeover which voltages may be scaled An interesting low-powerdesign approach is energy-recovery logic which demonstratesfrequency-dependent energy dissipation Energy-recovery logicis a design technique for minimising the energy dissipation ofcomputing devices well below typical CMOS limits Its major lim-itation is the requirement for passive components to generatetime-variable signals which cannot be efficiently integrated intocurrent generation ICs TSV-based 3D stacking and heterogeneousintegration of high-quality passive devices [66] may enable theuse of novel design techniques such as energy-recovery that
22 energy-recovery logic 14
Figure 210 Conventional AND gate
were not practically feasible in the past due to technology lim-itations In this PhD thesis energy-recovery is used for drivingTSV-based interconnections in a low-power scheme discussed inChapters 5 and 6
Historically energy-recovery logic has its roots in reversiblecomputing Reversibility is a concept from thermodynamics de-scribing physical processes that after they have taken place canbe reversed without loss or dissipation of energy Reversibleprocesses cause no change in the entropy of the system or itssurroundings (do not generate any waste heat) therefore theyachieve maximum energy efficiency The concept of reversibilityin thermodynamics may be extended to logical reversibility ifa computation can be implemented in such a way that it is al-ways possible to determine the previous state of the computationgiven a description of its current state In practical terms logicalreversibility would require that all information about the state ofthe computation in any stage is retained and never discarded
Yet the typical computing device is not reversible (irreversible)and as Landauer [36] has shown there is a fundamental min-imum amount of energy that must dissipated in such devicesThis minimum energy dissipation is known as the Von Neu-mann - Landauer limit and is at least kT ln 2 per irreversible bitcomputation The irreversible operation of conventional comput-ing can be demonstrated using the AND gate as an example(Figure 210) The AND gate has two input lines and one outputWhen both inputs are at logical 1 the output is 1 however theoutput is 0 in all of the remaining three combinations that have alogical 0 on the inputs Hence every time the gatersquos output be-comes 0 there is loss of information since we cannot determineexactly which input state caused the logical 0 at the output
Computations do not necessarily have to be irreversible aswas shown by Bennett [9] and thus it is still possible to breakthe kT ln 2 limit A reversible computing device would have to
22 energy-recovery logic 15
perform each computation twice the first time in the forwarddirection applying the inputs to a logic function and obtainingthe result and the second time backwards computing the inversefunction and returning the system to its initial state If a com-putation could be performed in such a way and implementedon non-dissipative hardware its energy requirements could bepotentially reduced to zero [10]
In practice even if a circuit is designed to be logically reversibleachieving zero-energy computations utilising existing electricalcircuits is limited considering that any process merely involvingcharge transfer will result in dissipated energy Nevertheless it isstill possible to reduce the energy dissipation during charge trans-ferring to ldquoasymptotically zerordquo levels applying circuit designtechniques generally known as adiabatic switching [8 31]
The operating principle of adiabatic switching can be bestexplained by comparing it to conventional CMOS switching In aconventional CMOS circuit (Figure 211) when the switch closes(typically the pull-up network) a certain amount of charge Q =
CVDD is pulled out of the positive power supply rail that chargesthe load capacitance C up to VDD The energy transferred fromthe power supply during the charging period (T) is [59]
E = QVDD = CV2DD (21)
From the total transferred energy only half ( 12 CV2
DD) is storedon the capacitor C while the other half is dissipated on theswitch and interconnect resistance R As can be derived fromEquation 21 to reduce the energy dissipation in conventionalCMOS either the supply voltage VDD has to be reduced or theload capacitance C
The adiabatic charging of an equivalent node capacitance C
is modelled in Figure 212 In contrast to the conventional CMOS
circuit the supply voltage in this case is not constant but time-variable with a slow rising time (T ≫ RC) If the charging periodof the capacitor (or rising time of the voltage supply) is T then itcan be proven [82 60] that the energy dissipated on the resistanceR would be proportional to
EAdiabatic prop
(
RC
T
)
CV2DD (22)
The distinct advantage of adiabatic switching is that the slowrising time of the supply voltage keeps the voltage difference(and thus current flow) on the resistance R low at all times Asthe charge transfer is spread more evenly over the entire chargingperiod the energy dissipation (E = I2R) is greatly reduced Inaddition as can be derived from Equation 22 increasing thecharging period (T) can potentially reduce the energy dissipationto arbitrary low levels irrespective of the VDD or C parameters
22 energy-recovery logic 16
A
B
CC
D
E
D
Figure 211 CMOS switching
AA
B
C
B
Figure 212 Adiabatic switching
Conventional CMOS Energy-recovery CMOS
2 states True False 3 states True False Off
Energy of information bitdissipated as heat on the
pull-down network
Energy of information bitrecovered by the
power-supply
High speed Lowmedium speed
High energy dissipation Very low energy dissipation
Area efficient Some area overhead
Table 24 Comparison energy-recoveryconventional CMOS
Combining reversible computations with adiabatic switchingtechniques allows for circuit implementations with extremely lowenergy dissipation compared to conventional designs [37 38]However fully reversible circuits in CMOS technology sufferfrom high area overhead as the inverse logic function has tobe implemented and intermediate logic states must be stored toenable reversibility of the computation [8] In cases where theoverhead of full reversibility is too high it can be significantlyreduced by allowing some information to be occasionally dis-carded without sacrificing much of the energy benefits Thesecircuits that partially apply the principles of reversible comput-ing and adiabatic switching to achieve low but not zero energydissipation are called energy or charge recovery [31]
A comparison between energy-recovery and conventional CMOSis summarised in Table 24
22 energy-recovery logic 17
ABCABC
Figure 213 Basic 2N-2P adiabaticcircuit
ABC
ABC
DEF
EE
DEF
Figure 214 2N-2P timing
221 Energy-recovery architectures
Several energy-recovery circuit architectures have been proposedin the literature over the past years Examples of logic families are2 nMOS2 pMOS adiabatic logic (2N-2P) [33] efficient charge re-covery logic (ECRL) [50] pass-transistor adiabatic logic (PAL) [54]clocked adiabatic logic (CAL) [45] quasi-static energy recoverylogic (QSERL) [83] and boost logic [62] These architectures applythe principles of reversible computing and adiabatic switchingto a variable degree while their differences are mainly relatedto characteristics such as singledual-rail logic style number ofclock phases chargingdischarging path and energy efficiency
A simple configuration is the 2N-2P family of logic circuits(Figure 213) which is based on the differential cascode voltageswitch logic (DCVSL) circuit [59] The timing diagram of the 2N-2P
circuit is presented in Figure 214 using an idealised adiabaticvoltage supply Initially the voltage supply is in the WAIT phase(LOW) keeping the outputs in the LOW state Then the inputs areset complementary to each other and the supply voltage rampsup As the inputs are evaluated the output that was enabled bythe pull-down network follows the power supply until it reachesVDD At that moment the inputs return to the LOW state andafter a certain period of time in the HOLD ldquo1rdquo phase the voltagesupply ramps down discharging the output As can be observedthe average voltage supply during the computation cycle is wellbelow VDD which can be advantageous in modern technologynodes where leakage currents are of a concern
The adiabatic chargingdischarging takes place during theramping updown of the EVALUATERESET phases while thereverse computation occurs during the RESET phase when thetransferred charge is returned to the power supply Howeverthe reversibility of the circuit is only partial since the p-channel
22 energy-recovery logic 18
Figure 215 Adiabatic driver
MOSFET (PMOS) transistor that was turned-on by the pull-downnetwork will be forced to turn-off when the output is approxim-ately a threshold voltage above zero This will cause some chargeto be trapped on the output which will be dissipated as heat on asubsequent cycle Using the 2N-2P circuit arbitrary logic functionsmay be implemented in the place of the pull-down network
The energy efficient characteristics of energy-recovery logic canbe also beneficial for driving large load capacitances A circuitthat may be used in such case is the adiabatic driveramplifier(Figure 215) [8]
The input of the adiabatic driver is dual-rail encoded as well asthe output The operation principle is straightforward First theinputs are set to valid values and then an adiabatic time-variablesignal goes through the pass-gate charging the load capacitanceat the output to its maximum value The output is now validand can be used by other stages to perform computations Sub-sequently the output slowly discharges returning the charge backto the power-supply Energy is saved by two methods Firstlythe slow-rising adiabatic signal forces a low voltage differencebetween the inputs of the pass-gate resulting in low currentand thus low energy dissipation Secondly the energy that isnot dissipated as heat during the transfer is returned back tothe power-supply which can be re-used in the next computationcycle
Circuits utilising the energy-recovery technique have been suc-cessfully implemented in the past while they have demonstratedgreat potential whilst driving highly-parasitic global intercon-nects such as clock distribution networks [42 63]
222 Power-clock generators
Energy-recovery circuits require specialised power supplies thatcan generate multiple-phase time-variable adiabatic signals and
22 energy-recovery logic 19
A
A
B
C
C
Figure 216 Stepwise charging generator
recover charge Since the power supply in an energy-recoverycircuit provides with both the power and clock signal they arereferred to as power-clock generators (PCGs) [6] The power-clockgenerators fall into two categories staircase and resonant gener-ators
Staircase generators use the stepwise charging technique [68]The basic circuit (Figure 216) is composed of N number of tankcapacitors (CT) that are successively discharged in N steps un-til the load capacitance (CLOAD) is charged to the maximumvoltage level This technique attempts to ldquoemulaterdquo the adiabaticchargingdischarging signal while its linearity depends on thenumber of charging steps N The advantage of staircase generat-ors is that they can be easily integrated on-chip since they do notrequire inductors as is the case with resonant generators How-ever both speed performance and energy efficiency are relativelylow
Resonant generators [44] are based on inductance-capacitance(LC) oscillators A simple circuit is the 1N single-phase powerclock generator of Figure 217 If R is the resistance of the adia-batic driver in Figure 215 then the load capacitance (CLOAD)along with a resonant-tank inductor (L) form an RLC oscillatorresonating at
f =1
2πradic
LCLOAD
(23)
As typical for RLC oscillators the shape of the power-clockis a sinewave which approximates the idealised waveform ofFigure 214 Energy is transferred adiabatically during the risinghalf of the sinewave while on the falling half the energy is
23 full-custom design flow 20
A
BCD
E
DFF
CD
Figure 217 1N power clock generator
partially recovered and stored in the inductor for later use RLC
oscillators are very energy efficient power-clock generators andcan operate at high speeds However inductors may be difficultto implement on-chip
23 full-custom design flow
All circuits presented in this thesis were designed using the full-custom approach since the target fabrication technology was anexperimental 3D process for which neither standard cells nordedicated 3D design tools were in our disposal
In the full-custom design methodology each individual tran-sistor and associated connectivity in the integrated circuit ismanually specified in the layout level by the designer Full-customis considered to be superior to standard-cell design when themain concern is circuit performance and area efficiency Further-more full-custom design approaches are essential in cases wherespecialised circuitry are to be designed or when standard celllibraries are not available
The electronic design automation (EDA) environment that wasused was based on Cadence1 applications and tools as integ-rated in the Design Framework II (DFII version 5141) CadenceDFII consists of tools for design management (library manager)schematic entry (Virtuoso schematic) physical layout (Virtuosolayout) verification (Assura) and simulation (Spectre) Thesetools are completely customisable supporting various processtechnologies through foundry-specific design kits In our case
1 Cadence Design Systems Inc (httpwwwcadencecom)
24 summary 21
the Assura tool was replaced by Mentor Graphics2 Calibre forthe verification step
Our customised design flow is illustrated in Figure 218 Ini-tially the circuit is captured at the transistor-level using theschematic editor (Figure 219) and the completed design is simu-lated and modified until it conforms to the design specificationsThe same circuit is re-created in the layout editor (Figure 220)where the detailed geometries and positioning of each fabricationmask layer is described The layout must conform to a series ofprocess design rules and any violations of these rules can be de-tected during the design rule check (DRC) step Subsequently thelayout design is translated to a circuit description (netlist) whichis compared to the initial circuit in the schematic editor duringthe layout versus schematic (LVS) step If the two designs matchthe interconnectivity and transistor parasitics are estimated fromthe layout design and another netlist is created which includesthese parasitics This detailed netlist can be simulated to get amore accurate estimation of the circuit behaviour including theinfluence of parasitics
24 summary
In this chapter the reader was introduced to the basic conceptsand terminology that are relevant in the context of this thesisThe following chapter begins the discussion of the research partof this work presenting a group of test structures that were usedfor the electrical characterization of the TSV parasitic capacitancein an experimental 3D process
2 Mentor Graphics Inc (httpwwwmentorcom)
24 summary 22
ABCDEA
FCCAAE
ABCDEAEC
DE
E
F
CEA
EEAEA
EDE
CAC
E
EC
CAC
Figure 218 Design flow
Figure 219 Inverter in Virtuososchematic
Figure 220 Inverter in Virtuosolayout
3C H A R A C T E R I Z AT I O N O F T S V C A PA C I TA N C E
31 introduction
In a semiconductor process technology electrical characterizationis crucial for testing device performance and reliability after goingthrough the process fabrication steps Electrical characterizationverifies the electrical properties of devices to specifications andprovides feedback to the wafer fabrication process engineers forimproving and optimising the process technology
In a 3D system the electrical behaviour of the TSV can bemodelled in terms of its resistance-inductance-capacitance (RLC)parasitics as discussed in 212 (page 12) Parasitic models are typ-ically used for simulating circuit behaviour and they are criticalfor accurately predicting the performance of a 3D system duringdesign verification [57]
From the RLC parasitics associated with a TSV capacitanceassumes the most significant role on its electrical properties suchas energy dissipation and delay [27] The capacitancersquos role is sodominant that for latency and signal integrity (SI) analysis theTSV behaviour can be modelled by considering the capacitancealone [21] Thus accurate characterization of the TSV capacitanceis highly desirable when designing a 3D system
Typical laboratory capacitance measurement techniques basedon LCR1 meters require well controlled process technologies forrealising a high number of devices without defects which mightnot be possible at the current maturity level of TSV process tech-nology In this chapter an alternative measurement technique isproposed for accurate on-chip characterization of the TSV parasiticcapacitance Test structures based on this technique are fabricatedon a 65nm 3D process (Table 23 page 11) and used for charac-terizing the TSV parasitic capacitance under various conditionsThe experimental measurements are analysed using statisticalmethods and the proposed test structures are evaluated
32 tsv parasitic capacitance
A typical isolated 2D back-end-of-line (BEOL) metal wire can bemodelled as a conductor over a ground plane (Figure 31) Sincethe conductor is separated from the ground plane by a dielectric
1 LCR meter is a measurement equipment used to determine the inductance (L)capacitance (C) and resistance (R) of devices based on impedance measure-ments
23
33 integrated capacitance measurement techniques 24
ABC
Figure 31 Isolated 2D wire capacitance
material (εox) its electrical behaviour closely resembles that ofa capacitor The basic formula for calculating a parallel platecapacitance is
C =εox
hwl (31)
In practice calculating the parasitic capacitance of a BEOL metalwire is significantly more complex since fringing fields increasethe total capacitance value [78] Nevertheless whether a 2D metalwire capacitance is calculated using Equation 31 or more com-plex formulas [85] its value is only dependent on geometricalcharacteristics and the dielectric material
In that context 3D interconnects in the form of TSVs do notbehave as simple wires but as devices The TSV copper (Cu) issurrounded by silicon (Si) and these two materials are separatedby a dielectric oxide liner (Figure 32) This structure resemblesa metal-oxide-semiconductor (MOS) capacitor and as has beenshown by modelling and measurements [27] exhibits distinctaccumulation depletion and inversion regions (Figure 33) TheTSV capacitance attains its maximum value (Cox) in the accumula-tion when Vbias lt VFB (flatband voltage) and inversion regions(Vbias gt Vth f lt 1kHz)
The bias voltage and frequency dependency of the TSV capacit-ance complicates both modelling and the experimental measure-ment of its parasitic capacitance Furthermore the MOS behaviourof TSVs is not desirable in circuits using 3D interconnections Forthat reason special techniques are employed [28] that shift theTSV C-V characteristic in such a way that for typical circuit oper-ating voltages (0 minus VDD) the TSV capacitance is constant and atits minimum value (Figure 33)
33 integrated capacitance measurement techniques
Extracting the TSV parasitic capacitance requires the use of meas-urement techniques that are compatible with the particular TSV
33 integrated capacitance measurement techniques 25
ABCDE
F
FD
F
F
FD
Figure 32 TSV parasitic capacitance
ABCDE FACDE
ECDE
BCFACDE
FCCDED DBCDE
D
DAB
Figure 33 Capacitance vs Gate voltage (CV) diagram of a MOS Capa-citor
33 integrated capacitance measurement techniques 26
characteristics as indicated in the previous section The tech-niques commonly used for measuring integrated capacitances ina laboratory environment can be classified into two categories
1 Measurements based on external instruments these can bedirect capacitance measurements LCR meter based [52] orRF S-parameter based measurements [39]
2 On-chip capacitance measurements examples are the ref-erence capacitor technique [32] ring-oscillator based [26]and charge-based [15]
The disadvantage of using external instruments directly connec-ted to the device under test (DUT) is that the measurement setupintroduces additional parasitics that have to be subtracted fromthe measurement result Using standard measurement setupssignificant noise is added when attempting to determine capa-citances well below 1pF due to the influence of the instrumentparasitics [64] In cases where the DUT capacitance is small suchas the femto-Farad TSV capacitance (Table 23) a large number ofidentical devices (typically in the hundreds) can be connected inparallel to increase the total capacitance value [75] However thismethod is only usable with process technologies that are wellcontrolled and allow the realisation of a high number of deviceswithout defects In general the extracted capacitance of the wholepopulation may not be representative of a single device [56] andas a result individual device measurement accuracy may suffer
On-chip techniques can achieve better accuracy when the DUT
capacitance is small Although DUT yield is not critical the pro-cess technology should allow the integration of active devicesand their parameters should be known to a reasonable degreeA simple yet high resolution technique is the charge-based ca-pacitance measurement (CBCM) which can be used to extractcapacitances down to the atto-Farad range [15]
In the CBCM technique the DUT capacitance (CDUT) is chargeddis-charged at frequency f using a pseudo-inverter (Figure 34) whilethe average charging current (Im) is measured with a laboratoryammeter Since transistor and wiring parasitic capacitances are inparallel with the DUT capacitance a second (reference) inverter isused to measure these parasitics (Ire f ) which are then subtractedfrom the calculated result The DUT capacitance is calculated asfollows
CDUT =Im minus Ire f
VDD middot f(32)
The accuracy of the CBCM technique can be further increasedby measuring the charging current over a range of frequenciesThe DUT capacitance can be then calculated from the slope of
34 electrical characterization 27
AB
AC
DE
FE
DE
FE
AB
AC
B
Figure 34 CBCM pseudo-inverters
Inet (Im minus Ire f ) (Figure 35) The accuracy is greatly improved aspseudo-inverter leakage current is removed from the extractedresult (slope is not affected by constant biases) and also defectiveDUTs can be detected by inspecting the linearity of the current-frequency plot
The CBCM technique has been previously used for measuringBEOL metal interconnect parasitic capacitances [15] and MOSFET
capacitances [64] In this work CBCM is proposed for the on-chipmeasurement of the TSV parasitic capacitance CBCM is compat-ible with the TSV parasitic capacitance on the premise that thevoltage and frequency ranges of CBCM are selected such thatthe TSV is forced into a region where its capacitance is constantAs was shown in the previous section (Figure 33) TSV capacit-ance is constant for typical circuit operating voltages while thefrequency dependency of the capacitance becomes significantonly in the multi-MHzGHz range [39] Therefore operating theCBCM pseudo-inverters in the 0 minus VDD range and below 1MHz
should be adequate to avoid the MOS behaviour of the TSV
34 electrical characterization
A group of test structures based on the CBCM technique wereimplemented for on-chip characterization of the TSV capacitanceon Imecrsquos 65nm 3D process technology (Table 23) As it is commonin IC design maximising the utilisation of the available chiparea is one of the priorities of the design effort For that reasonsupplementary test structures were designed and implemented
34 electrical characterization 28
AB
CADE
F
AB
E
CAD
B
Figure 35 Current as a function of frequency in CBCM
on the same chip that went beyond the core objectives of thiswork
The complete design consisted of 6 test structures 5 for ex-tracting the TSV capacitance under various conditions based onCBCM and 1 additional experiment for direct measurement ofthe RC constant All test structures are presented in this sectionhowever due to equipment and material availability only themost significant were measured and analysed
341 Physical implementation
In the CBCM technique the extraction accuracy of the TSV capacit-ance is highly dependant on the design of the pseudo-invertersThe rsquoreferencersquo pseudo-inverter in CBCM is used for measuringboth the wire parasitic capacitance and the inverter parasitics inthe form of diffusion and gate-drain overlap capacitances There-fore it is critical that the two pseudo-inverters are identical interms of parasitics as any mismatches will appear in the extrac-ted measurement result
Transistor mismatch is mostly influenced by process technologyparameters however it is possible to minimise process variationssuch as in the poly gate length by applying special design tech-niques A major cause of gate length variation is the polysiliconpitch to neighbours and one way to mitigate that effect is to fillunused space between transistor gates with polysilicon dummies[13] This technique was widely used in the CBCM pseudo-inverterdesign presented in this work (Figure 36) Concurrently theinverter transistors were designed with relatively large dimen-sions for the implementation process technology (P10microm1micromN5microm1microm) so as to minimise the effect of variation as a per-centage of the transistor parasitics
34 electrical characterization 29
Poly dummy fills
Metal2Metal1PolysiliconActive
1μm 5μm
Figure 36 CBCM pseudo-inverter pair in layout
Imecrsquos 65nm 3D process allows for 2-tier vertical stacking ofdies (TOP and BOTTOM) with the TSV passing through theTOP die Si substrate as illustrated in Figure 32 The IO foreach test structure was accessible through a standard 2 times 12 testpad module on the TOP die compatible with our wafer probetest system The 2 times 12 test pad module (PRC1) along with thecomplete test structure design is illustrated in Figure 37 while amicrograph image of the fabricated chip can be seen in Figure 38
The CBCM pseudo-inverters were implemented on either TOP
or BOTTOM tier while the TSVs were connected to the invert-ers from TOP or BOTTOM and arranged in 4 configurations (11 times 2 2 times 2 2 times 5) with different pitch distances The 6 test struc-tures available on the PRC1 test pad module are summarised inTable 31 A detailed description of the test structures (rsquoArsquo rsquoBrsquo rsquoCrsquorsquoDrsquo rsquoRrsquo and rsquoErsquo) follows in the paragraphs below
Structure rsquoArsquo Single TSV capacitance (TOP die)
The purpose of test structure rsquoArsquo (Figure 39) is to extract the capa-citance of a single TSV when it is electrically accessed through theTOP die The structure is composed of 3 pseudo-inverters con-nected to the TSVs using identical geometry Metal-2 layer wireswhose capacitance is also measured (Reference) In addition to
34 electrical characterization 30
DTOP DBTM
B10 A10
BREF AREF
B1 A1
R1 R2
SET VDD
RESET GND
C15 C10
C50 C20
SET VDD
ERC SET
EREF EC Metal2 (TOP)Metal1 (TOP)
Metal1 (BTM)Metal2 (BTM)
Figure 37 PRC1 test pad module(layout)
Figure 38 PRC1 test pad module(micrograph)
34 electrical characterization 31
Structure A B
TSVs - 1 2x5 - 1 2x5
Inverterlocation
TOP TOP TOP BTM BTM BTM
TSVconnection
TOP TOP TOP BTM BTM BTM
Multi-TSVconnection
- - TOP - - BTM
Multi-TSVpitch (microm)
- - 15 - - 15
Pad name AREF A1 A10 BREF B1 B10
Structure C D
TSVs 2x2 2x2 2x2 2x2 1x2 1x2
Inverterlocation
TOP TOP TOP TOP TOP TOP
TSVconnection
TOP TOP TOP TOP TOP TOP
Multi-TSVconnection
TOP TOP TOP TOP TOP BTM
Multi-TSVpitch (microm)
10 15 20 50 20 20
Pad name C10 C15 C20 C50 DTOP DBTM
Structure R E
TSVs 1 1 - 1x2 1x2
Inverterlocation
TOP TOP TOP TOP TOP
TSVconnection
TOP TOP - TOP TOP
Multi-TSVconnection
- - - TOP BTM
Multi-TSVpitch (microm)
- - - 20 20
Pad name R1 R2 EREF EC ERC
Table 31 Test structures on the PRC1 module
34 electrical characterization 32
Pseudo-inverters
2x5 TSVs
TSV
Reference
Metal2 (TOP)Metal1 (TOP)
Figure 39 Structure rsquoArsquo - Single TSV capacitance (TOP die)
the single TSV DUT the structure also contains 10 parallel TSVs
arranged in a 2 times 5 array with a 20microm pitch Since the suitabilityof the CBCM method for extracting the small TSV capacitance wasundetermined during design time the 2times 5 array was included asa precaution effectively increasing the DUT capacitance an orderof magnitude
Structure rsquoBrsquo Single TSV capacitance (BOTTOM die)
The design of structure rsquoBrsquo is similar to the structure rsquoArsquo asdescribed above The only difference is that the CBCM pseudo-inverters are implemented on the BOTTOM die (Figure 310)while connectivity to the TSVs is provided using a wire on theMetal-2 layer of the BOTTOM die The purpose of this structureis to examine whether the perceived TSV capacitance varies whenit is accessed from the BOTTOM side
Structure rsquoCrsquo Variable pitch TSV capacitance
The use of minimum TSV pitch in designs is highly desirable soas to minimise total chip area The purpose of test structure rsquoCrsquo(Figure 39) is to investigate whether TSV pitch has an effect onthe parasitic capacitance It has been previously reported in the
34 electrical characterization 33
2x5 TSVs
Reference
TSV
Pseudo-inverters
Metal2 (BTM)Metal1 (BTM)
Figure 310 Structure rsquoBrsquo - Single TSV capacitance (BOTTOM die)
literature that TSV proximity to other TSVs may affect its parasiticcapacitance probably through an interaction with the oxide liner[23] Test structure rsquoCrsquo aims to verify the presence of the pitcheffect on the 65nm 3D process
The CBCM pseudo-inverters are implemented on the TOP die(Figure 311) while the TSVs are connected in parallel in groups of4 with variable pitch distances at 10microm 15microm 20microm and 50micromAll groups connect to the pseudo-inverters using identical lengthwires Since wire parasitics are not measured their value is notextracted from the measured TSV capacitances thus the resultsare relative
Structure rsquoDrsquo TSV capacitance with interconnection on TOP or BOT-
TOM die
The purpose of structure rsquoDrsquo (Figure 312) is similar to structurersquoBrsquo which is to examine whether the perceived TSV capacitancechanges when it is accessed from the BOTTOM die However inthis structure the CBCM pseudo-inverters are implemented on theTOP die in the case the front-end-of-line (FEOL) active devices ofthe BOTTOM die were not capable of operating properly afterfabrication and stacking
34 electrical characterization 34
2x2 TSVs - 10μm
2x2 TSVs - 15μm
2x2 TSVs - 20μm
2x2 TSVs - 50μm
Metal2 (TOP)Metal1 (TOP)
Figure 311 Structure rsquoCrsquo - Variable pitch TSV capacitance
Metal2 (TOP)Metal1 (TOP)Metal1 (BTM)
Figure 312 Structure rsquoDrsquo - TSV capacitance measurement with inter-connection on TOP or BOTTOM die
34 electrical characterization 35
Metal2 (TOP)Metal1 (TOP)
Figure 313 Structure rsquoRrsquo - Evaluation of TSV capacitance measurementaccuracy
Metal2 (TOP)Metal1 (TOP)Metal1 (BTM)
Driver
Reference
Driver
Figure 314 Structure rsquoErsquo - TSV capacitance effect on signal propagationdelay (layout)
Structure rsquoRrsquo Evaluation of TSV capacitance measurement accuracy
Structure rsquoRrsquo (Figure 313) aims to evaluate the accuracy withwhich CBCM can extract the capacitance of two nearby TSVs Thetwo pseudo-inverters wires and TSVs are identical thus theor-etically they should give identical results Any variance in theresult should be mainly due to transistor variability
Structure rsquoErsquo TSV capacitance effect on signal propagation delay
Structure rsquoErsquo (Figure 314) allows the direct measurement ofthe delay caused by the TSV on signal propagation The signalpropagation delay is measured under two conditions When thesignal passes through the TSV (which includes the effect of resist-ance on delay) and when only the TSV capacitance is in the pathof the current (Figure 315) The contribution of wiretransistorparasitics is also measured so as to be subtracted from the finalresult
34 electrical characterization 36
ABC
D
ECBFC
ABC
Figure 315 Structure rsquoErsquo - TSV capacitance effect on signal propagationdelay (schematic)
ABC
DEF
CD
A
CA
A
Figure 316 Measurement setup
342 Measurement setup
The presented test structures were fabricated on 300mm wafers(Figure A1) each containing 472 identical dies For IO accessa general purpose on-wafer measurement system was used thatallowed direct contact with the wafer and the 2 times 12 test padmodule through probecard pins (Figure A2) All IO signals wereprovided by standard laboratory instruments controlled throughthe general purpose interface bus (GPIB) port of a PC runninga customised LabVIEW2 program The measurement setup isillustrated in Figure 316
The voltage source of the CBCM pseudo-inverters was set at1V and the CBCM pulse frequencies were swept in the range100kHz minus 1MHz For each measured test structure the DC cur-rent measurement was stored in non-volatile memory and sub-
2 LabVIEW is a visual programming language from National Instruments forautomating measurement equipment in a laboratory setup
34 electrical characterization 37
Test pad Direction Instrument connection
VDD Power Power supply - 1V
GND Power Power supply - 0V
SET InputPulse generator
(Non-overlapping with RESET)
RESET InputPulse generator
(Non-overlapping with SET)
A[REF 1 10] InputSMU (Inverter source - 1V
supply)
B[REF 1 10] InputSMU (Inverter source - 1V
supply)
C[10 15 2050]
InputSMU (Inverter source - 1V
supply)
D[TOP BTM] InputSMU (Inverter source - 1V
supply)
R[1 2] InputSMU (Inverter source - 1V
supply)
E[REF C RC] Output Oscilloscope (high-impedance)
Table 32 PRC1 test pad instrument connections
sequently the prober progressed to the next die measuringall dies on the wafer The IO test pads of the PRC1 module(Figure 37) and the instrument connections are summarised inTable 32
343 Experimental results
There were two wafers available for measurements named in thistext rsquoD11rsquo and rsquoD18rsquo which were selected from the same lot Eachwafer contained 472 identical dies of the TOP 3D tier measuredusing the measurement setup described in 342 Measurementdata were extracted only from test structures rsquoArsquo (Figure 39) andrsquoCrsquo (Figure 311) which were analysed using standard statisticalmethods
When utilising statistical methods to analyse data acquired ina laboratory environment it is common for some data pointsto be significantly away from the other measurements Thesedata points are termed outliers (Appendix B) and usually resultfrom measurement errors such as bad probe pin contacts ormalfunctioning dies on the wafer Since the following analysisis focused on TSV capacitance variability these few outliers areexcluded from the data set so as to improve analysis accuracy
34 electrical characterization 38
Structure rsquoArsquo (Fig 39)
TSV dimensions (ld) 40microm5microm
Oxide liner thickness 200nm
Total dies 472
Processed dies (excluding outliers) 454
Mean capacitance 709 f F
Standard deviation (σ) 19 f F
Coefficient of variation (3σmicro) 80
Table 33 TSV capacitance D2D variability on wafer rsquoD11rsquo
The statistical descriptors used for analysing the measurementdata are elaborated in Appendix B and include calculations ofmean standard deviation and coefficient of variation Theseparameters are used to evaluate the TSV capacitance in terms ofdie-to-die (D2D) wafer-to-wafer (W2W) and TSV pitch variabilityThe accuracy of the CBCM results is evaluated by comparing themto standard LCR measurements Furthermore the correlationbetween the extracted TSV capacitance and simulated oxide linerthickness is shown signifying the predominant role of the oxideliner on the observed TSV capacitance variability
Die to die (D2D) variability
The die-to-die (D2D) variability is evaluated by measuring thecapacitance of the single TSV of structure rsquoArsquo (Figure 39) acrossall dies on the wafer The measurements are repeated on bothavailable wafers (rsquoD11rsquo and rsquoD18rsquo)
On wafer rsquoD11rsquo 18 dies showed a capacitance value severalstandard deviations away from the calculated mean of the totalpopulation These outliers indicated a measurement error or mal-functioning on the die and they were removed from the statisticalanalysis In the remaining 454 dies the mean capacitance of theexperimental data was 709 f F with a standard deviation (σ) of19 f F (Table 33) Standard deviation is an important indicator al-lowing us to determine whether the variance in the experimentaldata is expected or if there is a random error affecting the meas-urements As can be seen in Figure 317 706 of the measuredvalues are within plusmnσ of the mean capacitance value Since mostof the experimental data are within a standard deviation of themean value the variability observed is insignificant and expectedthus the measurement results can be considered valid
On wafer rsquoD18rsquo from the 472 total dies 58 were removed asoutliers The mean capacitance of the experimental data was895 f F with a standard deviation (σ) of 22 f F (Table 34) Theprobability density function can be seen in Figure 318 with 691
34 electrical characterization 39
-60 -40 -20 00 20 40 60 80
σ-σ
706
Deviation from mean (fF)
0
005
01
015
02
025
Figure 317 TSV capacitance D2D variability on wafer rsquoD11rsquo - Probabil-ity density function
Structure rsquoArsquo (Fig 39)
TSV dimensions (ld) 40microm5microm
Oxide liner thickness 200nm
Total dies 472
Processed dies (excluding outliers) 414
Mean capacitance 895 f F
Standard deviation (σ) 22 f F
Coefficient of variation (3σmicro) 74
Table 34 TSV capacitance D2D variability on wafer rsquoD18rsquo
of the experimental data within plusmnσ of the mean capacitancevalue
Correlation between TSV capacitance and oxide liner thickness vari-
ation
The TSV capacitance as extracted from the experimental measure-ments is plotted for wafers rsquoD11rsquo and rsquoD18rsquo (Figures 319 and 320)The capacitance value in the figures is normalised to the calcu-lated mean while xy axis indicate die coordinates on the waferWe observe that on both wafers the capacitance is near the meanvalue at the periphery of the wafer and reduces as we approachto the centre indicating a common cause of variability on bothwafers
Parameters affecting the TSV capacitance are the geometry ofthe TSV (length diameter) and oxide liner characteristics (dielec-tric constant thickness) The variation of the oxide liner thicknessacross the wafer can be predicted by simulation means Oxideliner simulation results for the specific process and fabrication
34 electrical characterization 40
σ-σ
691
Deviation from mean (fF)-70 -50 -30 -10 10 30 50 70
0
002
004
006
008
01
012
014
016
018
02
Figure 318 TSV capacitance D2D variability on wafer rsquoD18rsquo - Probabil-ity density function
075
080
085
090
095
100
105
110
0
5
10
15
20
25
30
-30-25
-20-15
-10-5
Norm
aliz
ed
capacitance
Die X
Die Y
075
080
085
090
095
100
105
Wafer area
Figure 319 TSV capacitance measurement results on wafer D11
34 electrical characterization 41
086088090092094096098100102104106108
0
5
10
15
20
25
30
-30-25
-20-15
-10-5
Norm
aliz
ed
capacitance
Die X
Die Y
086
088
090
092
094
096
098
100
102
104
106
108
Wafer area
Figure 320 TSV capacitance measurement results on wafer D18
tools can be seen in Figure 321 The liner thickness in the graphis normalised to the mean value and xy axis indicate die coordin-ates on the wafer As can be observed the oxide liner thickness isnear the mean value at the periphery of the wafer and increasesto a peak value when approaching the centre Since the TSV ca-pacitance is inversely proportional to the oxide liner thicknesstherersquos a strong correlation between the two variations suggestingthe oxide liner thickness to be the major cause of die-to-die (D2D)variability on wafers rsquoD11rsquo and rsquoD18rsquo
Wafer to wafer (W2W) variability
Probability density functions for both wafers rsquoD11rsquo and rsquoD18rsquo areplotted in Figure 322 Comparing the two experimental data sets(Tables 33 and 34) we can see that although the main statisticaldescriptors have not changed much (standard deviation is slightlylarger in D18) the mean capacitance is much larger in rsquoD18rsquo(895 f F) when compared to rsquoD11rsquo (709 f F) The difference is 262capacitance increase from wafer to wafer Taking into accountthat both wafers are from the same lot this result indicates theneed for better controllability on process parameters
34 electrical characterization 42
109
100
095Nor
mal
ized
oxi
de th
ickn
ess
-5-10 -15 -20 -25 -30
510
1520
2530
0Die Y
Die X
Figure 321 Simulated oxide liner thickness variation
60 65 70 75 80 85 90 95 1000
005
01
015
02
025D11 D18
plusmnσ
plusmnσ
Capacitance (fF)
Figure 322 TSV capacitance W2W variability - Probability density func-tions
34 electrical characterization 43
Structure rsquoCrsquo (Fig 311)
TSV dimensions (ld) 40microm5microm
Oxide liner thickness 200nm
Total dies 472
Processed dies (excluding outliers) 424
Experiment C10 C50
Number of TSVs 4 4
Mean capacitance 4073 f F 4125 f F
Standard deviation (σ) 84 f F 74 f F
Population within σ 695 653
Table 35 Variable pitch TSV capacitance experimental data from waferrsquoD18rsquo
Variable pitch TSV capacitance
The variable pitch TSV capacitance was extracted using structurersquoCrsquo (Figure 311) TSVs with pitch distances at 10microm and 50microm
were measured (experiments C10 and C50) on wafer rsquoD18rsquo Fromthe 472 total dies 424 were functional and the measurementresults are summarised in Table 35 The extracted capacitancevalues represent the 4 parallel-connected TSVs which also includethe parasitics of Metal 2 connectivity wires
As can be observed in the probability distribution functions(Figure 323) the measurement data for experiments C10 and C50
are very close to each other and there is an overlap between theirstandard deviations Even though there is an observable changein mean capacitance between the two experiments the overlap ofstandard deviations does not allow to conclude whether the capa-citance change is due to difference in pitch or process variability
To decide whether the change observed in capacitance is signi-ficant we run a KruskalndashWallis (Appendix B) one-way statisticalanalysis of variance in Minitab3 Assuming that the null hypo-thesis is true we calculate a p-value of 0 indicating that the twodata sets are different and that the pitch has indeed an affect theTSVs capacitance value The pitch dependence of the capacitancewith the same direction has been also previously observed onImecrsquos 130nm 3D process [23]
Measurement accuracy of CBCM
The capacitance measurements extracted using the CBCM tech-nique were compared to conventional LCR measurements (Fig-ure 324) on wafer rsquoD18rsquo Comparing the two measurements we
3 Minitab is a program that implements common statistical functions It is de-veloped by Minitab Inc (httpwwwminitabcom)
35 summary and conclusions 44
376 380 384 388 392 396 400 404 408 412 416 420 424 428 4320
001
002
003
004
005
006C10 C50
Capacitance (fF)
plusmnσC10
plusmnσC50
695 653
Figure 323 Variable pitch TSV capacitance experimental data fromwafer rsquoD18rsquo - Probability density function
-80 -60 -40 -20 00 20 40 60 80 1000
002
004
006
008
01
012
014
016
018σ-σ
688
Deviation from mean (fF)
Figure 324 LCR experimental data from wafer rsquoD18rsquo - Probability dens-ity function
observe in Table 36 that the standard deviations of the two datasets are similar while mean capacitance is sim 8 smaller in theLCR method However the LCR method produced more out-liers than CBCM That was expected as LCR cannot extract thecapacitance of a single TSV but can only measure groups of TSVs
connected in parallel (32 in our example) so as to increase thetotal measured capacitance This increases the chance of measur-ing TSVs with defects which in turn affects the accuracy of theLCR extracted capacitance
35 summary and conclusions
In this chapter various test structures based on the CBCM tech-nique were presented for the electrical characterization of theTSV parasitic capacitance The test structures were implementedon a 65nm 3D process technology and measured using a gen-eral purpose on-wafer measurement system The measurement
35 summary and conclusions 45
Method LCR CBCM
Total dies 472 472
Processed dies (excluding outliers) 393 414
Mean capacitance 820 f F 895 f F
Standard deviation (σ) 23 f F 22 f F
Coefficient of variation (3σmicro) 86 74
Table 36 Comparison between CBCMLCR measurements on waferrsquoD18rsquo
results were statistically analysed in terms of TSV capacitancedie-to-die (D2D) and die-to-wafer (D2W) variability The calculatedD2D variability was found to be reasonable while a comparisonto simulated data of the oxide liner thickness variation revealedthe oxide liner as a major contributor to the observed TSV capa-citance variability However D2W variability was extensive andfeedback was provided to the fabrication engineering team forfurther improvement of the process The results also confirmedthe effect of TSV pitch on the parasitic capacitance as has beenpreviously reported in the literature for a different process [23]
The comparison of CBCM-based measurements to conventionalLCR gave similar results while in addition the CBCM techniquewas shown to produce less outliers (less defective samples) in themeasurement data Considering the simplicity of CBCM and thatit has acceptable accuracy for extracting the capacitance of singleTSVs it can be a valuable tool for process characterization in theearly research efforts of 3D integration
The ability of CBCM for measuring single-ended TSVs could alsoprove beneficial for testing TSVs prior to die stacking which is ancritical step for keeping 3D-IC yield acceptable [47] Extracting TSV
parasitic capacitance can provide important information about theTSV device fabrication quality since slight variations in processparameters will greatly affect its parasitic capacitance as wasdemonstrated in this chapter Techniques based on RC constantmeasurement and sense amplification have been proposed in theliterature for a similar role [16] however they are significantlymore complex to implement than CBCM
In the following chapter another critical TSV characteristic ther-momechanical stress is evaluated using a test structure designedand implemented on the same 65nm 3D process
4T S V P R O X I M I T Y I M PA C T O N M O S F E TP E R F O R M A N C E
41 introduction
The integration of through-silicon via (TSV) 3D interconnects inCMOS technology has raised concerns over the impact on activedevices implemented on the same silicon (Si) substrate The largecoefficient of thermal expansion (CTE) mismatch between the TSV
filling material and Si substrate has been shown to induce ther-momechanical stress and deform the surface in proximity to theTSV Since thermomechanical stress can have an effect on activedevice carrier mobility [71] it is essential that the stress impacton MOSFET performance is kept under control Monitoring theTSV-induced thermomechanical stress effect on MOSFET deviceperformance and generating keep-out-zone (KOZ) guidelines fordigital and analog circuits is an important milestone for design-ing TSV-aware circuits with predictable performance
Simulations using finite element method (FEM) modelling [40]predict that TSV-induced thermomechanical stress has a com-plex distribution that depends on numerous conditions such asTSV geometry device type channel orientation and proximityHowever currently published experimental measurements onthermomechanical stress investigate just a small number of test-cases typically limited to a single TSVchannel orientationaxis[53 70 81] The complex distribution of thermomechanical stressemphasises the need for sophisticated characterization structuresthat can monitor stress impact with precision and verify simula-tion models
In this chapter a test structure is presented for monitoring theimpact of TSV-induced thermomechanical stress on MOSFET deviceperformance The test structure uses proven design techniquesfor implementing a large number of test-cases and accessingnumerous devices with maximum precision Furthermore thetest structure is fabricated on an experimental 65nm 3D process(Table 23) and used for characterizing that process Some of thepresented circuits and results have previously appeared in thefollowing publications [48 49 58]
42 tsv-induced thermomechanical stress
Individual TSV integration technologies use different conductivefill materials which may include copper (Cu) tungsten (W) or
46
42 tsv-induced thermomechanical stress 47
Material CTE at 20 ˚C (10minus6K)
Si 3
Cu 17
W 45
Table 41 Coefficients of thermal expansion
Figure 41 TSV thermomechanical stress simulation [40]
polysilicon [20] The coefficient of thermal expansion (CTE) mis-match (Table 41) between the fill material and the silicon (Si) caninduce strain on the surface in proximity to the TSV This deform-ation is caused by thermomechanical stress along the XY-planeduring the cooling-down phase [25] due to the filling materialrsquoshigher-than-Si contraction rate
Simulations using finite element method (FEM) modelling showthat the thermomechanical stress exhibits two-fold rotationalsymmetry with tensile and compressive stresses concentratedon separate axis [40] The thermomechanical stress distributiondepends on TSV geometrical characteristics as well as the inter-action with other TSVs Figure 41 illustrates the simulated stressdistribution of a single TSV while in Figure 42 the interactionbetween two near-by TSVs is shown As can be observed whenmultiple TSVs are in proximity to each other stress-free areas canbe generated as a result of destructive interaction
It is well known that stress can affect carrier mobility in MOS-
FET devices In fact strained-silicon technology processes havebeen in use for years as means to enhance device performance inseveral logic families [71] However in TSV-based 3D integrationthe induced strain is not intentional since it is a characteristic in-trinsic to the TSV process technology and thus the resulting shiftin device performance might not be always desirable Character-ization of the TSV impact on device performance and definition ofdesign rules such as keep-out-zones (KOZs) for digital and ana-log circuits is an important milestone for implementing TSV-awarecircuits with predictable performance
Simulation results [65] have shown that the mobility shift de-pends on various conditions such as TSV geometry device type
43 test structure implementation 48
Figure 42 TSV thermomechanical stress interaction between two TSVs[40]
proximity and channel orientation In Figure 43 the mobilityshift of transistors between two TSVs at variable distances is simu-lated As can be observed the effect is positive and more intenseon PMOS devices while for n-channel MOSFET (NMOS) the mobil-ity shift is mostly negative
The mobility shift due to TSV proximity has been also con-firmed by experimental data [53 70 81] and in most cases theperformance shift was found to be within acceptable limits How-ever currently published experimental results investigate a smallnumber of test-cases typically limited to a single TSVdevice ori-entationaxis As was shown in Figures 41 and 42 and discussedin the text the thermomechanical stress distribution is complexand depends on various factors In order to evaluate the impactof TSV thermomechanical stress accurately and under differentconditions more sophisticated test structures are required Suchapproach is attempted in the test structure which is presented inthe following sections
43 test structure implementation
TSV-induced thermomechanical stress can be compressive in somedirections tensile in others and itrsquos effect on device mobilitymay vary depending on MOSFET typeorientationdistance orinteraction with other TSVs To accurately characterize the effectof TSV proximity on MOSFET device performance a test structurewas designed and fabricated on Imecrsquos 65nm 3D process whichattempted to cover as many test-cases as possible
43 test structure implementation 49
Figure 43 Estimated mobility shift on PMOS (a) and NMOS (b) tran-sistors in between two TSVs separated by distance d [65]
The test structure consisted of MOSFET arrays superimposedover TSVs such that MOSFET devices surrounded the TSVs in alldirections at variable distances The purpose of these experimentswas to indirectly observe the thermomechanical stress effect ontransistor mobility by measuring the saturation current of MOSFET
devices at various distances and directions in relation to the TSVsSeveral configurations of TSVs and MOSFET arrays were investig-ated so as to characterize the impact of single and multiple TSVsas well as to observe the effect on different device types
Stacked silicon dies were not fabricated thus all devices wereimplemented on the TOP die Si substrate as illustrated in Fig-ure 32 (page 25) The test structure inputoutput (IO) was ac-cessible through a standard 2 times 12 test pad module on the TOP
die compatible with our wafer probe test system In total two testpad modules were designed for this experiment PTP1 and PTP2implementing NMOS-type and PMOS-type arrays respectively Testpad module PTP1 along with the test structure layout is illus-trated in Figure 44 while a micrograph image of the fabricatedchip can be seen in Figure 45
43 test structure implementation 50
A0
B0
C0
D0
E0
F0
G0
H0
I0
J0
A90
B90
E90
F90
G90
J90
DS3 DF3
DF2 DS0
DS2 DF0
DF1 DS1
SS SF
DEC6 VDD
DEC7 GND
DEC8 VG
DEC4 DEC9
DEC5 GND
DEC2 DEC3
DEC0 DEC1
Arraydecoder
Draindecoder
Figure 44 PTP1 test pad mod-ule (layout)
Figure 45 PTP1 test pad mod-ule (micrograph)
43 test structure implementation 51
Metal2Metal1PolyActive
TSV
MOSFETs
07μm007μm 0deg
07μm007μm 90deg
05μm05μm 90deg
05μm05μm 0deg
Figure 46 MOSFET dimensions and rotations
431 MOSFET arrays
Each MOSFET array was a separate experiment consisting of 37 times37 transistors which could be either NMOS-type or PMOS-typeshort channel (07microm times 007microm) or long channel (05microm times 05microm)as well as at 0ordm or 90ordm rotation in relation to TSVs (Figure 46)
The MOSFET arrays enclosed TSVs in one of five different config-urations 1 TSV (Figure 47) 4 TSVs diagonal (Figure 48) 4 TSVs
lateral (Figure 49) 9 TSVs (Figure 410) and no-TSVs In total therewere 16 configurations for each MOSFET array type (NMOSPMOS)as listed in Table 42 The location of each MOSFET array in thetop-level design is illustrated in Figure 44
The number of available IO test pads on each module was2 times 12 which imposed a limitation for accessing all 21904 (16 times37 times 37) MOSFETs To overcome that restriction only a subset ofthe transistors in each MOSFET array were wired out (16times 16) andthus electrically accessed using digital selection logic The restof the transistors in the 37 times 37 MOSFET array were inserted asdummy devices placed in the gaps to ensure uniform boundaryconditions and good matching between transistors (Figure 411)
Since thermomechanical stress exhibits two-fold rotational sym-metry as discussed in 42 the stress effect is the same for tran-sistors on the same axis whether they are placed eastwest ofthe TSV on the latitudinal axis or northsouth of the TSV on thelongitudinal axis The symmetry of the stress distribution wasexploited in the design of the MOSFET arrays so as to increase theresolution of the experiment despite the limited number of tran-sistors As can be seen in Figure 411 the transistors in the arrayare placed in an irregular grid with a 2microm pitch Some transistorsare placed on odd-number distances (1microm 3microm) in relation tothe TSV while others on even-number distances (2microm 4microm)Nevertheless rotational symmetry allows the evaluation of thethermomechanical stress at all directions with a 1microm resolution
43 test structure implementation 52
TSV
MOSFETs Substrate taps
Figure 47 1 TSV MOSFET array configuration
MOSFETs Substrate taps
TSV TSV
TSV TSV
Figure 48 4 diagonal TSVs MOSFET array configuration
43 test structure implementation 53
MOSFETs Substrate taps
TSV
TSV TSV
TSV
Figure 49 4 lateral TSVs MOSFET array configuration
MOSFETs Substrate taps
TSV TSV TSV
TSV TSV TSV
TSV TSV TSV
Figure 410 9 TSVs MOSFET array configuration
43 test structure implementation 54
Configuration RotationFET
width(microm)
FETlength(microm)
Name
0 0deg 07 007 A0
1 0deg 07 007 B0
4diagonal 0deg 07 007 C0
4lateral 0deg 07 007 D0
9 0deg 07 007 E0
0 0deg 05 05 F0
1 0deg 05 05 G0
4diagonal 0deg 05 05 H0
4lateral 0deg 05 05 I0
9 0deg 05 05 J0
0 90deg 07 007 A90
1 90deg 07 007 B90
9 90deg 07 007 E90
0 90deg 05 05 F90
1 90deg 05 05 G90
9 90deg 05 05 J90
Table 42 MOSFET array configurations for test pad modulesPTP1PTP2
43 test structure implementation 55
2μm 1μm
Metal2Metal1PolyActive
FET
DUMMY
TAP
TSV
Figure 411 MOSFET array detail
432 Digital selection logic
Each MOSFET array was composed of 256 active transistors (16 times16) Test modules PTP1 and PTP2 implemented 16 array con-figurations each with 4096 devices in total (16 con f igurations times256 transistors) which could be accessed individually throughthe 24 IO pads of the 2 times 12 test pad modules Individual tran-sistor Gate Drain and Source terminals were multiplexed tothe IO pads using a combination of decoders and transmission-gates (Figure 412) The transmission-gates were used as analogswitches for current flow while the decoders enabled the corres-ponding transmission-gates for accessing the externally selectedtransistor for measurement
The procedure for selecting a single MOSFET device for meas-urement was as follows
bull Array (0 to 15) was enabled using the 4 minus bit Array-selectdecoder The MOSFET Source terminal (S) common to alltransistors in a single array was electrically connected toIO pads ldquoSFrdquoldquoSSrdquo when enabled
bull MOSFET Gate terminal (G0 to G15) was connected to IO
pad ldquoVGrdquo using the 4 minus bit Gate-select decoder and corres-ponding transmission-gates
bull MOSFET Drain terminal (D0 to D15) was connected to IO
pads ldquoDFnrdquoldquoDSnrdquo using the 2 minus bit Drain-select decoder
43 test structure implementation 56
D0
D15
DF[0-3]
DS[0-3]
Drain-select decoder (2-bit)
SSF
SS
Array-select decoder (4-bit)
VG
Gate-select decoder (4-bit)
G0 G15
MOSFET Array
Figure 412 Digital selection logic
and corresponding transmission-gates Drain terminals werealways enabled in groups of 4 and connected sequentiallyto IO pads ldquoDF0rdquoldquoDS0rdquo to ldquoDF3rdquoldquoDS3rdquo thus the 2 minus bit
Drain-select decoder was adequate for accessing all 16Drain terminals in the array
When a MOSFET is switched on a voltage drop is expectedbetween the IO pads and the transistor DrainSource terminalsdue to the resistance of wires To guarantee accurate voltagedelivery transistor DrainSource terminals are ldquosensedrdquo on theldquoDSnrdquoldquoSSrdquo IO pads so as to ensure that the applied voltageon IO pads ldquoDFnrdquoldquoSFrdquo supplies the desired voltage values onthe transistor terminals This procedure known as four-terminalsensing [29] is typically automated when using laboratory meas-urement equipment such as a source measurement unit (SMU)
A single MOSFET array with its digital selection logic can beseen in layout view in Figure 413 Transistor Gate terminals areselected using a 4minus bit decoder (Figure 414) while the externallyapplied voltage on IO pad ldquoVGrdquo passes through transmissiongates (Figure 415) The transmission gates schematic used forthe DrainSource terminals can be seen in Figure 416 The Drainterminal transmission gates are enabled in groups of 4 usinga 2 minus bit decoder (Figure 417) If a MOSFET array is not in theselected state by the Array-select decoder Gate terminals areforced at GND voltage level (for PTP1) or to VDD voltage (forPTP2) so that non-selected MOSFETs remain turned-off insteadof floating
43 test structure implementation 57
Gatedecoder
Metal2Metal1PolyActive
Transmissiongates (DS)
Transmission gates (G)
MOSFETArray
Figure 413 MOSFET array with digital selection logic
43 test structure implementation 58
BLK
A[0] A[1] A[2] A[3]
DEC[0-15]
Figure 414 4-bit Gate-enableArray-enable decoder
43 test structure implementation 59
DEC
VGGVDD
Transmission gate
Pull-downtransistor
Figure 415 Transmission gate for MOSFET Gate terminals
Transmission gates
FORCE
SENSE
DRAINSOURCE
EN
VDD
Figure 416 Transmission gates for MOSFET DrainSource terminals
43 test structure implementation 60
A[0] A[1]
DEC[0-3]
Figure 417 2-bit Drain-select decoder
44 experimental measurements 61
Configuration Type Name Wafer
1 No-TSVPMOS
(05microm times 05microm)F0 D08
2 1-TSV 0ordm 25 ordmCPMOS
(05microm times 05microm)G0 D08
3 1-TSV 0ordm 25 ordmCPMOS
(05microm times 05microm)G0 D20
4 1-TSV 0ordm 60 ordmCPMOS
(05microm times 05microm)G0 D08
54-lateral TSV 0ordm 25
ordmCPMOS
(05microm times 05microm)I0 D08
6 1-TSV 90ordm 25 ordmCPMOS
(05microm times 05microm)G90 D08
7 1-TSV 0ordm 25 ordmCPMOS (07microm times
007microm)B0 D08
Table 43 MOSFET array configurations selected for experimental eval-uation
44 experimental measurements
Test modules PTP1 and PTP2 implementing the presented MOS-
FET arrays were fabricated on 300mm wafers (Figure A1) eachcontaining 472 identical dies Two wafers were selected for meas-urements referred to as ldquoD08rdquo and ldquoD20rdquo both originating fromthe same lot The test module used for experimental evaluationwas PTP2 containing the PMOS-device arrays which were expec-ted to have the highest performance impact from thermomech-anical stress as was discussed in 42 The specific MOSFET arraysthat were selected for evaluation are listed in Table 43
441 Data processing methodology
For each MOSFET array evaluated the drain-to-source current(IDS) of all transistors in the array was measured in the saturationregion (VDS = 1V VGS = 1V) Subsequently the saturation cur-rent was divided with the median current value of all transistorsin the array generating IDS values normalised to the median Themedian of a data set represents the arithmetic value that the ma-jority of values tend (Appendix B) Since only a small percentageof the 256 transistors in the array are in proximity to the TSVsthe median IDS value represented typical transistor saturationcurrent not affected by thermomechanical stress
Measurement precision is increased with repeated measure-ments as random noise is removed from the experiment reducing
44 experimental measurements 62
ABCDEFBBC
FCDDFBCFFC
FCB
A
BAACADE
CECBBC
CFFCDE
B$BCFFBDBCBF
amp(BFC)B
BC
FCDDFA
CAF
Figure 418 Processing of measurement results
the standard deviation of the sampled mean (Appendix B) Con-sequently each MOSFET array was measured across 98 identicaldies on the wafer and their results were averaged Since the satur-ation current shift due to thermomechanical stress is systematicacross all dies averaging the saturation currents only removedthe random noise effectively increasing the measurement pre-cision by a factor of sim 10 (
radic98) The described procedure for
processing the measurement results is illustrated in Figure 418Each MOSFET array contains 256 transistors which are not
placed in a regular 16times 16 grid as can be seen in Figure 411 Thisapproach was chosen so as to minimise the number of transistorsin an array while allowing for 1microm resolution in the evaluation ofthe thermomechanical stress by taking advantage of its two-foldrotational symmetry as discussed in 431 However illustratingmeasurements in a contour graph requires a regular grid of dataThus for illustration purposes the experimental measurementswere reconstructed to a 33 times 31 regular grid (1023 transistors)with the intermediate values estimated using linear interpolation(Figure 419)
442 Measurement setup
For IO access a general purpose on-wafer measurement sys-tem was used that allowed direct contact with the wafer and the2times 12 test pad modules through probecard pins (Figure A2) IO
signal generation and measurement was provided by standard
44 experimental measurements 63
ABCDECDBF
ABECBF
AB
AB
Figure 419 Interpolation of measurement results
laboratory instruments controlled through the GPIB port of a PCrunning a customised LabVIEW program Four-terminal sens-ing for MOSFET DrainSource terminals was automated using aKeithley 2602 dual-channel SMU [1] The measurement setup isillustrated in Figure 420
The IO test pads for the PTP1PTP2 modules can be seen inFigure 44 while the instrument connections used for each testpad are summarised in Table 44
ABCCDE
F
AA
A
DBCB
Figure 420 Measurement setup
44 experimental measurements 64
Test pad Direction Instrument connection
VDD Power Power supply - 1V
GND Power Power supply - 0V
DEC[0-9] InputDigital Output
(GATEDRAINARRAY SelectDecoders Input)
DF[0-3] Input SMU (Drain Force terminal)
DS[0-3] Output SMU (Drain Sense terminal)
SF Input SMU (Source Force terminal)
SS Output SMU (Source Sense terminal)
VG Input Digital Output (Gate voltage)
Table 44 PTP1PTP2 test pad module instrument connections
443 PMOS (long channel)
No-TSV configuration
This MOSFET array does not contain any TSVs thus transistormobility is not affected by thermomechanical stress The arraywas measured on 98 identical dies of wafer ldquoD08rdquo the saturationcurrents normalised and then averaged over all measured diesusing the procedure elaborated in 441 In Figure 421 the norm-alised saturation current for each transistor in the array can beseen after interpolation to a 33 times 31 grid Each coordinate in thefigure represents a 1microm distance
As can be observed the majority of transistors in the array havealmost identical saturation currents as expected However there isa distinctive stripe running from north to south where the currentvalue is reduced up to sim 5 This revealed a flaw in the designof the MOSFET arrays that can be best observed in Figure 411The transistors that are located adjacent to the substrate tap donot perceive the same boundary conditions as the rest of thetransistors in the array That difference is enough to affect thephysical characteristics of these transistors during fabricationwhich in turn determine transistor parameters such as saturationcurrent This flaw stresses out the significance of careful boundarycondition design when targeting for good transistor matching ina test structure
1 TSV configuration
In this configuration long-channel MOSFETs (05microm times 05microm) areplaced in the periphery of a single TSV (Figure 47) The transistorsin the array were measured over 98 identical dies of wafer ldquoD08rdquo
44 experimental measurements 65
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
096
097
098
099
100
101
Figure 421 Normalised on-current for PMOS structure ldquoF0rdquo (D08)
and processed according to 441 The normalised saturation cur-rent for each transistor in the array can be seen in Figure 422 afterinterpolation to a 33 times 31 grid It can be clearly observed that thetransistors in proximity to the TSV show a distinctive variationin their saturation current in comparison to the other transist-ors in the array which is the direct consequence of TSV-inducedthermomechanical stress These results also demonstrate that theproposed test structure design has enough measurement preci-sion to distinguish thermomechanical stress from backgroundnoise and variability The north to south stripe observed in theprevious configuration due to boundary condition mismatch isalso present in these measurements
To observe the TSV effect on nearby transistors more accuratelythe normalised saturation current versus distance is plotted foraxis X = 16 and Y = 14 (Figure 423) which cross the TSV almostat its the centre In the graph only actual measurements are plot-ted prior to interpolation We observe that the saturation currentof a transistor at 1microm distance from the TSV increases sim 20 onthe latitudinal axis and decreases sim 15 on the longitudinalwhile the effect is almost negligible after 8microm
The same measurement is repeated on wafer ldquoD20rdquo In Fig-ure 424 the normalised saturation current for each measuredtransistor in the array can be seen after interpolation to a 33 times 31grid Also the normalised saturation current versus distance isplotted for axis X = 16 and Y = 14 (Figure 425) In the graph
44 experimental measurements 66
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
09
10
11
12
105
110
115
120TSV
Figure 422 Normalised on-current for PMOS structure ldquoG0rdquo (D08)
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
085
09
095
1
105
11
115
12
125Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 423 Normalised on-current for PMOS structure ldquoG0rdquo(D08Y=14X=16)
44 experimental measurements 67
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
09
10
11
12
13
TSV
Figure 424 Normalised on-current for PMOS structure ldquoG0rdquo (D20)
we observe that the saturation current of a transistor at 1microm dis-tance from the TSV increases sim 30 on the latitudinal axis anddecreases sim 18 on the longitudinal while the effect is almostnegligible after 8microm These results show that thermomechanicalstress can be quite different at distances very close to the TSV
even in between wafers originating from the same lotFinally the measurement is repeated on wafer ldquoD08rdquo after
increasing the temperature to 60 ordmC Normalised current contourplot can be seen in Figure 426 and measurements for X = 16and Y = 14 in Figure 427 The saturation current of a transistor
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
09
1
11
12
13
14
Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 425 Normalised on-current for PMOS structure ldquoG0rdquo(D20Y=14X=16)
44 experimental measurements 68
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
090
095
100
105
110
115TSV
Figure 426 Normalised on-current for PMOS structure ldquoG0rdquo (D0860˚C)
at 1microm distance from the TSV increases sim 16 on the latitudinalaxis (minus4 in comparison to 25 ordmC) and decreases sim 11 on thelongitudinal (minus4 in comparison to 25 ordmC) while the effect isalmost negligible after 6microm (minus2microm in relation to 25 ordmC)
4 lateral TSV configuration
In this configuration the thermomechanical stress is evaluatedin the lateral direction between 4 TSVs The array is measuredover 98 identical dies of wafer ldquoD08rdquo at 25 ordmC The contour plot ofthe normalised saturation current can be seen in Figure 428 andmeasurements for X = 16 and Y = 14 in Figures 429 and 430
respectively As can be observed the saturation current of atransistor at 1microm distance from the TSV increases sim 18 on thelatitudinal axis (minus2 in relation to 1 TSV) and decreases sim 12 onthe longitudinal (minus3 in relation to 1 TSV) while TSV interactionreduces the effect range to 5microm (minus3microm in relation to 1 TSV)
90˚ rotation TSV configuration
In this configuration the orientation of the transistor channel inrelation to the thermomechanical stress is rotated to 90˚ Thearray is measured over 98 identical dies of wafer ldquoD08rdquo at 25 ordmCThe contour plot of the normalised saturation current can beseen in Figure 431 and measurements for X = 16 and Y = 14in Figure 432 As was expected the stress effect on mobility has
44 experimental measurements 69
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
085
09
095
1
105
11
115
12Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 427 Normalised on-current for PMOS structure ldquoG0rdquo (D08Y=14 X=16 60˚C)
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
085
090
095
100
105
110
115
TSV
TSV TSV
TSV
Figure 428 Normalised on-current for PMOS structure ldquoI0rdquo (D08)
44 experimental measurements 70
2 1 1 2 3 4 5 6 7 7 6 5 4 3 2 1 1 2 309
095
1
105
11
115
12
X16
Distance (μm)
Normalized
Ion
TSV TSV
Figure 429 Normalised on-current for PMOS structure ldquoI0rdquo (D08X=16)
3 2 1 1 2 3 4 5 6 7 8 8 7 6 5 4 3 2 1 1 208208408608809092094096098
1102
Y14
Distance (μm)
Normalized
Ion
TSV TSV
Figure 430 Normalised on-current for PMOS structure ldquoI0rdquo (D08Y=14)
44 experimental measurements 71
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
090
095
100
105
110
115
120TSV
Figure 431 Normalised on-current for PMOS structure ldquoG90rdquo (D08)
been reversed between the two axis The saturation current of atransistor at 1microm distance from the TSV decreases sim 13 on thelatitudinal axis and increases sim 20 on the longitudinal whilethe effect is almost negligible after 6microm
444 PMOS (short channel)
In this configuration short-channel MOSFETs (07microm times 007microm) areplaced around a single TSV (Figure 47) The contour plot of thenormalised saturation current can be seen in Figure 433 and
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
08509
0951
10511
11512
125Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 432 Normalised on-current for PMOS structure ldquoG90rdquo (D08Y=14 X=16)
44 experimental measurements 72
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
094
096
098
100
102
104
106
108
110
TSV
Figure 433 Normalised on-current for PMOS structure ldquoB0rdquo (D08)
measurements for X = 16 and Y = 14 in Figure 434 At at 1microm
distance from the TSV the saturation current decreases 7 onthe longitudinal axis (minus8 in relation to long channel) while onthe latitudinal axis no stress effect could be detected That couldbe attributed either to short-channel effects or malfunctioningof the specific transistor due to proximity to the TSV and theexperimental nature of the manufacturing process On both axisthe effect is almost negligible after 5 minus 6microm
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
085
09
095
1
105
11
115Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 434 Normalised on-current for PMOS structure ldquoB0rdquo (D08Y=14 X=16)
45 summary and conclusions 73
Size
Waf
er
Con
figu
rati
on
Rot
atio
n
Eff
ect
1micro
mla
t
Eff
ect
1micro
mlo
ng
KO
Z
L D08 1-TSV (25 ordmC) 0ordm +20 minus15 8microm
L D20 1-TSV (25 ordmC) 0ordm +30 minus18 8microm
L D08 1-TSV (60 ordmC) 0ordm +16 minus11 6microm
L D08 4-lat TSV (25 ordmC) 0ordm +18 minus12 5microm
L D08 1-TSV (25 ordmC) 90ordm minus13 +20 6microm
S D08 1-TSV (25 ordmC) 0ordm sim 1 minus7 8microm
Table 45 Summary of experimental measurements on long (L) andshort (S) channel PMOS devices
45 summary and conclusions
In this chapter a test structure was presented for monitoringthe impact of TSV-induced thermomechanical stress on MOSFET
device performance The test structure implemented arrays ofMOSFET devices superimposed with TSVs in various configura-tions Digital selection logic allowed access to the MOSFET deviceswhich could be used to indirectly monitor thermomechanicalstress by measuring the saturation current of transistors
The structure was implemented in Imecrsquos experimental 65nm
3D process and used to characterize the process technology whileevaluating the test structure Experimental measurements werecarried on PMOS devices and the thermomechanical stress wasfound to be on par with simulation estimations
In Figure 45 the measurements are summarised for the latit-udinal and longitudinal axis at 1microm distance from the TSV alongwith the recommended keep-out-zone (KOZ) The KOZ is indic-ative and represents the distance where the thermomechanicalstress effect on saturation current is almost negligible (lt 1) Ina practical case KOZs will depend on the tolerated saturation cur-rent mismatch between devices which is a direct consequence oftiming and other circuit constraints Typically digital circuits cantolerate and thus function properly with much more transistorvariability than analog circuits
Comparing the experimental results of this work to other pub-lished data in the literature reveals that the thermomechanicalstress effect varies dramatically with different process character-istics Experimental measurements reported by Cho et al [17]confirm that long channel devices are more sensitive than short
45 summary and conclusions 74
channel however both the saturation current shift and KOZs areconsiderably reduced to 2 and lt 2microm respectively In additionthe dependency of the stress effect on TSV position was not ob-served by the authors in contrast to the measurements presentedin this work and predictions by theoretical models Contradictoryresults between theory and experiments and between differentexperiments emphasise the need for better understanding of thespecific mechanisms involved in the TSV-induced thermomech-anical stress
Evaluation of the measurements results allowed for some con-clusions to be made on the effectiveness of the presented teststructure
1 It is desirable for transistors in the array to be in a regu-lar grid so as processing of the measurement data is lesscomplex
2 The boundary conditions of transistors have an consider-able effect on transistor matching and thus measurementaccuracy
3 Distances of interest are generally lt 8microm therefore thearrays could be smaller or more dense in the area in prox-imity to the TSV than further away
4 The experimental process used to implement the test struc-ture had low yield which stressed out the need for mech-anisms to detect faults in the digital selection logic
The experiences learned from this project can be potentially usedfor designing improved thermomechanical stress characterizationstructures for next generation TSV processes In a future workit would be interesting to perform experimental measurementson NMOS devices which were not evaluated in this project Fur-thermore the cause of the glitch measured on the short-channeltransistor nearest to the TSV should be further investigated
In the following chapter the potential of the energy-recoverytechnique is investigated for reducing the energy dissipation ofTSV interconnects in future 3D-SoC designs implementing a largenumber of TSVs
5E VA L U AT I O N O F E N E R G Y- R E C O V E RY F O R T S VI N T E R C O N N E C T S
51 introduction
TSV-based 3D-ICs enable low-parasitic direct connections betweenfunctional blocks in a SoC by replacing traditional long horizontal2D global interconnects with short vertical ones (21 page 5)Compared to 2D interconnects the reduced resistance-capacitance(RC) parasitics of TSVs improve both speed (RC) and energy(12CV2
DD) performance of circuits However the TSV parasiticcapacitance may still become an important source of energydissipation in large densely interconnected 3D-SoCs since thecombined capacitance and thus the energy required to drive TSVs
increases linearly with the number of tiers and interconnections(Figure 51)
The principal technique for reducing energy dissipation inCMOS has traditionally been based on supply voltage scaling[14] though this represents a technical challenge below the 65nm
node since PVT variations and physical limits decrease in practicethe range over which voltages may be scaled An alternativelow-power design technique is energy-recovery logic [31] whichdemonstrates frequency-dependent energy dissipation and thusdoes not suffer from the limitations of voltage scaling The originsand operating principle of energy-recovery logic were discussedin detail in 22 (page 13)
In this chapter the potential of the energy-recovery techniqueis investigated for reducing the energy dissipation of TSV inter-connects in 3D-ICs The total energy dissipation per cycle andoptimum device sizing are extracted for the proposed schemeusing theoretical modelling while the configuration is evaluatedagainst conventional static CMOS Some of the presented methodsand results have appeared previously in Asimakopoulos et al[7]
52 energy-recovery tsv interconnects
As was discussed in 22 (page 13) energy-recovery circuits con-serve energy by restricting current-flow across resistances andsubsequently recovering part of the supplied energy using time-variable power sources The energy dissipated in an energy-
75
52 energy-recovery tsv interconnects 76
1XTSV
Tier1
TSV
Tier0LOGIC 0
LOGIC 0
TSV
LOGIC 0
LOGIC 0
TSV
LOGIC n
LOGIC n Tier1
Tier0
TSV
LOGIC 0
LOGIC 0
TSV
LOGIC n
LOGIC n
Tier(m-1)
Tier(m)
mXnXTSVFigure 51 TSV energy dissipation per cycle increase with 3D integra-
tion density
recovery circuit becomes proportional to the charging period (T)of the capacitive load (CL) [82 60]
EER prop
(
RCL
T
)
CLV2DD (51)
In comparison a conventional CMOS circuit driving an equi-valent capacitive load would dissipate during the same chargingperiod T [59]
ECMOS =12
CLV2DD (52)
Combining Equations 51 and 52 we calculate the energydissipation relation between the two circuits
EER
ECMOSprop
2RCL
T(53)
From Equation 53 we observe that the conditions under whichenergy-recovery conserves energy when compared to conven-tional CMOS are when either the charging period (T) is large orthe resistancecapacitance (RCL) of the load is small
In a 3D-IC TSV parasitics in terms of resistance-capacitance (RC)are inherently low by design as was discussed in 21 (page 5)In fact as was demonstrated by Katti et al [27] resistance isnegligible and thus the electrical behaviour of a TSV in a 3D-IC
52 energy-recovery tsv interconnects 77
Tier2Tier1TSV
LOGIC 1
LOGIC 2
driver
TSV
LOGIC 1
LOGIC 2
CMOS
driver
Convert
Energy-recovery
PCG
Figure 52 Comparison of a typical CMOS TSV driver to the proposedenergy-recovery
can be modelled as a single capacitance due to its predomin-ant impact on energydelay Therefore and based exclusivelyon Equation 53 energy-recovery has an advantage over con-ventional static CMOS when used for driving low-parasitic TSV
interconnectsThe differences between a typical CMOS driver in a TSV-based
3D-IC and the energy-recovery scheme proposed in this chapterare illustrated in Figure 52 A logic block in Tier1 generates adigital output which is regenerated by a driver and receivedby a second logic block in Tier2 after passing through the TSVIn the proposed energy-recovery scheme as an intermediatestep the digital output is converted to an adiabatic signal beforepassing through the TSV so as to take advantage of the low-power characteristics of energy-recovery Compatibility betweenthe receiving digital logic block in Tier2 and energy-recovery isretained by restoring the adiabatic signal back to its digital form
Converting the digital signal to an adiabatic as depicted inthe simplified diagram of Figure 52 requires the use of a spe-cialised driver that can combine (mix) the time-variable signalof the power-clock generator (PCG) with a digital output Signalsgenerated by a PCG return to zero at the end of a cycle resem-bling the operation of a digital clock (222 page 18) In contrastdigital outputs are typically allowed to have arbitrary values fromone cycle to the next thus a special structure is required thatcan encode the PCG generated signal to arbitrary binary values
52 energy-recovery tsv interconnects 78
IN IN
ININ
OUT OUT
IN
IN
OUT
OUT
Figure 53 Adiabatic driver
Figure 54 A simple pulse-to-level converter (P2LC)
An example of such structure is the adiabatic driver [8 84] (Fig-ure 53) Adiabatic drivers are composed of a pair of transmissiongates switching onoff alternately based on the value of a digitalinput On each cycle the time-variable adiabatic signal is redirec-ted either to the driverrsquos inverting or non-inverting output andas a result the data carried by the adiabatic signal are dual-railencoded
The signals at the outputs of the adiabatic driver can be restoredto their digital form after passing through the TSV by convertingthe dual-rail encoded sinusoidal pulses back to level signals usinga pulse-to-level converter (P2LC) [72] (Figure 54)
For generation of the time-variable adiabatic signal a resonantPCG may be used such as the one discussed in 222 (page 18)A resonant PCG generates continuous sinusoidal pulses approx-imating the idealised (linear) adiabatic signal of Figure 52 Theadiabatic driverrsquos resistance (RTG) along with the capacitive load(CL) assume a crucial role in the generation of the resonant si-nusoidal pulse as their presence in the signal path along with
53 analysis 79
VDD2
Rind
RTG
AdiabaticDriver
CL
M1
T=1f
L ind
RTG
CL
Figure 55 Resonant pulse generator with dual-rail data encoding
Tier2
Tier1
f-fPULSE
VDD2CLK IN
Delay1CLK1
CLK0
Delay2CLK2OUT1
Data In
Data Out
CLKRES
TSV TSV TSV
CLKBufferAdiabatic
Driver
P2LC
f-f
Data1
M1
Figure 56 Energy-recovery scheme for TSV interconnects
a resonant-tank inductor form an RLC oscillator [44] (Figure 55)resonating at
f =1
2πradic
LindCL(54)
A detailed block diagram of the proposed energy-recoveryscheme including the PCG and dual-rail encoding is illustratedin Figure 56
53 analysis
The energy dissipation relation depicted in Equation 51 is basedon the assumption that an idealised (linear) adiabatic signalis used for charging the TSV capacitance Furthermore energy
53 analysis 80
losses occurring in the additional components used in the realisticenergy-recovery circuit illustrated in Figure 56 are not consideredin Equation 51 The purpose of this section is to analyse andmodel all possible sources of energy dissipation in the proposedenergy-recovery scheme
The bulk of the energy dissipation in the energy-recovery cir-cuit of Figure 56 will occur in the adiabatic driver (EAD) theinductorrsquos parasitic resistance (Rind) transistor M1 (EM1) and theP2LC (EP2L) Therefore the total energy dissipated per cycle inthe energy-recovery circuit would be
EER = EAD + Eind + EM1 + EP2L (55)
Parameters EAD Eind and EM1 are theoretically derived in thefollowing subsections while EP2L depends on the specific P2LCs
architecture used and could be easily extracted from simulationdata as will be shown later in this chapter
531 Adiabatic driver
There are two sources of energy dissipation in the adiabaticdriver of Figure 53 adiabatic dissipation caused by current flow-ing through the transmission gate resistance (RTG) and CMOS
dissipation caused by external stages driving the transmissiongate and pull-down NMOS transistor input capacitances
The sinusoidal pulses generated by the PCG chargedischargethe load capacitance (CL) through the transmission gate resistance(RTG) The full chargedischarge cycle of the load capacitanceconcludes during a period T = 1 f while the energy dissipatedon the transmission gate resistance (ETG) during that period isadiabatic and thus proportional to Equation 22 Appending acorrection factor ξ equal to π28 for a sine-shaped current [8 76]and multiplying the equation by 2 to calculate energy dissipationfor the full cycle
ETG = 2ξ
(
RTGCL12 T
)
CLV2DD =
π2
2(RTGCL f )CLV2
DD (56)
The cross-coupled PMOS pair of Figure 53 reduces the transmis-sion gate input capacitance by 12 when compared to the simpleadiabatic driver of Figure 215 (page 18) External CMOS stagesare not required to drive the PMOS pair however the PMOS gatecapacitance will appear as an additional capacitive load at theoutput of the driver Energy will still be dissipated for driving thePMOS pair but will be adiabatic and thus more energy efficient
53 analysis 81
Transmission gates are typically sized to have equal NMOS andPMOS parts as increasing the size of the PMOS only slightly im-proves the gate resistance while significantly increasing the inputcapacitance [78] Thus assuming both NMOSPMOS transistorsare equally sized to Wn the additional capacitive load appearingat the driverrsquos output due to the PMOS pair would be one gatecapacitance Cn In addition drainsource diffusion capacitance(CD) will be an important portion of the load since in each cycle6CD will be present in the current flow path (4 contributed bythe ON and 2 by the OFF transmission gate) Including the TSV
capacitance at the output of the adiabatic driver the combinedload capacitance seen by the PCG in any cycle would be
CL = CTSV + Cn + 6CD (57)
The transmission gate resistance (RTG) can be related to thegate capacitance by a ldquodevice technology factorrdquo (κTG) [43] whichwe can define for calculation convenience
κTG = RTGCn rArr RTG =κTG
Cn(58)
Assuming that the pull-down NMOS transistors of Figure 53are small CMOS dissipation is primarily the result of driving theNMOS part of the transmission gates which is CnV2
DD Combiningthe CMOS dissipation with Equations 56 57 and 58 gives thetotal dissipated energy per cycle in the adiabatic driver
EAD = CnV2DD +
π2
2κTG
Cnf [CTSV + Cn + 6CD]
2 V2DD (59)
The second term of Equation 59 has a consistent contributionon the energy dissipation in every cycle while the first term isdependent on data switching activity (D) We can also furthersimplify this equation by defining the diffusion capacitance asa fraction of the transistors input capacitance (CD = bCn) andassigning term π2
2 κTG f to a variable (y = π2
2 κTG f ) Equation 59then becomes
EAD = D middot CnV2DD +
y
Cn[CTSV + (6b + 1)Cn]
2 V2DD
=
[
Cn
(
D
y+ (6b + 1)2
)
+1
CnC2
TSV
]
middot yV2DD
+(12b + 2)CTSV middot yV2DD (510)
Since in Equation 510 Cn is the free parameter the first twoterms are inversely proportional and thus EAD is minimised when
53 analysis 82
they become equal Solving equation Cn
(
Dy + (6b + 1)2
)
= 1Cn
C2TSV
for Cn the optimum transmission gate input capacitance is calcu-lated for a given CTSV
Cn(opt) =
radic
[D
y+ (6b + 1)2]minus1CTSV (511)
Combining Equations 510 and 511 gives the energy dissipationof an optimally sized adiabatic driver
EAD(min) =
[radic
D
(6b + 1)2y+ 1 + 1
]
middot (12b+ 2)yCTSVV2DD (512)
532 Inductorrsquos parasitic resistance
The inductorrsquos (Lind) parasitic resistance Rind is proportional tothe Qind factor which typically depends on the inductorrsquos imple-mentation technology For the purposes of this analysis it can beestimated as
Qind =1
Rind
radic
Lind
CLrArr Rind =
1QindCL2π f
(513)
Energy dissipation in Rind is adiabatic and can be estimatedcombining Equations 56 and 513
Eind =π2
2(RindCL f )CLV2
DD =π
4CL
QindV2
DD (514)
533 Switch M1
Transistor M1 is switched-on briefly to recover the energy dissip-ated each cycle in the Rtotal = RTG + Rind Its energy dissipationis a trade-off between dissipation in its on-resistance RM1 andinput capacitance CM1
Since M1 is a fairly large transistor previous ratioed stages willhave a significant contribution to the energy consumption as wellFor that reason a m factor is used to compensate for the additionallosses IM1(rms) is the root mean square (RMS) current passingthrough the transistor while turned-on and VGM1 is the peak gatevoltage A methodology for deriving optimum values for boththese parameters is proposed in [44] and the total dissipatedenergy in M1 can be estimated as
EM1(min) = 2IM1(rms)VGM1
radic
mκM1
f(515)
54 simulations 83
Delay2
CLK1
CLK2Delay1
CLKW
PULSE
CLK0T
PD
OUT1
PULSE
P2L
RES
CLK
Figure 57 Energy-recovery timing constraints
534 Timing Constraints
The gate of transistor M1 in Figure 55 is controlled by a pulsewith period TCLK and width WPULSE For optimum energy ef-ficiency the digital inputs of the adiabatic driver in Figure 53should switch during the time period defined by WPULSE whilethe input values must be valid for the complete resonant clockcycle as shown in Figure 57
The adiabatic driverrsquos input data latching is controlled by clocksignal CLK1 (Figure 56) and thus its positive edge should takeplace during WPULSE (Delay1 lt WPULSE) If these constraints arenot met energy that could be otherwise recovered by the inductorwill be dissipated on M1 during WPULSE The clock signal at thereceiving end CLK2 does not have the same constraint howeveritrsquos positive edge should occur before the P2LC outputs new data(Delay2 lt WPULSE + PDP2L)
54 simulations
To verify and evaluate the proposed theoretical models a testcase was designed for which energy dissipation was calculatedusing both theory and SPICE simulations The test case assumed1 minus bit of information per cycle was transferred between 2 tiersof a 3D-IC in which the TSV capacitance (CTSV) was 80 f F TheTSV was driven using the energy-recovery technique proposedin this chapter implemented with a single adiabatic driver fordata encoding and an inductor (Lind Rind) for generation of the
54 simulations 84
M1IN IN
IN
OUT OUT
IN
Figure 58 Test case for the evaluation of the theoretical model
Q250MHz 500MHz 750MHz
T S e () T S e () T S e ()
6 360 409 -120 476 516 -78 578 613 -57
8 308 347 -114 417 449 -70 515 543 -52
10 276 312 -115 382 411 -70 477 498 -43
12 255 289 -119 358 386 -71 451 472 -44
Table 51 Energy dissipation results ( f Jcycle) theoretical model (T)versus SPICE simulation (S)
adiabatic signal and energy recovery The circuit used for the testcase is illustrated in Figure 58
The energy dissipation on the adiabatic driver inductor andtransistor M1 were calculated using Equations 512 514 and 515with the Q factor and operating frequency acting as free variablesTechnology-specific parameters κTG κM1 and CD were extractedusing simulation models for Imecrsquos 130nm 3D process (Table 22page 11) The same circuit was simulated on Cadence VirtuosoSpectre simulator under identical conditions The energy dissipa-tion results for various Q factors and operating frequencies arelisted in Table 51
The calculated and simulated energy dissipation results arealso plotted in Figure 59 As can be observed theoretical es-timations are lower than than the equivalent results reported bythe simulator in all the cases examined That can be expected asthe calculations performed by the simulator are thorough andconsider much more details that are skipped in the theoreticalmodels Nevertheless there is good correlation between theoryand simulations
In Figure 510 the error in the estimation of the energy dissip-ation is plotted when using the theoretical model The Q factordoes not seem to have a significant effect on the error howeverthere is a considerable dependence on the operating frequency
54 simulations 85
5 6 7 8 9 10 11 12 130
10
20
30
40
50
60
70
T250 S250 T500 S500 T750 S750
Q factor
Energydissipation(fF)
Figure 59 Energy dissipation results ( f Jcycle) theoretical model (T)versus SPICE simulator (S)
5 6 7 8 9 10 11 12 13
-14
-12
-10
-8
-6
-4
-2
0
250MHz 500MHz 750MHz
Q factor
Error()
Figure 510 Energy dissipation estimation error of the theoretical modelcompared to the SPICE simulator
The parameters κTG κM1 and CD that were extracted for thespecific technology even though they were used as constants inthe models in practice they are frequency dependent Thereforeit could be said that based on the parameters chosen the theor-etical results are optimised for a specific frequency Increasingfrequency doesnrsquot necessary result in reduced error as seemsto be suggested by the graphs in Figure 510 In fact it is ex-pected that the estimation error will show a positive sign forlarger frequencies As was discussed previously in the text it isa reasonable expectation for the theoretical models to slightlyunderestimate the energy-dissipation
54 simulations 86
M1
TSV
TSV
TSV
Data in OutCMOS
OutER
Energy-recovery
CMOS
P2LC
Adiabaticdriver
Figure 511 Test case for comparison of energy-recovery to CMOS
541 Comparison to CMOS
When comparing different circuits it is typically desirable to takea holistic approach evaluating various aspects such as powerarea speed cost etc The purpose of this section is to comparethe energy-recovery scheme to a conventional static CMOS circuitusing just the theoretical models presented in this chapter Thisbasically limits the comparison to the energy dissipation andspeed parameters while on Chapter 6 (page 92) area is alsoconsidered The test case used for the comparison is illustratedin Figure 511 The TSV load capacitance was assumed to be 80 f F
for both circuitsThe energy dissipation of the CMOS circuit in Figure 511 is
described by Equation 52 Appending the data switching activity(D)
ECMOS = D middot 12
CLV2DD (516)
Equation 516 obviously calculates the theoretical minimumenergy dissipation since no parasitics and previous CMOS stagesare included In practice the energy dissipation of the CMOS
circuit can be much larger however Equation 516 is consideredadequate for the purposes of this comparison
Energy dissipation of the energy-recovery circuit is calculatedusing Equation 55 The energy contribution of each sub-componentthe adiabatic driver (EAD) the inductorrsquos parasitic resistance(Rind) and transistor M1 (EM1) were theoretically derived pre-viously in the text The P2LC energy was simulated in SPICEfor convenience The simple circuit topology of Figure 54 wasslightly improved adding the PMOS transistors seen in Figure 512so as to reduce short-circuit current when the input is in the range
54 simulations 87
Figure 512 Improved P2LC design
0 100 200 300 400 500 600 700 800 9000
2
4
6
8
10
12
14
Frequency (MHz)
Energy
dissipation(fJ)
Figure 513 Simulated energy dissipation of improved P2LC design
Vth rarr (VDD minus Vth) The energy dissipation results extracted us-ing SPICE can be seen in Figure 513 As can be observed theenergy dissipation is frequency dependent something not expec-ted in a CMOS circuit This behaviour is caused by the slow risingsinusoidal inputs which for low frequencies extend the periodduring which both PMOS and NMOS are on and thus increaseshort-circuit current
The Q factor is a critical parameter in the energy efficiency ofthe energy-recovery scheme since the entire circuit is basically anRLC oscillator A wide range of inductors are available as discretecomponents for use in circuits with Q factors depending on thesize and material On-chip integrated inductors have more re-stricted performance however their compact size is an advantageIntegrated Q factors are typically in the neighbourhood of 10for nH range inductors with sim 100microm radius [5] The Q factor
54 simulations 88
4 6 8 10 12 14 16 18 20 22-20
-10
0
10
20
30
40
50
60
800MHz 700MHz 600MHz 500MHz 400MHz 300MHz 200MHzQ factor
Energyred uction()
Figure 514 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS for variable Q factor(C = 80 f F)
4 6 8 10 12 14 16 18 20 220050
100150200250300350400450
Inductor M1 Adiabatic driver P2LC
Q factor
Energy
con trib
ution(
)
Figure 515 Energy contribution of energy-recovery circuit componentsat 200MHz (C = 80 f F)
values in the following simulations were deliberately chosen tobe in realistic ranges for integrated inductors
In Figure 514 the percentage reduction in energy dissipationachieved by the energy-recovery circuit is plotted when comparedto CMOS As expected energy efficiency of the energy-recoverycircuit improves with high Q factors and low frequencies
In Figures 515 and 516 the energy dissipation contributionof each component in the energy-recovery circuit can be seenat 200MHz and 800MHz respectively In both cases the energydissipated on the inductorrsquos resistance is reduced with higher Q
factors At lower frequencies the P2LC energy dissipation becomesa significant part of the total
In Figure 517 the percentage reduction in energy dissipationis plotted when the TSV capacitance (CTSV) is variable The en-ergy dissipation in the adiabatic driver (Equation 512) inductor(Equation 514) and transistor M1 (Equation 515) are linearlyrelated to the TSV capacitance Since the same applies to the CMOS
circuit (Equation 516) relative energy dissipation of both circuitsshould be constant with TSV capacitance The change observed in
54 simulations 89
4 6 8 10 12 14 16 18 20 2200
100
200
300
400
500
600
700
Inductor M1 Adiabatic driver P2LC
Q factor
Energy
con trib
ution(
)
Figure 516 Energy contribution of energy-recovery circuit componentsat 800MHz (C = 80 f F)
20 40 60 80 100 120 140 160 180 200 220-100
00
100
200
300
400
500
Q6 Q10 Q14 Q18
TSV capacitance (fF)
Energy
red u
ction(
)
Figure 517 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS with variable TSVcapacitance (C = 80 f F f = 500MHz)
Figure 517 is caused by the constant energy dissipation contribu-tion of the P2LC However as the load capacitance increases theenergy dissipation on the P2LC becomes insignificant in relationto the rest of the circuit components thus the reduction in energyis expected to asymptotically approach a maximum value
Switching activity (D) can also be a significant factor affect-ing energy performance Since in the energy-recovery circuit thesinusoidal oscillation may not be halted all capacitances in thecurrent flow path will charge and discharge on each cycle regard-less of data activity In contrast static CMOS ideally dissipatesenergy only when switching and thus the energy-recovery circuithas a disadvantage at low switching activities In Figure 518 theestimated effect of the switching activity on energy performanceis plotted for an operating frequency of 200MHz
Since the technology factors κTG and κM1 were extracted for a130nm process reducing their value by 12 could also provide uswith an estimation of the circuitrsquos energy performance for a 65nm
node The result in plotted in Figure 519 and as can be observedtechnology scaling has a positive effect on energy dissipationwhen compared to conventional static CMOS
54 simulations 90
4 6 8 10 12 14 16 18 20 22-40
-20
0
20
40
60
80
D=05 D=07 D=09 D=1
Q factor
Energyred uction()
Figure 518 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS (C = 80 f F f =200MHz)
4 6 8 10 12 14 16 18 20 220
10
20
30
40
50
60
130nm 65nm
Q factor
Energyred uction()
Figure 519 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS for variable technology(C = 80 f F f = 500MHz)
55 summary and conclusions 91
55 summary and conclusions
The work presented in this chapter investigated the potential ofthe energy-recovery technique as used in adiabatic logic andresonant clock distribution networks for reducing the energy dis-sipation of TSV interconnections in 3D-ICs The proposed energy-recovery scheme for TSVs was analysed using theoretical model-ling generating equations for energy dissipation and optimumdevice sizing The model accuracy was evaluated against SPICEsimulations which showed good correlation with the theoreticalestimations on a 130nm technology process
The energy-recovery scheme was compared to conventionalstatic CMOS when both circuits driving equivalent TSV load ca-pacitances under similar conditions The analysis revealed theenergy performance dependence on multiple circuit parametersthe Q factor operating frequency switching activity and TSV capa-citance The results demonstrated favourable energy performancefor the energy-recovery scheme for low operating frequenciesand high Q factors switching activities and TSV capacitancesFurthermore an estimation was provided on the energy perform-ance of the proposed scheme in an advanced technology nodewhich predicted further energy improvement
Encoding the PCG signal to arbitrary digital values using thedual-rail adiabatic driver (Figure 53) is only one of the possiblesolutions In fact energy could be saved if instead of dual-railanother encoding method is used such as 1-out-of-4 [74] In 1-out-of-4 encoding 4 outputs are used to represent 2-bits however onlyone of the outputs is charged at any given cycle In comparisondual-rail charges two outputs per cycle for the same number ofbits and thus 1-out-of-4 could potentially be saving up to 50energy when compared to dual-rail 1-out-of-4 encoding couldprove an interesting extension to the proposed energy-recoveryscheme
The theoretical models developed in this work can be used asa quick alternative to SPICE simulations for design-space explor-ation This is shown in the following chapter where the theoryestablished on the energy-recovery TSV scheme is used for design-ing a low-power 3D demonstrator circuit that was fabricated on a130nm process
6E N E R G Y- R E C O V E RY 3 D - I C D E M O N S T R AT O R
61 introduction
In the previous chapter a theoretical approach was taken toinvestigate the potential of the energy-recovery technique forreducing the energy dissipation of TSV interconnects in 3D-ICs Thetotal energy dissipation per cycle and optimum device sizing wereextracted for the proposed scheme using theoretical modellingwhile analysis revealed the design parameters affecting energyperformance
Practical design issues such as wire parasitics signal distri-bution area etc were not considered hence in this chapter a3D demonstrator circuit based on the energy-recovery scheme isdesigned under realistic physical and electrical constraints Thedemonstrator is compared to a CMOS circuit designed with thesame specifications using post-layout simulations Both circuitsare fabricated on a 130nm 3D technology process and evaluatedbased on the experimental measurements
62 physical implementation
The demonstrator circuit implemented a 500MHz 80-bit parallelIO channel on a 5-tier TSV-based 3D-IC The TSVs interconnectionswere driven using the energy-recovery scheme as discussed inChapter 5
The target process technology allowed for a low-resistivitybackside re-distribution layer (RDL) to be deposited which wasused to implement a high-Q integrated inductor Stacking of dieswas not possible in the target process so the 5-tier 3D-IC wasemulated by series-connecting TSVs in groups of 4 as is illustratedin Figure 61
The circuit consisted of a total of 640 TSVs (2 times 4 times 80bits) ofwhich half were electrically connected to the integrated inductorduring a single clock cycle The TSV capacitance for the fabrica-tion technology process had already been characterized and wasestimated at sim 40 f F (Table 22 page 11) Optimal device sizingand parameters for the energy-recovery IO circuit at 500MHz
were calculated using the methodology developed in Chapter 5
and are listed in Table 61In Figure 62 the schematic diagram of the energy-recovery
circuit is illustrated The main components comprising the circuitare the adiabatic drivers the pulse-to-level converters (P2LCs)
92
62 physical implementation 93
AB
AB
CDEFD
C
ABCD
EFD
C
Figure 61 Demonstrator circuit structure (showing 1-bit IO channelface-down view)
Adiabatic driver 80-bit circuit
Wn 461microm (Eq511) Rtotal 21Ω ( RTG80 )
RTG 168Ω (Eq58) Ctotal 165pF (80 middot Cload)
CL 207 f F (Eq57) Lind 613nH (Eq54)
Table 61 Energy-recovery IO circuit parameters
62 physical implementation 94
IntegratedInductor
VINDOER
PulseGenerator
S1AS2
DataGenerator
TSV
TSV
P2LCAdiabaticDriver
P2LCAdiabaticDriver TSV
TSV
Buffer
Buffer
1
80
1
80
EN
DecouplingCapacitor
Figure 62 Energy-recovery IO circuit schematic diagram
Metal2Metal1PolyActiveVDD
GND
RES_CLK
DATA
OUT
OUT_B
DATA
OUT_B OUT
RES_CLK
GND
VDD
Figure 63 Adiabatic driver (layoutschematic view)
the data generator the pulse generator transistor M1 and theintegrated inductor All sub-components are discussed in detailin the following paragraphs
621 Adiabatic driver
The adiabatic driver was based on the configuration theoreticallyanalysed in 531 (page 80) The adiabatic driver had 2 inputsfor the resonant pulse (RES_CLK) and the digital data (DATA)signals and 2 outputs for the inverted (OUT_B) and non-inverted(OUT) signals The layout and schematic views of the adiabaticdriver are illustrated in Figure 63
62 physical implementation 95
Metal2Metal1PolyActive
IN
OUT_B
VDD
IN_B
OUT
GND
OUT_B
IN
GND
VDD
OUT
IN_B
Figure 64 P2LC (layoutschematic view)
622 Pulse-to-level converter
The pulse-to-level converter (P2LC) was based on the simple circuitof Figure 54 (page 78) The design in layout and schematic viewsis illustrated in Figure 64 The circuit had 2 inputs for the inverted(IN_B) and non-inverted (IN) resonant pulse and 2 outputs forthe inverted (OUT_B) and non-inverted (OUT) digital signals
623 Data generator
A data generator was used for generating the digital data sig-nal required at the adiabatic driverrsquos input (DATA) The circuitwhich is illustrated in Figure 65 was a simple D-type flip-flopconfigured as a toggle The data generator had 1 input a digitalclock signal (CLK) operating at the same frequency as the res-onant clock and the digital data output (Q) with a switchingactivity of 1
As discussed in 534 (page 83) there are certain timing con-straints that must be respected for the energy-recovery circuitto operate efficiently The digital data received at the adiabaticdriverrsquos input (DATA) should switch during the period definedby WPULSE (Figure 66) In the demonstrator circuit the distribu-tion network carrying the DATA signal to the adiabatic driverswas carefully designed so that its delay impact was less thanWPULSE
624 Pulse generator
The gate of transistor M1 in Figure 62 is controlled by a pulsehaving the same frequency as the resonant clock (RES_CLK)and width WPULSE which is calculated using the models in 533(page 82) This digital pulse can be generated externally of thechip using a highly accurate laboratory signal generator Howeverwire parasitics in the path from the external signal source to the
62 physical implementation 96
Metal2Metal1PolyActive EN
CLK
VDD
Q
GND
D
CLKGND
ENVDD
QF-F
D
GND
VDDCLK
D Q
Figure 65 Data generator (layoutschematic view)
Delay1
WPULSE
PULSE
CLKTCLK
DATA
CLK_RES
Figure 66 Timing constraints for the energy-recovery demonstratorcircuit
62 physical implementation 97
PULSES1A
S2
GND
VDD
Metal2Metal1PolyActive
VDD
S1
S2
GND
PULSE
Figure 67 Pulse generator (layoutschematic view)
S2ΔPhase
WPULSE
PULSE
S1A
Figure 68 Pulse generator InputOutput signals
gate of on-chip transistor M1 affect the rising and falling times ofthe pulse resulting in pulse widths which are difficult to control
For that reason the pulse generator was implemented on-chip(Figure 67) and could be used to generate pulses with arbitrarywidths and frequencies The pulse generator had 2 sinusoidalinputs (S1A S2) supplied externally of the chip and generateda digital pulse with the same frequency as S1AS2 The pulsewidth was defined by the phase difference between S1AS2 asillustrated in Figure 68
625 Integrated inductor
The integrated inductor was designed in ASITIC1 as a planarsquare spiral with fixed dimensions (500microm times 500microm) and vari-able inductance while aiming for optimum Q factor The simu-
1 ASITIC is CAD tool that aids in the optimization and modelling of spiralinductors (httprficeecsberkeleyedu~niknejadasitichtml)
62 physical implementation 98
100 200 300 400 500 600 700 8005
7
9
11
13
15
17
19
21
10
15
20
25
30
35
40
45
Q (Simulated) Energy Reduction ()
Frequency (MHz)
Qfactor
Energyreduction()
Figure 69 Inductorrsquos Q factor simulation in ASITIC Estimated energydissipation reduction per cycle over standard CMOS
lated Q factor of the inductor for each frequency was then usedin Equation 55 to estimate the energy performance of the energy-recovery circuit when compared to standard CMOS (Equation 52)
As can be observed in Figure 69 the maximum performance isnot reached at the point where the maximum Q factor is achievedsince energy-recovery is also dependent on the operating fre-quency As a result the largest reduction in energy dissipationwas 40 attained at 300MHz while the maximum Q factor re-ported by ASITIC was 20 at 625MHz The high Q factors werea consequence of the large area allocated for the inductor andthe availability of the backside re-distribution layer (RDL) on thetarget process technology for the implementation of the inductorwhere thick low-resistance copper paths could be deposited
For an operating frequency of 500MHz the inductor had a Qfactor of sim 17 which resulted in an estimated energy reductionof 33 in comparison to the CMOS circuit The inductor had avalue of 613nH at that frequency and its design in layout viewcan be seen in Figure 610
626 Top level design
The various sub-components of the circuit were interconnectedusing Metal1 and Metal2 layers In the design there were twoglobal signals that assumed a critical role in the performance ofthe circuit The digital data signal (DATA) which was discussedin 623 and the resonant clock (RES_CLK)
For the distribution of the data signal buffers and interconnectswere properly sized so that the timing constraints of Figure 66were respected A sinusoidal signal such as RES_CLK does notrequire regeneration as it carries just one frequency harmonichowever the path itself has to be properly sized so as to make surethat any parasitic resistancescapacitances are not significantlyaffecting the target performance of the design As was previously
62 physical implementation 99
500μ
m
500μm
Figure 610 Planar square spiral inductor designed in ASITIC
shown by Chueh et al [19] when the driven load capacitancein a resonant distribution network is much larger than wireparasitic capacitance then wire resistance is more significant fordetermining both skew and energy performance
The resonant pulse distribution topology used in this designwas an asymmetric tree as shown in Figure 611 in which theinductor forces a current at the root branch terminating on alladiabatic drivers in parallel As such and considering that theresonant pulse supplies 80 adiabatic drivers with transmissiongate resistance of RTG = 168Ω the total load resistance seen atthe output node of the inductor is RTG
80 = 16880 = 21Ω However as
the current flows to lower branch levels in the tree the perceivedresistance increases to RTG
16 RTG4 and on the last one to RTG Any
additional resistance from wires is going to add in series with theadiabatic driver resistance which in turn will induce an almostlinear effect on the energy performance as can be estimated fromEquation 512
Assuming that the total driven resistance is not desirable toincrease more than 20 and that there are 4 branch levels inthe design each branch can tolerate wire resistance equal to20
4 = 5 of its value Starting from the inductor output nodeand calculating allowed wire resistivity for each branch level thecalculated tolerances for each branch are listed in Table 62 Thesecalculated limits were used for optimally sizing the resonantclock distribution tree in the demonstrator circuit
To ensure a stable voltage supply at the input of the inductor(VIND) a 1pF interdigitated metal-insulator-metal (MIM) de-coupling capacitor [78] was designed (Figure 612) which wasinserted between VIND and ground The top level design of
62 physical implementation 100
Adiabatic drivers
RTG
16
RTG
80
RTG
4
RTG
Level 1Level 2Level 3Level 4
Figure 611 Resonant pulse distribution tree
Branch level Wire resistance
1st 5 middot 16880 = 105mΩ
2nd 5 middot 16816 = 525mΩ
3rd 5 middot 1684 = 21Ω
4th 5 middot 168Ω = 84Ω
Total load 168+8480 + 21
20 + 05255 + 0105 = 252Ω
Table 62 Calculated resistance tolerances for each wire branch in theresonant clock distribution tree
63 post-layout simulation and comparison 101
+-
+-Metal2 Metal1
180μ
m
100μm
Figure 612 Interdigitated metal-insulator-metal capacitor (fringe capa-citor)
the energy-recovery IO circuit in layout view is illustrated inFigure 613
627 CMOS IO circuit
The CMOS IO circuit which was designed for comparison pur-poses is illustrated in the schematic diagram of Figure 614 Thecircuit used half the number of TSVs than the energy-recoverycircuit and was composed of two basic sub-components The datagenerator which was identical to the one discussed in 623 andthe CMOS drivers (Figure 615) which were properly sized to drivethe 160 f F TSV capacitance
The top level layout view of the CMOS IO circuit can be seenin Figure 616
63 post-layout simulation and comparison
The theoretically calculated parameters were used for the designof the demonstrator circuit in Imecrsquos 3D130nm process technologyIn Table 63 the estimated energy dissipation of each circuit sub-component is listed
63 post-layout simulation and comparison 102
22mm
12mm
GND GNDVBUF1 VDR
GND
VIND
S1A
S2
EN
OER
Figure 613 Energy-recovery IO circuit in layout view
63 post-layout simulation and comparison 103
OCMOS
S1B DataGenerator
TSV
CMOSDriver
CMOSDriver
TSV Buffer
1
80
EN
Figure 614 CMOS IO circuit schematic diagram
GND
VCMOS
DATA
078u
039u
105u
053u
316u
158u
950u
475u
OUT
Figure 615 CMOS driver schematic
12mm
12mm
OCMOS S1B GND VBUF2 VCMOSGND
GND GNDEN VCMOS
Figure 616 CMOS IO circuit in layout view
63 post-layout simulation and comparison 104
Energy dissipation (80-bits cycle)
Inductor 11pJ (Eq514)
Switch M1 092pJ [ ]
P2LCs 098pJ (Simulated)
Adiabatic drivers 315pJ (Eq512)
Total 615pJ
Table 63 Energy-dissipation theoretical calculations for Q = 17 f =500MHz
After parasitics extraction the top-level design was simulatedin SPICE The post-layout simulation results for a selection ofoutput signals are plotted in Figure 617 As can be observedthe timing constraints discussed in 623 are respected since theadiabatic driverrsquos input signal (DATA) optimally switches duringthe 327ps PULSE width It can also be seen that the resonantclock (CLK_RES) does not fully complete its discharge cyclebefore it is being forced to do so by the PULSE signal since theadditional capacitive load induced by the wire parasitics reducesthe oscillatorrsquos natural frequency
Post-layout simulated energy dissipation as well as area re-quirements for the energy-recovery and CMOS IO circuits arelisted in Table 64 The energy dissipation of the energy-recoveryimplementation is 25 more at the post-layout level in compar-ison to the theoretical estimations The energy overhead can beattributed to the effect of the wire parasitics as discussed in626 and the transistor parasitics that are not included in thetheoretical equations Therefore the accuracy of the theoreticalmodels can be considered acceptable for designing circuits basedon the proposed energy-recovery scheme
As was expected the dual-rail energy-recovery circuit occupiesalmost double the area of the equivalent CMOS circuit and eventhough both implementations have increased energy dissipationin comparison to the theoretical calculations the CMOS circuitenergy increases relatively more The latter was anticipated sincethe theoretical estimation of the CMOS energy dissipation wascalculated using the simple Equation 52 which does not accountfor any additional driver stages and parasitics The end result isthat the comparative performance favours the energy-recoveryimplementation which improves its energy savings from the
63 post-layout simulation and comparison 105
8 9 10 11 12 13 14 15 160
02
04
06
08
1
12
14RES_CLK
Time (ns)
V(V
)
8 9 10 11 12 13 14 15 160
02
04
06
08
1
12
14DATA
Time (ns)
V(V
)
8 9 10 11 12 13 14 15 160
02
04
06
08
1
12
14PULSE
Time (ns)
V(V
)
8 9 10 11 12 13 14 15 160
02
04
06
08
1
12
14OER
Time (ns)
V(V
)
Figure 617 Post-layout SPICE simulation of the energy-recovery cir-cuit
64 experimental measurements 106
AreaEnergy
dissipation(Theory)
Energydissipation
(Post-layout)
CMOS 144mm2922pJ
(115 f Jbit)
(Eq52)
1265pJ
(158 f Jbit)
Energy recovery 264mm2 615pJ
(77 f Jbit)
768pJ
(96 f Jbit)
Difference +83 minus333 minus393
Table 64 Post-layout simulated energy dissipation Area require-ments
theoretically calculated 333 up to 393 in the post-layoutsimulations
64 experimental measurements
The presented energy-recovery and CMOS IO circuits were fabric-ated on a 200mm wafer (Figure A3) which contained 47 identicaldies Micrograph images of the energy recovery and CMOS circuitscan be seen in Figures 618 and 619 respectively
641 Measurement setup
For IO access a manual on-wafer measurement system was usedthat allowed direct contact with the wafer and the IO test padsthrough RF probe heads (Figure A4) IO signal generation andmeasurement was provided by laboratory instruments whichwere manually controlled The measurement setup is illustratedin Figure 620
The IO test pads for the energy-recovery and CMOS circuitsand their equivalent instrument connections used for each testpad are summarised in Tables 65 and 66
642 Results
CMOS IO circuit
In Figure 621 the input signal S1B and the output signal OCMOS
are shown as captured by the oscilloscope Signal S1B whichhas a frequency of 500MHz is used by the data generator (Fig-ure 614) to generate a data signal switching at the same frequency
64 experimental measurements 107
Figure 618 Energy-recovery IO circuit micrograph
Figure 619 CMOS IO circuit micrograph
AABC
DEFF
D
EEBA
EBBA
Figure 620 Measurement setup
64 experimental measurements 108
Test pad Direction Instrument connection
VIND Power Power supply - gt06V
GND Power Power supply - 0V
S1A InputSine-wave generator (0deg Phase500MHz non-50Ω-terminated)
S2 InputSine-wave generator (VariablePhase 500MHznon-50Ω-terminated)
EN InputDigital Output (enable outputdata switching)
OER Output Oscilloscope
VDR PowerPower supply (adiabaticdrivers+P2LCs) - 12V
VBUF1 Power Power supply (Buffers) - 12V
Table 65 Energy-recovery circuit instrument connections
Test pad Direction Instrument connection
VCMOS PowerPower supply (CMOS drivers) -12V
GND Power Power supply - 0V
VBUF2 Power Power supply (Buffers) - 12V
S1B InputSine-wave generator (0deg Phase500MHz non-50Ω-terminated)
OCMOS Output Oscilloscope
EN InputDigital Output (enable outputdata switching)
Table 66 CMOS circuit instrument connections
64 experimental measurements 109
S1B
OCMOS
Figure 621 Oscilloscope output for signals S1B and OCMOS
100 200 300 400 500 600 700 800012345678910
Die1 Die2 Simulation
Frequency (MHz)
Powerdissipation(mW)
Figure 622 Measuredsimulated power dissipation for the CMOS cir-cuit
that passes through the TSVs and is output on OCMOS Thereforethe measured result on the oscilloscope suggests that the circuitis operating as expected
The proper operation of the circuit is also verified by measuringpower and energy per cycle dissipation on pad VCMOS Thepower dissipation of the two dies measured is approximatelylinear as expected by simulations (Figure 622) while energydissipation approximately constant (Figure 623) In additionabsolute power and energy dissipation values closely match post-layout simulations
Energy-recovery IO circuit
Input signal S1A and output signal OER can be seen in Fig-ure 624 as captured by the oscilloscope The output at OER
64 experimental measurements 110
100 200 300 400 500 600 700 800024681012141618
Die1 Die2 Simulation
Frequency (MHz)
Energydissipation(pJ)
Figure 623 Measuredsimulated energy dissipation for the CMOScircuit
S1A
OER
Figure 624 Oscilloscope output for signals S1A and OER
shows a data signal switching at 500MHz as expected The slowrising edge of the data signal suggests higher than anticipatedparasitics at the OER IO pad though the functionality of thecircuit is not affected
The energy dissipation is calculated by measuring the inputcurrent on IO test pads VIND and VDR The results for the twodies measured can be seen in Figure 625 The energy dissipationshows an approximately constant relation to frequency similar tothe CMOS circuit which does not correlate with post-layout simu-lations A properly operating energy-recovery circuit would beexpected to show minimum energy dissipation at the resonancefrequency (sim 500MHz)
Since the energy-recovery circuit shows CMOS-type energy dis-sipation it was estimated that the inductor could not resonate
65 summary and conclusions 111
200 250 300 350 400 450 500 550 600 65002468101214161820
Die1 Die2 Simulation
Frequency (MHz)
Energydissipation(pJ)
Figure 625 Measuredsimulated energy dissipation for the energy-recovery circuit
IntegratedInductor
VINDOER
PulseGenerator
S1A
S2
DataGenerator
TSV
TSV
P2LCAdiabaticDriver
P2LCAdiabaticDriver TSV
TSV
Buffer
Buffer
1
80
1
80
EN
DecouplingCapacitor
200pF
Figure 626 Energy-recovery IO circuit with fault hypothesis
properly and a hypothesis was made that the cause was a largerthan anticipated parasitic capacitance on the integrated inductorTo determine whether the hypothesis is valid a 200pF capacitoris inserted in the circuit schematic connected at the output ofthe inductor as shown in Figure 626 The new circuit is simu-lated and as can be seen in Figure 627 the output signal at OER
switches properly even though the resonant clock (RES_CLK) isnot oscillating The simulated energy dissipation of the circuitwith the fault hypothesis can be seen in Figure 628 and as canbe observed it is very similar to the experimental measurementsboth in terms of absolute values and frequency relation
65 summary and conclusions
In this chapter the design approach of a demonstrator circuitwas presented for evaluating the energy-recovery 3D interconnect-ing scheme Expanding on the theoretical analysis proposed inChapter 5 practical design issues were discussed and methodswere suggested for overcoming them
The circuit was designed and implemented on an experimental130nm 3D process technology and post-layout simulations res-
65 summary and conclusions 112
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15050
055
060
065
070
075
080RES_CLK
Time (ns)
V(V
)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 150
02
04
06
08
1
12
14PULSE
Time (ns)
V(V
)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1500
02
04
06
08
10
12
14OER
Time (ns)
V(V
)
Figure 627 Simulation of circuit with fault hypothesis
65 summary and conclusions 113
200 250 300 350 400 450 500 550 600 6500
5
10
15
20
25
Die1 Die2 Simulation CL=200pF
Frequency (MHz)
Ene
rgy
diss
ipat
ion
(pJ)
Figure 628 Measured energy dissipation and simulation of circuit withfault hypothesis
ults including parasitics were shown and interpreted The post-layout results revealed that the demonstrator circuit dissipated25 more energy per cycle than the value estimated by the-ory however the energy overhead was attributed to the effectof the wiretransistor parasitics which are not included in thetheoretical models Nevertheless the energy-recovery IO circuitdissipated 39 less energy per cycle than the equivalent CMOS
circuit implemented on the same process technologyExperimental measurements demonstrated that the fabricated
CMOS IO circuit closely matched the energy-dissipation estim-ated by post-layout simulations However the energy-recoveryIO circuit did not operate as expected and as a result its wasnot possible to evaluate the energy-recovery 3D interconnectingscheme on-chip Using simulation data the fault was attributedto higher than anticipated integrated inductor parasitics
7C O N C L U S I O N S
Planar on-chip interconnects have become a major concern inmodern processes as their impact on IC performance has beenprogressively increasing with each technology node In recentyears 3D interconnects based on vertical TSVs have been proposedas a possible solution to the interconnect bottleneck TSVs havethe potential to offer the best all around technology for meet-ing future technology trends with their advantages includingreduced parasitics and high interconnection density
As TSV is a relatively recent 3D interconnect technology thereare still challenges to be overcome before TSVs can be widelyadopted and commercialised Issues in the fabrication of TSV
devices are in the process of being resolved or solutions havealready been found For implementing TSV-based 3D-ICs withmore complex functionalities than simple test chips 3D designmethodologies and tools are needed which are still at their in-fancy stage Accurate perception of the TSV characteristics thatinfluence circuit performance is a major step towards the devel-opment of design methodologies and tools for 3D-ICs
Key TSV characteristics are the TSV parasitic capacitance andthermomechanical stress distribution The parasitic capacitanceaffects delay and power consumption of signals passing throughthe TSV interconnect while thermomechanical stress affects MOS-
FET device mobility in the vicinity of TSVs For extracting theseparameters accurately and under different conditions sophist-icated test structures were presented in this work which werefabricated and evaluated on an experimental 3D process
Although the TSV parasitic capacitance is small it may stillbecome an important source of power consumption as TSVs areimplemented in large densely interconnected 3D-SoCs Since con-ventional low-power design techniques based on voltage scalingrepresent a technical challenge in modern technology nodesa novel TSV interconnection scheme with frequency-dependentpower consumption was proposed in this work based on energy-recovery logic The proposed scheme was analysed using the-oretical modelling while a demonstrator IC was designed andfabricated on a 130nm 3D process
The contributions of this thesis as presented in the previouschapters are summarised in the following sections as well aspossible directions for future work
114
71 main contributions 115
71 main contributions
From the parasitics associated with a TSV capacitance assumesthe most significant role on its electrical properties such as en-ergy dissipation and delay thus accurate characterization of theTSV capacitance is highly desirable when designing a 3D systemIn Chapter 3 a group of test structures were presented for theelectrical characterization of the TSV parasitic capacitance un-der various conditions The test structures were based on theCBCM technique which has been used in the past for measuringBEOL metal interconnect parasitic capacitances and MOSFET ca-pacitances The presented test structures were fabricated on a65nm 3D process and the measurement results were statisticallyanalysed in terms of TSV capacitance die-to-die (D2D) and die-to-wafer (D2W) variability Comparison of the measurement data tosimulated oxide liner thickness variation for the specific processrevealed the oxide liner as a major contributor to the observedTSV capacitance variability Measurement results also confirmedthe effect of TSV pitch on the parasitic capacitance as has beenpreviously reported in the literature for a different process Theproposed CBCM-based TSV capacitance measurement techniquewas shown to be superior to conventional LCR measurementsproducing both accurate measurement results as well as lessoutliers (less defective samples) in the measurement data
The integration of TSV 3D interconnects in CMOS technology hasbeen shown to induce thermomechanical stress on the silicon (Si)substrate Thermomechanical stress can have an impact on activedevice carrier mobility implemented on the same Si substratethus monitoring and controlling stress is crucial for designing TSV-aware circuits with predictable performance Simulations usingfinite element method (FEM) modelling predict a complex ther-momechanical stress distribution which emphasises the need forsophisticated characterization structures that can monitor stressimpact with precision and verify simulation models In Chapter4 a test structure was presented for indirectly monitoring of theTSV-induced thermomechanical stress distribution by measuringits effect on transistor saturation current The structure implemen-ted arrays of MOSFET devices superimposed with TSVs in variousconfigurations and enabled accurate measurement of individualdevices by combining digital selection logic with four-terminalsensing Imecrsquos experimental 65nm 3D process was characterizedusing the presented test structure extracting the thermomechan-ical stress effect on PMOS devices which was found to be on parwith simulation estimations
In future large densely interconnected 3D-SoCs the TSV parasiticcapacitance may become an important source of energy dissip-ation Energy-recovery logic is an interesting low-power design
72 future work 116
approach as it is not limited by voltage scaling which representsa technical challenge in modern technology nodes In Chapter5 the potential of the energy-recovery technique was investig-ated for reducing the energy dissipation of TSV interconnectsin 3D-ICs The proposed energy-recovery scheme was analysedusing theoretical modelling generating equations for energy dis-sipation and optimum device sizing The model accuracy wasevaluated against SPICE simulations which showed good cor-relation with the theoretical estimations on a 130nm technologyprocess Comparison of the energy-recovery scheme to conven-tional static CMOS demonstrated favourable energy performancefor the energy-recovery scheme for low operating frequenciesand high Q factors switching activities and TSV capacitances
The theoretical models developed in Chapter 5 were used fordesigning a energy-recovery 3D demonstrator circuit in Chapter6 under realistic physical and electrical constraints The energy-recovery circuit and its CMOS equivalent were fabricated on a130nm 3D process technology and post-layout simulations showedthat the energy-recovery circuit dissipated 39 less energy percycle than the CMOS circuit on the same process technology Exper-imental measurements and comparison to simulations confirmedthe proper operation of the fabricated CMOS circuit however theenergy-recovery circuit did not operate as expected and as a res-ult its was not possible to evaluate the proposed energy-recoveryscheme on a chip Using simulation data the fault was attributedto higher than anticipated integrated inductor parasitics
72 future work
The CBCM-based test structures in Chapter 3 were shown to be atleast as accurate as external instruments for characterizing TSV
capacitance Considering that CBCM can evaluate single-endedTSVs it could prove a valuable tool for testing TSVs prior to diestacking which is a critical step for keeping 3D-IC yield acceptableMeasuring the TSV parasitic capacitance can provide with import-ant information about the quality of the fabricated TSV devicesince even slight variations in process parameters will affect itsparasitic capacitance as was demonstrated in Chapter 3 CBCM issimpler and potentially more accurate than other techniques thathave been proposed in the literature for the same role such asRC constant measurement using sense amplification [16]
In Chapter 4 the experiences learned from evaluation of theexperimental measurements results could be potentially used fordesigning improved thermomechanical stress characterizationstructures for next generation TSV processes
1 The transistors should be in a regular grid in the array forsimple processing of the measurement data
72 future work 117
2 The boundary conditions of transistors should be carefullydesigned as they have an considerable effect on transistormatching and thus measurement accuracy
3 Distances of interest are generally lt 8microm thus the arrayscould be smaller and more dense in the area in proximityto the TSV
4 Mechanisms are required to detect faults in the digitalselection logic when the implementation process is expectedto have low-yield issues
Furthermore in a future work the thermomechanical stress effecton NMOS devices should be measured which were not evaluatedin this project Simulations predict that NMOS devices are muchless affected by thermomechanical stress than PMOS devices thusthe approach used in this work might not be sufficiently accurateto extract stress effect on NMOS mobility
In Chapter 5 the energy-recovery scheme proposed could use adifferent encoding technique than dual-rail as a possible improve-ment A suggested encoding is 1-out-of-4 [74] since only chargesa single output at any given cycle which can potentially save upto 50 energy in comparison to dual-rail Also the demonstra-tion of the energy-recovery scheme feasibility on a chip is stillpending since the energy-recovery 3D demonstrator discussedin Chapter 6 failed to operate properly The fault was attributedto higher than anticipated integrated inductor parasitics thusthe same circuit should be implemented in the future with animproved inductor design for verifying the proposed theory
AA P P E N D I X A P H O T O S
Figure A1 3D65 300mm wafer (on chuck)
Figure A2 Probe card
118
appendix a photos 119
Figure A3 3D130 200mm wafer (on chuck)
ProbeHead
Microscope
Figure A4 Probe head
BA P P E N D I X B S TAT I S T I C S
mean (micro)
In a given data set the mean is the sum of the values divided bythe number of values The mean represents the average value in apopulation If xi is a value in the data set and n the total numberof values the mean micro can be calculated as
micro =1n
n
sumi=1
xi
median (m)
The median is useful for skewed distributions and represents thearithmetic value that the majority of the values in a data set tendThe median of a normal distribution with mean micro and varianceσ2 is the same as the mean
standard deviation (σ)
Standard deviation is used for measuring variability in a dataset It shows how much variation there is from the mean valueTo calculate the standard deviation of a data set first the squareof the difference of each value from the mean is calculated andthen the square root of the mean gives the standard deviation
A normal distribution contains 682 of its values within plusmn1σOther distributions can have a different spread and typically alow standard deviation indicates low variability whereas a highstandard deviation increased variability
standard deviation of the mean (σmicro )
The standard deviation of the sampled mean (σmicro) provides in-formation on the precision of the calculated mean of a data setThis is especially useful in experimental measurements wherethe certainty of the data being an accurate representation of themeasured physical quantity increases the more the measurementis repeated
Assuming there is a data set X with n values xi of mean micro
120
appendix b statistics 121
variance(X) = σ2X
variance(micro) = variance( 1n sum
ni=1 xi) =
1n2 variance(sumn
i=1 xi) =1n2 sum
ni=1 variance(xi) =
nn2 variance(X) =1n variance(X) =
σ2Xn rarr
σmicro = σXradicn
coefficient of variation
The coefficient of variation is the ratio of the standard deviationσ to the mean micro
CV =σ
micro
It can be more useful in determining variability in a populationthan standard deviation as in contrast to standard deviationcoefficient of variation is a dimensionless number often expressedas percentage
outlier
Outliers are values in a data set that are numerically distant fromthe rest of the data Observing and sometimes removing theoutliers is crucial in experimental measurement as these valuesusually indicate a measurement error or other malfunction De-termining outliers is generally subjective to the observer as thereis no specific mathematical definition of what constitutes outliersTypically the outliers are located by observing the probabilitydistribution function of a data set
kruskalndash wallis [34]
KruskalndashWallis one-way analysis of variance compares the medi-ans of two or more data sets to determine if the samples originatefrom separate populations This analysis tests the null-hypothesisthat there is no relationship between the data sets calculating theprobability of the data sets originating from the same populationand rejecting the null-hypothesis if it is less than a significancelevel The KruskalndashWallis test can be used on data sets that donot have a normal distribution and their variances do not have tobe equal
B I B L I O G R A P H Y
[1] Model 2602 dual-channel system sourcemeter instru-ment URL httpwwwkeithleycomproductsdcac
currentsourcebroadpurposemn=2602
[2] Max-3d URL httpwwwmicromagiccom
[3] The international technology roadmap for semiconductors2007 URL httppublicitrsnet
[4] The international technology roadmap for semiconductors2009 URL httppublicitrsnet
[5] Jaime Aguilera Joaquin de No Andres Garcia-Alonso FrankOehler Heiko Hein and Josef Sauerer A guide for on-chip inductor design in a conventional cmos process forrf applications In Applied Microwave amp Wireless Magazinepages 56ndash65 October 2001
[6] M Arsalan and M Shams Charge-recovery power clockgenerators for adiabatic logic circuits In Proc 18th Int VLSI
Design Conf pages 171ndash174 2005 doi 101109ICVD200564
[7] P Asimakopoulos G Van der Plas A Yakovlev andP Marchal Evaluation of energy-recovering interconnectsfor low-power 3d stacked ics In Proc IEEE Int Conf
3D System Integration 3DIC 2009 pages 1ndash5 2009 doi1011093DIC20095306581
[8] W C Athas L J Svensson J G Koller N Tzartzanis andE Ying-Chin Chou Low-power digital systems based onadiabatic-switching principles 2(4)398ndash407 1994 doi 10110992335009
[9] C H Bennett Logical reversibility of computation IBM
Journal of Research and Development 17(6)525ndash532 1973 doi101147rd1760525
[10] Charles H Bennett The fundamental physical lim-its of computation Scientific American 253(1)48ndash561985 URL httpwwwnaturecomdoifinder101038
scientificamerican0785-48
[11] E Beyne 3d system integration technologies In Proc Int
VLSI Technology Systems and Applications Symp pages 1ndash92006 doi 101109VTSA2006251113
122
bibliography 123
[12] Mark Bohr and Kaizad Mistry Intelrsquos revolu-tionary 22 nm transistor technology 2011 URLhttpdownloadintelcomnewsroomkits22nmpdfs
22nm-Details_Presentationpdf
[13] Ke Cao S Dobre and Jiang Hu Standard cell charac-terization considering lithography induced variations InProc 43rd ACMIEEE Design Automation Conf pages 801ndash8042006 doi 101109DAC2006229327
[14] A P Chandrakasan S Sheng and R W Brodersen Low-power cmos digital design 27(4)473ndash484 1992 doi 1011094126534
[15] J C Chen B W McGaughy D Sylvester and Chenming HuAn on-chip attofarad interconnect charge-based capacitancemeasurement (cbcm) technique In Proc Int Electron Devices
Meeting IEDM rsquo96 pages 69ndash72 1996 doi 101109IEDM1996553124
[16] Po-Yuan Chen Cheng-Wen Wu and Ding-Ming Kwai On-chip tsv testing for 3d ic before bonding using sense ampli-fication In Proc ATS rsquo09 Asian Test Symp pages 450ndash4552009 doi 101109ATS200942
[17] Sungdong Cho Sinwoo Kang Kangwook Park Jaechul KimKiyoung Yun Kisoon Bae Woon Seob Lee Sangwook JiEunji Kim Jangho Kim Y L Park and E S Jung Impact oftsv proximity on 45nm cmos devices in wafer level In Proc
IEEE Int Interconnect Technology Conf and 2011 Materials for
Advanced Metallization (IITCMAM) pages 1ndash3 2011 doi101109IITC20115940326
[18] Kyu-Myung Choi An industrial perspective of 3d ic inte-gration technology from the viewpoint of design technologyIn Proc 15th Asia and South Pacific Design Automation Conf
(ASP-DAC) pages 544ndash547 2010 doi 101109ASPDAC20105419823
[19] Juang-Ying Chueh C H Ziesler and M C PapaefthymiouEmpirical evaluation of timing and power in resonant clockdistribution In Proc Int Symp Circuits and Systems ISCAS
rsquo04 volume 2 2004 doi 101109ISCAS20041329255
[20] Thuy Dao D H Triyoso M Petras and M CanonicoThrough silicon via stress characterization In Proc IEEE
Int Conf IC Design and Technology ICICDT rsquo09 pages 39ndash412009 doi 101109ICICDT20095166260
[21] M Grange R Weerasekera D Pamunuwa and H Ten-hunen Exploration of through silicon via interconnect par-
bibliography 124
asitics for 3-dimensional integrated circuits In Workshop
Notes Design Automation and Test in Europe (DATE) 2009
[22] N Z Haron and S Hamdioui Why is cmos scaling comingto an end In Proc 3rd Int Design and Test Workshop IDT
2008 pages 98ndash103 2008 doi 101109IDT20084802475
[23] Imec 200mm 3d-sic tsv line 2009 URL httpwwwimec
beScientificReportSR2009HTML1213307html
[24] J W Joyner R Venkatesan P Zarkesh-Ha J A Davis andJ D Meindl Impact of three-dimensional architectures oninterconnects in gigascale integration 9(6)922ndash928 2001doi 10110992974905
[25] A P Karmarkar Xiaopeng Xu and V Moroz Perfor-manace and reliability analysis of 3d-integration struc-tures employing through silicon via (tsv) In Proc IEEE
Int Reliability Physics Symp pages 682ndash687 2009 doi101109IRPS20095173329
[26] G Katti A Mercha M Stucchi Z Tokei D VelenisJ Van Olmen C Huyghebaert A Jourdain M RakowskiI Debusschere P Soussan H Oprins W DehaeneK De Meyer Y Travaly E Beyne S Biesemans andB Swinnen Temperature dependent electrical character-istics of through-si-via (tsv) interconnections In Proc Int
Interconnect Technology Conf (IITC) pages 1ndash3 2010 doi101109IITC20105510311
[27] G Katti M Stucchi K De Meyer and W Dehaene Electricalmodeling and characterization of through silicon via forthree-dimensional ics 57(1)256ndash262 2010 doi 101109TED20092034508
[28] G Katti M Stucchi J Van Olmen K De Meyer and W De-haene Through-silicon-via capacitance reduction techniqueto benefit 3-d ic performance 31(6)549ndash551 2010 doi101109LED20102046712
[29] Walt Kester Sensor signal conditioning In Jon SWilson editor Sensor Technology Handbook pages 31 ndash136 Newnes Burlington 2005 ISBN 978-0-75-067729-5 doi DOI101016B978-075067729-550044-6 URLhttpwwwsciencedirectcomsciencearticlepii
B9780750677295500446
[30] Jonggab Kil Jie Gu and C H Kim A high-speed variation-tolerant interconnect technique for sub-threshold circuitsusing capacitive boosting 16(4)456ndash465 2008 doi 101109TVLSI2007915455
bibliography 125
[31] Suhwan Kim C H Ziesler and M C PapaefthymiouCharge-recovery computing on silicon 54(6)651ndash659 2005doi 101109TC200591
[32] C Kortekaas On-chip quasi-static floating-gate capacitancemeasurement method In Proc Int Conf Microelectronic Test
Structures ICMTS 1990 pages 109ndash113 1990 doi 101109ICMTS199067889
[33] A Kramer J S Denker B Flower and J Moroney 2ndorder adiabatic computation with 2n-2p and 2n-2n2p logiccircuits In Proceedings of the 1995 international symposium on
Low power design ISLPED rsquo95 pages 191ndash196 New York NYUSA 1995 ACM ISBN 0-89791-744-8 doi httpdoiacmorg101145224081224115 URL httpdoiacmorg10
1145224081224115
[34] William H Kruskal and W Allen Wallis Use of ranks in one-criterion variance analysis Journal of the American Statistical
Association 47(260)583ndash621 1952 ISSN 01621459 doi 1023072280779 URL httpdxdoiorg1023072280779
[35] Kelin J Kuhn Moorersquos law past 32nm Future challengesin device scaling 2009 URL httpdownloadintelcom
pressroompdfkkuhnKuhn_IWCE_invited_textpdf
[36] R Landauer Irreversibility and heat generation in the com-puting process IBM Journal of Research and Development 5
(3)183ndash191 1961 doi 101147rd530183
[37] Joonho Lim Dong-Gyu Kim and Soo-Ik Chae A 16-bitcarry-lookahead adder using reversible energy recoverylogic for ultra-low-energy systems 34(6)898ndash903 1999 doi1011094766827
[38] Joonho Lim Dong-Gyu Kim and Soo-Ik Chae nmos re-versible energy recovery logic for ultra-low-energy applica-tions 35(6)865ndash875 2000 doi 1011094845190
[39] F Liu X Gu K A Jenkins E A Cartier Y Liu P Song andS J Koester Electrical characterization of 3d through-silicon-vias In Proc 60th Electronic Components and Technology Conf
(ECTC) pages 1100ndash1105 2010 doi 101109ECTC20105490839
[40] K H Lu Xuefeng Zhang Suk-Kyu Ryu J Im Rui Huangand P S Ho Thermo-mechanical reliability of 3-d ics con-taining through silicon vias In Proc 59th Electronic Compo-
nents and Technology Conf ECTC 2009 pages 630ndash634 2009doi 101109ECTC20095074079
bibliography 126
[41] Nir Magen Avinoam Kolodny Uri Weiser and NachumShamir Interconnect-power dissipation in a microprocessorIn Proceedings of the 2004 international workshop on System
level interconnect prediction SLIP rsquo04 pages 7ndash13 New YorkNY USA 2004 ACM ISBN 1-58113-818-0 doi httpdoiacmorg101145966747966750 URL httpdoiacm
org101145966747966750
[42] H Mahmoodi V Tirumalashetty M Cooke and K RoyUltra low-power clocking scheme using energy recoveryand clock gating 17(1)33ndash44 2009 doi 101109TVLSI20082008453
[43] D Maksimovic A mos gate drive with resonant transitionsIn Proc nd Annual IEEE Power Electronics Specialists Conf
PESC rsquo91 Record pages 527ndash532 1991 doi 101109PESC1991162725
[44] D Maksimovic and V G Oklobdzija Integrated power clockgenerators for low energy logic In Proc th Annual IEEE
Power Electronics Specialists Conf PESC rsquo95 Record volume 1pages 61ndash67 1995 doi 101109PESC1995474793
[45] D Maksimovic V G Oklobdzija B Nikolic and K WCurrent Clocked cmos adiabatic logic with integratedsingle-phase power-clock supply 8(4)460ndash463 2000 doi10110992863629
[46] E J Marinissen Testing tsv-based three-dimensional stackedics In Proc Design Automation amp Test in Europe Conf amp
Exhibition (DATE) pages 1689ndash1694 2010
[47] E J Marinissen and Y Zorian Testing 3d chips containingthrough-silicon vias In Proc Int Test Conf ITC 2009 pages1ndash11 2009 doi 101109TEST20095355573
[48] A Mercha A Redolfi M Stucchi N Minas J Van Ol-men S Thangaraju D Velenis S Domae Y Yang G KattiR Labie C Okoro M Zhao P Asimakopoulos I De WolfT Chiarella T Schram E Rohr A Van Ammel A Jour-dain W Ruythooren S Armini A Radisic H PhilipsenN Heylen M Kostermans P Jaenen E Sleeckx D Sabun-cuoglu Tezcan I Debusschere P Soussan D PerryG Van der Plas J H Cho P Marchal Y Travaly E BeyneS Biesemans and B Swinnen Impact of thinning andthrough silicon via proximity on high-k metal gate firstcmos performance In Proc Symp VLSI Technology (VLSIT)pages 109ndash110 2010 doi 101109VLSIT20105556190
[49] A Mercha G Van der Plas V Moroz I De Wolf P Asi-makopoulos N Minas S Domae D Perry M Choi
bibliography 127
A Redolfi C Okoro Y Yang J Van Olmen S ThangarajuD S Tezcan P Soussan J H Cho A Yakovlev P Mar-chal Y Travaly E Beyne S Biesemans and B SwinnenComprehensive analysis of the impact of single and arraysof through silicon vias induced stress on high-k metalgate cmos performance In Proc IEEE Int Electron Devices
Meeting (IEDM) 2010 doi 101109IEDM20105703278
[50] Yong Moon and Deog-Kyoon Jeong An efficient chargerecovery logic circuit 31(4)514ndash522 1996 doi 1011094499727
[51] G E Moore Cramming more components onto inte-grated circuits Electronics 38(8)114ndash117 April 1965 doi101109JPROC1998658762 URL httpdxdoiorg10
1109JPROC1998658762
[52] O S Nakagawa S-Y Oh T Hsu and S Habu Benchmarkmethodology of interconnect capacitance simulation usinginter-digitated capacitors In Proc Int Microelectronic Test
Structures ICMTS 1998 Conf pages 235ndash237 1998 doi 101109ICMTS1998688103
[53] M W Newman S Muthukumar M SchueleinT Dambrauskas P A Dunaway J M Jordan S KulkarniC D Linde T A Opheim R A Stingel W WorwagL A Topic and J M Swan Fabrication and electricalcharacterization of 3d vertical interconnects In Proc
56th Electronic Components and Technology Conf 2006 doi101109ECTC20061645676
[54] V G Oklobdzija D Maksimovic and Fengcheng Lin Pass-transistor adiabatic logic using single power-clock supply44(10)842ndash846 1997 doi 10110982633443
[55] J Van Olmen C Huyghebaert J Coenen J Van AelstE Sleeckx A Van Ammel S Armini G Katti J VaesW Dehaene E Beyne and Y Travaly Integration chal-lenges of copper through silicon via (tsv) metallization for3d-stacked ic integration Microelectronic Engineering 88
(5)745 ndash 748 2011 ISSN 0167-9317 doi DOI101016jmee201006026 URL httpwwwsciencedirectcom
sciencearticlepiiS0167931710002121 The 2010 Inter-national workshop on - MAM 2010 The 2010 Internationalworkshop on
[56] J J Paulos and D A Antoniadis Measurement of minimum-geometry mos transistor capacitances 20(1)277ndash283 1985doi 101109JSSC19851052303
bibliography 128
[57] Vasilis F Pavlidis and Eby G Friedman Interconnect de-lay minimization through interlayer via placement in 3-dics In Proceedings of the 15th ACM Great Lakes symposium
on VLSI GLSVLSI rsquo05 pages 20ndash25 New York NY USA2005 ACM ISBN 1-59593-057-4 doi httpdoiacmorg10114510576611057669 URL httpdoiacmorg101145
10576611057669
[58] Dan Perry Jonghoon Cho Shinichi Domae Panagiotis Asi-makopoulos Alex Yakovlev Pol Marchal Geert Van derPlas and Nikolaos Minas An efficient array structure tocharacterize the impact of through silicon vias on fet devicesIn Proc IEEE Int Microelectronic Test Structures (ICMTS) Confpages 118ndash122 2011 doi 101109ICMTS20115976872
[59] Jan M Rabaey Anantha Chandrakasan and Borivoje NikolicDigital integrated circuits- A design perspective Prentice Hall2ed edition 2004
[60] K Roy and S Prasad Low-power CMOS VLSI circuit de-
sign A Wiley-Interscience publication Wiley 2000 ISBN9780471114888
[61] K Roy S Mukhopadhyay and H Mahmoodi-MeimandLeakage current mechanisms and leakage reduction tech-niques in deep-submicrometer cmos circuits 91(2)305ndash3272003 doi 101109JPROC2002808156
[62] V S Sathe M C Papaefthymiou and C H Ziesler Boostlogic a high speed energy recovery circuit family In Proc
IEEE Computer Society Annual Symp VLSI pages 22ndash27 2005doi 101109ISVLSI200522
[63] V S Sathe J C Kao and M C Papaefthymiou Resonant-clock latch-based design 43(4)864ndash873 2008 doi 101109JSSC2008917501
[64] B Sell A Avellan and W H Krautschneider Charge-basedcapacitance measurements (cbcm) on mos devices 2(1)9ndash122002 doi 101109TDMR20021014667
[65] C S Selvanayagam Xiaowu Zhang R Rajoo and D PinjalaModelling stress in silicon with tsvs and its effect on mobilityIn Proc 11th Electronics Packaging Technology Conf EPTC rsquo09pages 612ndash618 2009 doi 101109EPTC20095416477
[66] D Shariff P C Marimuthu K Hsiao L Asoy Chia LaiYee Aung Kyaw Oo K Buchanan K Crook T Wilby andS Burgess Integration of fine-pitched through-silicon viasand integrated passive devices In Proc IEEE 61st Elec-
tronic Components and Technology Conf (ECTC) pages 844ndash848 2011 doi 101109ECTC20115898609
bibliography 129
[67] S Stoukatch C Winters E Beyne W De Raedt andC Van Hoof 3d-sip integration for autonomous sensornodes In Proc 56th Electronic Components and Technology
Conf 2006 doi 101109ECTC20061645678
[68] L J Svensson and J G Koller Driving a capacitive loadwithout dissipating fcv2 In Proc IEEE Symp Low Power
Electronics Digest of Technical Papers pages 100ndash101 1994 doi101109LPE1994573220
[69] B Swinnen W Ruythooren P De Moor L Bogaerts L Car-bonell K De Munck B Eyckens S Stoukatch D S TezcanZ Tokei J Vaes J Van Aelst and E Beyne 3d integrationby cu-cu thermo-compression bonding of extremely thinnedbulk-si die containing 10 IcircŒm pitch through-si vias In Proc
Int Electron Devices Meeting IEDM rsquo06 pages 1ndash4 2006 doi101109IEDM2006346786
[70] N Tanaka M Kawashita Y Yoshimura T Uematsu M Fuji-sawa H Shimokawa N Kinoshita T Naito T Kikuchi andT Akazawa Characterization of mos transistors after tsvfabrication and 3d-assembly In Proc 2nd Electronics System-
Integration Technology Conf ESTC 2008 pages 131ndash134 2008doi 101109ESTC20084684338
[71] S E Thompson Guangyu Sun Youn Sung Choi andT Nishida Uniaxial-process-induced strained-si extend-ing the cmos roadmap 53(5)1010ndash1020 2006 doi 101109TED2006872088
[72] N Tzartzanis and W C Athas Clock-powered cmos ahybrid adiabatic logic style for energy-efficient computingIn Proc 20th Anniversary Conf Advanced Research in VLSIpages 137ndash151 1999 doi 101109ARVLSI1999756044
[73] J Van Olmen A Mercha G Katti C HuyghebaertJ Van Aelst E Seppala Zhao Chao S Armini J VaesR C Teixeira M Van Cauwenberghe P Verdonck K Ver-hemeldonck A Jourdain W Ruythooren M de Potterde ten Broeck A Opdebeeck T Chiarella B Parvais I De-busschere T Y Hoffmann B De Wachter W DehaeneM Stucchi M Rakowski P Soussan R Cartuyvels E BeyneS Biesemans and B Swinnen 3d stacked ic demonstrationusing a through silicon via first approach In Proc IEEE Int
Electron Devices Meeting IEDM 2008 pages 1ndash4 2008 doi101109IEDM20084796763
[74] Tom Verhoeff Delay-insensitive codes - an overview Dis-
tributed Computing 31ndash8 1988 ISSN 0178-2770 URL http
dxdoiorg101007BF01788562 101007BF01788562
bibliography 130
[75] P Vitanov T Dimitrova and I Eisele Direct capacitancemeasurements of small geometry mos transistors Micro-
electronics Journal 22(7-8)77 ndash 89 1991 ISSN 0026-2692doi DOI1010160026-2692(91)90016-G URL httpwww
sciencedirectcomsciencearticleB6V44-4829XDJ-2N
2f5b2c221dcf240c38cd13b7b3432f913
[76] B Voss and M Glesner A low power sinusoidal clockIn Proc IEEE Int Symp Circuits and Systems ISCAS 2001volume 4 pages 108ndash111 2001 doi 101109ISCAS2001922182
[77] R Weerasekera M Grange D Pamunuwa H Tenhunenand Li-Rong Zheng Compact modelling of through-siliconvias (tsvs) in three-dimensional (3-d) integrated circuits InProc IEEE Int Conf 3D System Integration 3DIC 2009 pages1ndash8 2009 doi 1011093DIC20095306541
[78] NHE Weste and DF Harris CMOS VLSI design a cir-
cuits and systems perspective PearsonAddison-Wesley 2005ISBN 9780321269775
[79] Steve Wozniak Failure magazine interview (by jason zasky)July 2000 URL httpfailuremagcomindexphpsite
printsteve_wozniak_interview
[80] Hyung Suk Yang and Muhannad S Bakir 3d integrationof cmos and mems using mechanically flexible intercon-nects (mfi) and through silicon vias (tsv) In Proc 60th Elec-
tronic Components and Technology Conf (ECTC) pages 822ndash828 2010 doi 101109ECTC20105490716
[81] Yu Yang G Katti R Labie Y Travaly B Verlinden andI De Wolf Electrical evaluation of 130-nm mosfets withtsv proximity in 3d-sic structure In Proc Int Interconnect
Technology Conf (IITC) pages 1ndash3 2010 doi 101109IITC20105510710
[82] Yibin Ye and K Roy Energy recovery circuits using re-versible and partially reversible logic 43(9)769ndash778 1996doi 10110981536746
[83] Yibin Ye and K Roy Qserl quasi-static energy recoverylogic 36(2)239ndash248 2001 doi 1011094902764
[84] C C Yeh J H Lou and J B Kuo 15 v cmos full-swingenergy efficient logic (eel) circuit suitable for low-voltageand low-power vlsi applications Electronics Letters 33(16)1375ndash1376 1997 doi 101049el19970915
bibliography 131
[85] C P Yuan and T N Trick A simple formula for the esti-mation of the capacitance of two-dimensional interconnectsin vlsi circuits 3(12)391ndash393 1982 doi 101109EDL198225610
- Abstract
- Publications
- Acknowledgments
- Contents
- List of Figures
- List of Tables
- Acronyms
- 1 Introduction
-
- 11 Research goals and contribution
- 12 Thesis outline
-
- 2 Background
-
- 21 3D integrated circuits
-
- 211 3D integration approaches
- 212 TSV interconnects
-
- 22 Energy-recovery logic
-
- 221 Energy-recovery architectures
- 222 Power-clock generators
-
- 23 Full-custom design flow
- 24 Summary
-
- 3 Characterization of TSV capacitance
-
- 31 Introduction
- 32 TSV parasitic capacitance
- 33 Integrated capacitance measurement techniques
- 34 Electrical characterization
-
- 341 Physical implementation
- 342 Measurement setup
- 343 Experimental results
-
- 35 Summary and conclusions
-
- 4 TSV proximity impact on MOSFET performance
-
- 41 Introduction
- 42 TSV-induced thermomechanical stress
- 43 Test structure implementation
-
- 431 MOSFET arrays
- 432 Digital selection logic
-
- 44 Experimental measurements
-
- 441 Data processing methodology
- 442 Measurement setup
- 443 PMOS (long channel)
- 444 PMOS (short channel)
-
- 45 Summary and conclusions
-
- 5 Evaluation of energy-recovery for TSV interconnects
-
- 51 Introduction
- 52 Energy-recovery TSV interconnects
- 53 Analysis
-
- 531 Adiabatic driver
- 532 Inductors parasitic resistance
- 533 Switch M1
- 534 Timing Constraints
-
- 54 Simulations
-
- 541 Comparison to CMOS
-
- 55 Summary and conclusions
-
- 6 Energy-recovery 3D-IC demonstrator
-
- 61 Introduction
- 62 Physical implementation
-
- 621 Adiabatic driver
- 622 Pulse-to-level converter
- 623 Data generator
- 624 Pulse generator
- 625 Integrated inductor
- 626 Top level design
- 627 CMOS IO circuit
-
- 63 Post-layout simulation and comparison
- 64 Experimental measurements
-
- 641 Measurement setup
- 642 Results
-
- 65 Summary and conclusions
-
- 7 Conclusions
-
- 71 Main contributions
- 72 Future work
-
- A Appendix A Photos
- B Appendix B Statistics
- Bibliography
-
Panagiotis Asimakopoulos Optimizing the integration and energy
efficiency of through silicon via-based 3D interconnects PhD Thesiscopy November 2011
A B S T R A C T
The aggressive scaling of CMOS process technology has been driv-ing the rapid growth of the semiconductor industry for more thanthree decades In recent years the performance gains enabledby CMOS scaling have been increasingly challenged by highly-parasitic on-chip interconnects as wire parasitics do not scale atthe same pace Emerging 3D integration technologies based onvertical through-silicon vias (TSVs) promise a solution to the inter-connect performance bottleneck along with reduced fabricationcost and heterogeneous integration
As TSVs are a relatively recent interconnect technology innov-ative test structures are required to evaluate and optimise theprocess as well as extract parameters for the generation of designrules and models From the circuit designerrsquos perspective criticalTSV characteristics are its parasitic capacitance and thermomech-anical stress distribution This work proposes new test structuresfor extracting these characteristics The structures were fabric-ated on a 65nm 3D process and used for the evaluation of thattechnology
Furthermore as TSVs are implemented in large densely inter-connected 3D-system-on-chips (SoCs) the TSV parasitic capacitancemay become an important source of energy dissipation Typicallow-power techniques based on voltage scaling can be usedthough this represents a technical challenge in modern techno-logy nodes In this work a novel TSV interconnection scheme isproposed based on reversible computing which shows frequency-dependent energy dissipation The scheme is analysed using the-oretical modelling while a demonstrator IC was designed basedon the developed theory and fabricated on a 130nm 3D process
iii
P U B L I C AT I O N S
Some ideas and figures have appeared previously in the followingpublications
1 D Perry J Cho S Domae P Asimakopoulos A YakovlevP Marchal G Van der Plas and N Minas An efficientarray structure to characterize the impact of through siliconvias on fet devices In Proc IEEE Int Microelectronic TestStructures (ICMTS) Conf pages 118ndash122 2011 (best paper
award)
2 A Mercha G Van der Plas V Moroz I De Wolf P Asi-
makopoulos N Minas S Domae D Perry M Choi ARedolfi C Okoro Y Yang J Van Olmen S ThangarajuD S Tezcan P Soussan J H Cho A Yakovlev P Mar-chal Y Travaly E Beyne S Biesemans and B SwinnenComprehensive analysis of the impact of single and arraysof through silicon vias induced stress on high-k metalgate cmos performance In Proc IEEE Int Electron DevicesMeeting (IEDM) 2010
3 A Mercha A Redolfi M Stucchi N Minas J Van OlmenS Thangaraju D Velenis S Domae Y Yang G Katti RLa- bie C Okoro M Zhao P Asimakopoulos I De WolfT Chiarella T Schram E Rohr A Van Ammel A Jour-dain W Ruythooren S Armini A Radisic H PhilipsenN Heylen M Kostermans P Jaenen E Sleeckx D Sabun-cuoglu Tezcan I Debusschere P Soussan D Perry G Vander Plas J H Cho P Marchal Y Travaly E Beyne S Biese-mans and B Swinnen Impact of thinning and throughsilicon via proximity on high-k metal gate first cmos per-formance In Proc Symp VLSI Technology (VLSIT) pages109ndash110 2010
4 P Asimakopoulos G Van der Plas A Yakovlev and PMarchal Evaluation of energy-recovering interconnects forlow-power 3d stacked ics In Proc IEEE Int Conf 3D Sys-tem Integration 3DIC 2009 pages 1ndash5 2009
5 P Asimakopoulos and A Yakovlev An AdiabaticPower-Supply Controller For Asynchronous Logic Circuits20th UK Asynchronous Forum pages 1ndash4 2008
iv
In the end I hope therersquos a little note somewhere
that says I designed a good computer
mdash Steve Wozniak [79]
A C K N O W L E D G E M E N T S
First and foremost I would like to express my gratitude to mysupervisor Prof Alex Yakovlev who provided the perfect balancebetween freedom and guidance allowing me to obtain sphericalknowledge and scope in microelectronics but at the same time topromptly complete my studies Irsquom also grateful to the Engineer-ing and Physical Science Research Council (EPSRC) for fundingmy PhD studies through their Doctoral Training Accounts
As part of my research project I had the opportunity to spendsome time at Imec Belgium The experiences and lessons learnedduring my stay at Imec will undoubtedly prove valuable in myforthcoming professional career I would like to thank Dan Perryfrom Qualcomm for sharing his experience and ideas for the de-sign of the characterization structures implemented on the 65nm
Imec process Special thanks to Shinichi Domae from Panasonicand Dr Dimitrios Velenis for their assistance with the 65nm wafermeasurements as well as Marco Facchini for the time he spendtrying to measure the energy-recovery circuits Last but not leastIrsquod like to thank Geert Van der Plas and Paul Marchal for theiradvices and guidance throughout my stay at Imec
Finally many thanks to my colleagues at Newcastle Universityfor assisting me in diverse ways during my studies I would liketo especially mention Dr Santosh Shedabale and James Dochertyfor reviewing and commenting on parts of this text as well asDr Robin Emery for the numerous technical discussions and forhelping setup the design tools
v
C O N T E N T S
1 Introduction 1
11 Research goals and contribution 2
12 Thesis outline 3
2 Background 5
21 3D integrated circuits 5
211 3D integration approaches 6
212 TSV interconnects 9
22 Energy-recovery logic 13
221 Energy-recovery architectures 17
222 Power-clock generators 18
23 Full-custom design flow 20
24 Summary 21
3 Characterization of TSV capacitance 23
31 Introduction 23
32 TSV parasitic capacitance 23
33 Integrated capacitance measurement techniques 24
34 Electrical characterization 27
341 Physical implementation 28
342 Measurement setup 36
343 Experimental results 37
35 Summary and conclusions 44
4 TSV proximity impact on MOSFET performance 46
41 Introduction 46
42 TSV-induced thermomechanical stress 46
43 Test structure implementation 48
431 MOSFET arrays 51
432 Digital selection logic 55
44 Experimental measurements 61
441 Data processing methodology 61
442 Measurement setup 62
443 PMOS (long channel) 64
444 PMOS (short channel) 71
45 Summary and conclusions 73
5 Evaluation of energy-recovery for TSV interconnects 75
51 Introduction 75
52 Energy-recovery TSV interconnects 75
53 Analysis 79
531 Adiabatic driver 80
532 Inductorrsquos parasitic resistance 82
533 Switch M1 82
534 Timing Constraints 83
54 Simulations 83
541 Comparison to CMOS 86
vi
contents vii
55 Summary and conclusions 91
6 Energy-recovery 3D-IC demonstrator 92
61 Introduction 92
62 Physical implementation 92
621 Adiabatic driver 94
622 Pulse-to-level converter 95
623 Data generator 95
624 Pulse generator 95
625 Integrated inductor 97
626 Top level design 98
627 CMOS IO circuit 101
63 Post-layout simulation and comparison 101
64 Experimental measurements 106
641 Measurement setup 106
642 Results 106
65 Summary and conclusions 111
7 Conclusions 114
71 Main contributions 115
72 Future work 116
a Appendix A Photos 118
b Appendix B Statistics 120
bibliography 122
L I S T O F F I G U R E S
Figure 11 Scaling trend of logic and interconnect delay[30] 2
Figure 21 Comparison of 2D3D Integration 5
Figure 22 System-in-a-package 6
Figure 23 Package-on-package 6
Figure 24 Imecrsquos 3-D SIP wireless bio-electronic sensor[67] 7
Figure 25 3D wafer-level-packaging 7
Figure 26 3D-stacked IC 8
Figure 27 TSV process steps 9
Figure 28 Before thinningafter thinning 10
Figure 29 TSV stacking 10
Figure 210 Conventional AND gate 14
Figure 211 CMOS switching 16
Figure 212 Adiabatic switching 16
Figure 213 Basic 2N-2P adiabatic circuit 17
Figure 214 2N-2P timing 17
Figure 215 Adiabatic driver 18
Figure 216 Stepwise charging generator 19
Figure 217 1N power clock generator 20
Figure 218 Design flow 22
Figure 219 Inverter in Virtuoso schematic 22
Figure 220 Inverter in Virtuoso layout 22
Figure 31 Isolated 2D wire capacitance 24
Figure 32 TSV parasitic capacitance 25
Figure 33 Capacitance vs Gate voltage (CV) diagramof a MOS Capacitor 25
Figure 34 CBCM pseudo-inverters 27
Figure 35 Current as a function of frequency in CBCM 28
Figure 36 CBCM pseudo-inverter pair in layout 29
Figure 37 PRC1 test pad module (layout) 30
Figure 38 PRC1 test pad module (micrograph) 30
Figure 39 Structure rsquoArsquo - Single TSV capacitance (TOPdie) 32
Figure 310 Structure rsquoBrsquo - Single TSV capacitance (BOT-TOM die) 33
Figure 311 Structure rsquoCrsquo - Variable pitch TSV capacitance34
Figure 312 Structure rsquoDrsquo - TSV capacitance measure-ment with interconnection on TOP or BOT-TOM die 34
Figure 313 Structure rsquoRrsquo - Evaluation of TSV capacit-ance measurement accuracy 35
viii
List of Figures ix
Figure 314 Structure rsquoErsquo - TSV capacitance effect onsignal propagation delay (layout) 35
Figure 315 Structure rsquoErsquo - TSV capacitance effect onsignal propagation delay (schematic) 36
Figure 316 Measurement setup 36
Figure 317 TSV capacitance D2D variability on waferrsquoD11rsquo - Probability density function 39
Figure 318 TSV capacitance D2D variability on waferrsquoD18rsquo - Probability density function 40
Figure 319 TSV capacitance measurement results onwafer D11 40
Figure 320 TSV capacitance measurement results onwafer D18 41
Figure 321 Simulated oxide liner thickness variation 42
Figure 322 TSV capacitance W2W variability - Probab-ility density functions 42
Figure 323 Variable pitch TSV capacitance experimentaldata from wafer rsquoD18rsquo - Probability densityfunction 44
Figure 324 LCR experimental data from wafer rsquoD18rsquo -Probability density function 44
Figure 41 TSV thermomechanical stress simulation [40] 47
Figure 42 TSV thermomechanical stress interactionbetween two TSVs [40] 48
Figure 43 Estimated mobility shift on PMOS (a) andNMOS (b) transistors in between two TSVsseparated by distance d [65] 49
Figure 44 PTP1 test pad module (layout) 50
Figure 45 PTP1 test pad module (micrograph) 50
Figure 46 MOSFET dimensions and rotations 51
Figure 47 1 TSV MOSFET array configuration 52
Figure 48 4 diagonal TSVs MOSFET array configuration52
Figure 49 4 lateral TSVs MOSFET array configuration 53
Figure 410 9 TSVs MOSFET array configuration 53
Figure 411 MOSFET array detail 55
Figure 412 Digital selection logic 56
Figure 413 MOSFET array with digital selection logic 57
Figure 414 4-bit Gate-enableArray-enable decoder 58
Figure 415 Transmission gate for MOSFET Gate terminals59
Figure 416 Transmission gates for MOSFET Drain-Source terminals 59
Figure 417 2-bit Drain-select decoder 60
Figure 418 Processing of measurement results 62
Figure 419 Interpolation of measurement results 63
Figure 420 Measurement setup 63
Figure 421 Normalised on-current for PMOS structureldquoF0rdquo (D08) 65
List of Figures x
Figure 422 Normalised on-current for PMOS structureldquoG0rdquo (D08) 66
Figure 423 Normalised on-current for PMOS structureldquoG0rdquo (D08Y=14X=16) 66
Figure 424 Normalised on-current for PMOS structureldquoG0rdquo (D20) 67
Figure 425 Normalised on-current for PMOS structureldquoG0rdquo (D20Y=14X=16) 67
Figure 426 Normalised on-current for PMOS structureldquoG0rdquo (D08 60˚C) 68
Figure 427 Normalised on-current for PMOS structureldquoG0rdquo (D08 Y=14 X=16 60˚C) 69
Figure 428 Normalised on-current for PMOS structureldquoI0rdquo (D08) 69
Figure 429 Normalised on-current for PMOS structureldquoI0rdquo (D08 X=16) 70
Figure 430 Normalised on-current for PMOS structureldquoI0rdquo (D08 Y=14) 70
Figure 431 Normalised on-current for PMOS structureldquoG90rdquo (D08) 71
Figure 432 Normalised on-current for PMOS structureldquoG90rdquo (D08 Y=14 X=16) 71
Figure 433 Normalised on-current for PMOS structureldquoB0rdquo (D08) 72
Figure 434 Normalised on-current for PMOS structureldquoB0rdquo (D08 Y=14 X=16) 72
Figure 51 TSV energy dissipation per cycle increasewith 3D integration density 76
Figure 52 Comparison of a typical CMOS TSV driverto the proposed energy-recovery 77
Figure 53 Adiabatic driver 78
Figure 54 A simple pulse-to-level converter (P2LC) 78
Figure 55 Resonant pulse generator with dual-rail dataencoding 79
Figure 56 Energy-recovery scheme for TSV interconnects79
Figure 57 Energy-recovery timing constraints 83
Figure 58 Test case for the evaluation of the theoret-ical model 84
Figure 59 Energy dissipation results ( f Jcycle) the-oretical model (T) versus SPICE simulator(S) 85
Figure 510 Energy dissipation estimation error of thetheoretical model compared to the SPICEsimulator 85
Figure 511 Test case for comparison of energy-recoveryto CMOS 86
Figure 512 Improved P2LC design 87
List of Figures xi
Figure 513 Simulated energy dissipation of improvedP2LC design 87
Figure 514 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS for variable Q factor (C = 80 f F) 88
Figure 515 Energy contribution of energy-recovery cir-cuit components at 200MHz (C = 80 f F) 88
Figure 516 Energy contribution of energy-recovery cir-cuit components at 800MHz (C = 80 f F) 89
Figure 517 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS with variable TSV capacitance (C =
80 f F f = 500MHz) 89
Figure 518 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS (C = 80 f F f = 200MHz) 90
Figure 519 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS for variable technology (C = 80 f Ff = 500MHz) 90
Figure 61 Demonstrator circuit structure (showing 1-bit inputoutput (IO) channel face-downview) 93
Figure 62 Energy-recovery IO circuit schematic dia-gram 94
Figure 63 Adiabatic driver (layoutschematic view) 94
Figure 64 P2LC (layoutschematic view) 95
Figure 65 Data generator (layoutschematic view) 96
Figure 66 Timing constraints for the energy-recoverydemonstrator circuit 96
Figure 67 Pulse generator (layoutschematic view) 97
Figure 68 Pulse generator InputOutput signals 97
Figure 69 Inductorrsquos Q factor simulation in ASITIC Estimated energy dissipation reduction percycle over standard CMOS 98
Figure 610 Planar square spiral inductor designed inASITIC 99
Figure 611 Resonant pulse distribution tree 100
Figure 612 Interdigitated metal-insulator-metal capa-citor (fringe capacitor) 101
Figure 613 Energy-recovery IO circuit in layout view 102
Figure 614 CMOS IO circuit schematic diagram 103
Figure 615 CMOS driver schematic 103
Figure 616 CMOS IO circuit in layout view 103
Figure 617 Post-layout SPICE simulation of the energy-recovery circuit 105
Figure 618 Energy-recovery IO circuit micrograph 107
List of Figures xii
Figure 619 CMOS IO circuit micrograph 107
Figure 620 Measurement setup 107
Figure 621 Oscilloscope output for signals S1B and OC-MOS 109
Figure 622 Measuredsimulated power dissipation forthe CMOS circuit 109
Figure 623 Measuredsimulated energy dissipation forthe CMOS circuit 110
Figure 624 Oscilloscope output for signals S1A and OER110
Figure 625 Measuredsimulated energy dissipation forthe energy-recovery circuit 111
Figure 626 Energy-recovery IO circuit with fault hy-pothesis 111
Figure 627 Simulation of circuit with fault hypothesis 112
Figure 628 Measured energy dissipation and simula-tion of circuit with fault hypothesis 113
Figure A1 3D65 300mm wafer (on chuck) 118
Figure A2 Probe card 118
Figure A3 3D130 200mm wafer (on chuck) 119
Figure A4 Probe head 119
L I S T O F TA B L E S
Table 21 3D interconnect technologies comparison 8
Table 22 Technology parameters for Imec 3D130 process11
Table 23 Technology parameters for Imec 3D65 process11
Table 24 Comparison energy-recoveryconventionalCMOS 16
Table 31 Test structures on the PRC1 module 31
Table 32 PRC1 test pad instrument connections 37
Table 33 TSV capacitance D2D variability on waferrsquoD11rsquo 38
Table 34 TSV capacitance D2D variability on waferrsquoD18rsquo 39
Table 35 Variable pitch TSV capacitance experimentaldata from wafer rsquoD18rsquo 43
Table 36 Comparison between CBCMLCR measure-ments on wafer rsquoD18rsquo 45
Table 41 Coefficients of thermal expansion 47
Table 42 MOSFET array configurations for test padmodules PTP1PTP2 54
Table 43 MOSFET array configurations selected forexperimental evaluation 61
Table 44 PTP1PTP2 test pad module instrumentconnections 64
Table 45 Summary of experimental measurementson long (L) and short (S) channel PMOSdevices 73
Table 51 Energy dissipation results ( f Jcycle) the-oretical model (T) versus SPICE simulation(S) 84
Table 61 Energy-recovery IO circuit parameters 93
Table 62 Calculated resistance tolerances for eachwire branch in the resonant clock distribu-tion tree 100
Table 63 Energy-dissipation theoretical calculationsfor Q = 17 f = 500MHz 104
Table 64 Post-layout simulated energy dissipation Area requirements 106
Table 65 Energy-recovery circuit instrument connec-tions 108
Table 66 CMOS circuit instrument connections 108
xiii
A C R O N Y M S
2D two-dimentional
2N-2P 2 nMOS2 pMOS adiabatic logic
3D-IC three-dimentional integrated circuit
3D-SIC 3D stacked-IC
3D-SIP 3D system-in-package
3D three-dimentional
3D-WLP 3D wafer-level-packaging
BCB benzocyclobutene
BEOL back-end-of-line
CAL clocked adiabatic logic
CBCM charge-based capacitance measurement
CMOS complementary metal-oxide-semiconductor
CMP chemical-mechanical planarization
CTE coefficient of thermal expansion
Cu copper
CVD chemical vapor deposition
D2D die-to-die
D2W die-to-wafer
DCVSL differential cascode voltage switch logic
DRC design rule check
DUT device under test
ECRL efficient charge recovery logic
EDA electronic design automation
FEM finite element method
FEOL front-end-of-line
GPIB general purpose interface bus
IC integrated circuit
xiv
acronyms xv
IO inputoutput
KOZ keep-out-zone
LC inductance-capacitance
LVS layout versus schematic
MEMS microelectromechanical systems
MIM metal-insulator-metal
MOSFET metal-oxide-semiconductor field-effect transistor
MOS metal-oxide-semiconductor
MPU microprocessor unit
MuGFET multiple gate field-effect transistor
NMOS n-channel MOSFET
P2LC pulse-to-level converter
PAL pass-transistor adiabatic logic
PCB printed circuit board
PCG power-clock generator
PMD pre-metal dielectric
PMOS p-channel MOSFET
PVT process voltage temperature
QSERL quasi-static energy recovery logic
RC resistance-capacitance
RDL re-distribution layer
RLC resistance-inductance-capacitance
RMS root mean square
SIP system-in-a-package
Si silicon
SMU source measurement unit
SoC system-on-chip
TaN tantalum nitride
TSV through-silicon via
W tungsten
acronyms xvi
W2W wafer-to-wafer
WLP wafer-level-packaging
1I N T R O D U C T I O N
Extending integrated circuit (IC) functionality while keepingcost per silicon area approximately constant has historicallybeen the fundamental factor behind the growth of the semi-conductor industry Ever since Moorersquos observation [51] in 1965
chip functionality has been doubling every 15-2 years advancingfrom the Intel 4004 with 2000 transistors to modern generationmicroprocessor units (MPUs) that integrate over 1 billion tran-sistors [35] While there are indications that this trend might beslowing down due to planar metal-oxide-semiconductor field-effect transistor (MOSFET) physical limitations [22] innovativesolutions such as multiple gate field-effect transistors (MuGFETs)continue to drive the transistor roadmap forward as was recentlydemonstrated by Intel on their 22nm node MPU [12] It can bereasonably expected that IC transistor density will continue toincrease well into the next decade reaching 10 or even 100 billiontransistors per chip [4]
As transistor gate length dielectric thickness and junctiondepth are scaled performance in terms of speed and power im-proves as well The performance requirements for ICs have beensteadily increasing for succeeding technology generations drivenby market demand Although transistor scaling has traditionallybeen the main technology focus determining both cost and per-formance of ICs in recent years the performance gains enabled byscaling have been gradually challenged by on-chip interconnects
In an IC interconnects are the conducting wires connectingdevices and functional blocks in the circuit With smaller tran-sistors and increased functionality in modern processes intercon-nects have been progressively becoming more complex occupyingup to 12 metal layers in the chip As wire length increased sodid the signal delay time and power consumption of intercon-nects which are typically determined by wire intrinsic parasiticsin terms of resistance-capacitance (RC) It was estimated that a130nm microprocessor consumed over 50 of dynamic poweron interconnects [41] with a projection that in future designsup to 80 of microprocessor power would be consumed byinterconnects [3] Likewise signal propagation delay of globalinterconnects as a portion of the total system delay has beenrapidly increasing for succeeding technology nodes (Figure 11)
The rapid increase of the interconnect impact on IC perform-ance prompted ITRS1 to appoint the realisation of novel and
1 International Technology Roadmap for Semiconductors
1
11 research goals and contribution 2
Figure 11 Scaling trend of logic and interconnect delay [30]
innovative interconnect technologies as one of the ldquoGrand Chal-lengesrdquo in 2007 [3] Since then several technologies have been in-vestigated to address the interconnect bottleneck such as copperlow-k interconnects optical interconnects and three-dimentional (3D)interconnects Although copperlow-k interconnects are morecompatible with current processes and remain the most attractiveoption into the near term 3D interconnects are probably the mostrealistic option for meeting future technology trends [53]
Emerging 3D integration technologies based on vertical through-silicon vias (TSVs) promise a solution to the interconnect per-formance bottleneck along with reduced fabrication cost andheterogeneous integration TSVs are electrical connections passingthrough the silicon substrate and interconnecting separate diesstacked vertically in a 3D package [69] While there are several 3D
interconnect technologies [11] TSVs have the potential to offer thebest all around 3D integration technology with their advantagesincluding reduced parasitics and high interconnection density
11 research goals and contribution
As TSV is a relatively recent 3D interconnect technology there arestill challenges to be overcome before TSVs can be widely adoptedand commercialised Fabrication issues such as etching linerdeposition metallization and chemical-mechanical planarization(CMP) steps are in the process of being resolved or solutions havealready been found [55] Consequently the next logical step isthe implementation of three-dimentional integrated circuits (3D-
ICs) with more complex functionalities than simple test chipsHowever complex circuit design requires design methodologiesand tools that for 3D-ICs are still at their infancy stage [18]
The motivation behind this project was to accelerate the de-velopment of such design methodologies and tools for future
12 thesis outline 3
3D-IC designs by addressing key concerns of the TSV technologyprimarily from the circuit designerrsquos perspective The researcheffort was aimed at developing characterization structures forevaluating TSV characteristics affecting signal propagation delaypower consumption and circuit robustness as well as proposingdesign techniques for overcoming limitations Although TSVs areinterconnects their behaviour resembles more to MOSFET devicesthan the commonly used back-end-of-line (BEOL) metal wireswhile their integration into silicon introduces new challenges forICs Consequently under certain conditions conventional designapproaches are sufficient neither for TSV characterization nor forTSV interconnection
The TSV parasitic capacitance and thermomechanical stress dis-tribution are critical TSV characteristics The parasitic capacitanceaffects both delay and power consumption of signals passingthrough the TSV interconnect while thermomechanical stress af-fects MOSFET device mobility in the vicinity of TSVs Accuratelydetermining these parameters is necessary for generating designrules and models that can be used in TSV-aware design tools Aspart of this project sophisticated test structures were developedand fabricated on a 65nm 3D process for extracting the TSV para-sitic capacitance and thermomechanical stress distribution
The TSV parasitic capacitance is typically small however it maystill become an important source of power consumption as TSVs
are gradually implemented in large densely interconnected 3D-system-on-chips (SoCs) Although conventional low-power designtechniques based on voltage scaling can be used this representsa technical challenge in modern technology nodes In this worka novel TSV interconnection scheme is proposed which is basedon energy-recovery logic and shows frequency-dependent powerconsumption The scheme is analysed using theoretical modellingwhile a demonstrator IC was designed based on the developedtheory and fabricated on a 130nm 3D process
The work presented in this thesis was part of a larger researchproject undertaken at Imec2 with the goal of improving andoptimising TSV-based 3D integration technology The authorrsquoscontribution within this project was to propose design and sim-ulate the circuits presented in the following chapters as well asperform and analyse the experimental measurements
12 thesis outline
This thesis is organised into seven Chapters and two Appendixesas follows
2 Imec is a micro- and nanoelectronics research center located in Leuven Belgium(httpwwwimecbe)
12 thesis outline 4
Chapter 2 (Background) introduces the basic concepts com-prising the fields of 3D integration and energy-recovery logic Inaddition the design flow used for implementing the design partof this work is described
Chapter 3 (Characterization of TSV capacitance) presents agroup of test structures for characterizing the TSV parasitic ca-pacitance under various conditions The test structures are fab-ricated on a 65nm 3D process and the measurement data areanalysed using statistical methods
Chapter 4 (TSV proximity impact on MOSFET performance)presents a test structure for monitoring the impact of TSV-inducedthermomechanical stress on MOSFET device performance Thetest structure uses proven design techniques for implementinga large number of test-cases and accessing numerous deviceswith maximum precision The structure is fabricated and usedfor characterizing thermomechanical stress on a 65nm 3D process
Chapter 5 (Evaluation of energy-recovery for TSV intercon-nects) investigates the potential of the energy-recovery techniquefor reducing the energy dissipation of TSV interconnects in 3D-ICsThe total energy dissipation per cycle and optimum device sizingare extracted for the proposed scheme using theoretical model-ling while the configuration is evaluated against conventionalstatic complementary metal-oxide-semiconductor (CMOS) design
Chapter 6 (Energy-recovery 3D-IC demonstrator) elaborates onthe design of a 3D demonstrator circuit based on the energy-recovery scheme under realistic physical and electrical con-straints The demonstrator is compared to a CMOS circuit de-signed with identical specifications using post-layout simulationsBoth circuits are fabricated on a 130nm 3D technology processand evaluated based on the experimental measurements
Chapter 7 (Conclusions) summarises the contributions of thisthesis as presented in the previous chapters and discusses areasof future work
Appendix A includes photos used as reference in various partsof this work
Appendix B elaborates on the statistical descriptors used inthis work for analysing experimental measurement data
2B A C K G R O U N D
The work presented in this thesis contributes in the fields of 3D
integration and energy-recovery logic This chapter introduces thebasic concepts comprising these two fields not in extensive detailbut enough that the reader can understand the motivation andthe choices made in the context of this work Furthermore thefull-custom design flow is discussed that was used to implementthe design part of this thesis as will be presented in the followingchapters
21 3d integrated circuits
CMOS integration technologies are traditionally based on two-dimentional (2D) planar architectures Each die is individuallypackaged and placed on a printed circuit board (PCB) which isused to provide mechanical support and interconnection betweendifferent dies and electronic components A 3D-IC is a stack ofmultiple dies (Figure 21) vertically integrated and interconnec-ted sharing a common package
There are several motivations for the development of 3D-ICsPerhaps the most significant advantage of 3D integration is thereduced interconnection delay and energy dissipation In generalglobal interconnect delay and energy is linearly related to wirelength in an IC By using 3D integration modelling of simple3D-IC architectures has demonstrated that up to 50 reduction inglobal interconnect length may be achieved [24] with comparablereductions in delay and energy
Another advantage of 3D integration is heterogeneous integra-tion Presently 2D planar architectures limit designers to a singlefabrication technology for both the analog and digital parts of
ABCDEFBA ABCDEFBA
Figure 21 Comparison of 2D3D Integration
5
21 3d integrated circuits 6
ABCDEFBC
E
Figure 22 System-in-a-package
ABCDEFBC
E
Figure 23 Package-on-package
a circuit The common practice is to use an inexpensive digitalprocess for the complete design which is usually inefficient forimplementing analog circuits In contrast a 3D-IC allows for themost suitable process technology to be used for implementingeach part of the design In addition other components may beintegrated in the same circuit such as microelectromechanicalsystems (MEMS) [80] and passive devices [66] (resistors capacitorsinductors) enabling ICs with novel functionalities
The replacement of long horizontal global interconnects withshort vertical ones allows for reduced die size as well which isdirectly related to the fabrication cost Also the vertical integ-ration of dies in the same package removes the need for thespacing overhead between components that is necessary in PCBsincreasing the area efficiency and reducing the form factor
211 3D integration approaches
3D integration may be achieved using several microelectronictechnologies which result in different densities and capabilitiesThe levels of 3D integration can be reasonably classified into threemajor technology platforms [11 4] based on the interconnectwiring hierarchy and the industrial infrastructure
bull 3D system-in-package (3D-SIP) Stacking of individual pack-ages
bull 3D wafer-level-packaging (3D-WLP) Stacking of embeddeddies by wafer-level-packaging (WLP)
bull 3D stacked-IC (3D-SIC) TSV-based vertical system integra-tion
The 3D-SIP integration is based on traditional packaging intercon-nect technologies Each sub-system is integrated in a system-in-a-package (SIP) fashion (Figure 22) using wire-bonds for inter-connection and subsequently each SIP subsystem is assembledvertically to create package-on-package 3D stacks (Figure 23)
3D-SIP is currently the most mature technology and has alreadyreached volume production It provides with increased function-ality in a smaller form factor and low cost However interconnectdensity is limited since both wire bond pads and solder bumps
21 3d integrated circuits 7
Figure 24 Imecrsquos 3-D SIP wireless bio-electronic sensor [67]
A
BCDEAFE
FFDC
CFE
Figure 25 3D wafer-level-packaging
take up a large area on silicon due to their large pitch The longbond wire length can also significantly distort signals due toits intrinsic parasitics Hence this approach offers limited per-formance when compared to other 3D interconnect solutions Anexample of 3D-SIP application is the wireless bio-electronic sensor(Figure 24) developed at Imec [67]
3D-WLP integration is based on WLP technology directly stack-ing individual dies face-to-face using flip-chip micro-bumps forinterconnection The 3D interconnects are processed after IC pas-sivation using WLP techniques and may also be realised usingTSVs (Figure 25) The TSVs if existing are much larger than theequivalent interconnections in a 3D-SIC typically in the range of20 minus 40microm (1 minus 5microm in 3D-SIC) [11]
The 3D-SIC technology [69] is similar to a 2D SoC approachexcept that circuits exist physically on different layers and verticalinterconnects replace the global interconnections of the SoC 3D-
SICs (Figure 26) are based on TSV interconnections and have thepotential to offer the best all around3D integration technology Theadvantages of TSV-based 3D-SICs include the short interconnectlength (reduced parasitics) and the high interconnection densityHowever the additional process steps required for implementingthe TSVs increase manufacturing cost
A comparison between the discussed 3D interconnect technolo-gies is attempted in Table 21
21 3d integrated circuits 8
ABCD
EFFDB
F
Figure 26 3D-stacked IC
3D interconnect
technologyAdvantages Disadvantages
3D system-in-package (3D-SIP)
Simple existingtechnology bestheterogeneous
integration
Performance(speed power) low
densitymanufacturing costin high-quantities
3D wafer-level-packaging (3D-
WLP)
Existingtechnology good
compromisebetween
performance-manufacturing
cost
Averageperformance and
density
3D stacked-IC (3D-SIC)
Performance(speed power)
high densitymanufacturing costin high-quantities
Manufacturing costin low-quantities
not ready formanufacturing
Table 21 3D interconnect technologies comparison
21 3d integrated circuits 9
ABBCDE
FBACACDA
ACCD F
Figure 27 TSV process steps
212 TSV interconnects
Of particular interest to this PhD thesis is the 3D-SIC approachdeveloped at Imec and specifically the TSVs used for verticalinterconnection
In Imecrsquos 3D-SIC process [69 73] TSVs are created prior to theBEOL processing The TSV process steps are illustrated in Fig-ure 27 After processing of the CMOS front-end-of-line (FEOL) andpre-metal dielectric (PMD) stack the TSV hole is plasma-etchedwith a diameter of 5microm and a minimum pitch of 10microm A thinchemical vapor deposition (CVD) oxide layer (sim 200nm) is thendeposited for insulating the TSVs from the substrate The metal-lization sequence consists of a tantalum nitride (TaN) barrier de-position and filling of the TSV hole with electroplated copper (Cu)Finally the Cu overburden is removed in a top-side CMP step anda standard 2-metal layer BEOL process is applied to finalise thedie
Before stacking the wafer is mounted on a temporary carrierand thinned down to a silicon (Si) thickness of sim 25microm leavingthe TSV Cu exposed on the back-side of the wafer (Figure 28)After dicing the dies are stacked by Cu-Cu thermocompressionin a die-to-die (D2D) or die-to-wafer (D2W) fashion (Figure 29)using a tacky polymer as an intermediate glue layer such asbenzocyclobutene (BCB) (ǫr = 27) Target technology parametersfor Imecrsquos 3D-SIC 130nm and 65nm processes are summarised inTables 22 and 23 respectively
As TSV is a relatively new 3D interconnect technology thereare several challenges to be overcome before TSV technology canbe widely adopted and commercialised Issues in the fabricationlike etching liner deposition metallization and CMP steps arein the process of being resolved or solutions have already beenfound [55] On the contrary design methodologies for TSV-basedICs are still at their infancy stage [18]
21 3d integrated circuits 10
AAAB
CAAAB
ABC
DEF
Figure 28 Before thinningafter thinning
ABC ABC
DE DE
F
DEE
B
CBA
Figure 29 TSV stacking
21 3d integrated circuits 11
Minimum devicechannel length (nm)
130
Supply voltage (V) 12
Routing layers 2
TSV diameter (microm) 5
TSV pitch (microm) 10
TSV length (microm) 22
TSV resistance (mΩ) ~20
TSV capacitance( f F)
~40
Wafer diameter(mm)
200
Table 22 Technology paramet-ers for Imec 3D130
process
Minimum devicechannel length (nm)
70
Supply voltage (V) 10
Routing layers 2
TSV diameter (microm) 5
TSV pitch (microm) 10
TSV length (microm) 40
TSV resistance (mΩ) ~40
TSV capacitance( f F)
~80
Wafer diameter(mm)
300
Table 23 Technology paramet-ers for Imec 3D65 pro-cess
Approaching the technology challenges from the designerrsquosperspective we can emphasise on the following issues designtools thermomechanical stress thermal analysis electrical char-acteristics and testing
Design tools
Although it is possible to design 3D-ICs using existing tools byutilising ad hoc techniques as 3D implementations become moreadvanced ldquonativerdquo 3D design and analysis tools will becomenecessary A few physical layout tools are currently availablefor TSV-based design [2] however they do not offer automaticpartitioning placement and routing Moreover there are noavailable commercial tools for evaluating performance (speedpower) reliability and manufacturability in 3D-IC designs
Thermomechanical stress
TSV fill materials may include copper (Cu) tungsten (W) orpolysilicon[20] The large mismatch in the coefficient of thermalexpansion (CTE) between the fill material of the TSV (17 middot 10minus6K)and the Si surface (3 middot 10minus6K) induces mechanical stress aroundthe vias As Cu contracts much faster than Si it pulls the surface ofthe surrounding bulk while cooling down causing tensile stressin the area [25] Since even low stress values affect carrier mobilityin MOSFET devices [71] performance shift of the transistors inproximity to TSVs may occur
21 3d integrated circuits 12
The intensity of TSV-induced stress depends on various condi-tions such as TSV geometry device proximity and process char-acteristics Defining design rules such as keep-out-zones (KOZs)for digital and analog circuits and generating stress-effect modelsis an important milestone for implementing TSV-aware circuitswith predictable performance Simulations predict the complexdistribution of thermomechanical stress [40] which emphasisesthe need for sophisticated characterization structures that canmonitor stress impact with precision and verify simulation mod-els In Chapter 4 a characterization structure is presented forevaluating the impact of TSV-induced thermomechanical stresson MOSFET device performance
Thermal analysis
The close proximity of dies in a 3D-IC vertical stack increasesthe heat density proportionally to the number of stacked diesAs temperature affects transistor parameters such as leakagecurrent accurate thermal analysis is required for 3D-ICs to ensurethat design specifications are met Presently tools for thermalanalysis of planar ICs are available however they have to beadapted for use with complex 3D systems so as to reduce runtime and improve the accuracy of TSV models [18] In additionldquothermally-awarerdquo design tools are required which can partitionthe design in such a way that high heat generating componentsare placed on layers in proximity to the heat-sink
Electrical characteristics
The electrical characteristics of TSVs such as delay crosstalk sig-nal integrity (SI) and power integrity (PI) dictate many aspects ofa 3D-IC design Proper evaluation of TSV characteristics typicallythrough simulation means is critical for accurately predicting theperformance of a 3D system [57]
Several attempts have been made to generate models of theTSV structure [77 27] using its intrinsic resistance-inductance-capacitance (RLC) parasitics in an equivalent circuit model TheRLC parasitics depend on the materials and geometry of the TSV
structure while prior work has indicated that these parametersare relatively small with the capacitance assuming the dominantrole on TSV electrical properties [27] For verifying the accur-acy and calibrating the models experimental measurements arerequired using compatible characterization structures Such struc-tures are presented in Chapter 3 which were used for extractingthe TSVs parasitic capacitance of an experimental 3D process
22 energy-recovery logic 13
Testing
Testing ICs for defects is crucial for guaranteeing a sufficient man-ufacturing quality The unique processing steps of TSVs such aswafer thinning alignment and bonding introduce new types ofdefects that should be considered during testing [46] Further-more finding and diagnosing defects in TSV interconnections isconsiderably more challenging than 2D interconnects consideringthat it is not possible to gain physical access to the vias after thetop die has been placed All these factors emphasise the need fornew testing structures and methodologies for testing 3D systems
22 energy-recovery logic
TSV 3D interconnect technology aims to reduce delay and en-ergy dissipation when compared to the traditionally used planarBEOL interconnections A key TSV characteristic is its parasiticcapacitance discussed in Chapter 3 which is typically small(Tables 22 and 23) influencing proportionally both delay andenergy dissipation Although TSV technology offers significantperformance improvements to previous solutions a low-powerinterconnection scheme may still be desirable under certain con-ditions due to specific characteristics of TSV technology
1 The TSV capacitance may become a concern in future largedensely interconnected 3D-SoCs as it increases linearly withthe number of tiers and interconnections
2 Since interconnects do not follow transistor scaling tran-sistor gate capacitance will always be smaller than intercon-nect capacitance With the gap tending to widen in futuretechnologies it can be reasonably expected that 3D intercon-nects will become a performance bottleneck in ICs just asthe BEOL interconnections preceding them
Furthermore conventional CMOS low-power design techniquesbased on voltage scaling represent a technical challenge in mod-ern technology nodes as process voltage temperature (PVT) vari-ations and increased leakage [61] decrease in practice the rangeover which voltages may be scaled An interesting low-powerdesign approach is energy-recovery logic which demonstratesfrequency-dependent energy dissipation Energy-recovery logicis a design technique for minimising the energy dissipation ofcomputing devices well below typical CMOS limits Its major lim-itation is the requirement for passive components to generatetime-variable signals which cannot be efficiently integrated intocurrent generation ICs TSV-based 3D stacking and heterogeneousintegration of high-quality passive devices [66] may enable theuse of novel design techniques such as energy-recovery that
22 energy-recovery logic 14
Figure 210 Conventional AND gate
were not practically feasible in the past due to technology lim-itations In this PhD thesis energy-recovery is used for drivingTSV-based interconnections in a low-power scheme discussed inChapters 5 and 6
Historically energy-recovery logic has its roots in reversiblecomputing Reversibility is a concept from thermodynamics de-scribing physical processes that after they have taken place canbe reversed without loss or dissipation of energy Reversibleprocesses cause no change in the entropy of the system or itssurroundings (do not generate any waste heat) therefore theyachieve maximum energy efficiency The concept of reversibilityin thermodynamics may be extended to logical reversibility ifa computation can be implemented in such a way that it is al-ways possible to determine the previous state of the computationgiven a description of its current state In practical terms logicalreversibility would require that all information about the state ofthe computation in any stage is retained and never discarded
Yet the typical computing device is not reversible (irreversible)and as Landauer [36] has shown there is a fundamental min-imum amount of energy that must dissipated in such devicesThis minimum energy dissipation is known as the Von Neu-mann - Landauer limit and is at least kT ln 2 per irreversible bitcomputation The irreversible operation of conventional comput-ing can be demonstrated using the AND gate as an example(Figure 210) The AND gate has two input lines and one outputWhen both inputs are at logical 1 the output is 1 however theoutput is 0 in all of the remaining three combinations that have alogical 0 on the inputs Hence every time the gatersquos output be-comes 0 there is loss of information since we cannot determineexactly which input state caused the logical 0 at the output
Computations do not necessarily have to be irreversible aswas shown by Bennett [9] and thus it is still possible to breakthe kT ln 2 limit A reversible computing device would have to
22 energy-recovery logic 15
perform each computation twice the first time in the forwarddirection applying the inputs to a logic function and obtainingthe result and the second time backwards computing the inversefunction and returning the system to its initial state If a com-putation could be performed in such a way and implementedon non-dissipative hardware its energy requirements could bepotentially reduced to zero [10]
In practice even if a circuit is designed to be logically reversibleachieving zero-energy computations utilising existing electricalcircuits is limited considering that any process merely involvingcharge transfer will result in dissipated energy Nevertheless it isstill possible to reduce the energy dissipation during charge trans-ferring to ldquoasymptotically zerordquo levels applying circuit designtechniques generally known as adiabatic switching [8 31]
The operating principle of adiabatic switching can be bestexplained by comparing it to conventional CMOS switching In aconventional CMOS circuit (Figure 211) when the switch closes(typically the pull-up network) a certain amount of charge Q =
CVDD is pulled out of the positive power supply rail that chargesthe load capacitance C up to VDD The energy transferred fromthe power supply during the charging period (T) is [59]
E = QVDD = CV2DD (21)
From the total transferred energy only half ( 12 CV2
DD) is storedon the capacitor C while the other half is dissipated on theswitch and interconnect resistance R As can be derived fromEquation 21 to reduce the energy dissipation in conventionalCMOS either the supply voltage VDD has to be reduced or theload capacitance C
The adiabatic charging of an equivalent node capacitance C
is modelled in Figure 212 In contrast to the conventional CMOS
circuit the supply voltage in this case is not constant but time-variable with a slow rising time (T ≫ RC) If the charging periodof the capacitor (or rising time of the voltage supply) is T then itcan be proven [82 60] that the energy dissipated on the resistanceR would be proportional to
EAdiabatic prop
(
RC
T
)
CV2DD (22)
The distinct advantage of adiabatic switching is that the slowrising time of the supply voltage keeps the voltage difference(and thus current flow) on the resistance R low at all times Asthe charge transfer is spread more evenly over the entire chargingperiod the energy dissipation (E = I2R) is greatly reduced Inaddition as can be derived from Equation 22 increasing thecharging period (T) can potentially reduce the energy dissipationto arbitrary low levels irrespective of the VDD or C parameters
22 energy-recovery logic 16
A
B
CC
D
E
D
Figure 211 CMOS switching
AA
B
C
B
Figure 212 Adiabatic switching
Conventional CMOS Energy-recovery CMOS
2 states True False 3 states True False Off
Energy of information bitdissipated as heat on the
pull-down network
Energy of information bitrecovered by the
power-supply
High speed Lowmedium speed
High energy dissipation Very low energy dissipation
Area efficient Some area overhead
Table 24 Comparison energy-recoveryconventional CMOS
Combining reversible computations with adiabatic switchingtechniques allows for circuit implementations with extremely lowenergy dissipation compared to conventional designs [37 38]However fully reversible circuits in CMOS technology sufferfrom high area overhead as the inverse logic function has tobe implemented and intermediate logic states must be stored toenable reversibility of the computation [8] In cases where theoverhead of full reversibility is too high it can be significantlyreduced by allowing some information to be occasionally dis-carded without sacrificing much of the energy benefits Thesecircuits that partially apply the principles of reversible comput-ing and adiabatic switching to achieve low but not zero energydissipation are called energy or charge recovery [31]
A comparison between energy-recovery and conventional CMOSis summarised in Table 24
22 energy-recovery logic 17
ABCABC
Figure 213 Basic 2N-2P adiabaticcircuit
ABC
ABC
DEF
EE
DEF
Figure 214 2N-2P timing
221 Energy-recovery architectures
Several energy-recovery circuit architectures have been proposedin the literature over the past years Examples of logic families are2 nMOS2 pMOS adiabatic logic (2N-2P) [33] efficient charge re-covery logic (ECRL) [50] pass-transistor adiabatic logic (PAL) [54]clocked adiabatic logic (CAL) [45] quasi-static energy recoverylogic (QSERL) [83] and boost logic [62] These architectures applythe principles of reversible computing and adiabatic switchingto a variable degree while their differences are mainly relatedto characteristics such as singledual-rail logic style number ofclock phases chargingdischarging path and energy efficiency
A simple configuration is the 2N-2P family of logic circuits(Figure 213) which is based on the differential cascode voltageswitch logic (DCVSL) circuit [59] The timing diagram of the 2N-2P
circuit is presented in Figure 214 using an idealised adiabaticvoltage supply Initially the voltage supply is in the WAIT phase(LOW) keeping the outputs in the LOW state Then the inputs areset complementary to each other and the supply voltage rampsup As the inputs are evaluated the output that was enabled bythe pull-down network follows the power supply until it reachesVDD At that moment the inputs return to the LOW state andafter a certain period of time in the HOLD ldquo1rdquo phase the voltagesupply ramps down discharging the output As can be observedthe average voltage supply during the computation cycle is wellbelow VDD which can be advantageous in modern technologynodes where leakage currents are of a concern
The adiabatic chargingdischarging takes place during theramping updown of the EVALUATERESET phases while thereverse computation occurs during the RESET phase when thetransferred charge is returned to the power supply Howeverthe reversibility of the circuit is only partial since the p-channel
22 energy-recovery logic 18
Figure 215 Adiabatic driver
MOSFET (PMOS) transistor that was turned-on by the pull-downnetwork will be forced to turn-off when the output is approxim-ately a threshold voltage above zero This will cause some chargeto be trapped on the output which will be dissipated as heat on asubsequent cycle Using the 2N-2P circuit arbitrary logic functionsmay be implemented in the place of the pull-down network
The energy efficient characteristics of energy-recovery logic canbe also beneficial for driving large load capacitances A circuitthat may be used in such case is the adiabatic driveramplifier(Figure 215) [8]
The input of the adiabatic driver is dual-rail encoded as well asthe output The operation principle is straightforward First theinputs are set to valid values and then an adiabatic time-variablesignal goes through the pass-gate charging the load capacitanceat the output to its maximum value The output is now validand can be used by other stages to perform computations Sub-sequently the output slowly discharges returning the charge backto the power-supply Energy is saved by two methods Firstlythe slow-rising adiabatic signal forces a low voltage differencebetween the inputs of the pass-gate resulting in low currentand thus low energy dissipation Secondly the energy that isnot dissipated as heat during the transfer is returned back tothe power-supply which can be re-used in the next computationcycle
Circuits utilising the energy-recovery technique have been suc-cessfully implemented in the past while they have demonstratedgreat potential whilst driving highly-parasitic global intercon-nects such as clock distribution networks [42 63]
222 Power-clock generators
Energy-recovery circuits require specialised power supplies thatcan generate multiple-phase time-variable adiabatic signals and
22 energy-recovery logic 19
A
A
B
C
C
Figure 216 Stepwise charging generator
recover charge Since the power supply in an energy-recoverycircuit provides with both the power and clock signal they arereferred to as power-clock generators (PCGs) [6] The power-clockgenerators fall into two categories staircase and resonant gener-ators
Staircase generators use the stepwise charging technique [68]The basic circuit (Figure 216) is composed of N number of tankcapacitors (CT) that are successively discharged in N steps un-til the load capacitance (CLOAD) is charged to the maximumvoltage level This technique attempts to ldquoemulaterdquo the adiabaticchargingdischarging signal while its linearity depends on thenumber of charging steps N The advantage of staircase generat-ors is that they can be easily integrated on-chip since they do notrequire inductors as is the case with resonant generators How-ever both speed performance and energy efficiency are relativelylow
Resonant generators [44] are based on inductance-capacitance(LC) oscillators A simple circuit is the 1N single-phase powerclock generator of Figure 217 If R is the resistance of the adia-batic driver in Figure 215 then the load capacitance (CLOAD)along with a resonant-tank inductor (L) form an RLC oscillatorresonating at
f =1
2πradic
LCLOAD
(23)
As typical for RLC oscillators the shape of the power-clockis a sinewave which approximates the idealised waveform ofFigure 214 Energy is transferred adiabatically during the risinghalf of the sinewave while on the falling half the energy is
23 full-custom design flow 20
A
BCD
E
DFF
CD
Figure 217 1N power clock generator
partially recovered and stored in the inductor for later use RLC
oscillators are very energy efficient power-clock generators andcan operate at high speeds However inductors may be difficultto implement on-chip
23 full-custom design flow
All circuits presented in this thesis were designed using the full-custom approach since the target fabrication technology was anexperimental 3D process for which neither standard cells nordedicated 3D design tools were in our disposal
In the full-custom design methodology each individual tran-sistor and associated connectivity in the integrated circuit ismanually specified in the layout level by the designer Full-customis considered to be superior to standard-cell design when themain concern is circuit performance and area efficiency Further-more full-custom design approaches are essential in cases wherespecialised circuitry are to be designed or when standard celllibraries are not available
The electronic design automation (EDA) environment that wasused was based on Cadence1 applications and tools as integ-rated in the Design Framework II (DFII version 5141) CadenceDFII consists of tools for design management (library manager)schematic entry (Virtuoso schematic) physical layout (Virtuosolayout) verification (Assura) and simulation (Spectre) Thesetools are completely customisable supporting various processtechnologies through foundry-specific design kits In our case
1 Cadence Design Systems Inc (httpwwwcadencecom)
24 summary 21
the Assura tool was replaced by Mentor Graphics2 Calibre forthe verification step
Our customised design flow is illustrated in Figure 218 Ini-tially the circuit is captured at the transistor-level using theschematic editor (Figure 219) and the completed design is simu-lated and modified until it conforms to the design specificationsThe same circuit is re-created in the layout editor (Figure 220)where the detailed geometries and positioning of each fabricationmask layer is described The layout must conform to a series ofprocess design rules and any violations of these rules can be de-tected during the design rule check (DRC) step Subsequently thelayout design is translated to a circuit description (netlist) whichis compared to the initial circuit in the schematic editor duringthe layout versus schematic (LVS) step If the two designs matchthe interconnectivity and transistor parasitics are estimated fromthe layout design and another netlist is created which includesthese parasitics This detailed netlist can be simulated to get amore accurate estimation of the circuit behaviour including theinfluence of parasitics
24 summary
In this chapter the reader was introduced to the basic conceptsand terminology that are relevant in the context of this thesisThe following chapter begins the discussion of the research partof this work presenting a group of test structures that were usedfor the electrical characterization of the TSV parasitic capacitancein an experimental 3D process
2 Mentor Graphics Inc (httpwwwmentorcom)
24 summary 22
ABCDEA
FCCAAE
ABCDEAEC
DE
E
F
CEA
EEAEA
EDE
CAC
E
EC
CAC
Figure 218 Design flow
Figure 219 Inverter in Virtuososchematic
Figure 220 Inverter in Virtuosolayout
3C H A R A C T E R I Z AT I O N O F T S V C A PA C I TA N C E
31 introduction
In a semiconductor process technology electrical characterizationis crucial for testing device performance and reliability after goingthrough the process fabrication steps Electrical characterizationverifies the electrical properties of devices to specifications andprovides feedback to the wafer fabrication process engineers forimproving and optimising the process technology
In a 3D system the electrical behaviour of the TSV can bemodelled in terms of its resistance-inductance-capacitance (RLC)parasitics as discussed in 212 (page 12) Parasitic models are typ-ically used for simulating circuit behaviour and they are criticalfor accurately predicting the performance of a 3D system duringdesign verification [57]
From the RLC parasitics associated with a TSV capacitanceassumes the most significant role on its electrical properties suchas energy dissipation and delay [27] The capacitancersquos role is sodominant that for latency and signal integrity (SI) analysis theTSV behaviour can be modelled by considering the capacitancealone [21] Thus accurate characterization of the TSV capacitanceis highly desirable when designing a 3D system
Typical laboratory capacitance measurement techniques basedon LCR1 meters require well controlled process technologies forrealising a high number of devices without defects which mightnot be possible at the current maturity level of TSV process tech-nology In this chapter an alternative measurement technique isproposed for accurate on-chip characterization of the TSV parasiticcapacitance Test structures based on this technique are fabricatedon a 65nm 3D process (Table 23 page 11) and used for charac-terizing the TSV parasitic capacitance under various conditionsThe experimental measurements are analysed using statisticalmethods and the proposed test structures are evaluated
32 tsv parasitic capacitance
A typical isolated 2D back-end-of-line (BEOL) metal wire can bemodelled as a conductor over a ground plane (Figure 31) Sincethe conductor is separated from the ground plane by a dielectric
1 LCR meter is a measurement equipment used to determine the inductance (L)capacitance (C) and resistance (R) of devices based on impedance measure-ments
23
33 integrated capacitance measurement techniques 24
ABC
Figure 31 Isolated 2D wire capacitance
material (εox) its electrical behaviour closely resembles that ofa capacitor The basic formula for calculating a parallel platecapacitance is
C =εox
hwl (31)
In practice calculating the parasitic capacitance of a BEOL metalwire is significantly more complex since fringing fields increasethe total capacitance value [78] Nevertheless whether a 2D metalwire capacitance is calculated using Equation 31 or more com-plex formulas [85] its value is only dependent on geometricalcharacteristics and the dielectric material
In that context 3D interconnects in the form of TSVs do notbehave as simple wires but as devices The TSV copper (Cu) issurrounded by silicon (Si) and these two materials are separatedby a dielectric oxide liner (Figure 32) This structure resemblesa metal-oxide-semiconductor (MOS) capacitor and as has beenshown by modelling and measurements [27] exhibits distinctaccumulation depletion and inversion regions (Figure 33) TheTSV capacitance attains its maximum value (Cox) in the accumula-tion when Vbias lt VFB (flatband voltage) and inversion regions(Vbias gt Vth f lt 1kHz)
The bias voltage and frequency dependency of the TSV capacit-ance complicates both modelling and the experimental measure-ment of its parasitic capacitance Furthermore the MOS behaviourof TSVs is not desirable in circuits using 3D interconnections Forthat reason special techniques are employed [28] that shift theTSV C-V characteristic in such a way that for typical circuit oper-ating voltages (0 minus VDD) the TSV capacitance is constant and atits minimum value (Figure 33)
33 integrated capacitance measurement techniques
Extracting the TSV parasitic capacitance requires the use of meas-urement techniques that are compatible with the particular TSV
33 integrated capacitance measurement techniques 25
ABCDE
F
FD
F
F
FD
Figure 32 TSV parasitic capacitance
ABCDE FACDE
ECDE
BCFACDE
FCCDED DBCDE
D
DAB
Figure 33 Capacitance vs Gate voltage (CV) diagram of a MOS Capa-citor
33 integrated capacitance measurement techniques 26
characteristics as indicated in the previous section The tech-niques commonly used for measuring integrated capacitances ina laboratory environment can be classified into two categories
1 Measurements based on external instruments these can bedirect capacitance measurements LCR meter based [52] orRF S-parameter based measurements [39]
2 On-chip capacitance measurements examples are the ref-erence capacitor technique [32] ring-oscillator based [26]and charge-based [15]
The disadvantage of using external instruments directly connec-ted to the device under test (DUT) is that the measurement setupintroduces additional parasitics that have to be subtracted fromthe measurement result Using standard measurement setupssignificant noise is added when attempting to determine capa-citances well below 1pF due to the influence of the instrumentparasitics [64] In cases where the DUT capacitance is small suchas the femto-Farad TSV capacitance (Table 23) a large number ofidentical devices (typically in the hundreds) can be connected inparallel to increase the total capacitance value [75] However thismethod is only usable with process technologies that are wellcontrolled and allow the realisation of a high number of deviceswithout defects In general the extracted capacitance of the wholepopulation may not be representative of a single device [56] andas a result individual device measurement accuracy may suffer
On-chip techniques can achieve better accuracy when the DUT
capacitance is small Although DUT yield is not critical the pro-cess technology should allow the integration of active devicesand their parameters should be known to a reasonable degreeA simple yet high resolution technique is the charge-based ca-pacitance measurement (CBCM) which can be used to extractcapacitances down to the atto-Farad range [15]
In the CBCM technique the DUT capacitance (CDUT) is chargeddis-charged at frequency f using a pseudo-inverter (Figure 34) whilethe average charging current (Im) is measured with a laboratoryammeter Since transistor and wiring parasitic capacitances are inparallel with the DUT capacitance a second (reference) inverter isused to measure these parasitics (Ire f ) which are then subtractedfrom the calculated result The DUT capacitance is calculated asfollows
CDUT =Im minus Ire f
VDD middot f(32)
The accuracy of the CBCM technique can be further increasedby measuring the charging current over a range of frequenciesThe DUT capacitance can be then calculated from the slope of
34 electrical characterization 27
AB
AC
DE
FE
DE
FE
AB
AC
B
Figure 34 CBCM pseudo-inverters
Inet (Im minus Ire f ) (Figure 35) The accuracy is greatly improved aspseudo-inverter leakage current is removed from the extractedresult (slope is not affected by constant biases) and also defectiveDUTs can be detected by inspecting the linearity of the current-frequency plot
The CBCM technique has been previously used for measuringBEOL metal interconnect parasitic capacitances [15] and MOSFET
capacitances [64] In this work CBCM is proposed for the on-chipmeasurement of the TSV parasitic capacitance CBCM is compat-ible with the TSV parasitic capacitance on the premise that thevoltage and frequency ranges of CBCM are selected such thatthe TSV is forced into a region where its capacitance is constantAs was shown in the previous section (Figure 33) TSV capacit-ance is constant for typical circuit operating voltages while thefrequency dependency of the capacitance becomes significantonly in the multi-MHzGHz range [39] Therefore operating theCBCM pseudo-inverters in the 0 minus VDD range and below 1MHz
should be adequate to avoid the MOS behaviour of the TSV
34 electrical characterization
A group of test structures based on the CBCM technique wereimplemented for on-chip characterization of the TSV capacitanceon Imecrsquos 65nm 3D process technology (Table 23) As it is commonin IC design maximising the utilisation of the available chiparea is one of the priorities of the design effort For that reasonsupplementary test structures were designed and implemented
34 electrical characterization 28
AB
CADE
F
AB
E
CAD
B
Figure 35 Current as a function of frequency in CBCM
on the same chip that went beyond the core objectives of thiswork
The complete design consisted of 6 test structures 5 for ex-tracting the TSV capacitance under various conditions based onCBCM and 1 additional experiment for direct measurement ofthe RC constant All test structures are presented in this sectionhowever due to equipment and material availability only themost significant were measured and analysed
341 Physical implementation
In the CBCM technique the extraction accuracy of the TSV capacit-ance is highly dependant on the design of the pseudo-invertersThe rsquoreferencersquo pseudo-inverter in CBCM is used for measuringboth the wire parasitic capacitance and the inverter parasitics inthe form of diffusion and gate-drain overlap capacitances There-fore it is critical that the two pseudo-inverters are identical interms of parasitics as any mismatches will appear in the extrac-ted measurement result
Transistor mismatch is mostly influenced by process technologyparameters however it is possible to minimise process variationssuch as in the poly gate length by applying special design tech-niques A major cause of gate length variation is the polysiliconpitch to neighbours and one way to mitigate that effect is to fillunused space between transistor gates with polysilicon dummies[13] This technique was widely used in the CBCM pseudo-inverterdesign presented in this work (Figure 36) Concurrently theinverter transistors were designed with relatively large dimen-sions for the implementation process technology (P10microm1micromN5microm1microm) so as to minimise the effect of variation as a per-centage of the transistor parasitics
34 electrical characterization 29
Poly dummy fills
Metal2Metal1PolysiliconActive
1μm 5μm
Figure 36 CBCM pseudo-inverter pair in layout
Imecrsquos 65nm 3D process allows for 2-tier vertical stacking ofdies (TOP and BOTTOM) with the TSV passing through theTOP die Si substrate as illustrated in Figure 32 The IO foreach test structure was accessible through a standard 2 times 12 testpad module on the TOP die compatible with our wafer probetest system The 2 times 12 test pad module (PRC1) along with thecomplete test structure design is illustrated in Figure 37 while amicrograph image of the fabricated chip can be seen in Figure 38
The CBCM pseudo-inverters were implemented on either TOP
or BOTTOM tier while the TSVs were connected to the invert-ers from TOP or BOTTOM and arranged in 4 configurations (11 times 2 2 times 2 2 times 5) with different pitch distances The 6 test struc-tures available on the PRC1 test pad module are summarised inTable 31 A detailed description of the test structures (rsquoArsquo rsquoBrsquo rsquoCrsquorsquoDrsquo rsquoRrsquo and rsquoErsquo) follows in the paragraphs below
Structure rsquoArsquo Single TSV capacitance (TOP die)
The purpose of test structure rsquoArsquo (Figure 39) is to extract the capa-citance of a single TSV when it is electrically accessed through theTOP die The structure is composed of 3 pseudo-inverters con-nected to the TSVs using identical geometry Metal-2 layer wireswhose capacitance is also measured (Reference) In addition to
34 electrical characterization 30
DTOP DBTM
B10 A10
BREF AREF
B1 A1
R1 R2
SET VDD
RESET GND
C15 C10
C50 C20
SET VDD
ERC SET
EREF EC Metal2 (TOP)Metal1 (TOP)
Metal1 (BTM)Metal2 (BTM)
Figure 37 PRC1 test pad module(layout)
Figure 38 PRC1 test pad module(micrograph)
34 electrical characterization 31
Structure A B
TSVs - 1 2x5 - 1 2x5
Inverterlocation
TOP TOP TOP BTM BTM BTM
TSVconnection
TOP TOP TOP BTM BTM BTM
Multi-TSVconnection
- - TOP - - BTM
Multi-TSVpitch (microm)
- - 15 - - 15
Pad name AREF A1 A10 BREF B1 B10
Structure C D
TSVs 2x2 2x2 2x2 2x2 1x2 1x2
Inverterlocation
TOP TOP TOP TOP TOP TOP
TSVconnection
TOP TOP TOP TOP TOP TOP
Multi-TSVconnection
TOP TOP TOP TOP TOP BTM
Multi-TSVpitch (microm)
10 15 20 50 20 20
Pad name C10 C15 C20 C50 DTOP DBTM
Structure R E
TSVs 1 1 - 1x2 1x2
Inverterlocation
TOP TOP TOP TOP TOP
TSVconnection
TOP TOP - TOP TOP
Multi-TSVconnection
- - - TOP BTM
Multi-TSVpitch (microm)
- - - 20 20
Pad name R1 R2 EREF EC ERC
Table 31 Test structures on the PRC1 module
34 electrical characterization 32
Pseudo-inverters
2x5 TSVs
TSV
Reference
Metal2 (TOP)Metal1 (TOP)
Figure 39 Structure rsquoArsquo - Single TSV capacitance (TOP die)
the single TSV DUT the structure also contains 10 parallel TSVs
arranged in a 2 times 5 array with a 20microm pitch Since the suitabilityof the CBCM method for extracting the small TSV capacitance wasundetermined during design time the 2times 5 array was included asa precaution effectively increasing the DUT capacitance an orderof magnitude
Structure rsquoBrsquo Single TSV capacitance (BOTTOM die)
The design of structure rsquoBrsquo is similar to the structure rsquoArsquo asdescribed above The only difference is that the CBCM pseudo-inverters are implemented on the BOTTOM die (Figure 310)while connectivity to the TSVs is provided using a wire on theMetal-2 layer of the BOTTOM die The purpose of this structureis to examine whether the perceived TSV capacitance varies whenit is accessed from the BOTTOM side
Structure rsquoCrsquo Variable pitch TSV capacitance
The use of minimum TSV pitch in designs is highly desirable soas to minimise total chip area The purpose of test structure rsquoCrsquo(Figure 39) is to investigate whether TSV pitch has an effect onthe parasitic capacitance It has been previously reported in the
34 electrical characterization 33
2x5 TSVs
Reference
TSV
Pseudo-inverters
Metal2 (BTM)Metal1 (BTM)
Figure 310 Structure rsquoBrsquo - Single TSV capacitance (BOTTOM die)
literature that TSV proximity to other TSVs may affect its parasiticcapacitance probably through an interaction with the oxide liner[23] Test structure rsquoCrsquo aims to verify the presence of the pitcheffect on the 65nm 3D process
The CBCM pseudo-inverters are implemented on the TOP die(Figure 311) while the TSVs are connected in parallel in groups of4 with variable pitch distances at 10microm 15microm 20microm and 50micromAll groups connect to the pseudo-inverters using identical lengthwires Since wire parasitics are not measured their value is notextracted from the measured TSV capacitances thus the resultsare relative
Structure rsquoDrsquo TSV capacitance with interconnection on TOP or BOT-
TOM die
The purpose of structure rsquoDrsquo (Figure 312) is similar to structurersquoBrsquo which is to examine whether the perceived TSV capacitancechanges when it is accessed from the BOTTOM die However inthis structure the CBCM pseudo-inverters are implemented on theTOP die in the case the front-end-of-line (FEOL) active devices ofthe BOTTOM die were not capable of operating properly afterfabrication and stacking
34 electrical characterization 34
2x2 TSVs - 10μm
2x2 TSVs - 15μm
2x2 TSVs - 20μm
2x2 TSVs - 50μm
Metal2 (TOP)Metal1 (TOP)
Figure 311 Structure rsquoCrsquo - Variable pitch TSV capacitance
Metal2 (TOP)Metal1 (TOP)Metal1 (BTM)
Figure 312 Structure rsquoDrsquo - TSV capacitance measurement with inter-connection on TOP or BOTTOM die
34 electrical characterization 35
Metal2 (TOP)Metal1 (TOP)
Figure 313 Structure rsquoRrsquo - Evaluation of TSV capacitance measurementaccuracy
Metal2 (TOP)Metal1 (TOP)Metal1 (BTM)
Driver
Reference
Driver
Figure 314 Structure rsquoErsquo - TSV capacitance effect on signal propagationdelay (layout)
Structure rsquoRrsquo Evaluation of TSV capacitance measurement accuracy
Structure rsquoRrsquo (Figure 313) aims to evaluate the accuracy withwhich CBCM can extract the capacitance of two nearby TSVs Thetwo pseudo-inverters wires and TSVs are identical thus theor-etically they should give identical results Any variance in theresult should be mainly due to transistor variability
Structure rsquoErsquo TSV capacitance effect on signal propagation delay
Structure rsquoErsquo (Figure 314) allows the direct measurement ofthe delay caused by the TSV on signal propagation The signalpropagation delay is measured under two conditions When thesignal passes through the TSV (which includes the effect of resist-ance on delay) and when only the TSV capacitance is in the pathof the current (Figure 315) The contribution of wiretransistorparasitics is also measured so as to be subtracted from the finalresult
34 electrical characterization 36
ABC
D
ECBFC
ABC
Figure 315 Structure rsquoErsquo - TSV capacitance effect on signal propagationdelay (schematic)
ABC
DEF
CD
A
CA
A
Figure 316 Measurement setup
342 Measurement setup
The presented test structures were fabricated on 300mm wafers(Figure A1) each containing 472 identical dies For IO accessa general purpose on-wafer measurement system was used thatallowed direct contact with the wafer and the 2 times 12 test padmodule through probecard pins (Figure A2) All IO signals wereprovided by standard laboratory instruments controlled throughthe general purpose interface bus (GPIB) port of a PC runninga customised LabVIEW2 program The measurement setup isillustrated in Figure 316
The voltage source of the CBCM pseudo-inverters was set at1V and the CBCM pulse frequencies were swept in the range100kHz minus 1MHz For each measured test structure the DC cur-rent measurement was stored in non-volatile memory and sub-
2 LabVIEW is a visual programming language from National Instruments forautomating measurement equipment in a laboratory setup
34 electrical characterization 37
Test pad Direction Instrument connection
VDD Power Power supply - 1V
GND Power Power supply - 0V
SET InputPulse generator
(Non-overlapping with RESET)
RESET InputPulse generator
(Non-overlapping with SET)
A[REF 1 10] InputSMU (Inverter source - 1V
supply)
B[REF 1 10] InputSMU (Inverter source - 1V
supply)
C[10 15 2050]
InputSMU (Inverter source - 1V
supply)
D[TOP BTM] InputSMU (Inverter source - 1V
supply)
R[1 2] InputSMU (Inverter source - 1V
supply)
E[REF C RC] Output Oscilloscope (high-impedance)
Table 32 PRC1 test pad instrument connections
sequently the prober progressed to the next die measuringall dies on the wafer The IO test pads of the PRC1 module(Figure 37) and the instrument connections are summarised inTable 32
343 Experimental results
There were two wafers available for measurements named in thistext rsquoD11rsquo and rsquoD18rsquo which were selected from the same lot Eachwafer contained 472 identical dies of the TOP 3D tier measuredusing the measurement setup described in 342 Measurementdata were extracted only from test structures rsquoArsquo (Figure 39) andrsquoCrsquo (Figure 311) which were analysed using standard statisticalmethods
When utilising statistical methods to analyse data acquired ina laboratory environment it is common for some data pointsto be significantly away from the other measurements Thesedata points are termed outliers (Appendix B) and usually resultfrom measurement errors such as bad probe pin contacts ormalfunctioning dies on the wafer Since the following analysisis focused on TSV capacitance variability these few outliers areexcluded from the data set so as to improve analysis accuracy
34 electrical characterization 38
Structure rsquoArsquo (Fig 39)
TSV dimensions (ld) 40microm5microm
Oxide liner thickness 200nm
Total dies 472
Processed dies (excluding outliers) 454
Mean capacitance 709 f F
Standard deviation (σ) 19 f F
Coefficient of variation (3σmicro) 80
Table 33 TSV capacitance D2D variability on wafer rsquoD11rsquo
The statistical descriptors used for analysing the measurementdata are elaborated in Appendix B and include calculations ofmean standard deviation and coefficient of variation Theseparameters are used to evaluate the TSV capacitance in terms ofdie-to-die (D2D) wafer-to-wafer (W2W) and TSV pitch variabilityThe accuracy of the CBCM results is evaluated by comparing themto standard LCR measurements Furthermore the correlationbetween the extracted TSV capacitance and simulated oxide linerthickness is shown signifying the predominant role of the oxideliner on the observed TSV capacitance variability
Die to die (D2D) variability
The die-to-die (D2D) variability is evaluated by measuring thecapacitance of the single TSV of structure rsquoArsquo (Figure 39) acrossall dies on the wafer The measurements are repeated on bothavailable wafers (rsquoD11rsquo and rsquoD18rsquo)
On wafer rsquoD11rsquo 18 dies showed a capacitance value severalstandard deviations away from the calculated mean of the totalpopulation These outliers indicated a measurement error or mal-functioning on the die and they were removed from the statisticalanalysis In the remaining 454 dies the mean capacitance of theexperimental data was 709 f F with a standard deviation (σ) of19 f F (Table 33) Standard deviation is an important indicator al-lowing us to determine whether the variance in the experimentaldata is expected or if there is a random error affecting the meas-urements As can be seen in Figure 317 706 of the measuredvalues are within plusmnσ of the mean capacitance value Since mostof the experimental data are within a standard deviation of themean value the variability observed is insignificant and expectedthus the measurement results can be considered valid
On wafer rsquoD18rsquo from the 472 total dies 58 were removed asoutliers The mean capacitance of the experimental data was895 f F with a standard deviation (σ) of 22 f F (Table 34) Theprobability density function can be seen in Figure 318 with 691
34 electrical characterization 39
-60 -40 -20 00 20 40 60 80
σ-σ
706
Deviation from mean (fF)
0
005
01
015
02
025
Figure 317 TSV capacitance D2D variability on wafer rsquoD11rsquo - Probabil-ity density function
Structure rsquoArsquo (Fig 39)
TSV dimensions (ld) 40microm5microm
Oxide liner thickness 200nm
Total dies 472
Processed dies (excluding outliers) 414
Mean capacitance 895 f F
Standard deviation (σ) 22 f F
Coefficient of variation (3σmicro) 74
Table 34 TSV capacitance D2D variability on wafer rsquoD18rsquo
of the experimental data within plusmnσ of the mean capacitancevalue
Correlation between TSV capacitance and oxide liner thickness vari-
ation
The TSV capacitance as extracted from the experimental measure-ments is plotted for wafers rsquoD11rsquo and rsquoD18rsquo (Figures 319 and 320)The capacitance value in the figures is normalised to the calcu-lated mean while xy axis indicate die coordinates on the waferWe observe that on both wafers the capacitance is near the meanvalue at the periphery of the wafer and reduces as we approachto the centre indicating a common cause of variability on bothwafers
Parameters affecting the TSV capacitance are the geometry ofthe TSV (length diameter) and oxide liner characteristics (dielec-tric constant thickness) The variation of the oxide liner thicknessacross the wafer can be predicted by simulation means Oxideliner simulation results for the specific process and fabrication
34 electrical characterization 40
σ-σ
691
Deviation from mean (fF)-70 -50 -30 -10 10 30 50 70
0
002
004
006
008
01
012
014
016
018
02
Figure 318 TSV capacitance D2D variability on wafer rsquoD18rsquo - Probabil-ity density function
075
080
085
090
095
100
105
110
0
5
10
15
20
25
30
-30-25
-20-15
-10-5
Norm
aliz
ed
capacitance
Die X
Die Y
075
080
085
090
095
100
105
Wafer area
Figure 319 TSV capacitance measurement results on wafer D11
34 electrical characterization 41
086088090092094096098100102104106108
0
5
10
15
20
25
30
-30-25
-20-15
-10-5
Norm
aliz
ed
capacitance
Die X
Die Y
086
088
090
092
094
096
098
100
102
104
106
108
Wafer area
Figure 320 TSV capacitance measurement results on wafer D18
tools can be seen in Figure 321 The liner thickness in the graphis normalised to the mean value and xy axis indicate die coordin-ates on the wafer As can be observed the oxide liner thickness isnear the mean value at the periphery of the wafer and increasesto a peak value when approaching the centre Since the TSV ca-pacitance is inversely proportional to the oxide liner thicknesstherersquos a strong correlation between the two variations suggestingthe oxide liner thickness to be the major cause of die-to-die (D2D)variability on wafers rsquoD11rsquo and rsquoD18rsquo
Wafer to wafer (W2W) variability
Probability density functions for both wafers rsquoD11rsquo and rsquoD18rsquo areplotted in Figure 322 Comparing the two experimental data sets(Tables 33 and 34) we can see that although the main statisticaldescriptors have not changed much (standard deviation is slightlylarger in D18) the mean capacitance is much larger in rsquoD18rsquo(895 f F) when compared to rsquoD11rsquo (709 f F) The difference is 262capacitance increase from wafer to wafer Taking into accountthat both wafers are from the same lot this result indicates theneed for better controllability on process parameters
34 electrical characterization 42
109
100
095Nor
mal
ized
oxi
de th
ickn
ess
-5-10 -15 -20 -25 -30
510
1520
2530
0Die Y
Die X
Figure 321 Simulated oxide liner thickness variation
60 65 70 75 80 85 90 95 1000
005
01
015
02
025D11 D18
plusmnσ
plusmnσ
Capacitance (fF)
Figure 322 TSV capacitance W2W variability - Probability density func-tions
34 electrical characterization 43
Structure rsquoCrsquo (Fig 311)
TSV dimensions (ld) 40microm5microm
Oxide liner thickness 200nm
Total dies 472
Processed dies (excluding outliers) 424
Experiment C10 C50
Number of TSVs 4 4
Mean capacitance 4073 f F 4125 f F
Standard deviation (σ) 84 f F 74 f F
Population within σ 695 653
Table 35 Variable pitch TSV capacitance experimental data from waferrsquoD18rsquo
Variable pitch TSV capacitance
The variable pitch TSV capacitance was extracted using structurersquoCrsquo (Figure 311) TSVs with pitch distances at 10microm and 50microm
were measured (experiments C10 and C50) on wafer rsquoD18rsquo Fromthe 472 total dies 424 were functional and the measurementresults are summarised in Table 35 The extracted capacitancevalues represent the 4 parallel-connected TSVs which also includethe parasitics of Metal 2 connectivity wires
As can be observed in the probability distribution functions(Figure 323) the measurement data for experiments C10 and C50
are very close to each other and there is an overlap between theirstandard deviations Even though there is an observable changein mean capacitance between the two experiments the overlap ofstandard deviations does not allow to conclude whether the capa-citance change is due to difference in pitch or process variability
To decide whether the change observed in capacitance is signi-ficant we run a KruskalndashWallis (Appendix B) one-way statisticalanalysis of variance in Minitab3 Assuming that the null hypo-thesis is true we calculate a p-value of 0 indicating that the twodata sets are different and that the pitch has indeed an affect theTSVs capacitance value The pitch dependence of the capacitancewith the same direction has been also previously observed onImecrsquos 130nm 3D process [23]
Measurement accuracy of CBCM
The capacitance measurements extracted using the CBCM tech-nique were compared to conventional LCR measurements (Fig-ure 324) on wafer rsquoD18rsquo Comparing the two measurements we
3 Minitab is a program that implements common statistical functions It is de-veloped by Minitab Inc (httpwwwminitabcom)
35 summary and conclusions 44
376 380 384 388 392 396 400 404 408 412 416 420 424 428 4320
001
002
003
004
005
006C10 C50
Capacitance (fF)
plusmnσC10
plusmnσC50
695 653
Figure 323 Variable pitch TSV capacitance experimental data fromwafer rsquoD18rsquo - Probability density function
-80 -60 -40 -20 00 20 40 60 80 1000
002
004
006
008
01
012
014
016
018σ-σ
688
Deviation from mean (fF)
Figure 324 LCR experimental data from wafer rsquoD18rsquo - Probability dens-ity function
observe in Table 36 that the standard deviations of the two datasets are similar while mean capacitance is sim 8 smaller in theLCR method However the LCR method produced more out-liers than CBCM That was expected as LCR cannot extract thecapacitance of a single TSV but can only measure groups of TSVs
connected in parallel (32 in our example) so as to increase thetotal measured capacitance This increases the chance of measur-ing TSVs with defects which in turn affects the accuracy of theLCR extracted capacitance
35 summary and conclusions
In this chapter various test structures based on the CBCM tech-nique were presented for the electrical characterization of theTSV parasitic capacitance The test structures were implementedon a 65nm 3D process technology and measured using a gen-eral purpose on-wafer measurement system The measurement
35 summary and conclusions 45
Method LCR CBCM
Total dies 472 472
Processed dies (excluding outliers) 393 414
Mean capacitance 820 f F 895 f F
Standard deviation (σ) 23 f F 22 f F
Coefficient of variation (3σmicro) 86 74
Table 36 Comparison between CBCMLCR measurements on waferrsquoD18rsquo
results were statistically analysed in terms of TSV capacitancedie-to-die (D2D) and die-to-wafer (D2W) variability The calculatedD2D variability was found to be reasonable while a comparisonto simulated data of the oxide liner thickness variation revealedthe oxide liner as a major contributor to the observed TSV capa-citance variability However D2W variability was extensive andfeedback was provided to the fabrication engineering team forfurther improvement of the process The results also confirmedthe effect of TSV pitch on the parasitic capacitance as has beenpreviously reported in the literature for a different process [23]
The comparison of CBCM-based measurements to conventionalLCR gave similar results while in addition the CBCM techniquewas shown to produce less outliers (less defective samples) in themeasurement data Considering the simplicity of CBCM and thatit has acceptable accuracy for extracting the capacitance of singleTSVs it can be a valuable tool for process characterization in theearly research efforts of 3D integration
The ability of CBCM for measuring single-ended TSVs could alsoprove beneficial for testing TSVs prior to die stacking which is ancritical step for keeping 3D-IC yield acceptable [47] Extracting TSV
parasitic capacitance can provide important information about theTSV device fabrication quality since slight variations in processparameters will greatly affect its parasitic capacitance as wasdemonstrated in this chapter Techniques based on RC constantmeasurement and sense amplification have been proposed in theliterature for a similar role [16] however they are significantlymore complex to implement than CBCM
In the following chapter another critical TSV characteristic ther-momechanical stress is evaluated using a test structure designedand implemented on the same 65nm 3D process
4T S V P R O X I M I T Y I M PA C T O N M O S F E TP E R F O R M A N C E
41 introduction
The integration of through-silicon via (TSV) 3D interconnects inCMOS technology has raised concerns over the impact on activedevices implemented on the same silicon (Si) substrate The largecoefficient of thermal expansion (CTE) mismatch between the TSV
filling material and Si substrate has been shown to induce ther-momechanical stress and deform the surface in proximity to theTSV Since thermomechanical stress can have an effect on activedevice carrier mobility [71] it is essential that the stress impacton MOSFET performance is kept under control Monitoring theTSV-induced thermomechanical stress effect on MOSFET deviceperformance and generating keep-out-zone (KOZ) guidelines fordigital and analog circuits is an important milestone for design-ing TSV-aware circuits with predictable performance
Simulations using finite element method (FEM) modelling [40]predict that TSV-induced thermomechanical stress has a com-plex distribution that depends on numerous conditions such asTSV geometry device type channel orientation and proximityHowever currently published experimental measurements onthermomechanical stress investigate just a small number of test-cases typically limited to a single TSVchannel orientationaxis[53 70 81] The complex distribution of thermomechanical stressemphasises the need for sophisticated characterization structuresthat can monitor stress impact with precision and verify simula-tion models
In this chapter a test structure is presented for monitoring theimpact of TSV-induced thermomechanical stress on MOSFET deviceperformance The test structure uses proven design techniquesfor implementing a large number of test-cases and accessingnumerous devices with maximum precision Furthermore thetest structure is fabricated on an experimental 65nm 3D process(Table 23) and used for characterizing that process Some of thepresented circuits and results have previously appeared in thefollowing publications [48 49 58]
42 tsv-induced thermomechanical stress
Individual TSV integration technologies use different conductivefill materials which may include copper (Cu) tungsten (W) or
46
42 tsv-induced thermomechanical stress 47
Material CTE at 20 ˚C (10minus6K)
Si 3
Cu 17
W 45
Table 41 Coefficients of thermal expansion
Figure 41 TSV thermomechanical stress simulation [40]
polysilicon [20] The coefficient of thermal expansion (CTE) mis-match (Table 41) between the fill material and the silicon (Si) caninduce strain on the surface in proximity to the TSV This deform-ation is caused by thermomechanical stress along the XY-planeduring the cooling-down phase [25] due to the filling materialrsquoshigher-than-Si contraction rate
Simulations using finite element method (FEM) modelling showthat the thermomechanical stress exhibits two-fold rotationalsymmetry with tensile and compressive stresses concentratedon separate axis [40] The thermomechanical stress distributiondepends on TSV geometrical characteristics as well as the inter-action with other TSVs Figure 41 illustrates the simulated stressdistribution of a single TSV while in Figure 42 the interactionbetween two near-by TSVs is shown As can be observed whenmultiple TSVs are in proximity to each other stress-free areas canbe generated as a result of destructive interaction
It is well known that stress can affect carrier mobility in MOS-
FET devices In fact strained-silicon technology processes havebeen in use for years as means to enhance device performance inseveral logic families [71] However in TSV-based 3D integrationthe induced strain is not intentional since it is a characteristic in-trinsic to the TSV process technology and thus the resulting shiftin device performance might not be always desirable Character-ization of the TSV impact on device performance and definition ofdesign rules such as keep-out-zones (KOZs) for digital and ana-log circuits is an important milestone for implementing TSV-awarecircuits with predictable performance
Simulation results [65] have shown that the mobility shift de-pends on various conditions such as TSV geometry device type
43 test structure implementation 48
Figure 42 TSV thermomechanical stress interaction between two TSVs[40]
proximity and channel orientation In Figure 43 the mobilityshift of transistors between two TSVs at variable distances is simu-lated As can be observed the effect is positive and more intenseon PMOS devices while for n-channel MOSFET (NMOS) the mobil-ity shift is mostly negative
The mobility shift due to TSV proximity has been also con-firmed by experimental data [53 70 81] and in most cases theperformance shift was found to be within acceptable limits How-ever currently published experimental results investigate a smallnumber of test-cases typically limited to a single TSVdevice ori-entationaxis As was shown in Figures 41 and 42 and discussedin the text the thermomechanical stress distribution is complexand depends on various factors In order to evaluate the impactof TSV thermomechanical stress accurately and under differentconditions more sophisticated test structures are required Suchapproach is attempted in the test structure which is presented inthe following sections
43 test structure implementation
TSV-induced thermomechanical stress can be compressive in somedirections tensile in others and itrsquos effect on device mobilitymay vary depending on MOSFET typeorientationdistance orinteraction with other TSVs To accurately characterize the effectof TSV proximity on MOSFET device performance a test structurewas designed and fabricated on Imecrsquos 65nm 3D process whichattempted to cover as many test-cases as possible
43 test structure implementation 49
Figure 43 Estimated mobility shift on PMOS (a) and NMOS (b) tran-sistors in between two TSVs separated by distance d [65]
The test structure consisted of MOSFET arrays superimposedover TSVs such that MOSFET devices surrounded the TSVs in alldirections at variable distances The purpose of these experimentswas to indirectly observe the thermomechanical stress effect ontransistor mobility by measuring the saturation current of MOSFET
devices at various distances and directions in relation to the TSVsSeveral configurations of TSVs and MOSFET arrays were investig-ated so as to characterize the impact of single and multiple TSVsas well as to observe the effect on different device types
Stacked silicon dies were not fabricated thus all devices wereimplemented on the TOP die Si substrate as illustrated in Fig-ure 32 (page 25) The test structure inputoutput (IO) was ac-cessible through a standard 2 times 12 test pad module on the TOP
die compatible with our wafer probe test system In total two testpad modules were designed for this experiment PTP1 and PTP2implementing NMOS-type and PMOS-type arrays respectively Testpad module PTP1 along with the test structure layout is illus-trated in Figure 44 while a micrograph image of the fabricatedchip can be seen in Figure 45
43 test structure implementation 50
A0
B0
C0
D0
E0
F0
G0
H0
I0
J0
A90
B90
E90
F90
G90
J90
DS3 DF3
DF2 DS0
DS2 DF0
DF1 DS1
SS SF
DEC6 VDD
DEC7 GND
DEC8 VG
DEC4 DEC9
DEC5 GND
DEC2 DEC3
DEC0 DEC1
Arraydecoder
Draindecoder
Figure 44 PTP1 test pad mod-ule (layout)
Figure 45 PTP1 test pad mod-ule (micrograph)
43 test structure implementation 51
Metal2Metal1PolyActive
TSV
MOSFETs
07μm007μm 0deg
07μm007μm 90deg
05μm05μm 90deg
05μm05μm 0deg
Figure 46 MOSFET dimensions and rotations
431 MOSFET arrays
Each MOSFET array was a separate experiment consisting of 37 times37 transistors which could be either NMOS-type or PMOS-typeshort channel (07microm times 007microm) or long channel (05microm times 05microm)as well as at 0ordm or 90ordm rotation in relation to TSVs (Figure 46)
The MOSFET arrays enclosed TSVs in one of five different config-urations 1 TSV (Figure 47) 4 TSVs diagonal (Figure 48) 4 TSVs
lateral (Figure 49) 9 TSVs (Figure 410) and no-TSVs In total therewere 16 configurations for each MOSFET array type (NMOSPMOS)as listed in Table 42 The location of each MOSFET array in thetop-level design is illustrated in Figure 44
The number of available IO test pads on each module was2 times 12 which imposed a limitation for accessing all 21904 (16 times37 times 37) MOSFETs To overcome that restriction only a subset ofthe transistors in each MOSFET array were wired out (16times 16) andthus electrically accessed using digital selection logic The restof the transistors in the 37 times 37 MOSFET array were inserted asdummy devices placed in the gaps to ensure uniform boundaryconditions and good matching between transistors (Figure 411)
Since thermomechanical stress exhibits two-fold rotational sym-metry as discussed in 42 the stress effect is the same for tran-sistors on the same axis whether they are placed eastwest ofthe TSV on the latitudinal axis or northsouth of the TSV on thelongitudinal axis The symmetry of the stress distribution wasexploited in the design of the MOSFET arrays so as to increase theresolution of the experiment despite the limited number of tran-sistors As can be seen in Figure 411 the transistors in the arrayare placed in an irregular grid with a 2microm pitch Some transistorsare placed on odd-number distances (1microm 3microm) in relation tothe TSV while others on even-number distances (2microm 4microm)Nevertheless rotational symmetry allows the evaluation of thethermomechanical stress at all directions with a 1microm resolution
43 test structure implementation 52
TSV
MOSFETs Substrate taps
Figure 47 1 TSV MOSFET array configuration
MOSFETs Substrate taps
TSV TSV
TSV TSV
Figure 48 4 diagonal TSVs MOSFET array configuration
43 test structure implementation 53
MOSFETs Substrate taps
TSV
TSV TSV
TSV
Figure 49 4 lateral TSVs MOSFET array configuration
MOSFETs Substrate taps
TSV TSV TSV
TSV TSV TSV
TSV TSV TSV
Figure 410 9 TSVs MOSFET array configuration
43 test structure implementation 54
Configuration RotationFET
width(microm)
FETlength(microm)
Name
0 0deg 07 007 A0
1 0deg 07 007 B0
4diagonal 0deg 07 007 C0
4lateral 0deg 07 007 D0
9 0deg 07 007 E0
0 0deg 05 05 F0
1 0deg 05 05 G0
4diagonal 0deg 05 05 H0
4lateral 0deg 05 05 I0
9 0deg 05 05 J0
0 90deg 07 007 A90
1 90deg 07 007 B90
9 90deg 07 007 E90
0 90deg 05 05 F90
1 90deg 05 05 G90
9 90deg 05 05 J90
Table 42 MOSFET array configurations for test pad modulesPTP1PTP2
43 test structure implementation 55
2μm 1μm
Metal2Metal1PolyActive
FET
DUMMY
TAP
TSV
Figure 411 MOSFET array detail
432 Digital selection logic
Each MOSFET array was composed of 256 active transistors (16 times16) Test modules PTP1 and PTP2 implemented 16 array con-figurations each with 4096 devices in total (16 con f igurations times256 transistors) which could be accessed individually throughthe 24 IO pads of the 2 times 12 test pad modules Individual tran-sistor Gate Drain and Source terminals were multiplexed tothe IO pads using a combination of decoders and transmission-gates (Figure 412) The transmission-gates were used as analogswitches for current flow while the decoders enabled the corres-ponding transmission-gates for accessing the externally selectedtransistor for measurement
The procedure for selecting a single MOSFET device for meas-urement was as follows
bull Array (0 to 15) was enabled using the 4 minus bit Array-selectdecoder The MOSFET Source terminal (S) common to alltransistors in a single array was electrically connected toIO pads ldquoSFrdquoldquoSSrdquo when enabled
bull MOSFET Gate terminal (G0 to G15) was connected to IO
pad ldquoVGrdquo using the 4 minus bit Gate-select decoder and corres-ponding transmission-gates
bull MOSFET Drain terminal (D0 to D15) was connected to IO
pads ldquoDFnrdquoldquoDSnrdquo using the 2 minus bit Drain-select decoder
43 test structure implementation 56
D0
D15
DF[0-3]
DS[0-3]
Drain-select decoder (2-bit)
SSF
SS
Array-select decoder (4-bit)
VG
Gate-select decoder (4-bit)
G0 G15
MOSFET Array
Figure 412 Digital selection logic
and corresponding transmission-gates Drain terminals werealways enabled in groups of 4 and connected sequentiallyto IO pads ldquoDF0rdquoldquoDS0rdquo to ldquoDF3rdquoldquoDS3rdquo thus the 2 minus bit
Drain-select decoder was adequate for accessing all 16Drain terminals in the array
When a MOSFET is switched on a voltage drop is expectedbetween the IO pads and the transistor DrainSource terminalsdue to the resistance of wires To guarantee accurate voltagedelivery transistor DrainSource terminals are ldquosensedrdquo on theldquoDSnrdquoldquoSSrdquo IO pads so as to ensure that the applied voltageon IO pads ldquoDFnrdquoldquoSFrdquo supplies the desired voltage values onthe transistor terminals This procedure known as four-terminalsensing [29] is typically automated when using laboratory meas-urement equipment such as a source measurement unit (SMU)
A single MOSFET array with its digital selection logic can beseen in layout view in Figure 413 Transistor Gate terminals areselected using a 4minus bit decoder (Figure 414) while the externallyapplied voltage on IO pad ldquoVGrdquo passes through transmissiongates (Figure 415) The transmission gates schematic used forthe DrainSource terminals can be seen in Figure 416 The Drainterminal transmission gates are enabled in groups of 4 usinga 2 minus bit decoder (Figure 417) If a MOSFET array is not in theselected state by the Array-select decoder Gate terminals areforced at GND voltage level (for PTP1) or to VDD voltage (forPTP2) so that non-selected MOSFETs remain turned-off insteadof floating
43 test structure implementation 57
Gatedecoder
Metal2Metal1PolyActive
Transmissiongates (DS)
Transmission gates (G)
MOSFETArray
Figure 413 MOSFET array with digital selection logic
43 test structure implementation 58
BLK
A[0] A[1] A[2] A[3]
DEC[0-15]
Figure 414 4-bit Gate-enableArray-enable decoder
43 test structure implementation 59
DEC
VGGVDD
Transmission gate
Pull-downtransistor
Figure 415 Transmission gate for MOSFET Gate terminals
Transmission gates
FORCE
SENSE
DRAINSOURCE
EN
VDD
Figure 416 Transmission gates for MOSFET DrainSource terminals
43 test structure implementation 60
A[0] A[1]
DEC[0-3]
Figure 417 2-bit Drain-select decoder
44 experimental measurements 61
Configuration Type Name Wafer
1 No-TSVPMOS
(05microm times 05microm)F0 D08
2 1-TSV 0ordm 25 ordmCPMOS
(05microm times 05microm)G0 D08
3 1-TSV 0ordm 25 ordmCPMOS
(05microm times 05microm)G0 D20
4 1-TSV 0ordm 60 ordmCPMOS
(05microm times 05microm)G0 D08
54-lateral TSV 0ordm 25
ordmCPMOS
(05microm times 05microm)I0 D08
6 1-TSV 90ordm 25 ordmCPMOS
(05microm times 05microm)G90 D08
7 1-TSV 0ordm 25 ordmCPMOS (07microm times
007microm)B0 D08
Table 43 MOSFET array configurations selected for experimental eval-uation
44 experimental measurements
Test modules PTP1 and PTP2 implementing the presented MOS-
FET arrays were fabricated on 300mm wafers (Figure A1) eachcontaining 472 identical dies Two wafers were selected for meas-urements referred to as ldquoD08rdquo and ldquoD20rdquo both originating fromthe same lot The test module used for experimental evaluationwas PTP2 containing the PMOS-device arrays which were expec-ted to have the highest performance impact from thermomech-anical stress as was discussed in 42 The specific MOSFET arraysthat were selected for evaluation are listed in Table 43
441 Data processing methodology
For each MOSFET array evaluated the drain-to-source current(IDS) of all transistors in the array was measured in the saturationregion (VDS = 1V VGS = 1V) Subsequently the saturation cur-rent was divided with the median current value of all transistorsin the array generating IDS values normalised to the median Themedian of a data set represents the arithmetic value that the ma-jority of values tend (Appendix B) Since only a small percentageof the 256 transistors in the array are in proximity to the TSVsthe median IDS value represented typical transistor saturationcurrent not affected by thermomechanical stress
Measurement precision is increased with repeated measure-ments as random noise is removed from the experiment reducing
44 experimental measurements 62
ABCDEFBBC
FCDDFBCFFC
FCB
A
BAACADE
CECBBC
CFFCDE
B$BCFFBDBCBF
amp(BFC)B
BC
FCDDFA
CAF
Figure 418 Processing of measurement results
the standard deviation of the sampled mean (Appendix B) Con-sequently each MOSFET array was measured across 98 identicaldies on the wafer and their results were averaged Since the satur-ation current shift due to thermomechanical stress is systematicacross all dies averaging the saturation currents only removedthe random noise effectively increasing the measurement pre-cision by a factor of sim 10 (
radic98) The described procedure for
processing the measurement results is illustrated in Figure 418Each MOSFET array contains 256 transistors which are not
placed in a regular 16times 16 grid as can be seen in Figure 411 Thisapproach was chosen so as to minimise the number of transistorsin an array while allowing for 1microm resolution in the evaluation ofthe thermomechanical stress by taking advantage of its two-foldrotational symmetry as discussed in 431 However illustratingmeasurements in a contour graph requires a regular grid of dataThus for illustration purposes the experimental measurementswere reconstructed to a 33 times 31 regular grid (1023 transistors)with the intermediate values estimated using linear interpolation(Figure 419)
442 Measurement setup
For IO access a general purpose on-wafer measurement sys-tem was used that allowed direct contact with the wafer and the2times 12 test pad modules through probecard pins (Figure A2) IO
signal generation and measurement was provided by standard
44 experimental measurements 63
ABCDECDBF
ABECBF
AB
AB
Figure 419 Interpolation of measurement results
laboratory instruments controlled through the GPIB port of a PCrunning a customised LabVIEW program Four-terminal sens-ing for MOSFET DrainSource terminals was automated using aKeithley 2602 dual-channel SMU [1] The measurement setup isillustrated in Figure 420
The IO test pads for the PTP1PTP2 modules can be seen inFigure 44 while the instrument connections used for each testpad are summarised in Table 44
ABCCDE
F
AA
A
DBCB
Figure 420 Measurement setup
44 experimental measurements 64
Test pad Direction Instrument connection
VDD Power Power supply - 1V
GND Power Power supply - 0V
DEC[0-9] InputDigital Output
(GATEDRAINARRAY SelectDecoders Input)
DF[0-3] Input SMU (Drain Force terminal)
DS[0-3] Output SMU (Drain Sense terminal)
SF Input SMU (Source Force terminal)
SS Output SMU (Source Sense terminal)
VG Input Digital Output (Gate voltage)
Table 44 PTP1PTP2 test pad module instrument connections
443 PMOS (long channel)
No-TSV configuration
This MOSFET array does not contain any TSVs thus transistormobility is not affected by thermomechanical stress The arraywas measured on 98 identical dies of wafer ldquoD08rdquo the saturationcurrents normalised and then averaged over all measured diesusing the procedure elaborated in 441 In Figure 421 the norm-alised saturation current for each transistor in the array can beseen after interpolation to a 33 times 31 grid Each coordinate in thefigure represents a 1microm distance
As can be observed the majority of transistors in the array havealmost identical saturation currents as expected However there isa distinctive stripe running from north to south where the currentvalue is reduced up to sim 5 This revealed a flaw in the designof the MOSFET arrays that can be best observed in Figure 411The transistors that are located adjacent to the substrate tap donot perceive the same boundary conditions as the rest of thetransistors in the array That difference is enough to affect thephysical characteristics of these transistors during fabricationwhich in turn determine transistor parameters such as saturationcurrent This flaw stresses out the significance of careful boundarycondition design when targeting for good transistor matching ina test structure
1 TSV configuration
In this configuration long-channel MOSFETs (05microm times 05microm) areplaced in the periphery of a single TSV (Figure 47) The transistorsin the array were measured over 98 identical dies of wafer ldquoD08rdquo
44 experimental measurements 65
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
096
097
098
099
100
101
Figure 421 Normalised on-current for PMOS structure ldquoF0rdquo (D08)
and processed according to 441 The normalised saturation cur-rent for each transistor in the array can be seen in Figure 422 afterinterpolation to a 33 times 31 grid It can be clearly observed that thetransistors in proximity to the TSV show a distinctive variationin their saturation current in comparison to the other transist-ors in the array which is the direct consequence of TSV-inducedthermomechanical stress These results also demonstrate that theproposed test structure design has enough measurement preci-sion to distinguish thermomechanical stress from backgroundnoise and variability The north to south stripe observed in theprevious configuration due to boundary condition mismatch isalso present in these measurements
To observe the TSV effect on nearby transistors more accuratelythe normalised saturation current versus distance is plotted foraxis X = 16 and Y = 14 (Figure 423) which cross the TSV almostat its the centre In the graph only actual measurements are plot-ted prior to interpolation We observe that the saturation currentof a transistor at 1microm distance from the TSV increases sim 20 onthe latitudinal axis and decreases sim 15 on the longitudinalwhile the effect is almost negligible after 8microm
The same measurement is repeated on wafer ldquoD20rdquo In Fig-ure 424 the normalised saturation current for each measuredtransistor in the array can be seen after interpolation to a 33 times 31grid Also the normalised saturation current versus distance isplotted for axis X = 16 and Y = 14 (Figure 425) In the graph
44 experimental measurements 66
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
09
10
11
12
105
110
115
120TSV
Figure 422 Normalised on-current for PMOS structure ldquoG0rdquo (D08)
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
085
09
095
1
105
11
115
12
125Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 423 Normalised on-current for PMOS structure ldquoG0rdquo(D08Y=14X=16)
44 experimental measurements 67
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
09
10
11
12
13
TSV
Figure 424 Normalised on-current for PMOS structure ldquoG0rdquo (D20)
we observe that the saturation current of a transistor at 1microm dis-tance from the TSV increases sim 30 on the latitudinal axis anddecreases sim 18 on the longitudinal while the effect is almostnegligible after 8microm These results show that thermomechanicalstress can be quite different at distances very close to the TSV
even in between wafers originating from the same lotFinally the measurement is repeated on wafer ldquoD08rdquo after
increasing the temperature to 60 ordmC Normalised current contourplot can be seen in Figure 426 and measurements for X = 16and Y = 14 in Figure 427 The saturation current of a transistor
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
09
1
11
12
13
14
Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 425 Normalised on-current for PMOS structure ldquoG0rdquo(D20Y=14X=16)
44 experimental measurements 68
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
090
095
100
105
110
115TSV
Figure 426 Normalised on-current for PMOS structure ldquoG0rdquo (D0860˚C)
at 1microm distance from the TSV increases sim 16 on the latitudinalaxis (minus4 in comparison to 25 ordmC) and decreases sim 11 on thelongitudinal (minus4 in comparison to 25 ordmC) while the effect isalmost negligible after 6microm (minus2microm in relation to 25 ordmC)
4 lateral TSV configuration
In this configuration the thermomechanical stress is evaluatedin the lateral direction between 4 TSVs The array is measuredover 98 identical dies of wafer ldquoD08rdquo at 25 ordmC The contour plot ofthe normalised saturation current can be seen in Figure 428 andmeasurements for X = 16 and Y = 14 in Figures 429 and 430
respectively As can be observed the saturation current of atransistor at 1microm distance from the TSV increases sim 18 on thelatitudinal axis (minus2 in relation to 1 TSV) and decreases sim 12 onthe longitudinal (minus3 in relation to 1 TSV) while TSV interactionreduces the effect range to 5microm (minus3microm in relation to 1 TSV)
90˚ rotation TSV configuration
In this configuration the orientation of the transistor channel inrelation to the thermomechanical stress is rotated to 90˚ Thearray is measured over 98 identical dies of wafer ldquoD08rdquo at 25 ordmCThe contour plot of the normalised saturation current can beseen in Figure 431 and measurements for X = 16 and Y = 14in Figure 432 As was expected the stress effect on mobility has
44 experimental measurements 69
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
085
09
095
1
105
11
115
12Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 427 Normalised on-current for PMOS structure ldquoG0rdquo (D08Y=14 X=16 60˚C)
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
085
090
095
100
105
110
115
TSV
TSV TSV
TSV
Figure 428 Normalised on-current for PMOS structure ldquoI0rdquo (D08)
44 experimental measurements 70
2 1 1 2 3 4 5 6 7 7 6 5 4 3 2 1 1 2 309
095
1
105
11
115
12
X16
Distance (μm)
Normalized
Ion
TSV TSV
Figure 429 Normalised on-current for PMOS structure ldquoI0rdquo (D08X=16)
3 2 1 1 2 3 4 5 6 7 8 8 7 6 5 4 3 2 1 1 208208408608809092094096098
1102
Y14
Distance (μm)
Normalized
Ion
TSV TSV
Figure 430 Normalised on-current for PMOS structure ldquoI0rdquo (D08Y=14)
44 experimental measurements 71
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
090
095
100
105
110
115
120TSV
Figure 431 Normalised on-current for PMOS structure ldquoG90rdquo (D08)
been reversed between the two axis The saturation current of atransistor at 1microm distance from the TSV decreases sim 13 on thelatitudinal axis and increases sim 20 on the longitudinal whilethe effect is almost negligible after 6microm
444 PMOS (short channel)
In this configuration short-channel MOSFETs (07microm times 007microm) areplaced around a single TSV (Figure 47) The contour plot of thenormalised saturation current can be seen in Figure 433 and
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
08509
0951
10511
11512
125Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 432 Normalised on-current for PMOS structure ldquoG90rdquo (D08Y=14 X=16)
44 experimental measurements 72
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
094
096
098
100
102
104
106
108
110
TSV
Figure 433 Normalised on-current for PMOS structure ldquoB0rdquo (D08)
measurements for X = 16 and Y = 14 in Figure 434 At at 1microm
distance from the TSV the saturation current decreases 7 onthe longitudinal axis (minus8 in relation to long channel) while onthe latitudinal axis no stress effect could be detected That couldbe attributed either to short-channel effects or malfunctioningof the specific transistor due to proximity to the TSV and theexperimental nature of the manufacturing process On both axisthe effect is almost negligible after 5 minus 6microm
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
085
09
095
1
105
11
115Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 434 Normalised on-current for PMOS structure ldquoB0rdquo (D08Y=14 X=16)
45 summary and conclusions 73
Size
Waf
er
Con
figu
rati
on
Rot
atio
n
Eff
ect
1micro
mla
t
Eff
ect
1micro
mlo
ng
KO
Z
L D08 1-TSV (25 ordmC) 0ordm +20 minus15 8microm
L D20 1-TSV (25 ordmC) 0ordm +30 minus18 8microm
L D08 1-TSV (60 ordmC) 0ordm +16 minus11 6microm
L D08 4-lat TSV (25 ordmC) 0ordm +18 minus12 5microm
L D08 1-TSV (25 ordmC) 90ordm minus13 +20 6microm
S D08 1-TSV (25 ordmC) 0ordm sim 1 minus7 8microm
Table 45 Summary of experimental measurements on long (L) andshort (S) channel PMOS devices
45 summary and conclusions
In this chapter a test structure was presented for monitoringthe impact of TSV-induced thermomechanical stress on MOSFET
device performance The test structure implemented arrays ofMOSFET devices superimposed with TSVs in various configura-tions Digital selection logic allowed access to the MOSFET deviceswhich could be used to indirectly monitor thermomechanicalstress by measuring the saturation current of transistors
The structure was implemented in Imecrsquos experimental 65nm
3D process and used to characterize the process technology whileevaluating the test structure Experimental measurements werecarried on PMOS devices and the thermomechanical stress wasfound to be on par with simulation estimations
In Figure 45 the measurements are summarised for the latit-udinal and longitudinal axis at 1microm distance from the TSV alongwith the recommended keep-out-zone (KOZ) The KOZ is indic-ative and represents the distance where the thermomechanicalstress effect on saturation current is almost negligible (lt 1) Ina practical case KOZs will depend on the tolerated saturation cur-rent mismatch between devices which is a direct consequence oftiming and other circuit constraints Typically digital circuits cantolerate and thus function properly with much more transistorvariability than analog circuits
Comparing the experimental results of this work to other pub-lished data in the literature reveals that the thermomechanicalstress effect varies dramatically with different process character-istics Experimental measurements reported by Cho et al [17]confirm that long channel devices are more sensitive than short
45 summary and conclusions 74
channel however both the saturation current shift and KOZs areconsiderably reduced to 2 and lt 2microm respectively In additionthe dependency of the stress effect on TSV position was not ob-served by the authors in contrast to the measurements presentedin this work and predictions by theoretical models Contradictoryresults between theory and experiments and between differentexperiments emphasise the need for better understanding of thespecific mechanisms involved in the TSV-induced thermomech-anical stress
Evaluation of the measurements results allowed for some con-clusions to be made on the effectiveness of the presented teststructure
1 It is desirable for transistors in the array to be in a regu-lar grid so as processing of the measurement data is lesscomplex
2 The boundary conditions of transistors have an consider-able effect on transistor matching and thus measurementaccuracy
3 Distances of interest are generally lt 8microm therefore thearrays could be smaller or more dense in the area in prox-imity to the TSV than further away
4 The experimental process used to implement the test struc-ture had low yield which stressed out the need for mech-anisms to detect faults in the digital selection logic
The experiences learned from this project can be potentially usedfor designing improved thermomechanical stress characterizationstructures for next generation TSV processes In a future workit would be interesting to perform experimental measurementson NMOS devices which were not evaluated in this project Fur-thermore the cause of the glitch measured on the short-channeltransistor nearest to the TSV should be further investigated
In the following chapter the potential of the energy-recoverytechnique is investigated for reducing the energy dissipation ofTSV interconnects in future 3D-SoC designs implementing a largenumber of TSVs
5E VA L U AT I O N O F E N E R G Y- R E C O V E RY F O R T S VI N T E R C O N N E C T S
51 introduction
TSV-based 3D-ICs enable low-parasitic direct connections betweenfunctional blocks in a SoC by replacing traditional long horizontal2D global interconnects with short vertical ones (21 page 5)Compared to 2D interconnects the reduced resistance-capacitance(RC) parasitics of TSVs improve both speed (RC) and energy(12CV2
DD) performance of circuits However the TSV parasiticcapacitance may still become an important source of energydissipation in large densely interconnected 3D-SoCs since thecombined capacitance and thus the energy required to drive TSVs
increases linearly with the number of tiers and interconnections(Figure 51)
The principal technique for reducing energy dissipation inCMOS has traditionally been based on supply voltage scaling[14] though this represents a technical challenge below the 65nm
node since PVT variations and physical limits decrease in practicethe range over which voltages may be scaled An alternativelow-power design technique is energy-recovery logic [31] whichdemonstrates frequency-dependent energy dissipation and thusdoes not suffer from the limitations of voltage scaling The originsand operating principle of energy-recovery logic were discussedin detail in 22 (page 13)
In this chapter the potential of the energy-recovery techniqueis investigated for reducing the energy dissipation of TSV inter-connects in 3D-ICs The total energy dissipation per cycle andoptimum device sizing are extracted for the proposed schemeusing theoretical modelling while the configuration is evaluatedagainst conventional static CMOS Some of the presented methodsand results have appeared previously in Asimakopoulos et al[7]
52 energy-recovery tsv interconnects
As was discussed in 22 (page 13) energy-recovery circuits con-serve energy by restricting current-flow across resistances andsubsequently recovering part of the supplied energy using time-variable power sources The energy dissipated in an energy-
75
52 energy-recovery tsv interconnects 76
1XTSV
Tier1
TSV
Tier0LOGIC 0
LOGIC 0
TSV
LOGIC 0
LOGIC 0
TSV
LOGIC n
LOGIC n Tier1
Tier0
TSV
LOGIC 0
LOGIC 0
TSV
LOGIC n
LOGIC n
Tier(m-1)
Tier(m)
mXnXTSVFigure 51 TSV energy dissipation per cycle increase with 3D integra-
tion density
recovery circuit becomes proportional to the charging period (T)of the capacitive load (CL) [82 60]
EER prop
(
RCL
T
)
CLV2DD (51)
In comparison a conventional CMOS circuit driving an equi-valent capacitive load would dissipate during the same chargingperiod T [59]
ECMOS =12
CLV2DD (52)
Combining Equations 51 and 52 we calculate the energydissipation relation between the two circuits
EER
ECMOSprop
2RCL
T(53)
From Equation 53 we observe that the conditions under whichenergy-recovery conserves energy when compared to conven-tional CMOS are when either the charging period (T) is large orthe resistancecapacitance (RCL) of the load is small
In a 3D-IC TSV parasitics in terms of resistance-capacitance (RC)are inherently low by design as was discussed in 21 (page 5)In fact as was demonstrated by Katti et al [27] resistance isnegligible and thus the electrical behaviour of a TSV in a 3D-IC
52 energy-recovery tsv interconnects 77
Tier2Tier1TSV
LOGIC 1
LOGIC 2
driver
TSV
LOGIC 1
LOGIC 2
CMOS
driver
Convert
Energy-recovery
PCG
Figure 52 Comparison of a typical CMOS TSV driver to the proposedenergy-recovery
can be modelled as a single capacitance due to its predomin-ant impact on energydelay Therefore and based exclusivelyon Equation 53 energy-recovery has an advantage over con-ventional static CMOS when used for driving low-parasitic TSV
interconnectsThe differences between a typical CMOS driver in a TSV-based
3D-IC and the energy-recovery scheme proposed in this chapterare illustrated in Figure 52 A logic block in Tier1 generates adigital output which is regenerated by a driver and receivedby a second logic block in Tier2 after passing through the TSVIn the proposed energy-recovery scheme as an intermediatestep the digital output is converted to an adiabatic signal beforepassing through the TSV so as to take advantage of the low-power characteristics of energy-recovery Compatibility betweenthe receiving digital logic block in Tier2 and energy-recovery isretained by restoring the adiabatic signal back to its digital form
Converting the digital signal to an adiabatic as depicted inthe simplified diagram of Figure 52 requires the use of a spe-cialised driver that can combine (mix) the time-variable signalof the power-clock generator (PCG) with a digital output Signalsgenerated by a PCG return to zero at the end of a cycle resem-bling the operation of a digital clock (222 page 18) In contrastdigital outputs are typically allowed to have arbitrary values fromone cycle to the next thus a special structure is required thatcan encode the PCG generated signal to arbitrary binary values
52 energy-recovery tsv interconnects 78
IN IN
ININ
OUT OUT
IN
IN
OUT
OUT
Figure 53 Adiabatic driver
Figure 54 A simple pulse-to-level converter (P2LC)
An example of such structure is the adiabatic driver [8 84] (Fig-ure 53) Adiabatic drivers are composed of a pair of transmissiongates switching onoff alternately based on the value of a digitalinput On each cycle the time-variable adiabatic signal is redirec-ted either to the driverrsquos inverting or non-inverting output andas a result the data carried by the adiabatic signal are dual-railencoded
The signals at the outputs of the adiabatic driver can be restoredto their digital form after passing through the TSV by convertingthe dual-rail encoded sinusoidal pulses back to level signals usinga pulse-to-level converter (P2LC) [72] (Figure 54)
For generation of the time-variable adiabatic signal a resonantPCG may be used such as the one discussed in 222 (page 18)A resonant PCG generates continuous sinusoidal pulses approx-imating the idealised (linear) adiabatic signal of Figure 52 Theadiabatic driverrsquos resistance (RTG) along with the capacitive load(CL) assume a crucial role in the generation of the resonant si-nusoidal pulse as their presence in the signal path along with
53 analysis 79
VDD2
Rind
RTG
AdiabaticDriver
CL
M1
T=1f
L ind
RTG
CL
Figure 55 Resonant pulse generator with dual-rail data encoding
Tier2
Tier1
f-fPULSE
VDD2CLK IN
Delay1CLK1
CLK0
Delay2CLK2OUT1
Data In
Data Out
CLKRES
TSV TSV TSV
CLKBufferAdiabatic
Driver
P2LC
f-f
Data1
M1
Figure 56 Energy-recovery scheme for TSV interconnects
a resonant-tank inductor form an RLC oscillator [44] (Figure 55)resonating at
f =1
2πradic
LindCL(54)
A detailed block diagram of the proposed energy-recoveryscheme including the PCG and dual-rail encoding is illustratedin Figure 56
53 analysis
The energy dissipation relation depicted in Equation 51 is basedon the assumption that an idealised (linear) adiabatic signalis used for charging the TSV capacitance Furthermore energy
53 analysis 80
losses occurring in the additional components used in the realisticenergy-recovery circuit illustrated in Figure 56 are not consideredin Equation 51 The purpose of this section is to analyse andmodel all possible sources of energy dissipation in the proposedenergy-recovery scheme
The bulk of the energy dissipation in the energy-recovery cir-cuit of Figure 56 will occur in the adiabatic driver (EAD) theinductorrsquos parasitic resistance (Rind) transistor M1 (EM1) and theP2LC (EP2L) Therefore the total energy dissipated per cycle inthe energy-recovery circuit would be
EER = EAD + Eind + EM1 + EP2L (55)
Parameters EAD Eind and EM1 are theoretically derived in thefollowing subsections while EP2L depends on the specific P2LCs
architecture used and could be easily extracted from simulationdata as will be shown later in this chapter
531 Adiabatic driver
There are two sources of energy dissipation in the adiabaticdriver of Figure 53 adiabatic dissipation caused by current flow-ing through the transmission gate resistance (RTG) and CMOS
dissipation caused by external stages driving the transmissiongate and pull-down NMOS transistor input capacitances
The sinusoidal pulses generated by the PCG chargedischargethe load capacitance (CL) through the transmission gate resistance(RTG) The full chargedischarge cycle of the load capacitanceconcludes during a period T = 1 f while the energy dissipatedon the transmission gate resistance (ETG) during that period isadiabatic and thus proportional to Equation 22 Appending acorrection factor ξ equal to π28 for a sine-shaped current [8 76]and multiplying the equation by 2 to calculate energy dissipationfor the full cycle
ETG = 2ξ
(
RTGCL12 T
)
CLV2DD =
π2
2(RTGCL f )CLV2
DD (56)
The cross-coupled PMOS pair of Figure 53 reduces the transmis-sion gate input capacitance by 12 when compared to the simpleadiabatic driver of Figure 215 (page 18) External CMOS stagesare not required to drive the PMOS pair however the PMOS gatecapacitance will appear as an additional capacitive load at theoutput of the driver Energy will still be dissipated for driving thePMOS pair but will be adiabatic and thus more energy efficient
53 analysis 81
Transmission gates are typically sized to have equal NMOS andPMOS parts as increasing the size of the PMOS only slightly im-proves the gate resistance while significantly increasing the inputcapacitance [78] Thus assuming both NMOSPMOS transistorsare equally sized to Wn the additional capacitive load appearingat the driverrsquos output due to the PMOS pair would be one gatecapacitance Cn In addition drainsource diffusion capacitance(CD) will be an important portion of the load since in each cycle6CD will be present in the current flow path (4 contributed bythe ON and 2 by the OFF transmission gate) Including the TSV
capacitance at the output of the adiabatic driver the combinedload capacitance seen by the PCG in any cycle would be
CL = CTSV + Cn + 6CD (57)
The transmission gate resistance (RTG) can be related to thegate capacitance by a ldquodevice technology factorrdquo (κTG) [43] whichwe can define for calculation convenience
κTG = RTGCn rArr RTG =κTG
Cn(58)
Assuming that the pull-down NMOS transistors of Figure 53are small CMOS dissipation is primarily the result of driving theNMOS part of the transmission gates which is CnV2
DD Combiningthe CMOS dissipation with Equations 56 57 and 58 gives thetotal dissipated energy per cycle in the adiabatic driver
EAD = CnV2DD +
π2
2κTG
Cnf [CTSV + Cn + 6CD]
2 V2DD (59)
The second term of Equation 59 has a consistent contributionon the energy dissipation in every cycle while the first term isdependent on data switching activity (D) We can also furthersimplify this equation by defining the diffusion capacitance asa fraction of the transistors input capacitance (CD = bCn) andassigning term π2
2 κTG f to a variable (y = π2
2 κTG f ) Equation 59then becomes
EAD = D middot CnV2DD +
y
Cn[CTSV + (6b + 1)Cn]
2 V2DD
=
[
Cn
(
D
y+ (6b + 1)2
)
+1
CnC2
TSV
]
middot yV2DD
+(12b + 2)CTSV middot yV2DD (510)
Since in Equation 510 Cn is the free parameter the first twoterms are inversely proportional and thus EAD is minimised when
53 analysis 82
they become equal Solving equation Cn
(
Dy + (6b + 1)2
)
= 1Cn
C2TSV
for Cn the optimum transmission gate input capacitance is calcu-lated for a given CTSV
Cn(opt) =
radic
[D
y+ (6b + 1)2]minus1CTSV (511)
Combining Equations 510 and 511 gives the energy dissipationof an optimally sized adiabatic driver
EAD(min) =
[radic
D
(6b + 1)2y+ 1 + 1
]
middot (12b+ 2)yCTSVV2DD (512)
532 Inductorrsquos parasitic resistance
The inductorrsquos (Lind) parasitic resistance Rind is proportional tothe Qind factor which typically depends on the inductorrsquos imple-mentation technology For the purposes of this analysis it can beestimated as
Qind =1
Rind
radic
Lind
CLrArr Rind =
1QindCL2π f
(513)
Energy dissipation in Rind is adiabatic and can be estimatedcombining Equations 56 and 513
Eind =π2
2(RindCL f )CLV2
DD =π
4CL
QindV2
DD (514)
533 Switch M1
Transistor M1 is switched-on briefly to recover the energy dissip-ated each cycle in the Rtotal = RTG + Rind Its energy dissipationis a trade-off between dissipation in its on-resistance RM1 andinput capacitance CM1
Since M1 is a fairly large transistor previous ratioed stages willhave a significant contribution to the energy consumption as wellFor that reason a m factor is used to compensate for the additionallosses IM1(rms) is the root mean square (RMS) current passingthrough the transistor while turned-on and VGM1 is the peak gatevoltage A methodology for deriving optimum values for boththese parameters is proposed in [44] and the total dissipatedenergy in M1 can be estimated as
EM1(min) = 2IM1(rms)VGM1
radic
mκM1
f(515)
54 simulations 83
Delay2
CLK1
CLK2Delay1
CLKW
PULSE
CLK0T
PD
OUT1
PULSE
P2L
RES
CLK
Figure 57 Energy-recovery timing constraints
534 Timing Constraints
The gate of transistor M1 in Figure 55 is controlled by a pulsewith period TCLK and width WPULSE For optimum energy ef-ficiency the digital inputs of the adiabatic driver in Figure 53should switch during the time period defined by WPULSE whilethe input values must be valid for the complete resonant clockcycle as shown in Figure 57
The adiabatic driverrsquos input data latching is controlled by clocksignal CLK1 (Figure 56) and thus its positive edge should takeplace during WPULSE (Delay1 lt WPULSE) If these constraints arenot met energy that could be otherwise recovered by the inductorwill be dissipated on M1 during WPULSE The clock signal at thereceiving end CLK2 does not have the same constraint howeveritrsquos positive edge should occur before the P2LC outputs new data(Delay2 lt WPULSE + PDP2L)
54 simulations
To verify and evaluate the proposed theoretical models a testcase was designed for which energy dissipation was calculatedusing both theory and SPICE simulations The test case assumed1 minus bit of information per cycle was transferred between 2 tiersof a 3D-IC in which the TSV capacitance (CTSV) was 80 f F TheTSV was driven using the energy-recovery technique proposedin this chapter implemented with a single adiabatic driver fordata encoding and an inductor (Lind Rind) for generation of the
54 simulations 84
M1IN IN
IN
OUT OUT
IN
Figure 58 Test case for the evaluation of the theoretical model
Q250MHz 500MHz 750MHz
T S e () T S e () T S e ()
6 360 409 -120 476 516 -78 578 613 -57
8 308 347 -114 417 449 -70 515 543 -52
10 276 312 -115 382 411 -70 477 498 -43
12 255 289 -119 358 386 -71 451 472 -44
Table 51 Energy dissipation results ( f Jcycle) theoretical model (T)versus SPICE simulation (S)
adiabatic signal and energy recovery The circuit used for the testcase is illustrated in Figure 58
The energy dissipation on the adiabatic driver inductor andtransistor M1 were calculated using Equations 512 514 and 515with the Q factor and operating frequency acting as free variablesTechnology-specific parameters κTG κM1 and CD were extractedusing simulation models for Imecrsquos 130nm 3D process (Table 22page 11) The same circuit was simulated on Cadence VirtuosoSpectre simulator under identical conditions The energy dissipa-tion results for various Q factors and operating frequencies arelisted in Table 51
The calculated and simulated energy dissipation results arealso plotted in Figure 59 As can be observed theoretical es-timations are lower than than the equivalent results reported bythe simulator in all the cases examined That can be expected asthe calculations performed by the simulator are thorough andconsider much more details that are skipped in the theoreticalmodels Nevertheless there is good correlation between theoryand simulations
In Figure 510 the error in the estimation of the energy dissip-ation is plotted when using the theoretical model The Q factordoes not seem to have a significant effect on the error howeverthere is a considerable dependence on the operating frequency
54 simulations 85
5 6 7 8 9 10 11 12 130
10
20
30
40
50
60
70
T250 S250 T500 S500 T750 S750
Q factor
Energydissipation(fF)
Figure 59 Energy dissipation results ( f Jcycle) theoretical model (T)versus SPICE simulator (S)
5 6 7 8 9 10 11 12 13
-14
-12
-10
-8
-6
-4
-2
0
250MHz 500MHz 750MHz
Q factor
Error()
Figure 510 Energy dissipation estimation error of the theoretical modelcompared to the SPICE simulator
The parameters κTG κM1 and CD that were extracted for thespecific technology even though they were used as constants inthe models in practice they are frequency dependent Thereforeit could be said that based on the parameters chosen the theor-etical results are optimised for a specific frequency Increasingfrequency doesnrsquot necessary result in reduced error as seemsto be suggested by the graphs in Figure 510 In fact it is ex-pected that the estimation error will show a positive sign forlarger frequencies As was discussed previously in the text it isa reasonable expectation for the theoretical models to slightlyunderestimate the energy-dissipation
54 simulations 86
M1
TSV
TSV
TSV
Data in OutCMOS
OutER
Energy-recovery
CMOS
P2LC
Adiabaticdriver
Figure 511 Test case for comparison of energy-recovery to CMOS
541 Comparison to CMOS
When comparing different circuits it is typically desirable to takea holistic approach evaluating various aspects such as powerarea speed cost etc The purpose of this section is to comparethe energy-recovery scheme to a conventional static CMOS circuitusing just the theoretical models presented in this chapter Thisbasically limits the comparison to the energy dissipation andspeed parameters while on Chapter 6 (page 92) area is alsoconsidered The test case used for the comparison is illustratedin Figure 511 The TSV load capacitance was assumed to be 80 f F
for both circuitsThe energy dissipation of the CMOS circuit in Figure 511 is
described by Equation 52 Appending the data switching activity(D)
ECMOS = D middot 12
CLV2DD (516)
Equation 516 obviously calculates the theoretical minimumenergy dissipation since no parasitics and previous CMOS stagesare included In practice the energy dissipation of the CMOS
circuit can be much larger however Equation 516 is consideredadequate for the purposes of this comparison
Energy dissipation of the energy-recovery circuit is calculatedusing Equation 55 The energy contribution of each sub-componentthe adiabatic driver (EAD) the inductorrsquos parasitic resistance(Rind) and transistor M1 (EM1) were theoretically derived pre-viously in the text The P2LC energy was simulated in SPICEfor convenience The simple circuit topology of Figure 54 wasslightly improved adding the PMOS transistors seen in Figure 512so as to reduce short-circuit current when the input is in the range
54 simulations 87
Figure 512 Improved P2LC design
0 100 200 300 400 500 600 700 800 9000
2
4
6
8
10
12
14
Frequency (MHz)
Energy
dissipation(fJ)
Figure 513 Simulated energy dissipation of improved P2LC design
Vth rarr (VDD minus Vth) The energy dissipation results extracted us-ing SPICE can be seen in Figure 513 As can be observed theenergy dissipation is frequency dependent something not expec-ted in a CMOS circuit This behaviour is caused by the slow risingsinusoidal inputs which for low frequencies extend the periodduring which both PMOS and NMOS are on and thus increaseshort-circuit current
The Q factor is a critical parameter in the energy efficiency ofthe energy-recovery scheme since the entire circuit is basically anRLC oscillator A wide range of inductors are available as discretecomponents for use in circuits with Q factors depending on thesize and material On-chip integrated inductors have more re-stricted performance however their compact size is an advantageIntegrated Q factors are typically in the neighbourhood of 10for nH range inductors with sim 100microm radius [5] The Q factor
54 simulations 88
4 6 8 10 12 14 16 18 20 22-20
-10
0
10
20
30
40
50
60
800MHz 700MHz 600MHz 500MHz 400MHz 300MHz 200MHzQ factor
Energyred uction()
Figure 514 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS for variable Q factor(C = 80 f F)
4 6 8 10 12 14 16 18 20 220050
100150200250300350400450
Inductor M1 Adiabatic driver P2LC
Q factor
Energy
con trib
ution(
)
Figure 515 Energy contribution of energy-recovery circuit componentsat 200MHz (C = 80 f F)
values in the following simulations were deliberately chosen tobe in realistic ranges for integrated inductors
In Figure 514 the percentage reduction in energy dissipationachieved by the energy-recovery circuit is plotted when comparedto CMOS As expected energy efficiency of the energy-recoverycircuit improves with high Q factors and low frequencies
In Figures 515 and 516 the energy dissipation contributionof each component in the energy-recovery circuit can be seenat 200MHz and 800MHz respectively In both cases the energydissipated on the inductorrsquos resistance is reduced with higher Q
factors At lower frequencies the P2LC energy dissipation becomesa significant part of the total
In Figure 517 the percentage reduction in energy dissipationis plotted when the TSV capacitance (CTSV) is variable The en-ergy dissipation in the adiabatic driver (Equation 512) inductor(Equation 514) and transistor M1 (Equation 515) are linearlyrelated to the TSV capacitance Since the same applies to the CMOS
circuit (Equation 516) relative energy dissipation of both circuitsshould be constant with TSV capacitance The change observed in
54 simulations 89
4 6 8 10 12 14 16 18 20 2200
100
200
300
400
500
600
700
Inductor M1 Adiabatic driver P2LC
Q factor
Energy
con trib
ution(
)
Figure 516 Energy contribution of energy-recovery circuit componentsat 800MHz (C = 80 f F)
20 40 60 80 100 120 140 160 180 200 220-100
00
100
200
300
400
500
Q6 Q10 Q14 Q18
TSV capacitance (fF)
Energy
red u
ction(
)
Figure 517 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS with variable TSVcapacitance (C = 80 f F f = 500MHz)
Figure 517 is caused by the constant energy dissipation contribu-tion of the P2LC However as the load capacitance increases theenergy dissipation on the P2LC becomes insignificant in relationto the rest of the circuit components thus the reduction in energyis expected to asymptotically approach a maximum value
Switching activity (D) can also be a significant factor affect-ing energy performance Since in the energy-recovery circuit thesinusoidal oscillation may not be halted all capacitances in thecurrent flow path will charge and discharge on each cycle regard-less of data activity In contrast static CMOS ideally dissipatesenergy only when switching and thus the energy-recovery circuithas a disadvantage at low switching activities In Figure 518 theestimated effect of the switching activity on energy performanceis plotted for an operating frequency of 200MHz
Since the technology factors κTG and κM1 were extracted for a130nm process reducing their value by 12 could also provide uswith an estimation of the circuitrsquos energy performance for a 65nm
node The result in plotted in Figure 519 and as can be observedtechnology scaling has a positive effect on energy dissipationwhen compared to conventional static CMOS
54 simulations 90
4 6 8 10 12 14 16 18 20 22-40
-20
0
20
40
60
80
D=05 D=07 D=09 D=1
Q factor
Energyred uction()
Figure 518 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS (C = 80 f F f =200MHz)
4 6 8 10 12 14 16 18 20 220
10
20
30
40
50
60
130nm 65nm
Q factor
Energyred uction()
Figure 519 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS for variable technology(C = 80 f F f = 500MHz)
55 summary and conclusions 91
55 summary and conclusions
The work presented in this chapter investigated the potential ofthe energy-recovery technique as used in adiabatic logic andresonant clock distribution networks for reducing the energy dis-sipation of TSV interconnections in 3D-ICs The proposed energy-recovery scheme for TSVs was analysed using theoretical model-ling generating equations for energy dissipation and optimumdevice sizing The model accuracy was evaluated against SPICEsimulations which showed good correlation with the theoreticalestimations on a 130nm technology process
The energy-recovery scheme was compared to conventionalstatic CMOS when both circuits driving equivalent TSV load ca-pacitances under similar conditions The analysis revealed theenergy performance dependence on multiple circuit parametersthe Q factor operating frequency switching activity and TSV capa-citance The results demonstrated favourable energy performancefor the energy-recovery scheme for low operating frequenciesand high Q factors switching activities and TSV capacitancesFurthermore an estimation was provided on the energy perform-ance of the proposed scheme in an advanced technology nodewhich predicted further energy improvement
Encoding the PCG signal to arbitrary digital values using thedual-rail adiabatic driver (Figure 53) is only one of the possiblesolutions In fact energy could be saved if instead of dual-railanother encoding method is used such as 1-out-of-4 [74] In 1-out-of-4 encoding 4 outputs are used to represent 2-bits however onlyone of the outputs is charged at any given cycle In comparisondual-rail charges two outputs per cycle for the same number ofbits and thus 1-out-of-4 could potentially be saving up to 50energy when compared to dual-rail 1-out-of-4 encoding couldprove an interesting extension to the proposed energy-recoveryscheme
The theoretical models developed in this work can be used asa quick alternative to SPICE simulations for design-space explor-ation This is shown in the following chapter where the theoryestablished on the energy-recovery TSV scheme is used for design-ing a low-power 3D demonstrator circuit that was fabricated on a130nm process
6E N E R G Y- R E C O V E RY 3 D - I C D E M O N S T R AT O R
61 introduction
In the previous chapter a theoretical approach was taken toinvestigate the potential of the energy-recovery technique forreducing the energy dissipation of TSV interconnects in 3D-ICs Thetotal energy dissipation per cycle and optimum device sizing wereextracted for the proposed scheme using theoretical modellingwhile analysis revealed the design parameters affecting energyperformance
Practical design issues such as wire parasitics signal distri-bution area etc were not considered hence in this chapter a3D demonstrator circuit based on the energy-recovery scheme isdesigned under realistic physical and electrical constraints Thedemonstrator is compared to a CMOS circuit designed with thesame specifications using post-layout simulations Both circuitsare fabricated on a 130nm 3D technology process and evaluatedbased on the experimental measurements
62 physical implementation
The demonstrator circuit implemented a 500MHz 80-bit parallelIO channel on a 5-tier TSV-based 3D-IC The TSVs interconnectionswere driven using the energy-recovery scheme as discussed inChapter 5
The target process technology allowed for a low-resistivitybackside re-distribution layer (RDL) to be deposited which wasused to implement a high-Q integrated inductor Stacking of dieswas not possible in the target process so the 5-tier 3D-IC wasemulated by series-connecting TSVs in groups of 4 as is illustratedin Figure 61
The circuit consisted of a total of 640 TSVs (2 times 4 times 80bits) ofwhich half were electrically connected to the integrated inductorduring a single clock cycle The TSV capacitance for the fabrica-tion technology process had already been characterized and wasestimated at sim 40 f F (Table 22 page 11) Optimal device sizingand parameters for the energy-recovery IO circuit at 500MHz
were calculated using the methodology developed in Chapter 5
and are listed in Table 61In Figure 62 the schematic diagram of the energy-recovery
circuit is illustrated The main components comprising the circuitare the adiabatic drivers the pulse-to-level converters (P2LCs)
92
62 physical implementation 93
AB
AB
CDEFD
C
ABCD
EFD
C
Figure 61 Demonstrator circuit structure (showing 1-bit IO channelface-down view)
Adiabatic driver 80-bit circuit
Wn 461microm (Eq511) Rtotal 21Ω ( RTG80 )
RTG 168Ω (Eq58) Ctotal 165pF (80 middot Cload)
CL 207 f F (Eq57) Lind 613nH (Eq54)
Table 61 Energy-recovery IO circuit parameters
62 physical implementation 94
IntegratedInductor
VINDOER
PulseGenerator
S1AS2
DataGenerator
TSV
TSV
P2LCAdiabaticDriver
P2LCAdiabaticDriver TSV
TSV
Buffer
Buffer
1
80
1
80
EN
DecouplingCapacitor
Figure 62 Energy-recovery IO circuit schematic diagram
Metal2Metal1PolyActiveVDD
GND
RES_CLK
DATA
OUT
OUT_B
DATA
OUT_B OUT
RES_CLK
GND
VDD
Figure 63 Adiabatic driver (layoutschematic view)
the data generator the pulse generator transistor M1 and theintegrated inductor All sub-components are discussed in detailin the following paragraphs
621 Adiabatic driver
The adiabatic driver was based on the configuration theoreticallyanalysed in 531 (page 80) The adiabatic driver had 2 inputsfor the resonant pulse (RES_CLK) and the digital data (DATA)signals and 2 outputs for the inverted (OUT_B) and non-inverted(OUT) signals The layout and schematic views of the adiabaticdriver are illustrated in Figure 63
62 physical implementation 95
Metal2Metal1PolyActive
IN
OUT_B
VDD
IN_B
OUT
GND
OUT_B
IN
GND
VDD
OUT
IN_B
Figure 64 P2LC (layoutschematic view)
622 Pulse-to-level converter
The pulse-to-level converter (P2LC) was based on the simple circuitof Figure 54 (page 78) The design in layout and schematic viewsis illustrated in Figure 64 The circuit had 2 inputs for the inverted(IN_B) and non-inverted (IN) resonant pulse and 2 outputs forthe inverted (OUT_B) and non-inverted (OUT) digital signals
623 Data generator
A data generator was used for generating the digital data sig-nal required at the adiabatic driverrsquos input (DATA) The circuitwhich is illustrated in Figure 65 was a simple D-type flip-flopconfigured as a toggle The data generator had 1 input a digitalclock signal (CLK) operating at the same frequency as the res-onant clock and the digital data output (Q) with a switchingactivity of 1
As discussed in 534 (page 83) there are certain timing con-straints that must be respected for the energy-recovery circuitto operate efficiently The digital data received at the adiabaticdriverrsquos input (DATA) should switch during the period definedby WPULSE (Figure 66) In the demonstrator circuit the distribu-tion network carrying the DATA signal to the adiabatic driverswas carefully designed so that its delay impact was less thanWPULSE
624 Pulse generator
The gate of transistor M1 in Figure 62 is controlled by a pulsehaving the same frequency as the resonant clock (RES_CLK)and width WPULSE which is calculated using the models in 533(page 82) This digital pulse can be generated externally of thechip using a highly accurate laboratory signal generator Howeverwire parasitics in the path from the external signal source to the
62 physical implementation 96
Metal2Metal1PolyActive EN
CLK
VDD
Q
GND
D
CLKGND
ENVDD
QF-F
D
GND
VDDCLK
D Q
Figure 65 Data generator (layoutschematic view)
Delay1
WPULSE
PULSE
CLKTCLK
DATA
CLK_RES
Figure 66 Timing constraints for the energy-recovery demonstratorcircuit
62 physical implementation 97
PULSES1A
S2
GND
VDD
Metal2Metal1PolyActive
VDD
S1
S2
GND
PULSE
Figure 67 Pulse generator (layoutschematic view)
S2ΔPhase
WPULSE
PULSE
S1A
Figure 68 Pulse generator InputOutput signals
gate of on-chip transistor M1 affect the rising and falling times ofthe pulse resulting in pulse widths which are difficult to control
For that reason the pulse generator was implemented on-chip(Figure 67) and could be used to generate pulses with arbitrarywidths and frequencies The pulse generator had 2 sinusoidalinputs (S1A S2) supplied externally of the chip and generateda digital pulse with the same frequency as S1AS2 The pulsewidth was defined by the phase difference between S1AS2 asillustrated in Figure 68
625 Integrated inductor
The integrated inductor was designed in ASITIC1 as a planarsquare spiral with fixed dimensions (500microm times 500microm) and vari-able inductance while aiming for optimum Q factor The simu-
1 ASITIC is CAD tool that aids in the optimization and modelling of spiralinductors (httprficeecsberkeleyedu~niknejadasitichtml)
62 physical implementation 98
100 200 300 400 500 600 700 8005
7
9
11
13
15
17
19
21
10
15
20
25
30
35
40
45
Q (Simulated) Energy Reduction ()
Frequency (MHz)
Qfactor
Energyreduction()
Figure 69 Inductorrsquos Q factor simulation in ASITIC Estimated energydissipation reduction per cycle over standard CMOS
lated Q factor of the inductor for each frequency was then usedin Equation 55 to estimate the energy performance of the energy-recovery circuit when compared to standard CMOS (Equation 52)
As can be observed in Figure 69 the maximum performance isnot reached at the point where the maximum Q factor is achievedsince energy-recovery is also dependent on the operating fre-quency As a result the largest reduction in energy dissipationwas 40 attained at 300MHz while the maximum Q factor re-ported by ASITIC was 20 at 625MHz The high Q factors werea consequence of the large area allocated for the inductor andthe availability of the backside re-distribution layer (RDL) on thetarget process technology for the implementation of the inductorwhere thick low-resistance copper paths could be deposited
For an operating frequency of 500MHz the inductor had a Qfactor of sim 17 which resulted in an estimated energy reductionof 33 in comparison to the CMOS circuit The inductor had avalue of 613nH at that frequency and its design in layout viewcan be seen in Figure 610
626 Top level design
The various sub-components of the circuit were interconnectedusing Metal1 and Metal2 layers In the design there were twoglobal signals that assumed a critical role in the performance ofthe circuit The digital data signal (DATA) which was discussedin 623 and the resonant clock (RES_CLK)
For the distribution of the data signal buffers and interconnectswere properly sized so that the timing constraints of Figure 66were respected A sinusoidal signal such as RES_CLK does notrequire regeneration as it carries just one frequency harmonichowever the path itself has to be properly sized so as to make surethat any parasitic resistancescapacitances are not significantlyaffecting the target performance of the design As was previously
62 physical implementation 99
500μ
m
500μm
Figure 610 Planar square spiral inductor designed in ASITIC
shown by Chueh et al [19] when the driven load capacitancein a resonant distribution network is much larger than wireparasitic capacitance then wire resistance is more significant fordetermining both skew and energy performance
The resonant pulse distribution topology used in this designwas an asymmetric tree as shown in Figure 611 in which theinductor forces a current at the root branch terminating on alladiabatic drivers in parallel As such and considering that theresonant pulse supplies 80 adiabatic drivers with transmissiongate resistance of RTG = 168Ω the total load resistance seen atthe output node of the inductor is RTG
80 = 16880 = 21Ω However as
the current flows to lower branch levels in the tree the perceivedresistance increases to RTG
16 RTG4 and on the last one to RTG Any
additional resistance from wires is going to add in series with theadiabatic driver resistance which in turn will induce an almostlinear effect on the energy performance as can be estimated fromEquation 512
Assuming that the total driven resistance is not desirable toincrease more than 20 and that there are 4 branch levels inthe design each branch can tolerate wire resistance equal to20
4 = 5 of its value Starting from the inductor output nodeand calculating allowed wire resistivity for each branch level thecalculated tolerances for each branch are listed in Table 62 Thesecalculated limits were used for optimally sizing the resonantclock distribution tree in the demonstrator circuit
To ensure a stable voltage supply at the input of the inductor(VIND) a 1pF interdigitated metal-insulator-metal (MIM) de-coupling capacitor [78] was designed (Figure 612) which wasinserted between VIND and ground The top level design of
62 physical implementation 100
Adiabatic drivers
RTG
16
RTG
80
RTG
4
RTG
Level 1Level 2Level 3Level 4
Figure 611 Resonant pulse distribution tree
Branch level Wire resistance
1st 5 middot 16880 = 105mΩ
2nd 5 middot 16816 = 525mΩ
3rd 5 middot 1684 = 21Ω
4th 5 middot 168Ω = 84Ω
Total load 168+8480 + 21
20 + 05255 + 0105 = 252Ω
Table 62 Calculated resistance tolerances for each wire branch in theresonant clock distribution tree
63 post-layout simulation and comparison 101
+-
+-Metal2 Metal1
180μ
m
100μm
Figure 612 Interdigitated metal-insulator-metal capacitor (fringe capa-citor)
the energy-recovery IO circuit in layout view is illustrated inFigure 613
627 CMOS IO circuit
The CMOS IO circuit which was designed for comparison pur-poses is illustrated in the schematic diagram of Figure 614 Thecircuit used half the number of TSVs than the energy-recoverycircuit and was composed of two basic sub-components The datagenerator which was identical to the one discussed in 623 andthe CMOS drivers (Figure 615) which were properly sized to drivethe 160 f F TSV capacitance
The top level layout view of the CMOS IO circuit can be seenin Figure 616
63 post-layout simulation and comparison
The theoretically calculated parameters were used for the designof the demonstrator circuit in Imecrsquos 3D130nm process technologyIn Table 63 the estimated energy dissipation of each circuit sub-component is listed
63 post-layout simulation and comparison 102
22mm
12mm
GND GNDVBUF1 VDR
GND
VIND
S1A
S2
EN
OER
Figure 613 Energy-recovery IO circuit in layout view
63 post-layout simulation and comparison 103
OCMOS
S1B DataGenerator
TSV
CMOSDriver
CMOSDriver
TSV Buffer
1
80
EN
Figure 614 CMOS IO circuit schematic diagram
GND
VCMOS
DATA
078u
039u
105u
053u
316u
158u
950u
475u
OUT
Figure 615 CMOS driver schematic
12mm
12mm
OCMOS S1B GND VBUF2 VCMOSGND
GND GNDEN VCMOS
Figure 616 CMOS IO circuit in layout view
63 post-layout simulation and comparison 104
Energy dissipation (80-bits cycle)
Inductor 11pJ (Eq514)
Switch M1 092pJ [ ]
P2LCs 098pJ (Simulated)
Adiabatic drivers 315pJ (Eq512)
Total 615pJ
Table 63 Energy-dissipation theoretical calculations for Q = 17 f =500MHz
After parasitics extraction the top-level design was simulatedin SPICE The post-layout simulation results for a selection ofoutput signals are plotted in Figure 617 As can be observedthe timing constraints discussed in 623 are respected since theadiabatic driverrsquos input signal (DATA) optimally switches duringthe 327ps PULSE width It can also be seen that the resonantclock (CLK_RES) does not fully complete its discharge cyclebefore it is being forced to do so by the PULSE signal since theadditional capacitive load induced by the wire parasitics reducesthe oscillatorrsquos natural frequency
Post-layout simulated energy dissipation as well as area re-quirements for the energy-recovery and CMOS IO circuits arelisted in Table 64 The energy dissipation of the energy-recoveryimplementation is 25 more at the post-layout level in compar-ison to the theoretical estimations The energy overhead can beattributed to the effect of the wire parasitics as discussed in626 and the transistor parasitics that are not included in thetheoretical equations Therefore the accuracy of the theoreticalmodels can be considered acceptable for designing circuits basedon the proposed energy-recovery scheme
As was expected the dual-rail energy-recovery circuit occupiesalmost double the area of the equivalent CMOS circuit and eventhough both implementations have increased energy dissipationin comparison to the theoretical calculations the CMOS circuitenergy increases relatively more The latter was anticipated sincethe theoretical estimation of the CMOS energy dissipation wascalculated using the simple Equation 52 which does not accountfor any additional driver stages and parasitics The end result isthat the comparative performance favours the energy-recoveryimplementation which improves its energy savings from the
63 post-layout simulation and comparison 105
8 9 10 11 12 13 14 15 160
02
04
06
08
1
12
14RES_CLK
Time (ns)
V(V
)
8 9 10 11 12 13 14 15 160
02
04
06
08
1
12
14DATA
Time (ns)
V(V
)
8 9 10 11 12 13 14 15 160
02
04
06
08
1
12
14PULSE
Time (ns)
V(V
)
8 9 10 11 12 13 14 15 160
02
04
06
08
1
12
14OER
Time (ns)
V(V
)
Figure 617 Post-layout SPICE simulation of the energy-recovery cir-cuit
64 experimental measurements 106
AreaEnergy
dissipation(Theory)
Energydissipation
(Post-layout)
CMOS 144mm2922pJ
(115 f Jbit)
(Eq52)
1265pJ
(158 f Jbit)
Energy recovery 264mm2 615pJ
(77 f Jbit)
768pJ
(96 f Jbit)
Difference +83 minus333 minus393
Table 64 Post-layout simulated energy dissipation Area require-ments
theoretically calculated 333 up to 393 in the post-layoutsimulations
64 experimental measurements
The presented energy-recovery and CMOS IO circuits were fabric-ated on a 200mm wafer (Figure A3) which contained 47 identicaldies Micrograph images of the energy recovery and CMOS circuitscan be seen in Figures 618 and 619 respectively
641 Measurement setup
For IO access a manual on-wafer measurement system was usedthat allowed direct contact with the wafer and the IO test padsthrough RF probe heads (Figure A4) IO signal generation andmeasurement was provided by laboratory instruments whichwere manually controlled The measurement setup is illustratedin Figure 620
The IO test pads for the energy-recovery and CMOS circuitsand their equivalent instrument connections used for each testpad are summarised in Tables 65 and 66
642 Results
CMOS IO circuit
In Figure 621 the input signal S1B and the output signal OCMOS
are shown as captured by the oscilloscope Signal S1B whichhas a frequency of 500MHz is used by the data generator (Fig-ure 614) to generate a data signal switching at the same frequency
64 experimental measurements 107
Figure 618 Energy-recovery IO circuit micrograph
Figure 619 CMOS IO circuit micrograph
AABC
DEFF
D
EEBA
EBBA
Figure 620 Measurement setup
64 experimental measurements 108
Test pad Direction Instrument connection
VIND Power Power supply - gt06V
GND Power Power supply - 0V
S1A InputSine-wave generator (0deg Phase500MHz non-50Ω-terminated)
S2 InputSine-wave generator (VariablePhase 500MHznon-50Ω-terminated)
EN InputDigital Output (enable outputdata switching)
OER Output Oscilloscope
VDR PowerPower supply (adiabaticdrivers+P2LCs) - 12V
VBUF1 Power Power supply (Buffers) - 12V
Table 65 Energy-recovery circuit instrument connections
Test pad Direction Instrument connection
VCMOS PowerPower supply (CMOS drivers) -12V
GND Power Power supply - 0V
VBUF2 Power Power supply (Buffers) - 12V
S1B InputSine-wave generator (0deg Phase500MHz non-50Ω-terminated)
OCMOS Output Oscilloscope
EN InputDigital Output (enable outputdata switching)
Table 66 CMOS circuit instrument connections
64 experimental measurements 109
S1B
OCMOS
Figure 621 Oscilloscope output for signals S1B and OCMOS
100 200 300 400 500 600 700 800012345678910
Die1 Die2 Simulation
Frequency (MHz)
Powerdissipation(mW)
Figure 622 Measuredsimulated power dissipation for the CMOS cir-cuit
that passes through the TSVs and is output on OCMOS Thereforethe measured result on the oscilloscope suggests that the circuitis operating as expected
The proper operation of the circuit is also verified by measuringpower and energy per cycle dissipation on pad VCMOS Thepower dissipation of the two dies measured is approximatelylinear as expected by simulations (Figure 622) while energydissipation approximately constant (Figure 623) In additionabsolute power and energy dissipation values closely match post-layout simulations
Energy-recovery IO circuit
Input signal S1A and output signal OER can be seen in Fig-ure 624 as captured by the oscilloscope The output at OER
64 experimental measurements 110
100 200 300 400 500 600 700 800024681012141618
Die1 Die2 Simulation
Frequency (MHz)
Energydissipation(pJ)
Figure 623 Measuredsimulated energy dissipation for the CMOScircuit
S1A
OER
Figure 624 Oscilloscope output for signals S1A and OER
shows a data signal switching at 500MHz as expected The slowrising edge of the data signal suggests higher than anticipatedparasitics at the OER IO pad though the functionality of thecircuit is not affected
The energy dissipation is calculated by measuring the inputcurrent on IO test pads VIND and VDR The results for the twodies measured can be seen in Figure 625 The energy dissipationshows an approximately constant relation to frequency similar tothe CMOS circuit which does not correlate with post-layout simu-lations A properly operating energy-recovery circuit would beexpected to show minimum energy dissipation at the resonancefrequency (sim 500MHz)
Since the energy-recovery circuit shows CMOS-type energy dis-sipation it was estimated that the inductor could not resonate
65 summary and conclusions 111
200 250 300 350 400 450 500 550 600 65002468101214161820
Die1 Die2 Simulation
Frequency (MHz)
Energydissipation(pJ)
Figure 625 Measuredsimulated energy dissipation for the energy-recovery circuit
IntegratedInductor
VINDOER
PulseGenerator
S1A
S2
DataGenerator
TSV
TSV
P2LCAdiabaticDriver
P2LCAdiabaticDriver TSV
TSV
Buffer
Buffer
1
80
1
80
EN
DecouplingCapacitor
200pF
Figure 626 Energy-recovery IO circuit with fault hypothesis
properly and a hypothesis was made that the cause was a largerthan anticipated parasitic capacitance on the integrated inductorTo determine whether the hypothesis is valid a 200pF capacitoris inserted in the circuit schematic connected at the output ofthe inductor as shown in Figure 626 The new circuit is simu-lated and as can be seen in Figure 627 the output signal at OER
switches properly even though the resonant clock (RES_CLK) isnot oscillating The simulated energy dissipation of the circuitwith the fault hypothesis can be seen in Figure 628 and as canbe observed it is very similar to the experimental measurementsboth in terms of absolute values and frequency relation
65 summary and conclusions
In this chapter the design approach of a demonstrator circuitwas presented for evaluating the energy-recovery 3D interconnect-ing scheme Expanding on the theoretical analysis proposed inChapter 5 practical design issues were discussed and methodswere suggested for overcoming them
The circuit was designed and implemented on an experimental130nm 3D process technology and post-layout simulations res-
65 summary and conclusions 112
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15050
055
060
065
070
075
080RES_CLK
Time (ns)
V(V
)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 150
02
04
06
08
1
12
14PULSE
Time (ns)
V(V
)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1500
02
04
06
08
10
12
14OER
Time (ns)
V(V
)
Figure 627 Simulation of circuit with fault hypothesis
65 summary and conclusions 113
200 250 300 350 400 450 500 550 600 6500
5
10
15
20
25
Die1 Die2 Simulation CL=200pF
Frequency (MHz)
Ene
rgy
diss
ipat
ion
(pJ)
Figure 628 Measured energy dissipation and simulation of circuit withfault hypothesis
ults including parasitics were shown and interpreted The post-layout results revealed that the demonstrator circuit dissipated25 more energy per cycle than the value estimated by the-ory however the energy overhead was attributed to the effectof the wiretransistor parasitics which are not included in thetheoretical models Nevertheless the energy-recovery IO circuitdissipated 39 less energy per cycle than the equivalent CMOS
circuit implemented on the same process technologyExperimental measurements demonstrated that the fabricated
CMOS IO circuit closely matched the energy-dissipation estim-ated by post-layout simulations However the energy-recoveryIO circuit did not operate as expected and as a result its wasnot possible to evaluate the energy-recovery 3D interconnectingscheme on-chip Using simulation data the fault was attributedto higher than anticipated integrated inductor parasitics
7C O N C L U S I O N S
Planar on-chip interconnects have become a major concern inmodern processes as their impact on IC performance has beenprogressively increasing with each technology node In recentyears 3D interconnects based on vertical TSVs have been proposedas a possible solution to the interconnect bottleneck TSVs havethe potential to offer the best all around technology for meet-ing future technology trends with their advantages includingreduced parasitics and high interconnection density
As TSV is a relatively recent 3D interconnect technology thereare still challenges to be overcome before TSVs can be widelyadopted and commercialised Issues in the fabrication of TSV
devices are in the process of being resolved or solutions havealready been found For implementing TSV-based 3D-ICs withmore complex functionalities than simple test chips 3D designmethodologies and tools are needed which are still at their in-fancy stage Accurate perception of the TSV characteristics thatinfluence circuit performance is a major step towards the devel-opment of design methodologies and tools for 3D-ICs
Key TSV characteristics are the TSV parasitic capacitance andthermomechanical stress distribution The parasitic capacitanceaffects delay and power consumption of signals passing throughthe TSV interconnect while thermomechanical stress affects MOS-
FET device mobility in the vicinity of TSVs For extracting theseparameters accurately and under different conditions sophist-icated test structures were presented in this work which werefabricated and evaluated on an experimental 3D process
Although the TSV parasitic capacitance is small it may stillbecome an important source of power consumption as TSVs areimplemented in large densely interconnected 3D-SoCs Since con-ventional low-power design techniques based on voltage scalingrepresent a technical challenge in modern technology nodesa novel TSV interconnection scheme with frequency-dependentpower consumption was proposed in this work based on energy-recovery logic The proposed scheme was analysed using the-oretical modelling while a demonstrator IC was designed andfabricated on a 130nm 3D process
The contributions of this thesis as presented in the previouschapters are summarised in the following sections as well aspossible directions for future work
114
71 main contributions 115
71 main contributions
From the parasitics associated with a TSV capacitance assumesthe most significant role on its electrical properties such as en-ergy dissipation and delay thus accurate characterization of theTSV capacitance is highly desirable when designing a 3D systemIn Chapter 3 a group of test structures were presented for theelectrical characterization of the TSV parasitic capacitance un-der various conditions The test structures were based on theCBCM technique which has been used in the past for measuringBEOL metal interconnect parasitic capacitances and MOSFET ca-pacitances The presented test structures were fabricated on a65nm 3D process and the measurement results were statisticallyanalysed in terms of TSV capacitance die-to-die (D2D) and die-to-wafer (D2W) variability Comparison of the measurement data tosimulated oxide liner thickness variation for the specific processrevealed the oxide liner as a major contributor to the observedTSV capacitance variability Measurement results also confirmedthe effect of TSV pitch on the parasitic capacitance as has beenpreviously reported in the literature for a different process Theproposed CBCM-based TSV capacitance measurement techniquewas shown to be superior to conventional LCR measurementsproducing both accurate measurement results as well as lessoutliers (less defective samples) in the measurement data
The integration of TSV 3D interconnects in CMOS technology hasbeen shown to induce thermomechanical stress on the silicon (Si)substrate Thermomechanical stress can have an impact on activedevice carrier mobility implemented on the same Si substratethus monitoring and controlling stress is crucial for designing TSV-aware circuits with predictable performance Simulations usingfinite element method (FEM) modelling predict a complex ther-momechanical stress distribution which emphasises the need forsophisticated characterization structures that can monitor stressimpact with precision and verify simulation models In Chapter4 a test structure was presented for indirectly monitoring of theTSV-induced thermomechanical stress distribution by measuringits effect on transistor saturation current The structure implemen-ted arrays of MOSFET devices superimposed with TSVs in variousconfigurations and enabled accurate measurement of individualdevices by combining digital selection logic with four-terminalsensing Imecrsquos experimental 65nm 3D process was characterizedusing the presented test structure extracting the thermomechan-ical stress effect on PMOS devices which was found to be on parwith simulation estimations
In future large densely interconnected 3D-SoCs the TSV parasiticcapacitance may become an important source of energy dissip-ation Energy-recovery logic is an interesting low-power design
72 future work 116
approach as it is not limited by voltage scaling which representsa technical challenge in modern technology nodes In Chapter5 the potential of the energy-recovery technique was investig-ated for reducing the energy dissipation of TSV interconnectsin 3D-ICs The proposed energy-recovery scheme was analysedusing theoretical modelling generating equations for energy dis-sipation and optimum device sizing The model accuracy wasevaluated against SPICE simulations which showed good cor-relation with the theoretical estimations on a 130nm technologyprocess Comparison of the energy-recovery scheme to conven-tional static CMOS demonstrated favourable energy performancefor the energy-recovery scheme for low operating frequenciesand high Q factors switching activities and TSV capacitances
The theoretical models developed in Chapter 5 were used fordesigning a energy-recovery 3D demonstrator circuit in Chapter6 under realistic physical and electrical constraints The energy-recovery circuit and its CMOS equivalent were fabricated on a130nm 3D process technology and post-layout simulations showedthat the energy-recovery circuit dissipated 39 less energy percycle than the CMOS circuit on the same process technology Exper-imental measurements and comparison to simulations confirmedthe proper operation of the fabricated CMOS circuit however theenergy-recovery circuit did not operate as expected and as a res-ult its was not possible to evaluate the proposed energy-recoveryscheme on a chip Using simulation data the fault was attributedto higher than anticipated integrated inductor parasitics
72 future work
The CBCM-based test structures in Chapter 3 were shown to be atleast as accurate as external instruments for characterizing TSV
capacitance Considering that CBCM can evaluate single-endedTSVs it could prove a valuable tool for testing TSVs prior to diestacking which is a critical step for keeping 3D-IC yield acceptableMeasuring the TSV parasitic capacitance can provide with import-ant information about the quality of the fabricated TSV devicesince even slight variations in process parameters will affect itsparasitic capacitance as was demonstrated in Chapter 3 CBCM issimpler and potentially more accurate than other techniques thathave been proposed in the literature for the same role such asRC constant measurement using sense amplification [16]
In Chapter 4 the experiences learned from evaluation of theexperimental measurements results could be potentially used fordesigning improved thermomechanical stress characterizationstructures for next generation TSV processes
1 The transistors should be in a regular grid in the array forsimple processing of the measurement data
72 future work 117
2 The boundary conditions of transistors should be carefullydesigned as they have an considerable effect on transistormatching and thus measurement accuracy
3 Distances of interest are generally lt 8microm thus the arrayscould be smaller and more dense in the area in proximityto the TSV
4 Mechanisms are required to detect faults in the digitalselection logic when the implementation process is expectedto have low-yield issues
Furthermore in a future work the thermomechanical stress effecton NMOS devices should be measured which were not evaluatedin this project Simulations predict that NMOS devices are muchless affected by thermomechanical stress than PMOS devices thusthe approach used in this work might not be sufficiently accurateto extract stress effect on NMOS mobility
In Chapter 5 the energy-recovery scheme proposed could use adifferent encoding technique than dual-rail as a possible improve-ment A suggested encoding is 1-out-of-4 [74] since only chargesa single output at any given cycle which can potentially save upto 50 energy in comparison to dual-rail Also the demonstra-tion of the energy-recovery scheme feasibility on a chip is stillpending since the energy-recovery 3D demonstrator discussedin Chapter 6 failed to operate properly The fault was attributedto higher than anticipated integrated inductor parasitics thusthe same circuit should be implemented in the future with animproved inductor design for verifying the proposed theory
AA P P E N D I X A P H O T O S
Figure A1 3D65 300mm wafer (on chuck)
Figure A2 Probe card
118
appendix a photos 119
Figure A3 3D130 200mm wafer (on chuck)
ProbeHead
Microscope
Figure A4 Probe head
BA P P E N D I X B S TAT I S T I C S
mean (micro)
In a given data set the mean is the sum of the values divided bythe number of values The mean represents the average value in apopulation If xi is a value in the data set and n the total numberof values the mean micro can be calculated as
micro =1n
n
sumi=1
xi
median (m)
The median is useful for skewed distributions and represents thearithmetic value that the majority of the values in a data set tendThe median of a normal distribution with mean micro and varianceσ2 is the same as the mean
standard deviation (σ)
Standard deviation is used for measuring variability in a dataset It shows how much variation there is from the mean valueTo calculate the standard deviation of a data set first the squareof the difference of each value from the mean is calculated andthen the square root of the mean gives the standard deviation
A normal distribution contains 682 of its values within plusmn1σOther distributions can have a different spread and typically alow standard deviation indicates low variability whereas a highstandard deviation increased variability
standard deviation of the mean (σmicro )
The standard deviation of the sampled mean (σmicro) provides in-formation on the precision of the calculated mean of a data setThis is especially useful in experimental measurements wherethe certainty of the data being an accurate representation of themeasured physical quantity increases the more the measurementis repeated
Assuming there is a data set X with n values xi of mean micro
120
appendix b statistics 121
variance(X) = σ2X
variance(micro) = variance( 1n sum
ni=1 xi) =
1n2 variance(sumn
i=1 xi) =1n2 sum
ni=1 variance(xi) =
nn2 variance(X) =1n variance(X) =
σ2Xn rarr
σmicro = σXradicn
coefficient of variation
The coefficient of variation is the ratio of the standard deviationσ to the mean micro
CV =σ
micro
It can be more useful in determining variability in a populationthan standard deviation as in contrast to standard deviationcoefficient of variation is a dimensionless number often expressedas percentage
outlier
Outliers are values in a data set that are numerically distant fromthe rest of the data Observing and sometimes removing theoutliers is crucial in experimental measurement as these valuesusually indicate a measurement error or other malfunction De-termining outliers is generally subjective to the observer as thereis no specific mathematical definition of what constitutes outliersTypically the outliers are located by observing the probabilitydistribution function of a data set
kruskalndash wallis [34]
KruskalndashWallis one-way analysis of variance compares the medi-ans of two or more data sets to determine if the samples originatefrom separate populations This analysis tests the null-hypothesisthat there is no relationship between the data sets calculating theprobability of the data sets originating from the same populationand rejecting the null-hypothesis if it is less than a significancelevel The KruskalndashWallis test can be used on data sets that donot have a normal distribution and their variances do not have tobe equal
B I B L I O G R A P H Y
[1] Model 2602 dual-channel system sourcemeter instru-ment URL httpwwwkeithleycomproductsdcac
currentsourcebroadpurposemn=2602
[2] Max-3d URL httpwwwmicromagiccom
[3] The international technology roadmap for semiconductors2007 URL httppublicitrsnet
[4] The international technology roadmap for semiconductors2009 URL httppublicitrsnet
[5] Jaime Aguilera Joaquin de No Andres Garcia-Alonso FrankOehler Heiko Hein and Josef Sauerer A guide for on-chip inductor design in a conventional cmos process forrf applications In Applied Microwave amp Wireless Magazinepages 56ndash65 October 2001
[6] M Arsalan and M Shams Charge-recovery power clockgenerators for adiabatic logic circuits In Proc 18th Int VLSI
Design Conf pages 171ndash174 2005 doi 101109ICVD200564
[7] P Asimakopoulos G Van der Plas A Yakovlev andP Marchal Evaluation of energy-recovering interconnectsfor low-power 3d stacked ics In Proc IEEE Int Conf
3D System Integration 3DIC 2009 pages 1ndash5 2009 doi1011093DIC20095306581
[8] W C Athas L J Svensson J G Koller N Tzartzanis andE Ying-Chin Chou Low-power digital systems based onadiabatic-switching principles 2(4)398ndash407 1994 doi 10110992335009
[9] C H Bennett Logical reversibility of computation IBM
Journal of Research and Development 17(6)525ndash532 1973 doi101147rd1760525
[10] Charles H Bennett The fundamental physical lim-its of computation Scientific American 253(1)48ndash561985 URL httpwwwnaturecomdoifinder101038
scientificamerican0785-48
[11] E Beyne 3d system integration technologies In Proc Int
VLSI Technology Systems and Applications Symp pages 1ndash92006 doi 101109VTSA2006251113
122
bibliography 123
[12] Mark Bohr and Kaizad Mistry Intelrsquos revolu-tionary 22 nm transistor technology 2011 URLhttpdownloadintelcomnewsroomkits22nmpdfs
22nm-Details_Presentationpdf
[13] Ke Cao S Dobre and Jiang Hu Standard cell charac-terization considering lithography induced variations InProc 43rd ACMIEEE Design Automation Conf pages 801ndash8042006 doi 101109DAC2006229327
[14] A P Chandrakasan S Sheng and R W Brodersen Low-power cmos digital design 27(4)473ndash484 1992 doi 1011094126534
[15] J C Chen B W McGaughy D Sylvester and Chenming HuAn on-chip attofarad interconnect charge-based capacitancemeasurement (cbcm) technique In Proc Int Electron Devices
Meeting IEDM rsquo96 pages 69ndash72 1996 doi 101109IEDM1996553124
[16] Po-Yuan Chen Cheng-Wen Wu and Ding-Ming Kwai On-chip tsv testing for 3d ic before bonding using sense ampli-fication In Proc ATS rsquo09 Asian Test Symp pages 450ndash4552009 doi 101109ATS200942
[17] Sungdong Cho Sinwoo Kang Kangwook Park Jaechul KimKiyoung Yun Kisoon Bae Woon Seob Lee Sangwook JiEunji Kim Jangho Kim Y L Park and E S Jung Impact oftsv proximity on 45nm cmos devices in wafer level In Proc
IEEE Int Interconnect Technology Conf and 2011 Materials for
Advanced Metallization (IITCMAM) pages 1ndash3 2011 doi101109IITC20115940326
[18] Kyu-Myung Choi An industrial perspective of 3d ic inte-gration technology from the viewpoint of design technologyIn Proc 15th Asia and South Pacific Design Automation Conf
(ASP-DAC) pages 544ndash547 2010 doi 101109ASPDAC20105419823
[19] Juang-Ying Chueh C H Ziesler and M C PapaefthymiouEmpirical evaluation of timing and power in resonant clockdistribution In Proc Int Symp Circuits and Systems ISCAS
rsquo04 volume 2 2004 doi 101109ISCAS20041329255
[20] Thuy Dao D H Triyoso M Petras and M CanonicoThrough silicon via stress characterization In Proc IEEE
Int Conf IC Design and Technology ICICDT rsquo09 pages 39ndash412009 doi 101109ICICDT20095166260
[21] M Grange R Weerasekera D Pamunuwa and H Ten-hunen Exploration of through silicon via interconnect par-
bibliography 124
asitics for 3-dimensional integrated circuits In Workshop
Notes Design Automation and Test in Europe (DATE) 2009
[22] N Z Haron and S Hamdioui Why is cmos scaling comingto an end In Proc 3rd Int Design and Test Workshop IDT
2008 pages 98ndash103 2008 doi 101109IDT20084802475
[23] Imec 200mm 3d-sic tsv line 2009 URL httpwwwimec
beScientificReportSR2009HTML1213307html
[24] J W Joyner R Venkatesan P Zarkesh-Ha J A Davis andJ D Meindl Impact of three-dimensional architectures oninterconnects in gigascale integration 9(6)922ndash928 2001doi 10110992974905
[25] A P Karmarkar Xiaopeng Xu and V Moroz Perfor-manace and reliability analysis of 3d-integration struc-tures employing through silicon via (tsv) In Proc IEEE
Int Reliability Physics Symp pages 682ndash687 2009 doi101109IRPS20095173329
[26] G Katti A Mercha M Stucchi Z Tokei D VelenisJ Van Olmen C Huyghebaert A Jourdain M RakowskiI Debusschere P Soussan H Oprins W DehaeneK De Meyer Y Travaly E Beyne S Biesemans andB Swinnen Temperature dependent electrical character-istics of through-si-via (tsv) interconnections In Proc Int
Interconnect Technology Conf (IITC) pages 1ndash3 2010 doi101109IITC20105510311
[27] G Katti M Stucchi K De Meyer and W Dehaene Electricalmodeling and characterization of through silicon via forthree-dimensional ics 57(1)256ndash262 2010 doi 101109TED20092034508
[28] G Katti M Stucchi J Van Olmen K De Meyer and W De-haene Through-silicon-via capacitance reduction techniqueto benefit 3-d ic performance 31(6)549ndash551 2010 doi101109LED20102046712
[29] Walt Kester Sensor signal conditioning In Jon SWilson editor Sensor Technology Handbook pages 31 ndash136 Newnes Burlington 2005 ISBN 978-0-75-067729-5 doi DOI101016B978-075067729-550044-6 URLhttpwwwsciencedirectcomsciencearticlepii
B9780750677295500446
[30] Jonggab Kil Jie Gu and C H Kim A high-speed variation-tolerant interconnect technique for sub-threshold circuitsusing capacitive boosting 16(4)456ndash465 2008 doi 101109TVLSI2007915455
bibliography 125
[31] Suhwan Kim C H Ziesler and M C PapaefthymiouCharge-recovery computing on silicon 54(6)651ndash659 2005doi 101109TC200591
[32] C Kortekaas On-chip quasi-static floating-gate capacitancemeasurement method In Proc Int Conf Microelectronic Test
Structures ICMTS 1990 pages 109ndash113 1990 doi 101109ICMTS199067889
[33] A Kramer J S Denker B Flower and J Moroney 2ndorder adiabatic computation with 2n-2p and 2n-2n2p logiccircuits In Proceedings of the 1995 international symposium on
Low power design ISLPED rsquo95 pages 191ndash196 New York NYUSA 1995 ACM ISBN 0-89791-744-8 doi httpdoiacmorg101145224081224115 URL httpdoiacmorg10
1145224081224115
[34] William H Kruskal and W Allen Wallis Use of ranks in one-criterion variance analysis Journal of the American Statistical
Association 47(260)583ndash621 1952 ISSN 01621459 doi 1023072280779 URL httpdxdoiorg1023072280779
[35] Kelin J Kuhn Moorersquos law past 32nm Future challengesin device scaling 2009 URL httpdownloadintelcom
pressroompdfkkuhnKuhn_IWCE_invited_textpdf
[36] R Landauer Irreversibility and heat generation in the com-puting process IBM Journal of Research and Development 5
(3)183ndash191 1961 doi 101147rd530183
[37] Joonho Lim Dong-Gyu Kim and Soo-Ik Chae A 16-bitcarry-lookahead adder using reversible energy recoverylogic for ultra-low-energy systems 34(6)898ndash903 1999 doi1011094766827
[38] Joonho Lim Dong-Gyu Kim and Soo-Ik Chae nmos re-versible energy recovery logic for ultra-low-energy applica-tions 35(6)865ndash875 2000 doi 1011094845190
[39] F Liu X Gu K A Jenkins E A Cartier Y Liu P Song andS J Koester Electrical characterization of 3d through-silicon-vias In Proc 60th Electronic Components and Technology Conf
(ECTC) pages 1100ndash1105 2010 doi 101109ECTC20105490839
[40] K H Lu Xuefeng Zhang Suk-Kyu Ryu J Im Rui Huangand P S Ho Thermo-mechanical reliability of 3-d ics con-taining through silicon vias In Proc 59th Electronic Compo-
nents and Technology Conf ECTC 2009 pages 630ndash634 2009doi 101109ECTC20095074079
bibliography 126
[41] Nir Magen Avinoam Kolodny Uri Weiser and NachumShamir Interconnect-power dissipation in a microprocessorIn Proceedings of the 2004 international workshop on System
level interconnect prediction SLIP rsquo04 pages 7ndash13 New YorkNY USA 2004 ACM ISBN 1-58113-818-0 doi httpdoiacmorg101145966747966750 URL httpdoiacm
org101145966747966750
[42] H Mahmoodi V Tirumalashetty M Cooke and K RoyUltra low-power clocking scheme using energy recoveryand clock gating 17(1)33ndash44 2009 doi 101109TVLSI20082008453
[43] D Maksimovic A mos gate drive with resonant transitionsIn Proc nd Annual IEEE Power Electronics Specialists Conf
PESC rsquo91 Record pages 527ndash532 1991 doi 101109PESC1991162725
[44] D Maksimovic and V G Oklobdzija Integrated power clockgenerators for low energy logic In Proc th Annual IEEE
Power Electronics Specialists Conf PESC rsquo95 Record volume 1pages 61ndash67 1995 doi 101109PESC1995474793
[45] D Maksimovic V G Oklobdzija B Nikolic and K WCurrent Clocked cmos adiabatic logic with integratedsingle-phase power-clock supply 8(4)460ndash463 2000 doi10110992863629
[46] E J Marinissen Testing tsv-based three-dimensional stackedics In Proc Design Automation amp Test in Europe Conf amp
Exhibition (DATE) pages 1689ndash1694 2010
[47] E J Marinissen and Y Zorian Testing 3d chips containingthrough-silicon vias In Proc Int Test Conf ITC 2009 pages1ndash11 2009 doi 101109TEST20095355573
[48] A Mercha A Redolfi M Stucchi N Minas J Van Ol-men S Thangaraju D Velenis S Domae Y Yang G KattiR Labie C Okoro M Zhao P Asimakopoulos I De WolfT Chiarella T Schram E Rohr A Van Ammel A Jour-dain W Ruythooren S Armini A Radisic H PhilipsenN Heylen M Kostermans P Jaenen E Sleeckx D Sabun-cuoglu Tezcan I Debusschere P Soussan D PerryG Van der Plas J H Cho P Marchal Y Travaly E BeyneS Biesemans and B Swinnen Impact of thinning andthrough silicon via proximity on high-k metal gate firstcmos performance In Proc Symp VLSI Technology (VLSIT)pages 109ndash110 2010 doi 101109VLSIT20105556190
[49] A Mercha G Van der Plas V Moroz I De Wolf P Asi-makopoulos N Minas S Domae D Perry M Choi
bibliography 127
A Redolfi C Okoro Y Yang J Van Olmen S ThangarajuD S Tezcan P Soussan J H Cho A Yakovlev P Mar-chal Y Travaly E Beyne S Biesemans and B SwinnenComprehensive analysis of the impact of single and arraysof through silicon vias induced stress on high-k metalgate cmos performance In Proc IEEE Int Electron Devices
Meeting (IEDM) 2010 doi 101109IEDM20105703278
[50] Yong Moon and Deog-Kyoon Jeong An efficient chargerecovery logic circuit 31(4)514ndash522 1996 doi 1011094499727
[51] G E Moore Cramming more components onto inte-grated circuits Electronics 38(8)114ndash117 April 1965 doi101109JPROC1998658762 URL httpdxdoiorg10
1109JPROC1998658762
[52] O S Nakagawa S-Y Oh T Hsu and S Habu Benchmarkmethodology of interconnect capacitance simulation usinginter-digitated capacitors In Proc Int Microelectronic Test
Structures ICMTS 1998 Conf pages 235ndash237 1998 doi 101109ICMTS1998688103
[53] M W Newman S Muthukumar M SchueleinT Dambrauskas P A Dunaway J M Jordan S KulkarniC D Linde T A Opheim R A Stingel W WorwagL A Topic and J M Swan Fabrication and electricalcharacterization of 3d vertical interconnects In Proc
56th Electronic Components and Technology Conf 2006 doi101109ECTC20061645676
[54] V G Oklobdzija D Maksimovic and Fengcheng Lin Pass-transistor adiabatic logic using single power-clock supply44(10)842ndash846 1997 doi 10110982633443
[55] J Van Olmen C Huyghebaert J Coenen J Van AelstE Sleeckx A Van Ammel S Armini G Katti J VaesW Dehaene E Beyne and Y Travaly Integration chal-lenges of copper through silicon via (tsv) metallization for3d-stacked ic integration Microelectronic Engineering 88
(5)745 ndash 748 2011 ISSN 0167-9317 doi DOI101016jmee201006026 URL httpwwwsciencedirectcom
sciencearticlepiiS0167931710002121 The 2010 Inter-national workshop on - MAM 2010 The 2010 Internationalworkshop on
[56] J J Paulos and D A Antoniadis Measurement of minimum-geometry mos transistor capacitances 20(1)277ndash283 1985doi 101109JSSC19851052303
bibliography 128
[57] Vasilis F Pavlidis and Eby G Friedman Interconnect de-lay minimization through interlayer via placement in 3-dics In Proceedings of the 15th ACM Great Lakes symposium
on VLSI GLSVLSI rsquo05 pages 20ndash25 New York NY USA2005 ACM ISBN 1-59593-057-4 doi httpdoiacmorg10114510576611057669 URL httpdoiacmorg101145
10576611057669
[58] Dan Perry Jonghoon Cho Shinichi Domae Panagiotis Asi-makopoulos Alex Yakovlev Pol Marchal Geert Van derPlas and Nikolaos Minas An efficient array structure tocharacterize the impact of through silicon vias on fet devicesIn Proc IEEE Int Microelectronic Test Structures (ICMTS) Confpages 118ndash122 2011 doi 101109ICMTS20115976872
[59] Jan M Rabaey Anantha Chandrakasan and Borivoje NikolicDigital integrated circuits- A design perspective Prentice Hall2ed edition 2004
[60] K Roy and S Prasad Low-power CMOS VLSI circuit de-
sign A Wiley-Interscience publication Wiley 2000 ISBN9780471114888
[61] K Roy S Mukhopadhyay and H Mahmoodi-MeimandLeakage current mechanisms and leakage reduction tech-niques in deep-submicrometer cmos circuits 91(2)305ndash3272003 doi 101109JPROC2002808156
[62] V S Sathe M C Papaefthymiou and C H Ziesler Boostlogic a high speed energy recovery circuit family In Proc
IEEE Computer Society Annual Symp VLSI pages 22ndash27 2005doi 101109ISVLSI200522
[63] V S Sathe J C Kao and M C Papaefthymiou Resonant-clock latch-based design 43(4)864ndash873 2008 doi 101109JSSC2008917501
[64] B Sell A Avellan and W H Krautschneider Charge-basedcapacitance measurements (cbcm) on mos devices 2(1)9ndash122002 doi 101109TDMR20021014667
[65] C S Selvanayagam Xiaowu Zhang R Rajoo and D PinjalaModelling stress in silicon with tsvs and its effect on mobilityIn Proc 11th Electronics Packaging Technology Conf EPTC rsquo09pages 612ndash618 2009 doi 101109EPTC20095416477
[66] D Shariff P C Marimuthu K Hsiao L Asoy Chia LaiYee Aung Kyaw Oo K Buchanan K Crook T Wilby andS Burgess Integration of fine-pitched through-silicon viasand integrated passive devices In Proc IEEE 61st Elec-
tronic Components and Technology Conf (ECTC) pages 844ndash848 2011 doi 101109ECTC20115898609
bibliography 129
[67] S Stoukatch C Winters E Beyne W De Raedt andC Van Hoof 3d-sip integration for autonomous sensornodes In Proc 56th Electronic Components and Technology
Conf 2006 doi 101109ECTC20061645678
[68] L J Svensson and J G Koller Driving a capacitive loadwithout dissipating fcv2 In Proc IEEE Symp Low Power
Electronics Digest of Technical Papers pages 100ndash101 1994 doi101109LPE1994573220
[69] B Swinnen W Ruythooren P De Moor L Bogaerts L Car-bonell K De Munck B Eyckens S Stoukatch D S TezcanZ Tokei J Vaes J Van Aelst and E Beyne 3d integrationby cu-cu thermo-compression bonding of extremely thinnedbulk-si die containing 10 IcircŒm pitch through-si vias In Proc
Int Electron Devices Meeting IEDM rsquo06 pages 1ndash4 2006 doi101109IEDM2006346786
[70] N Tanaka M Kawashita Y Yoshimura T Uematsu M Fuji-sawa H Shimokawa N Kinoshita T Naito T Kikuchi andT Akazawa Characterization of mos transistors after tsvfabrication and 3d-assembly In Proc 2nd Electronics System-
Integration Technology Conf ESTC 2008 pages 131ndash134 2008doi 101109ESTC20084684338
[71] S E Thompson Guangyu Sun Youn Sung Choi andT Nishida Uniaxial-process-induced strained-si extend-ing the cmos roadmap 53(5)1010ndash1020 2006 doi 101109TED2006872088
[72] N Tzartzanis and W C Athas Clock-powered cmos ahybrid adiabatic logic style for energy-efficient computingIn Proc 20th Anniversary Conf Advanced Research in VLSIpages 137ndash151 1999 doi 101109ARVLSI1999756044
[73] J Van Olmen A Mercha G Katti C HuyghebaertJ Van Aelst E Seppala Zhao Chao S Armini J VaesR C Teixeira M Van Cauwenberghe P Verdonck K Ver-hemeldonck A Jourdain W Ruythooren M de Potterde ten Broeck A Opdebeeck T Chiarella B Parvais I De-busschere T Y Hoffmann B De Wachter W DehaeneM Stucchi M Rakowski P Soussan R Cartuyvels E BeyneS Biesemans and B Swinnen 3d stacked ic demonstrationusing a through silicon via first approach In Proc IEEE Int
Electron Devices Meeting IEDM 2008 pages 1ndash4 2008 doi101109IEDM20084796763
[74] Tom Verhoeff Delay-insensitive codes - an overview Dis-
tributed Computing 31ndash8 1988 ISSN 0178-2770 URL http
dxdoiorg101007BF01788562 101007BF01788562
bibliography 130
[75] P Vitanov T Dimitrova and I Eisele Direct capacitancemeasurements of small geometry mos transistors Micro-
electronics Journal 22(7-8)77 ndash 89 1991 ISSN 0026-2692doi DOI1010160026-2692(91)90016-G URL httpwww
sciencedirectcomsciencearticleB6V44-4829XDJ-2N
2f5b2c221dcf240c38cd13b7b3432f913
[76] B Voss and M Glesner A low power sinusoidal clockIn Proc IEEE Int Symp Circuits and Systems ISCAS 2001volume 4 pages 108ndash111 2001 doi 101109ISCAS2001922182
[77] R Weerasekera M Grange D Pamunuwa H Tenhunenand Li-Rong Zheng Compact modelling of through-siliconvias (tsvs) in three-dimensional (3-d) integrated circuits InProc IEEE Int Conf 3D System Integration 3DIC 2009 pages1ndash8 2009 doi 1011093DIC20095306541
[78] NHE Weste and DF Harris CMOS VLSI design a cir-
cuits and systems perspective PearsonAddison-Wesley 2005ISBN 9780321269775
[79] Steve Wozniak Failure magazine interview (by jason zasky)July 2000 URL httpfailuremagcomindexphpsite
printsteve_wozniak_interview
[80] Hyung Suk Yang and Muhannad S Bakir 3d integrationof cmos and mems using mechanically flexible intercon-nects (mfi) and through silicon vias (tsv) In Proc 60th Elec-
tronic Components and Technology Conf (ECTC) pages 822ndash828 2010 doi 101109ECTC20105490716
[81] Yu Yang G Katti R Labie Y Travaly B Verlinden andI De Wolf Electrical evaluation of 130-nm mosfets withtsv proximity in 3d-sic structure In Proc Int Interconnect
Technology Conf (IITC) pages 1ndash3 2010 doi 101109IITC20105510710
[82] Yibin Ye and K Roy Energy recovery circuits using re-versible and partially reversible logic 43(9)769ndash778 1996doi 10110981536746
[83] Yibin Ye and K Roy Qserl quasi-static energy recoverylogic 36(2)239ndash248 2001 doi 1011094902764
[84] C C Yeh J H Lou and J B Kuo 15 v cmos full-swingenergy efficient logic (eel) circuit suitable for low-voltageand low-power vlsi applications Electronics Letters 33(16)1375ndash1376 1997 doi 101049el19970915
bibliography 131
[85] C P Yuan and T N Trick A simple formula for the esti-mation of the capacitance of two-dimensional interconnectsin vlsi circuits 3(12)391ndash393 1982 doi 101109EDL198225610
- Abstract
- Publications
- Acknowledgments
- Contents
- List of Figures
- List of Tables
- Acronyms
- 1 Introduction
-
- 11 Research goals and contribution
- 12 Thesis outline
-
- 2 Background
-
- 21 3D integrated circuits
-
- 211 3D integration approaches
- 212 TSV interconnects
-
- 22 Energy-recovery logic
-
- 221 Energy-recovery architectures
- 222 Power-clock generators
-
- 23 Full-custom design flow
- 24 Summary
-
- 3 Characterization of TSV capacitance
-
- 31 Introduction
- 32 TSV parasitic capacitance
- 33 Integrated capacitance measurement techniques
- 34 Electrical characterization
-
- 341 Physical implementation
- 342 Measurement setup
- 343 Experimental results
-
- 35 Summary and conclusions
-
- 4 TSV proximity impact on MOSFET performance
-
- 41 Introduction
- 42 TSV-induced thermomechanical stress
- 43 Test structure implementation
-
- 431 MOSFET arrays
- 432 Digital selection logic
-
- 44 Experimental measurements
-
- 441 Data processing methodology
- 442 Measurement setup
- 443 PMOS (long channel)
- 444 PMOS (short channel)
-
- 45 Summary and conclusions
-
- 5 Evaluation of energy-recovery for TSV interconnects
-
- 51 Introduction
- 52 Energy-recovery TSV interconnects
- 53 Analysis
-
- 531 Adiabatic driver
- 532 Inductors parasitic resistance
- 533 Switch M1
- 534 Timing Constraints
-
- 54 Simulations
-
- 541 Comparison to CMOS
-
- 55 Summary and conclusions
-
- 6 Energy-recovery 3D-IC demonstrator
-
- 61 Introduction
- 62 Physical implementation
-
- 621 Adiabatic driver
- 622 Pulse-to-level converter
- 623 Data generator
- 624 Pulse generator
- 625 Integrated inductor
- 626 Top level design
- 627 CMOS IO circuit
-
- 63 Post-layout simulation and comparison
- 64 Experimental measurements
-
- 641 Measurement setup
- 642 Results
-
- 65 Summary and conclusions
-
- 7 Conclusions
-
- 71 Main contributions
- 72 Future work
-
- A Appendix A Photos
- B Appendix B Statistics
- Bibliography
-
A B S T R A C T
The aggressive scaling of CMOS process technology has been driv-ing the rapid growth of the semiconductor industry for more thanthree decades In recent years the performance gains enabledby CMOS scaling have been increasingly challenged by highly-parasitic on-chip interconnects as wire parasitics do not scale atthe same pace Emerging 3D integration technologies based onvertical through-silicon vias (TSVs) promise a solution to the inter-connect performance bottleneck along with reduced fabricationcost and heterogeneous integration
As TSVs are a relatively recent interconnect technology innov-ative test structures are required to evaluate and optimise theprocess as well as extract parameters for the generation of designrules and models From the circuit designerrsquos perspective criticalTSV characteristics are its parasitic capacitance and thermomech-anical stress distribution This work proposes new test structuresfor extracting these characteristics The structures were fabric-ated on a 65nm 3D process and used for the evaluation of thattechnology
Furthermore as TSVs are implemented in large densely inter-connected 3D-system-on-chips (SoCs) the TSV parasitic capacitancemay become an important source of energy dissipation Typicallow-power techniques based on voltage scaling can be usedthough this represents a technical challenge in modern techno-logy nodes In this work a novel TSV interconnection scheme isproposed based on reversible computing which shows frequency-dependent energy dissipation The scheme is analysed using the-oretical modelling while a demonstrator IC was designed basedon the developed theory and fabricated on a 130nm 3D process
iii
P U B L I C AT I O N S
Some ideas and figures have appeared previously in the followingpublications
1 D Perry J Cho S Domae P Asimakopoulos A YakovlevP Marchal G Van der Plas and N Minas An efficientarray structure to characterize the impact of through siliconvias on fet devices In Proc IEEE Int Microelectronic TestStructures (ICMTS) Conf pages 118ndash122 2011 (best paper
award)
2 A Mercha G Van der Plas V Moroz I De Wolf P Asi-
makopoulos N Minas S Domae D Perry M Choi ARedolfi C Okoro Y Yang J Van Olmen S ThangarajuD S Tezcan P Soussan J H Cho A Yakovlev P Mar-chal Y Travaly E Beyne S Biesemans and B SwinnenComprehensive analysis of the impact of single and arraysof through silicon vias induced stress on high-k metalgate cmos performance In Proc IEEE Int Electron DevicesMeeting (IEDM) 2010
3 A Mercha A Redolfi M Stucchi N Minas J Van OlmenS Thangaraju D Velenis S Domae Y Yang G Katti RLa- bie C Okoro M Zhao P Asimakopoulos I De WolfT Chiarella T Schram E Rohr A Van Ammel A Jour-dain W Ruythooren S Armini A Radisic H PhilipsenN Heylen M Kostermans P Jaenen E Sleeckx D Sabun-cuoglu Tezcan I Debusschere P Soussan D Perry G Vander Plas J H Cho P Marchal Y Travaly E Beyne S Biese-mans and B Swinnen Impact of thinning and throughsilicon via proximity on high-k metal gate first cmos per-formance In Proc Symp VLSI Technology (VLSIT) pages109ndash110 2010
4 P Asimakopoulos G Van der Plas A Yakovlev and PMarchal Evaluation of energy-recovering interconnects forlow-power 3d stacked ics In Proc IEEE Int Conf 3D Sys-tem Integration 3DIC 2009 pages 1ndash5 2009
5 P Asimakopoulos and A Yakovlev An AdiabaticPower-Supply Controller For Asynchronous Logic Circuits20th UK Asynchronous Forum pages 1ndash4 2008
iv
In the end I hope therersquos a little note somewhere
that says I designed a good computer
mdash Steve Wozniak [79]
A C K N O W L E D G E M E N T S
First and foremost I would like to express my gratitude to mysupervisor Prof Alex Yakovlev who provided the perfect balancebetween freedom and guidance allowing me to obtain sphericalknowledge and scope in microelectronics but at the same time topromptly complete my studies Irsquom also grateful to the Engineer-ing and Physical Science Research Council (EPSRC) for fundingmy PhD studies through their Doctoral Training Accounts
As part of my research project I had the opportunity to spendsome time at Imec Belgium The experiences and lessons learnedduring my stay at Imec will undoubtedly prove valuable in myforthcoming professional career I would like to thank Dan Perryfrom Qualcomm for sharing his experience and ideas for the de-sign of the characterization structures implemented on the 65nm
Imec process Special thanks to Shinichi Domae from Panasonicand Dr Dimitrios Velenis for their assistance with the 65nm wafermeasurements as well as Marco Facchini for the time he spendtrying to measure the energy-recovery circuits Last but not leastIrsquod like to thank Geert Van der Plas and Paul Marchal for theiradvices and guidance throughout my stay at Imec
Finally many thanks to my colleagues at Newcastle Universityfor assisting me in diverse ways during my studies I would liketo especially mention Dr Santosh Shedabale and James Dochertyfor reviewing and commenting on parts of this text as well asDr Robin Emery for the numerous technical discussions and forhelping setup the design tools
v
C O N T E N T S
1 Introduction 1
11 Research goals and contribution 2
12 Thesis outline 3
2 Background 5
21 3D integrated circuits 5
211 3D integration approaches 6
212 TSV interconnects 9
22 Energy-recovery logic 13
221 Energy-recovery architectures 17
222 Power-clock generators 18
23 Full-custom design flow 20
24 Summary 21
3 Characterization of TSV capacitance 23
31 Introduction 23
32 TSV parasitic capacitance 23
33 Integrated capacitance measurement techniques 24
34 Electrical characterization 27
341 Physical implementation 28
342 Measurement setup 36
343 Experimental results 37
35 Summary and conclusions 44
4 TSV proximity impact on MOSFET performance 46
41 Introduction 46
42 TSV-induced thermomechanical stress 46
43 Test structure implementation 48
431 MOSFET arrays 51
432 Digital selection logic 55
44 Experimental measurements 61
441 Data processing methodology 61
442 Measurement setup 62
443 PMOS (long channel) 64
444 PMOS (short channel) 71
45 Summary and conclusions 73
5 Evaluation of energy-recovery for TSV interconnects 75
51 Introduction 75
52 Energy-recovery TSV interconnects 75
53 Analysis 79
531 Adiabatic driver 80
532 Inductorrsquos parasitic resistance 82
533 Switch M1 82
534 Timing Constraints 83
54 Simulations 83
541 Comparison to CMOS 86
vi
contents vii
55 Summary and conclusions 91
6 Energy-recovery 3D-IC demonstrator 92
61 Introduction 92
62 Physical implementation 92
621 Adiabatic driver 94
622 Pulse-to-level converter 95
623 Data generator 95
624 Pulse generator 95
625 Integrated inductor 97
626 Top level design 98
627 CMOS IO circuit 101
63 Post-layout simulation and comparison 101
64 Experimental measurements 106
641 Measurement setup 106
642 Results 106
65 Summary and conclusions 111
7 Conclusions 114
71 Main contributions 115
72 Future work 116
a Appendix A Photos 118
b Appendix B Statistics 120
bibliography 122
L I S T O F F I G U R E S
Figure 11 Scaling trend of logic and interconnect delay[30] 2
Figure 21 Comparison of 2D3D Integration 5
Figure 22 System-in-a-package 6
Figure 23 Package-on-package 6
Figure 24 Imecrsquos 3-D SIP wireless bio-electronic sensor[67] 7
Figure 25 3D wafer-level-packaging 7
Figure 26 3D-stacked IC 8
Figure 27 TSV process steps 9
Figure 28 Before thinningafter thinning 10
Figure 29 TSV stacking 10
Figure 210 Conventional AND gate 14
Figure 211 CMOS switching 16
Figure 212 Adiabatic switching 16
Figure 213 Basic 2N-2P adiabatic circuit 17
Figure 214 2N-2P timing 17
Figure 215 Adiabatic driver 18
Figure 216 Stepwise charging generator 19
Figure 217 1N power clock generator 20
Figure 218 Design flow 22
Figure 219 Inverter in Virtuoso schematic 22
Figure 220 Inverter in Virtuoso layout 22
Figure 31 Isolated 2D wire capacitance 24
Figure 32 TSV parasitic capacitance 25
Figure 33 Capacitance vs Gate voltage (CV) diagramof a MOS Capacitor 25
Figure 34 CBCM pseudo-inverters 27
Figure 35 Current as a function of frequency in CBCM 28
Figure 36 CBCM pseudo-inverter pair in layout 29
Figure 37 PRC1 test pad module (layout) 30
Figure 38 PRC1 test pad module (micrograph) 30
Figure 39 Structure rsquoArsquo - Single TSV capacitance (TOPdie) 32
Figure 310 Structure rsquoBrsquo - Single TSV capacitance (BOT-TOM die) 33
Figure 311 Structure rsquoCrsquo - Variable pitch TSV capacitance34
Figure 312 Structure rsquoDrsquo - TSV capacitance measure-ment with interconnection on TOP or BOT-TOM die 34
Figure 313 Structure rsquoRrsquo - Evaluation of TSV capacit-ance measurement accuracy 35
viii
List of Figures ix
Figure 314 Structure rsquoErsquo - TSV capacitance effect onsignal propagation delay (layout) 35
Figure 315 Structure rsquoErsquo - TSV capacitance effect onsignal propagation delay (schematic) 36
Figure 316 Measurement setup 36
Figure 317 TSV capacitance D2D variability on waferrsquoD11rsquo - Probability density function 39
Figure 318 TSV capacitance D2D variability on waferrsquoD18rsquo - Probability density function 40
Figure 319 TSV capacitance measurement results onwafer D11 40
Figure 320 TSV capacitance measurement results onwafer D18 41
Figure 321 Simulated oxide liner thickness variation 42
Figure 322 TSV capacitance W2W variability - Probab-ility density functions 42
Figure 323 Variable pitch TSV capacitance experimentaldata from wafer rsquoD18rsquo - Probability densityfunction 44
Figure 324 LCR experimental data from wafer rsquoD18rsquo -Probability density function 44
Figure 41 TSV thermomechanical stress simulation [40] 47
Figure 42 TSV thermomechanical stress interactionbetween two TSVs [40] 48
Figure 43 Estimated mobility shift on PMOS (a) andNMOS (b) transistors in between two TSVsseparated by distance d [65] 49
Figure 44 PTP1 test pad module (layout) 50
Figure 45 PTP1 test pad module (micrograph) 50
Figure 46 MOSFET dimensions and rotations 51
Figure 47 1 TSV MOSFET array configuration 52
Figure 48 4 diagonal TSVs MOSFET array configuration52
Figure 49 4 lateral TSVs MOSFET array configuration 53
Figure 410 9 TSVs MOSFET array configuration 53
Figure 411 MOSFET array detail 55
Figure 412 Digital selection logic 56
Figure 413 MOSFET array with digital selection logic 57
Figure 414 4-bit Gate-enableArray-enable decoder 58
Figure 415 Transmission gate for MOSFET Gate terminals59
Figure 416 Transmission gates for MOSFET Drain-Source terminals 59
Figure 417 2-bit Drain-select decoder 60
Figure 418 Processing of measurement results 62
Figure 419 Interpolation of measurement results 63
Figure 420 Measurement setup 63
Figure 421 Normalised on-current for PMOS structureldquoF0rdquo (D08) 65
List of Figures x
Figure 422 Normalised on-current for PMOS structureldquoG0rdquo (D08) 66
Figure 423 Normalised on-current for PMOS structureldquoG0rdquo (D08Y=14X=16) 66
Figure 424 Normalised on-current for PMOS structureldquoG0rdquo (D20) 67
Figure 425 Normalised on-current for PMOS structureldquoG0rdquo (D20Y=14X=16) 67
Figure 426 Normalised on-current for PMOS structureldquoG0rdquo (D08 60˚C) 68
Figure 427 Normalised on-current for PMOS structureldquoG0rdquo (D08 Y=14 X=16 60˚C) 69
Figure 428 Normalised on-current for PMOS structureldquoI0rdquo (D08) 69
Figure 429 Normalised on-current for PMOS structureldquoI0rdquo (D08 X=16) 70
Figure 430 Normalised on-current for PMOS structureldquoI0rdquo (D08 Y=14) 70
Figure 431 Normalised on-current for PMOS structureldquoG90rdquo (D08) 71
Figure 432 Normalised on-current for PMOS structureldquoG90rdquo (D08 Y=14 X=16) 71
Figure 433 Normalised on-current for PMOS structureldquoB0rdquo (D08) 72
Figure 434 Normalised on-current for PMOS structureldquoB0rdquo (D08 Y=14 X=16) 72
Figure 51 TSV energy dissipation per cycle increasewith 3D integration density 76
Figure 52 Comparison of a typical CMOS TSV driverto the proposed energy-recovery 77
Figure 53 Adiabatic driver 78
Figure 54 A simple pulse-to-level converter (P2LC) 78
Figure 55 Resonant pulse generator with dual-rail dataencoding 79
Figure 56 Energy-recovery scheme for TSV interconnects79
Figure 57 Energy-recovery timing constraints 83
Figure 58 Test case for the evaluation of the theoret-ical model 84
Figure 59 Energy dissipation results ( f Jcycle) the-oretical model (T) versus SPICE simulator(S) 85
Figure 510 Energy dissipation estimation error of thetheoretical model compared to the SPICEsimulator 85
Figure 511 Test case for comparison of energy-recoveryto CMOS 86
Figure 512 Improved P2LC design 87
List of Figures xi
Figure 513 Simulated energy dissipation of improvedP2LC design 87
Figure 514 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS for variable Q factor (C = 80 f F) 88
Figure 515 Energy contribution of energy-recovery cir-cuit components at 200MHz (C = 80 f F) 88
Figure 516 Energy contribution of energy-recovery cir-cuit components at 800MHz (C = 80 f F) 89
Figure 517 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS with variable TSV capacitance (C =
80 f F f = 500MHz) 89
Figure 518 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS (C = 80 f F f = 200MHz) 90
Figure 519 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS for variable technology (C = 80 f Ff = 500MHz) 90
Figure 61 Demonstrator circuit structure (showing 1-bit inputoutput (IO) channel face-downview) 93
Figure 62 Energy-recovery IO circuit schematic dia-gram 94
Figure 63 Adiabatic driver (layoutschematic view) 94
Figure 64 P2LC (layoutschematic view) 95
Figure 65 Data generator (layoutschematic view) 96
Figure 66 Timing constraints for the energy-recoverydemonstrator circuit 96
Figure 67 Pulse generator (layoutschematic view) 97
Figure 68 Pulse generator InputOutput signals 97
Figure 69 Inductorrsquos Q factor simulation in ASITIC Estimated energy dissipation reduction percycle over standard CMOS 98
Figure 610 Planar square spiral inductor designed inASITIC 99
Figure 611 Resonant pulse distribution tree 100
Figure 612 Interdigitated metal-insulator-metal capa-citor (fringe capacitor) 101
Figure 613 Energy-recovery IO circuit in layout view 102
Figure 614 CMOS IO circuit schematic diagram 103
Figure 615 CMOS driver schematic 103
Figure 616 CMOS IO circuit in layout view 103
Figure 617 Post-layout SPICE simulation of the energy-recovery circuit 105
Figure 618 Energy-recovery IO circuit micrograph 107
List of Figures xii
Figure 619 CMOS IO circuit micrograph 107
Figure 620 Measurement setup 107
Figure 621 Oscilloscope output for signals S1B and OC-MOS 109
Figure 622 Measuredsimulated power dissipation forthe CMOS circuit 109
Figure 623 Measuredsimulated energy dissipation forthe CMOS circuit 110
Figure 624 Oscilloscope output for signals S1A and OER110
Figure 625 Measuredsimulated energy dissipation forthe energy-recovery circuit 111
Figure 626 Energy-recovery IO circuit with fault hy-pothesis 111
Figure 627 Simulation of circuit with fault hypothesis 112
Figure 628 Measured energy dissipation and simula-tion of circuit with fault hypothesis 113
Figure A1 3D65 300mm wafer (on chuck) 118
Figure A2 Probe card 118
Figure A3 3D130 200mm wafer (on chuck) 119
Figure A4 Probe head 119
L I S T O F TA B L E S
Table 21 3D interconnect technologies comparison 8
Table 22 Technology parameters for Imec 3D130 process11
Table 23 Technology parameters for Imec 3D65 process11
Table 24 Comparison energy-recoveryconventionalCMOS 16
Table 31 Test structures on the PRC1 module 31
Table 32 PRC1 test pad instrument connections 37
Table 33 TSV capacitance D2D variability on waferrsquoD11rsquo 38
Table 34 TSV capacitance D2D variability on waferrsquoD18rsquo 39
Table 35 Variable pitch TSV capacitance experimentaldata from wafer rsquoD18rsquo 43
Table 36 Comparison between CBCMLCR measure-ments on wafer rsquoD18rsquo 45
Table 41 Coefficients of thermal expansion 47
Table 42 MOSFET array configurations for test padmodules PTP1PTP2 54
Table 43 MOSFET array configurations selected forexperimental evaluation 61
Table 44 PTP1PTP2 test pad module instrumentconnections 64
Table 45 Summary of experimental measurementson long (L) and short (S) channel PMOSdevices 73
Table 51 Energy dissipation results ( f Jcycle) the-oretical model (T) versus SPICE simulation(S) 84
Table 61 Energy-recovery IO circuit parameters 93
Table 62 Calculated resistance tolerances for eachwire branch in the resonant clock distribu-tion tree 100
Table 63 Energy-dissipation theoretical calculationsfor Q = 17 f = 500MHz 104
Table 64 Post-layout simulated energy dissipation Area requirements 106
Table 65 Energy-recovery circuit instrument connec-tions 108
Table 66 CMOS circuit instrument connections 108
xiii
A C R O N Y M S
2D two-dimentional
2N-2P 2 nMOS2 pMOS adiabatic logic
3D-IC three-dimentional integrated circuit
3D-SIC 3D stacked-IC
3D-SIP 3D system-in-package
3D three-dimentional
3D-WLP 3D wafer-level-packaging
BCB benzocyclobutene
BEOL back-end-of-line
CAL clocked adiabatic logic
CBCM charge-based capacitance measurement
CMOS complementary metal-oxide-semiconductor
CMP chemical-mechanical planarization
CTE coefficient of thermal expansion
Cu copper
CVD chemical vapor deposition
D2D die-to-die
D2W die-to-wafer
DCVSL differential cascode voltage switch logic
DRC design rule check
DUT device under test
ECRL efficient charge recovery logic
EDA electronic design automation
FEM finite element method
FEOL front-end-of-line
GPIB general purpose interface bus
IC integrated circuit
xiv
acronyms xv
IO inputoutput
KOZ keep-out-zone
LC inductance-capacitance
LVS layout versus schematic
MEMS microelectromechanical systems
MIM metal-insulator-metal
MOSFET metal-oxide-semiconductor field-effect transistor
MOS metal-oxide-semiconductor
MPU microprocessor unit
MuGFET multiple gate field-effect transistor
NMOS n-channel MOSFET
P2LC pulse-to-level converter
PAL pass-transistor adiabatic logic
PCB printed circuit board
PCG power-clock generator
PMD pre-metal dielectric
PMOS p-channel MOSFET
PVT process voltage temperature
QSERL quasi-static energy recovery logic
RC resistance-capacitance
RDL re-distribution layer
RLC resistance-inductance-capacitance
RMS root mean square
SIP system-in-a-package
Si silicon
SMU source measurement unit
SoC system-on-chip
TaN tantalum nitride
TSV through-silicon via
W tungsten
acronyms xvi
W2W wafer-to-wafer
WLP wafer-level-packaging
1I N T R O D U C T I O N
Extending integrated circuit (IC) functionality while keepingcost per silicon area approximately constant has historicallybeen the fundamental factor behind the growth of the semi-conductor industry Ever since Moorersquos observation [51] in 1965
chip functionality has been doubling every 15-2 years advancingfrom the Intel 4004 with 2000 transistors to modern generationmicroprocessor units (MPUs) that integrate over 1 billion tran-sistors [35] While there are indications that this trend might beslowing down due to planar metal-oxide-semiconductor field-effect transistor (MOSFET) physical limitations [22] innovativesolutions such as multiple gate field-effect transistors (MuGFETs)continue to drive the transistor roadmap forward as was recentlydemonstrated by Intel on their 22nm node MPU [12] It can bereasonably expected that IC transistor density will continue toincrease well into the next decade reaching 10 or even 100 billiontransistors per chip [4]
As transistor gate length dielectric thickness and junctiondepth are scaled performance in terms of speed and power im-proves as well The performance requirements for ICs have beensteadily increasing for succeeding technology generations drivenby market demand Although transistor scaling has traditionallybeen the main technology focus determining both cost and per-formance of ICs in recent years the performance gains enabled byscaling have been gradually challenged by on-chip interconnects
In an IC interconnects are the conducting wires connectingdevices and functional blocks in the circuit With smaller tran-sistors and increased functionality in modern processes intercon-nects have been progressively becoming more complex occupyingup to 12 metal layers in the chip As wire length increased sodid the signal delay time and power consumption of intercon-nects which are typically determined by wire intrinsic parasiticsin terms of resistance-capacitance (RC) It was estimated that a130nm microprocessor consumed over 50 of dynamic poweron interconnects [41] with a projection that in future designsup to 80 of microprocessor power would be consumed byinterconnects [3] Likewise signal propagation delay of globalinterconnects as a portion of the total system delay has beenrapidly increasing for succeeding technology nodes (Figure 11)
The rapid increase of the interconnect impact on IC perform-ance prompted ITRS1 to appoint the realisation of novel and
1 International Technology Roadmap for Semiconductors
1
11 research goals and contribution 2
Figure 11 Scaling trend of logic and interconnect delay [30]
innovative interconnect technologies as one of the ldquoGrand Chal-lengesrdquo in 2007 [3] Since then several technologies have been in-vestigated to address the interconnect bottleneck such as copperlow-k interconnects optical interconnects and three-dimentional (3D)interconnects Although copperlow-k interconnects are morecompatible with current processes and remain the most attractiveoption into the near term 3D interconnects are probably the mostrealistic option for meeting future technology trends [53]
Emerging 3D integration technologies based on vertical through-silicon vias (TSVs) promise a solution to the interconnect per-formance bottleneck along with reduced fabrication cost andheterogeneous integration TSVs are electrical connections passingthrough the silicon substrate and interconnecting separate diesstacked vertically in a 3D package [69] While there are several 3D
interconnect technologies [11] TSVs have the potential to offer thebest all around 3D integration technology with their advantagesincluding reduced parasitics and high interconnection density
11 research goals and contribution
As TSV is a relatively recent 3D interconnect technology there arestill challenges to be overcome before TSVs can be widely adoptedand commercialised Fabrication issues such as etching linerdeposition metallization and chemical-mechanical planarization(CMP) steps are in the process of being resolved or solutions havealready been found [55] Consequently the next logical step isthe implementation of three-dimentional integrated circuits (3D-
ICs) with more complex functionalities than simple test chipsHowever complex circuit design requires design methodologiesand tools that for 3D-ICs are still at their infancy stage [18]
The motivation behind this project was to accelerate the de-velopment of such design methodologies and tools for future
12 thesis outline 3
3D-IC designs by addressing key concerns of the TSV technologyprimarily from the circuit designerrsquos perspective The researcheffort was aimed at developing characterization structures forevaluating TSV characteristics affecting signal propagation delaypower consumption and circuit robustness as well as proposingdesign techniques for overcoming limitations Although TSVs areinterconnects their behaviour resembles more to MOSFET devicesthan the commonly used back-end-of-line (BEOL) metal wireswhile their integration into silicon introduces new challenges forICs Consequently under certain conditions conventional designapproaches are sufficient neither for TSV characterization nor forTSV interconnection
The TSV parasitic capacitance and thermomechanical stress dis-tribution are critical TSV characteristics The parasitic capacitanceaffects both delay and power consumption of signals passingthrough the TSV interconnect while thermomechanical stress af-fects MOSFET device mobility in the vicinity of TSVs Accuratelydetermining these parameters is necessary for generating designrules and models that can be used in TSV-aware design tools Aspart of this project sophisticated test structures were developedand fabricated on a 65nm 3D process for extracting the TSV para-sitic capacitance and thermomechanical stress distribution
The TSV parasitic capacitance is typically small however it maystill become an important source of power consumption as TSVs
are gradually implemented in large densely interconnected 3D-system-on-chips (SoCs) Although conventional low-power designtechniques based on voltage scaling can be used this representsa technical challenge in modern technology nodes In this worka novel TSV interconnection scheme is proposed which is basedon energy-recovery logic and shows frequency-dependent powerconsumption The scheme is analysed using theoretical modellingwhile a demonstrator IC was designed based on the developedtheory and fabricated on a 130nm 3D process
The work presented in this thesis was part of a larger researchproject undertaken at Imec2 with the goal of improving andoptimising TSV-based 3D integration technology The authorrsquoscontribution within this project was to propose design and sim-ulate the circuits presented in the following chapters as well asperform and analyse the experimental measurements
12 thesis outline
This thesis is organised into seven Chapters and two Appendixesas follows
2 Imec is a micro- and nanoelectronics research center located in Leuven Belgium(httpwwwimecbe)
12 thesis outline 4
Chapter 2 (Background) introduces the basic concepts com-prising the fields of 3D integration and energy-recovery logic Inaddition the design flow used for implementing the design partof this work is described
Chapter 3 (Characterization of TSV capacitance) presents agroup of test structures for characterizing the TSV parasitic ca-pacitance under various conditions The test structures are fab-ricated on a 65nm 3D process and the measurement data areanalysed using statistical methods
Chapter 4 (TSV proximity impact on MOSFET performance)presents a test structure for monitoring the impact of TSV-inducedthermomechanical stress on MOSFET device performance Thetest structure uses proven design techniques for implementinga large number of test-cases and accessing numerous deviceswith maximum precision The structure is fabricated and usedfor characterizing thermomechanical stress on a 65nm 3D process
Chapter 5 (Evaluation of energy-recovery for TSV intercon-nects) investigates the potential of the energy-recovery techniquefor reducing the energy dissipation of TSV interconnects in 3D-ICsThe total energy dissipation per cycle and optimum device sizingare extracted for the proposed scheme using theoretical model-ling while the configuration is evaluated against conventionalstatic complementary metal-oxide-semiconductor (CMOS) design
Chapter 6 (Energy-recovery 3D-IC demonstrator) elaborates onthe design of a 3D demonstrator circuit based on the energy-recovery scheme under realistic physical and electrical con-straints The demonstrator is compared to a CMOS circuit de-signed with identical specifications using post-layout simulationsBoth circuits are fabricated on a 130nm 3D technology processand evaluated based on the experimental measurements
Chapter 7 (Conclusions) summarises the contributions of thisthesis as presented in the previous chapters and discusses areasof future work
Appendix A includes photos used as reference in various partsof this work
Appendix B elaborates on the statistical descriptors used inthis work for analysing experimental measurement data
2B A C K G R O U N D
The work presented in this thesis contributes in the fields of 3D
integration and energy-recovery logic This chapter introduces thebasic concepts comprising these two fields not in extensive detailbut enough that the reader can understand the motivation andthe choices made in the context of this work Furthermore thefull-custom design flow is discussed that was used to implementthe design part of this thesis as will be presented in the followingchapters
21 3d integrated circuits
CMOS integration technologies are traditionally based on two-dimentional (2D) planar architectures Each die is individuallypackaged and placed on a printed circuit board (PCB) which isused to provide mechanical support and interconnection betweendifferent dies and electronic components A 3D-IC is a stack ofmultiple dies (Figure 21) vertically integrated and interconnec-ted sharing a common package
There are several motivations for the development of 3D-ICsPerhaps the most significant advantage of 3D integration is thereduced interconnection delay and energy dissipation In generalglobal interconnect delay and energy is linearly related to wirelength in an IC By using 3D integration modelling of simple3D-IC architectures has demonstrated that up to 50 reduction inglobal interconnect length may be achieved [24] with comparablereductions in delay and energy
Another advantage of 3D integration is heterogeneous integra-tion Presently 2D planar architectures limit designers to a singlefabrication technology for both the analog and digital parts of
ABCDEFBA ABCDEFBA
Figure 21 Comparison of 2D3D Integration
5
21 3d integrated circuits 6
ABCDEFBC
E
Figure 22 System-in-a-package
ABCDEFBC
E
Figure 23 Package-on-package
a circuit The common practice is to use an inexpensive digitalprocess for the complete design which is usually inefficient forimplementing analog circuits In contrast a 3D-IC allows for themost suitable process technology to be used for implementingeach part of the design In addition other components may beintegrated in the same circuit such as microelectromechanicalsystems (MEMS) [80] and passive devices [66] (resistors capacitorsinductors) enabling ICs with novel functionalities
The replacement of long horizontal global interconnects withshort vertical ones allows for reduced die size as well which isdirectly related to the fabrication cost Also the vertical integ-ration of dies in the same package removes the need for thespacing overhead between components that is necessary in PCBsincreasing the area efficiency and reducing the form factor
211 3D integration approaches
3D integration may be achieved using several microelectronictechnologies which result in different densities and capabilitiesThe levels of 3D integration can be reasonably classified into threemajor technology platforms [11 4] based on the interconnectwiring hierarchy and the industrial infrastructure
bull 3D system-in-package (3D-SIP) Stacking of individual pack-ages
bull 3D wafer-level-packaging (3D-WLP) Stacking of embeddeddies by wafer-level-packaging (WLP)
bull 3D stacked-IC (3D-SIC) TSV-based vertical system integra-tion
The 3D-SIP integration is based on traditional packaging intercon-nect technologies Each sub-system is integrated in a system-in-a-package (SIP) fashion (Figure 22) using wire-bonds for inter-connection and subsequently each SIP subsystem is assembledvertically to create package-on-package 3D stacks (Figure 23)
3D-SIP is currently the most mature technology and has alreadyreached volume production It provides with increased function-ality in a smaller form factor and low cost However interconnectdensity is limited since both wire bond pads and solder bumps
21 3d integrated circuits 7
Figure 24 Imecrsquos 3-D SIP wireless bio-electronic sensor [67]
A
BCDEAFE
FFDC
CFE
Figure 25 3D wafer-level-packaging
take up a large area on silicon due to their large pitch The longbond wire length can also significantly distort signals due toits intrinsic parasitics Hence this approach offers limited per-formance when compared to other 3D interconnect solutions Anexample of 3D-SIP application is the wireless bio-electronic sensor(Figure 24) developed at Imec [67]
3D-WLP integration is based on WLP technology directly stack-ing individual dies face-to-face using flip-chip micro-bumps forinterconnection The 3D interconnects are processed after IC pas-sivation using WLP techniques and may also be realised usingTSVs (Figure 25) The TSVs if existing are much larger than theequivalent interconnections in a 3D-SIC typically in the range of20 minus 40microm (1 minus 5microm in 3D-SIC) [11]
The 3D-SIC technology [69] is similar to a 2D SoC approachexcept that circuits exist physically on different layers and verticalinterconnects replace the global interconnections of the SoC 3D-
SICs (Figure 26) are based on TSV interconnections and have thepotential to offer the best all around3D integration technology Theadvantages of TSV-based 3D-SICs include the short interconnectlength (reduced parasitics) and the high interconnection densityHowever the additional process steps required for implementingthe TSVs increase manufacturing cost
A comparison between the discussed 3D interconnect technolo-gies is attempted in Table 21
21 3d integrated circuits 8
ABCD
EFFDB
F
Figure 26 3D-stacked IC
3D interconnect
technologyAdvantages Disadvantages
3D system-in-package (3D-SIP)
Simple existingtechnology bestheterogeneous
integration
Performance(speed power) low
densitymanufacturing costin high-quantities
3D wafer-level-packaging (3D-
WLP)
Existingtechnology good
compromisebetween
performance-manufacturing
cost
Averageperformance and
density
3D stacked-IC (3D-SIC)
Performance(speed power)
high densitymanufacturing costin high-quantities
Manufacturing costin low-quantities
not ready formanufacturing
Table 21 3D interconnect technologies comparison
21 3d integrated circuits 9
ABBCDE
FBACACDA
ACCD F
Figure 27 TSV process steps
212 TSV interconnects
Of particular interest to this PhD thesis is the 3D-SIC approachdeveloped at Imec and specifically the TSVs used for verticalinterconnection
In Imecrsquos 3D-SIC process [69 73] TSVs are created prior to theBEOL processing The TSV process steps are illustrated in Fig-ure 27 After processing of the CMOS front-end-of-line (FEOL) andpre-metal dielectric (PMD) stack the TSV hole is plasma-etchedwith a diameter of 5microm and a minimum pitch of 10microm A thinchemical vapor deposition (CVD) oxide layer (sim 200nm) is thendeposited for insulating the TSVs from the substrate The metal-lization sequence consists of a tantalum nitride (TaN) barrier de-position and filling of the TSV hole with electroplated copper (Cu)Finally the Cu overburden is removed in a top-side CMP step anda standard 2-metal layer BEOL process is applied to finalise thedie
Before stacking the wafer is mounted on a temporary carrierand thinned down to a silicon (Si) thickness of sim 25microm leavingthe TSV Cu exposed on the back-side of the wafer (Figure 28)After dicing the dies are stacked by Cu-Cu thermocompressionin a die-to-die (D2D) or die-to-wafer (D2W) fashion (Figure 29)using a tacky polymer as an intermediate glue layer such asbenzocyclobutene (BCB) (ǫr = 27) Target technology parametersfor Imecrsquos 3D-SIC 130nm and 65nm processes are summarised inTables 22 and 23 respectively
As TSV is a relatively new 3D interconnect technology thereare several challenges to be overcome before TSV technology canbe widely adopted and commercialised Issues in the fabricationlike etching liner deposition metallization and CMP steps arein the process of being resolved or solutions have already beenfound [55] On the contrary design methodologies for TSV-basedICs are still at their infancy stage [18]
21 3d integrated circuits 10
AAAB
CAAAB
ABC
DEF
Figure 28 Before thinningafter thinning
ABC ABC
DE DE
F
DEE
B
CBA
Figure 29 TSV stacking
21 3d integrated circuits 11
Minimum devicechannel length (nm)
130
Supply voltage (V) 12
Routing layers 2
TSV diameter (microm) 5
TSV pitch (microm) 10
TSV length (microm) 22
TSV resistance (mΩ) ~20
TSV capacitance( f F)
~40
Wafer diameter(mm)
200
Table 22 Technology paramet-ers for Imec 3D130
process
Minimum devicechannel length (nm)
70
Supply voltage (V) 10
Routing layers 2
TSV diameter (microm) 5
TSV pitch (microm) 10
TSV length (microm) 40
TSV resistance (mΩ) ~40
TSV capacitance( f F)
~80
Wafer diameter(mm)
300
Table 23 Technology paramet-ers for Imec 3D65 pro-cess
Approaching the technology challenges from the designerrsquosperspective we can emphasise on the following issues designtools thermomechanical stress thermal analysis electrical char-acteristics and testing
Design tools
Although it is possible to design 3D-ICs using existing tools byutilising ad hoc techniques as 3D implementations become moreadvanced ldquonativerdquo 3D design and analysis tools will becomenecessary A few physical layout tools are currently availablefor TSV-based design [2] however they do not offer automaticpartitioning placement and routing Moreover there are noavailable commercial tools for evaluating performance (speedpower) reliability and manufacturability in 3D-IC designs
Thermomechanical stress
TSV fill materials may include copper (Cu) tungsten (W) orpolysilicon[20] The large mismatch in the coefficient of thermalexpansion (CTE) between the fill material of the TSV (17 middot 10minus6K)and the Si surface (3 middot 10minus6K) induces mechanical stress aroundthe vias As Cu contracts much faster than Si it pulls the surface ofthe surrounding bulk while cooling down causing tensile stressin the area [25] Since even low stress values affect carrier mobilityin MOSFET devices [71] performance shift of the transistors inproximity to TSVs may occur
21 3d integrated circuits 12
The intensity of TSV-induced stress depends on various condi-tions such as TSV geometry device proximity and process char-acteristics Defining design rules such as keep-out-zones (KOZs)for digital and analog circuits and generating stress-effect modelsis an important milestone for implementing TSV-aware circuitswith predictable performance Simulations predict the complexdistribution of thermomechanical stress [40] which emphasisesthe need for sophisticated characterization structures that canmonitor stress impact with precision and verify simulation mod-els In Chapter 4 a characterization structure is presented forevaluating the impact of TSV-induced thermomechanical stresson MOSFET device performance
Thermal analysis
The close proximity of dies in a 3D-IC vertical stack increasesthe heat density proportionally to the number of stacked diesAs temperature affects transistor parameters such as leakagecurrent accurate thermal analysis is required for 3D-ICs to ensurethat design specifications are met Presently tools for thermalanalysis of planar ICs are available however they have to beadapted for use with complex 3D systems so as to reduce runtime and improve the accuracy of TSV models [18] In additionldquothermally-awarerdquo design tools are required which can partitionthe design in such a way that high heat generating componentsare placed on layers in proximity to the heat-sink
Electrical characteristics
The electrical characteristics of TSVs such as delay crosstalk sig-nal integrity (SI) and power integrity (PI) dictate many aspects ofa 3D-IC design Proper evaluation of TSV characteristics typicallythrough simulation means is critical for accurately predicting theperformance of a 3D system [57]
Several attempts have been made to generate models of theTSV structure [77 27] using its intrinsic resistance-inductance-capacitance (RLC) parasitics in an equivalent circuit model TheRLC parasitics depend on the materials and geometry of the TSV
structure while prior work has indicated that these parametersare relatively small with the capacitance assuming the dominantrole on TSV electrical properties [27] For verifying the accur-acy and calibrating the models experimental measurements arerequired using compatible characterization structures Such struc-tures are presented in Chapter 3 which were used for extractingthe TSVs parasitic capacitance of an experimental 3D process
22 energy-recovery logic 13
Testing
Testing ICs for defects is crucial for guaranteeing a sufficient man-ufacturing quality The unique processing steps of TSVs such aswafer thinning alignment and bonding introduce new types ofdefects that should be considered during testing [46] Further-more finding and diagnosing defects in TSV interconnections isconsiderably more challenging than 2D interconnects consideringthat it is not possible to gain physical access to the vias after thetop die has been placed All these factors emphasise the need fornew testing structures and methodologies for testing 3D systems
22 energy-recovery logic
TSV 3D interconnect technology aims to reduce delay and en-ergy dissipation when compared to the traditionally used planarBEOL interconnections A key TSV characteristic is its parasiticcapacitance discussed in Chapter 3 which is typically small(Tables 22 and 23) influencing proportionally both delay andenergy dissipation Although TSV technology offers significantperformance improvements to previous solutions a low-powerinterconnection scheme may still be desirable under certain con-ditions due to specific characteristics of TSV technology
1 The TSV capacitance may become a concern in future largedensely interconnected 3D-SoCs as it increases linearly withthe number of tiers and interconnections
2 Since interconnects do not follow transistor scaling tran-sistor gate capacitance will always be smaller than intercon-nect capacitance With the gap tending to widen in futuretechnologies it can be reasonably expected that 3D intercon-nects will become a performance bottleneck in ICs just asthe BEOL interconnections preceding them
Furthermore conventional CMOS low-power design techniquesbased on voltage scaling represent a technical challenge in mod-ern technology nodes as process voltage temperature (PVT) vari-ations and increased leakage [61] decrease in practice the rangeover which voltages may be scaled An interesting low-powerdesign approach is energy-recovery logic which demonstratesfrequency-dependent energy dissipation Energy-recovery logicis a design technique for minimising the energy dissipation ofcomputing devices well below typical CMOS limits Its major lim-itation is the requirement for passive components to generatetime-variable signals which cannot be efficiently integrated intocurrent generation ICs TSV-based 3D stacking and heterogeneousintegration of high-quality passive devices [66] may enable theuse of novel design techniques such as energy-recovery that
22 energy-recovery logic 14
Figure 210 Conventional AND gate
were not practically feasible in the past due to technology lim-itations In this PhD thesis energy-recovery is used for drivingTSV-based interconnections in a low-power scheme discussed inChapters 5 and 6
Historically energy-recovery logic has its roots in reversiblecomputing Reversibility is a concept from thermodynamics de-scribing physical processes that after they have taken place canbe reversed without loss or dissipation of energy Reversibleprocesses cause no change in the entropy of the system or itssurroundings (do not generate any waste heat) therefore theyachieve maximum energy efficiency The concept of reversibilityin thermodynamics may be extended to logical reversibility ifa computation can be implemented in such a way that it is al-ways possible to determine the previous state of the computationgiven a description of its current state In practical terms logicalreversibility would require that all information about the state ofthe computation in any stage is retained and never discarded
Yet the typical computing device is not reversible (irreversible)and as Landauer [36] has shown there is a fundamental min-imum amount of energy that must dissipated in such devicesThis minimum energy dissipation is known as the Von Neu-mann - Landauer limit and is at least kT ln 2 per irreversible bitcomputation The irreversible operation of conventional comput-ing can be demonstrated using the AND gate as an example(Figure 210) The AND gate has two input lines and one outputWhen both inputs are at logical 1 the output is 1 however theoutput is 0 in all of the remaining three combinations that have alogical 0 on the inputs Hence every time the gatersquos output be-comes 0 there is loss of information since we cannot determineexactly which input state caused the logical 0 at the output
Computations do not necessarily have to be irreversible aswas shown by Bennett [9] and thus it is still possible to breakthe kT ln 2 limit A reversible computing device would have to
22 energy-recovery logic 15
perform each computation twice the first time in the forwarddirection applying the inputs to a logic function and obtainingthe result and the second time backwards computing the inversefunction and returning the system to its initial state If a com-putation could be performed in such a way and implementedon non-dissipative hardware its energy requirements could bepotentially reduced to zero [10]
In practice even if a circuit is designed to be logically reversibleachieving zero-energy computations utilising existing electricalcircuits is limited considering that any process merely involvingcharge transfer will result in dissipated energy Nevertheless it isstill possible to reduce the energy dissipation during charge trans-ferring to ldquoasymptotically zerordquo levels applying circuit designtechniques generally known as adiabatic switching [8 31]
The operating principle of adiabatic switching can be bestexplained by comparing it to conventional CMOS switching In aconventional CMOS circuit (Figure 211) when the switch closes(typically the pull-up network) a certain amount of charge Q =
CVDD is pulled out of the positive power supply rail that chargesthe load capacitance C up to VDD The energy transferred fromthe power supply during the charging period (T) is [59]
E = QVDD = CV2DD (21)
From the total transferred energy only half ( 12 CV2
DD) is storedon the capacitor C while the other half is dissipated on theswitch and interconnect resistance R As can be derived fromEquation 21 to reduce the energy dissipation in conventionalCMOS either the supply voltage VDD has to be reduced or theload capacitance C
The adiabatic charging of an equivalent node capacitance C
is modelled in Figure 212 In contrast to the conventional CMOS
circuit the supply voltage in this case is not constant but time-variable with a slow rising time (T ≫ RC) If the charging periodof the capacitor (or rising time of the voltage supply) is T then itcan be proven [82 60] that the energy dissipated on the resistanceR would be proportional to
EAdiabatic prop
(
RC
T
)
CV2DD (22)
The distinct advantage of adiabatic switching is that the slowrising time of the supply voltage keeps the voltage difference(and thus current flow) on the resistance R low at all times Asthe charge transfer is spread more evenly over the entire chargingperiod the energy dissipation (E = I2R) is greatly reduced Inaddition as can be derived from Equation 22 increasing thecharging period (T) can potentially reduce the energy dissipationto arbitrary low levels irrespective of the VDD or C parameters
22 energy-recovery logic 16
A
B
CC
D
E
D
Figure 211 CMOS switching
AA
B
C
B
Figure 212 Adiabatic switching
Conventional CMOS Energy-recovery CMOS
2 states True False 3 states True False Off
Energy of information bitdissipated as heat on the
pull-down network
Energy of information bitrecovered by the
power-supply
High speed Lowmedium speed
High energy dissipation Very low energy dissipation
Area efficient Some area overhead
Table 24 Comparison energy-recoveryconventional CMOS
Combining reversible computations with adiabatic switchingtechniques allows for circuit implementations with extremely lowenergy dissipation compared to conventional designs [37 38]However fully reversible circuits in CMOS technology sufferfrom high area overhead as the inverse logic function has tobe implemented and intermediate logic states must be stored toenable reversibility of the computation [8] In cases where theoverhead of full reversibility is too high it can be significantlyreduced by allowing some information to be occasionally dis-carded without sacrificing much of the energy benefits Thesecircuits that partially apply the principles of reversible comput-ing and adiabatic switching to achieve low but not zero energydissipation are called energy or charge recovery [31]
A comparison between energy-recovery and conventional CMOSis summarised in Table 24
22 energy-recovery logic 17
ABCABC
Figure 213 Basic 2N-2P adiabaticcircuit
ABC
ABC
DEF
EE
DEF
Figure 214 2N-2P timing
221 Energy-recovery architectures
Several energy-recovery circuit architectures have been proposedin the literature over the past years Examples of logic families are2 nMOS2 pMOS adiabatic logic (2N-2P) [33] efficient charge re-covery logic (ECRL) [50] pass-transistor adiabatic logic (PAL) [54]clocked adiabatic logic (CAL) [45] quasi-static energy recoverylogic (QSERL) [83] and boost logic [62] These architectures applythe principles of reversible computing and adiabatic switchingto a variable degree while their differences are mainly relatedto characteristics such as singledual-rail logic style number ofclock phases chargingdischarging path and energy efficiency
A simple configuration is the 2N-2P family of logic circuits(Figure 213) which is based on the differential cascode voltageswitch logic (DCVSL) circuit [59] The timing diagram of the 2N-2P
circuit is presented in Figure 214 using an idealised adiabaticvoltage supply Initially the voltage supply is in the WAIT phase(LOW) keeping the outputs in the LOW state Then the inputs areset complementary to each other and the supply voltage rampsup As the inputs are evaluated the output that was enabled bythe pull-down network follows the power supply until it reachesVDD At that moment the inputs return to the LOW state andafter a certain period of time in the HOLD ldquo1rdquo phase the voltagesupply ramps down discharging the output As can be observedthe average voltage supply during the computation cycle is wellbelow VDD which can be advantageous in modern technologynodes where leakage currents are of a concern
The adiabatic chargingdischarging takes place during theramping updown of the EVALUATERESET phases while thereverse computation occurs during the RESET phase when thetransferred charge is returned to the power supply Howeverthe reversibility of the circuit is only partial since the p-channel
22 energy-recovery logic 18
Figure 215 Adiabatic driver
MOSFET (PMOS) transistor that was turned-on by the pull-downnetwork will be forced to turn-off when the output is approxim-ately a threshold voltage above zero This will cause some chargeto be trapped on the output which will be dissipated as heat on asubsequent cycle Using the 2N-2P circuit arbitrary logic functionsmay be implemented in the place of the pull-down network
The energy efficient characteristics of energy-recovery logic canbe also beneficial for driving large load capacitances A circuitthat may be used in such case is the adiabatic driveramplifier(Figure 215) [8]
The input of the adiabatic driver is dual-rail encoded as well asthe output The operation principle is straightforward First theinputs are set to valid values and then an adiabatic time-variablesignal goes through the pass-gate charging the load capacitanceat the output to its maximum value The output is now validand can be used by other stages to perform computations Sub-sequently the output slowly discharges returning the charge backto the power-supply Energy is saved by two methods Firstlythe slow-rising adiabatic signal forces a low voltage differencebetween the inputs of the pass-gate resulting in low currentand thus low energy dissipation Secondly the energy that isnot dissipated as heat during the transfer is returned back tothe power-supply which can be re-used in the next computationcycle
Circuits utilising the energy-recovery technique have been suc-cessfully implemented in the past while they have demonstratedgreat potential whilst driving highly-parasitic global intercon-nects such as clock distribution networks [42 63]
222 Power-clock generators
Energy-recovery circuits require specialised power supplies thatcan generate multiple-phase time-variable adiabatic signals and
22 energy-recovery logic 19
A
A
B
C
C
Figure 216 Stepwise charging generator
recover charge Since the power supply in an energy-recoverycircuit provides with both the power and clock signal they arereferred to as power-clock generators (PCGs) [6] The power-clockgenerators fall into two categories staircase and resonant gener-ators
Staircase generators use the stepwise charging technique [68]The basic circuit (Figure 216) is composed of N number of tankcapacitors (CT) that are successively discharged in N steps un-til the load capacitance (CLOAD) is charged to the maximumvoltage level This technique attempts to ldquoemulaterdquo the adiabaticchargingdischarging signal while its linearity depends on thenumber of charging steps N The advantage of staircase generat-ors is that they can be easily integrated on-chip since they do notrequire inductors as is the case with resonant generators How-ever both speed performance and energy efficiency are relativelylow
Resonant generators [44] are based on inductance-capacitance(LC) oscillators A simple circuit is the 1N single-phase powerclock generator of Figure 217 If R is the resistance of the adia-batic driver in Figure 215 then the load capacitance (CLOAD)along with a resonant-tank inductor (L) form an RLC oscillatorresonating at
f =1
2πradic
LCLOAD
(23)
As typical for RLC oscillators the shape of the power-clockis a sinewave which approximates the idealised waveform ofFigure 214 Energy is transferred adiabatically during the risinghalf of the sinewave while on the falling half the energy is
23 full-custom design flow 20
A
BCD
E
DFF
CD
Figure 217 1N power clock generator
partially recovered and stored in the inductor for later use RLC
oscillators are very energy efficient power-clock generators andcan operate at high speeds However inductors may be difficultto implement on-chip
23 full-custom design flow
All circuits presented in this thesis were designed using the full-custom approach since the target fabrication technology was anexperimental 3D process for which neither standard cells nordedicated 3D design tools were in our disposal
In the full-custom design methodology each individual tran-sistor and associated connectivity in the integrated circuit ismanually specified in the layout level by the designer Full-customis considered to be superior to standard-cell design when themain concern is circuit performance and area efficiency Further-more full-custom design approaches are essential in cases wherespecialised circuitry are to be designed or when standard celllibraries are not available
The electronic design automation (EDA) environment that wasused was based on Cadence1 applications and tools as integ-rated in the Design Framework II (DFII version 5141) CadenceDFII consists of tools for design management (library manager)schematic entry (Virtuoso schematic) physical layout (Virtuosolayout) verification (Assura) and simulation (Spectre) Thesetools are completely customisable supporting various processtechnologies through foundry-specific design kits In our case
1 Cadence Design Systems Inc (httpwwwcadencecom)
24 summary 21
the Assura tool was replaced by Mentor Graphics2 Calibre forthe verification step
Our customised design flow is illustrated in Figure 218 Ini-tially the circuit is captured at the transistor-level using theschematic editor (Figure 219) and the completed design is simu-lated and modified until it conforms to the design specificationsThe same circuit is re-created in the layout editor (Figure 220)where the detailed geometries and positioning of each fabricationmask layer is described The layout must conform to a series ofprocess design rules and any violations of these rules can be de-tected during the design rule check (DRC) step Subsequently thelayout design is translated to a circuit description (netlist) whichis compared to the initial circuit in the schematic editor duringthe layout versus schematic (LVS) step If the two designs matchthe interconnectivity and transistor parasitics are estimated fromthe layout design and another netlist is created which includesthese parasitics This detailed netlist can be simulated to get amore accurate estimation of the circuit behaviour including theinfluence of parasitics
24 summary
In this chapter the reader was introduced to the basic conceptsand terminology that are relevant in the context of this thesisThe following chapter begins the discussion of the research partof this work presenting a group of test structures that were usedfor the electrical characterization of the TSV parasitic capacitancein an experimental 3D process
2 Mentor Graphics Inc (httpwwwmentorcom)
24 summary 22
ABCDEA
FCCAAE
ABCDEAEC
DE
E
F
CEA
EEAEA
EDE
CAC
E
EC
CAC
Figure 218 Design flow
Figure 219 Inverter in Virtuososchematic
Figure 220 Inverter in Virtuosolayout
3C H A R A C T E R I Z AT I O N O F T S V C A PA C I TA N C E
31 introduction
In a semiconductor process technology electrical characterizationis crucial for testing device performance and reliability after goingthrough the process fabrication steps Electrical characterizationverifies the electrical properties of devices to specifications andprovides feedback to the wafer fabrication process engineers forimproving and optimising the process technology
In a 3D system the electrical behaviour of the TSV can bemodelled in terms of its resistance-inductance-capacitance (RLC)parasitics as discussed in 212 (page 12) Parasitic models are typ-ically used for simulating circuit behaviour and they are criticalfor accurately predicting the performance of a 3D system duringdesign verification [57]
From the RLC parasitics associated with a TSV capacitanceassumes the most significant role on its electrical properties suchas energy dissipation and delay [27] The capacitancersquos role is sodominant that for latency and signal integrity (SI) analysis theTSV behaviour can be modelled by considering the capacitancealone [21] Thus accurate characterization of the TSV capacitanceis highly desirable when designing a 3D system
Typical laboratory capacitance measurement techniques basedon LCR1 meters require well controlled process technologies forrealising a high number of devices without defects which mightnot be possible at the current maturity level of TSV process tech-nology In this chapter an alternative measurement technique isproposed for accurate on-chip characterization of the TSV parasiticcapacitance Test structures based on this technique are fabricatedon a 65nm 3D process (Table 23 page 11) and used for charac-terizing the TSV parasitic capacitance under various conditionsThe experimental measurements are analysed using statisticalmethods and the proposed test structures are evaluated
32 tsv parasitic capacitance
A typical isolated 2D back-end-of-line (BEOL) metal wire can bemodelled as a conductor over a ground plane (Figure 31) Sincethe conductor is separated from the ground plane by a dielectric
1 LCR meter is a measurement equipment used to determine the inductance (L)capacitance (C) and resistance (R) of devices based on impedance measure-ments
23
33 integrated capacitance measurement techniques 24
ABC
Figure 31 Isolated 2D wire capacitance
material (εox) its electrical behaviour closely resembles that ofa capacitor The basic formula for calculating a parallel platecapacitance is
C =εox
hwl (31)
In practice calculating the parasitic capacitance of a BEOL metalwire is significantly more complex since fringing fields increasethe total capacitance value [78] Nevertheless whether a 2D metalwire capacitance is calculated using Equation 31 or more com-plex formulas [85] its value is only dependent on geometricalcharacteristics and the dielectric material
In that context 3D interconnects in the form of TSVs do notbehave as simple wires but as devices The TSV copper (Cu) issurrounded by silicon (Si) and these two materials are separatedby a dielectric oxide liner (Figure 32) This structure resemblesa metal-oxide-semiconductor (MOS) capacitor and as has beenshown by modelling and measurements [27] exhibits distinctaccumulation depletion and inversion regions (Figure 33) TheTSV capacitance attains its maximum value (Cox) in the accumula-tion when Vbias lt VFB (flatband voltage) and inversion regions(Vbias gt Vth f lt 1kHz)
The bias voltage and frequency dependency of the TSV capacit-ance complicates both modelling and the experimental measure-ment of its parasitic capacitance Furthermore the MOS behaviourof TSVs is not desirable in circuits using 3D interconnections Forthat reason special techniques are employed [28] that shift theTSV C-V characteristic in such a way that for typical circuit oper-ating voltages (0 minus VDD) the TSV capacitance is constant and atits minimum value (Figure 33)
33 integrated capacitance measurement techniques
Extracting the TSV parasitic capacitance requires the use of meas-urement techniques that are compatible with the particular TSV
33 integrated capacitance measurement techniques 25
ABCDE
F
FD
F
F
FD
Figure 32 TSV parasitic capacitance
ABCDE FACDE
ECDE
BCFACDE
FCCDED DBCDE
D
DAB
Figure 33 Capacitance vs Gate voltage (CV) diagram of a MOS Capa-citor
33 integrated capacitance measurement techniques 26
characteristics as indicated in the previous section The tech-niques commonly used for measuring integrated capacitances ina laboratory environment can be classified into two categories
1 Measurements based on external instruments these can bedirect capacitance measurements LCR meter based [52] orRF S-parameter based measurements [39]
2 On-chip capacitance measurements examples are the ref-erence capacitor technique [32] ring-oscillator based [26]and charge-based [15]
The disadvantage of using external instruments directly connec-ted to the device under test (DUT) is that the measurement setupintroduces additional parasitics that have to be subtracted fromthe measurement result Using standard measurement setupssignificant noise is added when attempting to determine capa-citances well below 1pF due to the influence of the instrumentparasitics [64] In cases where the DUT capacitance is small suchas the femto-Farad TSV capacitance (Table 23) a large number ofidentical devices (typically in the hundreds) can be connected inparallel to increase the total capacitance value [75] However thismethod is only usable with process technologies that are wellcontrolled and allow the realisation of a high number of deviceswithout defects In general the extracted capacitance of the wholepopulation may not be representative of a single device [56] andas a result individual device measurement accuracy may suffer
On-chip techniques can achieve better accuracy when the DUT
capacitance is small Although DUT yield is not critical the pro-cess technology should allow the integration of active devicesand their parameters should be known to a reasonable degreeA simple yet high resolution technique is the charge-based ca-pacitance measurement (CBCM) which can be used to extractcapacitances down to the atto-Farad range [15]
In the CBCM technique the DUT capacitance (CDUT) is chargeddis-charged at frequency f using a pseudo-inverter (Figure 34) whilethe average charging current (Im) is measured with a laboratoryammeter Since transistor and wiring parasitic capacitances are inparallel with the DUT capacitance a second (reference) inverter isused to measure these parasitics (Ire f ) which are then subtractedfrom the calculated result The DUT capacitance is calculated asfollows
CDUT =Im minus Ire f
VDD middot f(32)
The accuracy of the CBCM technique can be further increasedby measuring the charging current over a range of frequenciesThe DUT capacitance can be then calculated from the slope of
34 electrical characterization 27
AB
AC
DE
FE
DE
FE
AB
AC
B
Figure 34 CBCM pseudo-inverters
Inet (Im minus Ire f ) (Figure 35) The accuracy is greatly improved aspseudo-inverter leakage current is removed from the extractedresult (slope is not affected by constant biases) and also defectiveDUTs can be detected by inspecting the linearity of the current-frequency plot
The CBCM technique has been previously used for measuringBEOL metal interconnect parasitic capacitances [15] and MOSFET
capacitances [64] In this work CBCM is proposed for the on-chipmeasurement of the TSV parasitic capacitance CBCM is compat-ible with the TSV parasitic capacitance on the premise that thevoltage and frequency ranges of CBCM are selected such thatthe TSV is forced into a region where its capacitance is constantAs was shown in the previous section (Figure 33) TSV capacit-ance is constant for typical circuit operating voltages while thefrequency dependency of the capacitance becomes significantonly in the multi-MHzGHz range [39] Therefore operating theCBCM pseudo-inverters in the 0 minus VDD range and below 1MHz
should be adequate to avoid the MOS behaviour of the TSV
34 electrical characterization
A group of test structures based on the CBCM technique wereimplemented for on-chip characterization of the TSV capacitanceon Imecrsquos 65nm 3D process technology (Table 23) As it is commonin IC design maximising the utilisation of the available chiparea is one of the priorities of the design effort For that reasonsupplementary test structures were designed and implemented
34 electrical characterization 28
AB
CADE
F
AB
E
CAD
B
Figure 35 Current as a function of frequency in CBCM
on the same chip that went beyond the core objectives of thiswork
The complete design consisted of 6 test structures 5 for ex-tracting the TSV capacitance under various conditions based onCBCM and 1 additional experiment for direct measurement ofthe RC constant All test structures are presented in this sectionhowever due to equipment and material availability only themost significant were measured and analysed
341 Physical implementation
In the CBCM technique the extraction accuracy of the TSV capacit-ance is highly dependant on the design of the pseudo-invertersThe rsquoreferencersquo pseudo-inverter in CBCM is used for measuringboth the wire parasitic capacitance and the inverter parasitics inthe form of diffusion and gate-drain overlap capacitances There-fore it is critical that the two pseudo-inverters are identical interms of parasitics as any mismatches will appear in the extrac-ted measurement result
Transistor mismatch is mostly influenced by process technologyparameters however it is possible to minimise process variationssuch as in the poly gate length by applying special design tech-niques A major cause of gate length variation is the polysiliconpitch to neighbours and one way to mitigate that effect is to fillunused space between transistor gates with polysilicon dummies[13] This technique was widely used in the CBCM pseudo-inverterdesign presented in this work (Figure 36) Concurrently theinverter transistors were designed with relatively large dimen-sions for the implementation process technology (P10microm1micromN5microm1microm) so as to minimise the effect of variation as a per-centage of the transistor parasitics
34 electrical characterization 29
Poly dummy fills
Metal2Metal1PolysiliconActive
1μm 5μm
Figure 36 CBCM pseudo-inverter pair in layout
Imecrsquos 65nm 3D process allows for 2-tier vertical stacking ofdies (TOP and BOTTOM) with the TSV passing through theTOP die Si substrate as illustrated in Figure 32 The IO foreach test structure was accessible through a standard 2 times 12 testpad module on the TOP die compatible with our wafer probetest system The 2 times 12 test pad module (PRC1) along with thecomplete test structure design is illustrated in Figure 37 while amicrograph image of the fabricated chip can be seen in Figure 38
The CBCM pseudo-inverters were implemented on either TOP
or BOTTOM tier while the TSVs were connected to the invert-ers from TOP or BOTTOM and arranged in 4 configurations (11 times 2 2 times 2 2 times 5) with different pitch distances The 6 test struc-tures available on the PRC1 test pad module are summarised inTable 31 A detailed description of the test structures (rsquoArsquo rsquoBrsquo rsquoCrsquorsquoDrsquo rsquoRrsquo and rsquoErsquo) follows in the paragraphs below
Structure rsquoArsquo Single TSV capacitance (TOP die)
The purpose of test structure rsquoArsquo (Figure 39) is to extract the capa-citance of a single TSV when it is electrically accessed through theTOP die The structure is composed of 3 pseudo-inverters con-nected to the TSVs using identical geometry Metal-2 layer wireswhose capacitance is also measured (Reference) In addition to
34 electrical characterization 30
DTOP DBTM
B10 A10
BREF AREF
B1 A1
R1 R2
SET VDD
RESET GND
C15 C10
C50 C20
SET VDD
ERC SET
EREF EC Metal2 (TOP)Metal1 (TOP)
Metal1 (BTM)Metal2 (BTM)
Figure 37 PRC1 test pad module(layout)
Figure 38 PRC1 test pad module(micrograph)
34 electrical characterization 31
Structure A B
TSVs - 1 2x5 - 1 2x5
Inverterlocation
TOP TOP TOP BTM BTM BTM
TSVconnection
TOP TOP TOP BTM BTM BTM
Multi-TSVconnection
- - TOP - - BTM
Multi-TSVpitch (microm)
- - 15 - - 15
Pad name AREF A1 A10 BREF B1 B10
Structure C D
TSVs 2x2 2x2 2x2 2x2 1x2 1x2
Inverterlocation
TOP TOP TOP TOP TOP TOP
TSVconnection
TOP TOP TOP TOP TOP TOP
Multi-TSVconnection
TOP TOP TOP TOP TOP BTM
Multi-TSVpitch (microm)
10 15 20 50 20 20
Pad name C10 C15 C20 C50 DTOP DBTM
Structure R E
TSVs 1 1 - 1x2 1x2
Inverterlocation
TOP TOP TOP TOP TOP
TSVconnection
TOP TOP - TOP TOP
Multi-TSVconnection
- - - TOP BTM
Multi-TSVpitch (microm)
- - - 20 20
Pad name R1 R2 EREF EC ERC
Table 31 Test structures on the PRC1 module
34 electrical characterization 32
Pseudo-inverters
2x5 TSVs
TSV
Reference
Metal2 (TOP)Metal1 (TOP)
Figure 39 Structure rsquoArsquo - Single TSV capacitance (TOP die)
the single TSV DUT the structure also contains 10 parallel TSVs
arranged in a 2 times 5 array with a 20microm pitch Since the suitabilityof the CBCM method for extracting the small TSV capacitance wasundetermined during design time the 2times 5 array was included asa precaution effectively increasing the DUT capacitance an orderof magnitude
Structure rsquoBrsquo Single TSV capacitance (BOTTOM die)
The design of structure rsquoBrsquo is similar to the structure rsquoArsquo asdescribed above The only difference is that the CBCM pseudo-inverters are implemented on the BOTTOM die (Figure 310)while connectivity to the TSVs is provided using a wire on theMetal-2 layer of the BOTTOM die The purpose of this structureis to examine whether the perceived TSV capacitance varies whenit is accessed from the BOTTOM side
Structure rsquoCrsquo Variable pitch TSV capacitance
The use of minimum TSV pitch in designs is highly desirable soas to minimise total chip area The purpose of test structure rsquoCrsquo(Figure 39) is to investigate whether TSV pitch has an effect onthe parasitic capacitance It has been previously reported in the
34 electrical characterization 33
2x5 TSVs
Reference
TSV
Pseudo-inverters
Metal2 (BTM)Metal1 (BTM)
Figure 310 Structure rsquoBrsquo - Single TSV capacitance (BOTTOM die)
literature that TSV proximity to other TSVs may affect its parasiticcapacitance probably through an interaction with the oxide liner[23] Test structure rsquoCrsquo aims to verify the presence of the pitcheffect on the 65nm 3D process
The CBCM pseudo-inverters are implemented on the TOP die(Figure 311) while the TSVs are connected in parallel in groups of4 with variable pitch distances at 10microm 15microm 20microm and 50micromAll groups connect to the pseudo-inverters using identical lengthwires Since wire parasitics are not measured their value is notextracted from the measured TSV capacitances thus the resultsare relative
Structure rsquoDrsquo TSV capacitance with interconnection on TOP or BOT-
TOM die
The purpose of structure rsquoDrsquo (Figure 312) is similar to structurersquoBrsquo which is to examine whether the perceived TSV capacitancechanges when it is accessed from the BOTTOM die However inthis structure the CBCM pseudo-inverters are implemented on theTOP die in the case the front-end-of-line (FEOL) active devices ofthe BOTTOM die were not capable of operating properly afterfabrication and stacking
34 electrical characterization 34
2x2 TSVs - 10μm
2x2 TSVs - 15μm
2x2 TSVs - 20μm
2x2 TSVs - 50μm
Metal2 (TOP)Metal1 (TOP)
Figure 311 Structure rsquoCrsquo - Variable pitch TSV capacitance
Metal2 (TOP)Metal1 (TOP)Metal1 (BTM)
Figure 312 Structure rsquoDrsquo - TSV capacitance measurement with inter-connection on TOP or BOTTOM die
34 electrical characterization 35
Metal2 (TOP)Metal1 (TOP)
Figure 313 Structure rsquoRrsquo - Evaluation of TSV capacitance measurementaccuracy
Metal2 (TOP)Metal1 (TOP)Metal1 (BTM)
Driver
Reference
Driver
Figure 314 Structure rsquoErsquo - TSV capacitance effect on signal propagationdelay (layout)
Structure rsquoRrsquo Evaluation of TSV capacitance measurement accuracy
Structure rsquoRrsquo (Figure 313) aims to evaluate the accuracy withwhich CBCM can extract the capacitance of two nearby TSVs Thetwo pseudo-inverters wires and TSVs are identical thus theor-etically they should give identical results Any variance in theresult should be mainly due to transistor variability
Structure rsquoErsquo TSV capacitance effect on signal propagation delay
Structure rsquoErsquo (Figure 314) allows the direct measurement ofthe delay caused by the TSV on signal propagation The signalpropagation delay is measured under two conditions When thesignal passes through the TSV (which includes the effect of resist-ance on delay) and when only the TSV capacitance is in the pathof the current (Figure 315) The contribution of wiretransistorparasitics is also measured so as to be subtracted from the finalresult
34 electrical characterization 36
ABC
D
ECBFC
ABC
Figure 315 Structure rsquoErsquo - TSV capacitance effect on signal propagationdelay (schematic)
ABC
DEF
CD
A
CA
A
Figure 316 Measurement setup
342 Measurement setup
The presented test structures were fabricated on 300mm wafers(Figure A1) each containing 472 identical dies For IO accessa general purpose on-wafer measurement system was used thatallowed direct contact with the wafer and the 2 times 12 test padmodule through probecard pins (Figure A2) All IO signals wereprovided by standard laboratory instruments controlled throughthe general purpose interface bus (GPIB) port of a PC runninga customised LabVIEW2 program The measurement setup isillustrated in Figure 316
The voltage source of the CBCM pseudo-inverters was set at1V and the CBCM pulse frequencies were swept in the range100kHz minus 1MHz For each measured test structure the DC cur-rent measurement was stored in non-volatile memory and sub-
2 LabVIEW is a visual programming language from National Instruments forautomating measurement equipment in a laboratory setup
34 electrical characterization 37
Test pad Direction Instrument connection
VDD Power Power supply - 1V
GND Power Power supply - 0V
SET InputPulse generator
(Non-overlapping with RESET)
RESET InputPulse generator
(Non-overlapping with SET)
A[REF 1 10] InputSMU (Inverter source - 1V
supply)
B[REF 1 10] InputSMU (Inverter source - 1V
supply)
C[10 15 2050]
InputSMU (Inverter source - 1V
supply)
D[TOP BTM] InputSMU (Inverter source - 1V
supply)
R[1 2] InputSMU (Inverter source - 1V
supply)
E[REF C RC] Output Oscilloscope (high-impedance)
Table 32 PRC1 test pad instrument connections
sequently the prober progressed to the next die measuringall dies on the wafer The IO test pads of the PRC1 module(Figure 37) and the instrument connections are summarised inTable 32
343 Experimental results
There were two wafers available for measurements named in thistext rsquoD11rsquo and rsquoD18rsquo which were selected from the same lot Eachwafer contained 472 identical dies of the TOP 3D tier measuredusing the measurement setup described in 342 Measurementdata were extracted only from test structures rsquoArsquo (Figure 39) andrsquoCrsquo (Figure 311) which were analysed using standard statisticalmethods
When utilising statistical methods to analyse data acquired ina laboratory environment it is common for some data pointsto be significantly away from the other measurements Thesedata points are termed outliers (Appendix B) and usually resultfrom measurement errors such as bad probe pin contacts ormalfunctioning dies on the wafer Since the following analysisis focused on TSV capacitance variability these few outliers areexcluded from the data set so as to improve analysis accuracy
34 electrical characterization 38
Structure rsquoArsquo (Fig 39)
TSV dimensions (ld) 40microm5microm
Oxide liner thickness 200nm
Total dies 472
Processed dies (excluding outliers) 454
Mean capacitance 709 f F
Standard deviation (σ) 19 f F
Coefficient of variation (3σmicro) 80
Table 33 TSV capacitance D2D variability on wafer rsquoD11rsquo
The statistical descriptors used for analysing the measurementdata are elaborated in Appendix B and include calculations ofmean standard deviation and coefficient of variation Theseparameters are used to evaluate the TSV capacitance in terms ofdie-to-die (D2D) wafer-to-wafer (W2W) and TSV pitch variabilityThe accuracy of the CBCM results is evaluated by comparing themto standard LCR measurements Furthermore the correlationbetween the extracted TSV capacitance and simulated oxide linerthickness is shown signifying the predominant role of the oxideliner on the observed TSV capacitance variability
Die to die (D2D) variability
The die-to-die (D2D) variability is evaluated by measuring thecapacitance of the single TSV of structure rsquoArsquo (Figure 39) acrossall dies on the wafer The measurements are repeated on bothavailable wafers (rsquoD11rsquo and rsquoD18rsquo)
On wafer rsquoD11rsquo 18 dies showed a capacitance value severalstandard deviations away from the calculated mean of the totalpopulation These outliers indicated a measurement error or mal-functioning on the die and they were removed from the statisticalanalysis In the remaining 454 dies the mean capacitance of theexperimental data was 709 f F with a standard deviation (σ) of19 f F (Table 33) Standard deviation is an important indicator al-lowing us to determine whether the variance in the experimentaldata is expected or if there is a random error affecting the meas-urements As can be seen in Figure 317 706 of the measuredvalues are within plusmnσ of the mean capacitance value Since mostof the experimental data are within a standard deviation of themean value the variability observed is insignificant and expectedthus the measurement results can be considered valid
On wafer rsquoD18rsquo from the 472 total dies 58 were removed asoutliers The mean capacitance of the experimental data was895 f F with a standard deviation (σ) of 22 f F (Table 34) Theprobability density function can be seen in Figure 318 with 691
34 electrical characterization 39
-60 -40 -20 00 20 40 60 80
σ-σ
706
Deviation from mean (fF)
0
005
01
015
02
025
Figure 317 TSV capacitance D2D variability on wafer rsquoD11rsquo - Probabil-ity density function
Structure rsquoArsquo (Fig 39)
TSV dimensions (ld) 40microm5microm
Oxide liner thickness 200nm
Total dies 472
Processed dies (excluding outliers) 414
Mean capacitance 895 f F
Standard deviation (σ) 22 f F
Coefficient of variation (3σmicro) 74
Table 34 TSV capacitance D2D variability on wafer rsquoD18rsquo
of the experimental data within plusmnσ of the mean capacitancevalue
Correlation between TSV capacitance and oxide liner thickness vari-
ation
The TSV capacitance as extracted from the experimental measure-ments is plotted for wafers rsquoD11rsquo and rsquoD18rsquo (Figures 319 and 320)The capacitance value in the figures is normalised to the calcu-lated mean while xy axis indicate die coordinates on the waferWe observe that on both wafers the capacitance is near the meanvalue at the periphery of the wafer and reduces as we approachto the centre indicating a common cause of variability on bothwafers
Parameters affecting the TSV capacitance are the geometry ofthe TSV (length diameter) and oxide liner characteristics (dielec-tric constant thickness) The variation of the oxide liner thicknessacross the wafer can be predicted by simulation means Oxideliner simulation results for the specific process and fabrication
34 electrical characterization 40
σ-σ
691
Deviation from mean (fF)-70 -50 -30 -10 10 30 50 70
0
002
004
006
008
01
012
014
016
018
02
Figure 318 TSV capacitance D2D variability on wafer rsquoD18rsquo - Probabil-ity density function
075
080
085
090
095
100
105
110
0
5
10
15
20
25
30
-30-25
-20-15
-10-5
Norm
aliz
ed
capacitance
Die X
Die Y
075
080
085
090
095
100
105
Wafer area
Figure 319 TSV capacitance measurement results on wafer D11
34 electrical characterization 41
086088090092094096098100102104106108
0
5
10
15
20
25
30
-30-25
-20-15
-10-5
Norm
aliz
ed
capacitance
Die X
Die Y
086
088
090
092
094
096
098
100
102
104
106
108
Wafer area
Figure 320 TSV capacitance measurement results on wafer D18
tools can be seen in Figure 321 The liner thickness in the graphis normalised to the mean value and xy axis indicate die coordin-ates on the wafer As can be observed the oxide liner thickness isnear the mean value at the periphery of the wafer and increasesto a peak value when approaching the centre Since the TSV ca-pacitance is inversely proportional to the oxide liner thicknesstherersquos a strong correlation between the two variations suggestingthe oxide liner thickness to be the major cause of die-to-die (D2D)variability on wafers rsquoD11rsquo and rsquoD18rsquo
Wafer to wafer (W2W) variability
Probability density functions for both wafers rsquoD11rsquo and rsquoD18rsquo areplotted in Figure 322 Comparing the two experimental data sets(Tables 33 and 34) we can see that although the main statisticaldescriptors have not changed much (standard deviation is slightlylarger in D18) the mean capacitance is much larger in rsquoD18rsquo(895 f F) when compared to rsquoD11rsquo (709 f F) The difference is 262capacitance increase from wafer to wafer Taking into accountthat both wafers are from the same lot this result indicates theneed for better controllability on process parameters
34 electrical characterization 42
109
100
095Nor
mal
ized
oxi
de th
ickn
ess
-5-10 -15 -20 -25 -30
510
1520
2530
0Die Y
Die X
Figure 321 Simulated oxide liner thickness variation
60 65 70 75 80 85 90 95 1000
005
01
015
02
025D11 D18
plusmnσ
plusmnσ
Capacitance (fF)
Figure 322 TSV capacitance W2W variability - Probability density func-tions
34 electrical characterization 43
Structure rsquoCrsquo (Fig 311)
TSV dimensions (ld) 40microm5microm
Oxide liner thickness 200nm
Total dies 472
Processed dies (excluding outliers) 424
Experiment C10 C50
Number of TSVs 4 4
Mean capacitance 4073 f F 4125 f F
Standard deviation (σ) 84 f F 74 f F
Population within σ 695 653
Table 35 Variable pitch TSV capacitance experimental data from waferrsquoD18rsquo
Variable pitch TSV capacitance
The variable pitch TSV capacitance was extracted using structurersquoCrsquo (Figure 311) TSVs with pitch distances at 10microm and 50microm
were measured (experiments C10 and C50) on wafer rsquoD18rsquo Fromthe 472 total dies 424 were functional and the measurementresults are summarised in Table 35 The extracted capacitancevalues represent the 4 parallel-connected TSVs which also includethe parasitics of Metal 2 connectivity wires
As can be observed in the probability distribution functions(Figure 323) the measurement data for experiments C10 and C50
are very close to each other and there is an overlap between theirstandard deviations Even though there is an observable changein mean capacitance between the two experiments the overlap ofstandard deviations does not allow to conclude whether the capa-citance change is due to difference in pitch or process variability
To decide whether the change observed in capacitance is signi-ficant we run a KruskalndashWallis (Appendix B) one-way statisticalanalysis of variance in Minitab3 Assuming that the null hypo-thesis is true we calculate a p-value of 0 indicating that the twodata sets are different and that the pitch has indeed an affect theTSVs capacitance value The pitch dependence of the capacitancewith the same direction has been also previously observed onImecrsquos 130nm 3D process [23]
Measurement accuracy of CBCM
The capacitance measurements extracted using the CBCM tech-nique were compared to conventional LCR measurements (Fig-ure 324) on wafer rsquoD18rsquo Comparing the two measurements we
3 Minitab is a program that implements common statistical functions It is de-veloped by Minitab Inc (httpwwwminitabcom)
35 summary and conclusions 44
376 380 384 388 392 396 400 404 408 412 416 420 424 428 4320
001
002
003
004
005
006C10 C50
Capacitance (fF)
plusmnσC10
plusmnσC50
695 653
Figure 323 Variable pitch TSV capacitance experimental data fromwafer rsquoD18rsquo - Probability density function
-80 -60 -40 -20 00 20 40 60 80 1000
002
004
006
008
01
012
014
016
018σ-σ
688
Deviation from mean (fF)
Figure 324 LCR experimental data from wafer rsquoD18rsquo - Probability dens-ity function
observe in Table 36 that the standard deviations of the two datasets are similar while mean capacitance is sim 8 smaller in theLCR method However the LCR method produced more out-liers than CBCM That was expected as LCR cannot extract thecapacitance of a single TSV but can only measure groups of TSVs
connected in parallel (32 in our example) so as to increase thetotal measured capacitance This increases the chance of measur-ing TSVs with defects which in turn affects the accuracy of theLCR extracted capacitance
35 summary and conclusions
In this chapter various test structures based on the CBCM tech-nique were presented for the electrical characterization of theTSV parasitic capacitance The test structures were implementedon a 65nm 3D process technology and measured using a gen-eral purpose on-wafer measurement system The measurement
35 summary and conclusions 45
Method LCR CBCM
Total dies 472 472
Processed dies (excluding outliers) 393 414
Mean capacitance 820 f F 895 f F
Standard deviation (σ) 23 f F 22 f F
Coefficient of variation (3σmicro) 86 74
Table 36 Comparison between CBCMLCR measurements on waferrsquoD18rsquo
results were statistically analysed in terms of TSV capacitancedie-to-die (D2D) and die-to-wafer (D2W) variability The calculatedD2D variability was found to be reasonable while a comparisonto simulated data of the oxide liner thickness variation revealedthe oxide liner as a major contributor to the observed TSV capa-citance variability However D2W variability was extensive andfeedback was provided to the fabrication engineering team forfurther improvement of the process The results also confirmedthe effect of TSV pitch on the parasitic capacitance as has beenpreviously reported in the literature for a different process [23]
The comparison of CBCM-based measurements to conventionalLCR gave similar results while in addition the CBCM techniquewas shown to produce less outliers (less defective samples) in themeasurement data Considering the simplicity of CBCM and thatit has acceptable accuracy for extracting the capacitance of singleTSVs it can be a valuable tool for process characterization in theearly research efforts of 3D integration
The ability of CBCM for measuring single-ended TSVs could alsoprove beneficial for testing TSVs prior to die stacking which is ancritical step for keeping 3D-IC yield acceptable [47] Extracting TSV
parasitic capacitance can provide important information about theTSV device fabrication quality since slight variations in processparameters will greatly affect its parasitic capacitance as wasdemonstrated in this chapter Techniques based on RC constantmeasurement and sense amplification have been proposed in theliterature for a similar role [16] however they are significantlymore complex to implement than CBCM
In the following chapter another critical TSV characteristic ther-momechanical stress is evaluated using a test structure designedand implemented on the same 65nm 3D process
4T S V P R O X I M I T Y I M PA C T O N M O S F E TP E R F O R M A N C E
41 introduction
The integration of through-silicon via (TSV) 3D interconnects inCMOS technology has raised concerns over the impact on activedevices implemented on the same silicon (Si) substrate The largecoefficient of thermal expansion (CTE) mismatch between the TSV
filling material and Si substrate has been shown to induce ther-momechanical stress and deform the surface in proximity to theTSV Since thermomechanical stress can have an effect on activedevice carrier mobility [71] it is essential that the stress impacton MOSFET performance is kept under control Monitoring theTSV-induced thermomechanical stress effect on MOSFET deviceperformance and generating keep-out-zone (KOZ) guidelines fordigital and analog circuits is an important milestone for design-ing TSV-aware circuits with predictable performance
Simulations using finite element method (FEM) modelling [40]predict that TSV-induced thermomechanical stress has a com-plex distribution that depends on numerous conditions such asTSV geometry device type channel orientation and proximityHowever currently published experimental measurements onthermomechanical stress investigate just a small number of test-cases typically limited to a single TSVchannel orientationaxis[53 70 81] The complex distribution of thermomechanical stressemphasises the need for sophisticated characterization structuresthat can monitor stress impact with precision and verify simula-tion models
In this chapter a test structure is presented for monitoring theimpact of TSV-induced thermomechanical stress on MOSFET deviceperformance The test structure uses proven design techniquesfor implementing a large number of test-cases and accessingnumerous devices with maximum precision Furthermore thetest structure is fabricated on an experimental 65nm 3D process(Table 23) and used for characterizing that process Some of thepresented circuits and results have previously appeared in thefollowing publications [48 49 58]
42 tsv-induced thermomechanical stress
Individual TSV integration technologies use different conductivefill materials which may include copper (Cu) tungsten (W) or
46
42 tsv-induced thermomechanical stress 47
Material CTE at 20 ˚C (10minus6K)
Si 3
Cu 17
W 45
Table 41 Coefficients of thermal expansion
Figure 41 TSV thermomechanical stress simulation [40]
polysilicon [20] The coefficient of thermal expansion (CTE) mis-match (Table 41) between the fill material and the silicon (Si) caninduce strain on the surface in proximity to the TSV This deform-ation is caused by thermomechanical stress along the XY-planeduring the cooling-down phase [25] due to the filling materialrsquoshigher-than-Si contraction rate
Simulations using finite element method (FEM) modelling showthat the thermomechanical stress exhibits two-fold rotationalsymmetry with tensile and compressive stresses concentratedon separate axis [40] The thermomechanical stress distributiondepends on TSV geometrical characteristics as well as the inter-action with other TSVs Figure 41 illustrates the simulated stressdistribution of a single TSV while in Figure 42 the interactionbetween two near-by TSVs is shown As can be observed whenmultiple TSVs are in proximity to each other stress-free areas canbe generated as a result of destructive interaction
It is well known that stress can affect carrier mobility in MOS-
FET devices In fact strained-silicon technology processes havebeen in use for years as means to enhance device performance inseveral logic families [71] However in TSV-based 3D integrationthe induced strain is not intentional since it is a characteristic in-trinsic to the TSV process technology and thus the resulting shiftin device performance might not be always desirable Character-ization of the TSV impact on device performance and definition ofdesign rules such as keep-out-zones (KOZs) for digital and ana-log circuits is an important milestone for implementing TSV-awarecircuits with predictable performance
Simulation results [65] have shown that the mobility shift de-pends on various conditions such as TSV geometry device type
43 test structure implementation 48
Figure 42 TSV thermomechanical stress interaction between two TSVs[40]
proximity and channel orientation In Figure 43 the mobilityshift of transistors between two TSVs at variable distances is simu-lated As can be observed the effect is positive and more intenseon PMOS devices while for n-channel MOSFET (NMOS) the mobil-ity shift is mostly negative
The mobility shift due to TSV proximity has been also con-firmed by experimental data [53 70 81] and in most cases theperformance shift was found to be within acceptable limits How-ever currently published experimental results investigate a smallnumber of test-cases typically limited to a single TSVdevice ori-entationaxis As was shown in Figures 41 and 42 and discussedin the text the thermomechanical stress distribution is complexand depends on various factors In order to evaluate the impactof TSV thermomechanical stress accurately and under differentconditions more sophisticated test structures are required Suchapproach is attempted in the test structure which is presented inthe following sections
43 test structure implementation
TSV-induced thermomechanical stress can be compressive in somedirections tensile in others and itrsquos effect on device mobilitymay vary depending on MOSFET typeorientationdistance orinteraction with other TSVs To accurately characterize the effectof TSV proximity on MOSFET device performance a test structurewas designed and fabricated on Imecrsquos 65nm 3D process whichattempted to cover as many test-cases as possible
43 test structure implementation 49
Figure 43 Estimated mobility shift on PMOS (a) and NMOS (b) tran-sistors in between two TSVs separated by distance d [65]
The test structure consisted of MOSFET arrays superimposedover TSVs such that MOSFET devices surrounded the TSVs in alldirections at variable distances The purpose of these experimentswas to indirectly observe the thermomechanical stress effect ontransistor mobility by measuring the saturation current of MOSFET
devices at various distances and directions in relation to the TSVsSeveral configurations of TSVs and MOSFET arrays were investig-ated so as to characterize the impact of single and multiple TSVsas well as to observe the effect on different device types
Stacked silicon dies were not fabricated thus all devices wereimplemented on the TOP die Si substrate as illustrated in Fig-ure 32 (page 25) The test structure inputoutput (IO) was ac-cessible through a standard 2 times 12 test pad module on the TOP
die compatible with our wafer probe test system In total two testpad modules were designed for this experiment PTP1 and PTP2implementing NMOS-type and PMOS-type arrays respectively Testpad module PTP1 along with the test structure layout is illus-trated in Figure 44 while a micrograph image of the fabricatedchip can be seen in Figure 45
43 test structure implementation 50
A0
B0
C0
D0
E0
F0
G0
H0
I0
J0
A90
B90
E90
F90
G90
J90
DS3 DF3
DF2 DS0
DS2 DF0
DF1 DS1
SS SF
DEC6 VDD
DEC7 GND
DEC8 VG
DEC4 DEC9
DEC5 GND
DEC2 DEC3
DEC0 DEC1
Arraydecoder
Draindecoder
Figure 44 PTP1 test pad mod-ule (layout)
Figure 45 PTP1 test pad mod-ule (micrograph)
43 test structure implementation 51
Metal2Metal1PolyActive
TSV
MOSFETs
07μm007μm 0deg
07μm007μm 90deg
05μm05μm 90deg
05μm05μm 0deg
Figure 46 MOSFET dimensions and rotations
431 MOSFET arrays
Each MOSFET array was a separate experiment consisting of 37 times37 transistors which could be either NMOS-type or PMOS-typeshort channel (07microm times 007microm) or long channel (05microm times 05microm)as well as at 0ordm or 90ordm rotation in relation to TSVs (Figure 46)
The MOSFET arrays enclosed TSVs in one of five different config-urations 1 TSV (Figure 47) 4 TSVs diagonal (Figure 48) 4 TSVs
lateral (Figure 49) 9 TSVs (Figure 410) and no-TSVs In total therewere 16 configurations for each MOSFET array type (NMOSPMOS)as listed in Table 42 The location of each MOSFET array in thetop-level design is illustrated in Figure 44
The number of available IO test pads on each module was2 times 12 which imposed a limitation for accessing all 21904 (16 times37 times 37) MOSFETs To overcome that restriction only a subset ofthe transistors in each MOSFET array were wired out (16times 16) andthus electrically accessed using digital selection logic The restof the transistors in the 37 times 37 MOSFET array were inserted asdummy devices placed in the gaps to ensure uniform boundaryconditions and good matching between transistors (Figure 411)
Since thermomechanical stress exhibits two-fold rotational sym-metry as discussed in 42 the stress effect is the same for tran-sistors on the same axis whether they are placed eastwest ofthe TSV on the latitudinal axis or northsouth of the TSV on thelongitudinal axis The symmetry of the stress distribution wasexploited in the design of the MOSFET arrays so as to increase theresolution of the experiment despite the limited number of tran-sistors As can be seen in Figure 411 the transistors in the arrayare placed in an irregular grid with a 2microm pitch Some transistorsare placed on odd-number distances (1microm 3microm) in relation tothe TSV while others on even-number distances (2microm 4microm)Nevertheless rotational symmetry allows the evaluation of thethermomechanical stress at all directions with a 1microm resolution
43 test structure implementation 52
TSV
MOSFETs Substrate taps
Figure 47 1 TSV MOSFET array configuration
MOSFETs Substrate taps
TSV TSV
TSV TSV
Figure 48 4 diagonal TSVs MOSFET array configuration
43 test structure implementation 53
MOSFETs Substrate taps
TSV
TSV TSV
TSV
Figure 49 4 lateral TSVs MOSFET array configuration
MOSFETs Substrate taps
TSV TSV TSV
TSV TSV TSV
TSV TSV TSV
Figure 410 9 TSVs MOSFET array configuration
43 test structure implementation 54
Configuration RotationFET
width(microm)
FETlength(microm)
Name
0 0deg 07 007 A0
1 0deg 07 007 B0
4diagonal 0deg 07 007 C0
4lateral 0deg 07 007 D0
9 0deg 07 007 E0
0 0deg 05 05 F0
1 0deg 05 05 G0
4diagonal 0deg 05 05 H0
4lateral 0deg 05 05 I0
9 0deg 05 05 J0
0 90deg 07 007 A90
1 90deg 07 007 B90
9 90deg 07 007 E90
0 90deg 05 05 F90
1 90deg 05 05 G90
9 90deg 05 05 J90
Table 42 MOSFET array configurations for test pad modulesPTP1PTP2
43 test structure implementation 55
2μm 1μm
Metal2Metal1PolyActive
FET
DUMMY
TAP
TSV
Figure 411 MOSFET array detail
432 Digital selection logic
Each MOSFET array was composed of 256 active transistors (16 times16) Test modules PTP1 and PTP2 implemented 16 array con-figurations each with 4096 devices in total (16 con f igurations times256 transistors) which could be accessed individually throughthe 24 IO pads of the 2 times 12 test pad modules Individual tran-sistor Gate Drain and Source terminals were multiplexed tothe IO pads using a combination of decoders and transmission-gates (Figure 412) The transmission-gates were used as analogswitches for current flow while the decoders enabled the corres-ponding transmission-gates for accessing the externally selectedtransistor for measurement
The procedure for selecting a single MOSFET device for meas-urement was as follows
bull Array (0 to 15) was enabled using the 4 minus bit Array-selectdecoder The MOSFET Source terminal (S) common to alltransistors in a single array was electrically connected toIO pads ldquoSFrdquoldquoSSrdquo when enabled
bull MOSFET Gate terminal (G0 to G15) was connected to IO
pad ldquoVGrdquo using the 4 minus bit Gate-select decoder and corres-ponding transmission-gates
bull MOSFET Drain terminal (D0 to D15) was connected to IO
pads ldquoDFnrdquoldquoDSnrdquo using the 2 minus bit Drain-select decoder
43 test structure implementation 56
D0
D15
DF[0-3]
DS[0-3]
Drain-select decoder (2-bit)
SSF
SS
Array-select decoder (4-bit)
VG
Gate-select decoder (4-bit)
G0 G15
MOSFET Array
Figure 412 Digital selection logic
and corresponding transmission-gates Drain terminals werealways enabled in groups of 4 and connected sequentiallyto IO pads ldquoDF0rdquoldquoDS0rdquo to ldquoDF3rdquoldquoDS3rdquo thus the 2 minus bit
Drain-select decoder was adequate for accessing all 16Drain terminals in the array
When a MOSFET is switched on a voltage drop is expectedbetween the IO pads and the transistor DrainSource terminalsdue to the resistance of wires To guarantee accurate voltagedelivery transistor DrainSource terminals are ldquosensedrdquo on theldquoDSnrdquoldquoSSrdquo IO pads so as to ensure that the applied voltageon IO pads ldquoDFnrdquoldquoSFrdquo supplies the desired voltage values onthe transistor terminals This procedure known as four-terminalsensing [29] is typically automated when using laboratory meas-urement equipment such as a source measurement unit (SMU)
A single MOSFET array with its digital selection logic can beseen in layout view in Figure 413 Transistor Gate terminals areselected using a 4minus bit decoder (Figure 414) while the externallyapplied voltage on IO pad ldquoVGrdquo passes through transmissiongates (Figure 415) The transmission gates schematic used forthe DrainSource terminals can be seen in Figure 416 The Drainterminal transmission gates are enabled in groups of 4 usinga 2 minus bit decoder (Figure 417) If a MOSFET array is not in theselected state by the Array-select decoder Gate terminals areforced at GND voltage level (for PTP1) or to VDD voltage (forPTP2) so that non-selected MOSFETs remain turned-off insteadof floating
43 test structure implementation 57
Gatedecoder
Metal2Metal1PolyActive
Transmissiongates (DS)
Transmission gates (G)
MOSFETArray
Figure 413 MOSFET array with digital selection logic
43 test structure implementation 58
BLK
A[0] A[1] A[2] A[3]
DEC[0-15]
Figure 414 4-bit Gate-enableArray-enable decoder
43 test structure implementation 59
DEC
VGGVDD
Transmission gate
Pull-downtransistor
Figure 415 Transmission gate for MOSFET Gate terminals
Transmission gates
FORCE
SENSE
DRAINSOURCE
EN
VDD
Figure 416 Transmission gates for MOSFET DrainSource terminals
43 test structure implementation 60
A[0] A[1]
DEC[0-3]
Figure 417 2-bit Drain-select decoder
44 experimental measurements 61
Configuration Type Name Wafer
1 No-TSVPMOS
(05microm times 05microm)F0 D08
2 1-TSV 0ordm 25 ordmCPMOS
(05microm times 05microm)G0 D08
3 1-TSV 0ordm 25 ordmCPMOS
(05microm times 05microm)G0 D20
4 1-TSV 0ordm 60 ordmCPMOS
(05microm times 05microm)G0 D08
54-lateral TSV 0ordm 25
ordmCPMOS
(05microm times 05microm)I0 D08
6 1-TSV 90ordm 25 ordmCPMOS
(05microm times 05microm)G90 D08
7 1-TSV 0ordm 25 ordmCPMOS (07microm times
007microm)B0 D08
Table 43 MOSFET array configurations selected for experimental eval-uation
44 experimental measurements
Test modules PTP1 and PTP2 implementing the presented MOS-
FET arrays were fabricated on 300mm wafers (Figure A1) eachcontaining 472 identical dies Two wafers were selected for meas-urements referred to as ldquoD08rdquo and ldquoD20rdquo both originating fromthe same lot The test module used for experimental evaluationwas PTP2 containing the PMOS-device arrays which were expec-ted to have the highest performance impact from thermomech-anical stress as was discussed in 42 The specific MOSFET arraysthat were selected for evaluation are listed in Table 43
441 Data processing methodology
For each MOSFET array evaluated the drain-to-source current(IDS) of all transistors in the array was measured in the saturationregion (VDS = 1V VGS = 1V) Subsequently the saturation cur-rent was divided with the median current value of all transistorsin the array generating IDS values normalised to the median Themedian of a data set represents the arithmetic value that the ma-jority of values tend (Appendix B) Since only a small percentageof the 256 transistors in the array are in proximity to the TSVsthe median IDS value represented typical transistor saturationcurrent not affected by thermomechanical stress
Measurement precision is increased with repeated measure-ments as random noise is removed from the experiment reducing
44 experimental measurements 62
ABCDEFBBC
FCDDFBCFFC
FCB
A
BAACADE
CECBBC
CFFCDE
B$BCFFBDBCBF
amp(BFC)B
BC
FCDDFA
CAF
Figure 418 Processing of measurement results
the standard deviation of the sampled mean (Appendix B) Con-sequently each MOSFET array was measured across 98 identicaldies on the wafer and their results were averaged Since the satur-ation current shift due to thermomechanical stress is systematicacross all dies averaging the saturation currents only removedthe random noise effectively increasing the measurement pre-cision by a factor of sim 10 (
radic98) The described procedure for
processing the measurement results is illustrated in Figure 418Each MOSFET array contains 256 transistors which are not
placed in a regular 16times 16 grid as can be seen in Figure 411 Thisapproach was chosen so as to minimise the number of transistorsin an array while allowing for 1microm resolution in the evaluation ofthe thermomechanical stress by taking advantage of its two-foldrotational symmetry as discussed in 431 However illustratingmeasurements in a contour graph requires a regular grid of dataThus for illustration purposes the experimental measurementswere reconstructed to a 33 times 31 regular grid (1023 transistors)with the intermediate values estimated using linear interpolation(Figure 419)
442 Measurement setup
For IO access a general purpose on-wafer measurement sys-tem was used that allowed direct contact with the wafer and the2times 12 test pad modules through probecard pins (Figure A2) IO
signal generation and measurement was provided by standard
44 experimental measurements 63
ABCDECDBF
ABECBF
AB
AB
Figure 419 Interpolation of measurement results
laboratory instruments controlled through the GPIB port of a PCrunning a customised LabVIEW program Four-terminal sens-ing for MOSFET DrainSource terminals was automated using aKeithley 2602 dual-channel SMU [1] The measurement setup isillustrated in Figure 420
The IO test pads for the PTP1PTP2 modules can be seen inFigure 44 while the instrument connections used for each testpad are summarised in Table 44
ABCCDE
F
AA
A
DBCB
Figure 420 Measurement setup
44 experimental measurements 64
Test pad Direction Instrument connection
VDD Power Power supply - 1V
GND Power Power supply - 0V
DEC[0-9] InputDigital Output
(GATEDRAINARRAY SelectDecoders Input)
DF[0-3] Input SMU (Drain Force terminal)
DS[0-3] Output SMU (Drain Sense terminal)
SF Input SMU (Source Force terminal)
SS Output SMU (Source Sense terminal)
VG Input Digital Output (Gate voltage)
Table 44 PTP1PTP2 test pad module instrument connections
443 PMOS (long channel)
No-TSV configuration
This MOSFET array does not contain any TSVs thus transistormobility is not affected by thermomechanical stress The arraywas measured on 98 identical dies of wafer ldquoD08rdquo the saturationcurrents normalised and then averaged over all measured diesusing the procedure elaborated in 441 In Figure 421 the norm-alised saturation current for each transistor in the array can beseen after interpolation to a 33 times 31 grid Each coordinate in thefigure represents a 1microm distance
As can be observed the majority of transistors in the array havealmost identical saturation currents as expected However there isa distinctive stripe running from north to south where the currentvalue is reduced up to sim 5 This revealed a flaw in the designof the MOSFET arrays that can be best observed in Figure 411The transistors that are located adjacent to the substrate tap donot perceive the same boundary conditions as the rest of thetransistors in the array That difference is enough to affect thephysical characteristics of these transistors during fabricationwhich in turn determine transistor parameters such as saturationcurrent This flaw stresses out the significance of careful boundarycondition design when targeting for good transistor matching ina test structure
1 TSV configuration
In this configuration long-channel MOSFETs (05microm times 05microm) areplaced in the periphery of a single TSV (Figure 47) The transistorsin the array were measured over 98 identical dies of wafer ldquoD08rdquo
44 experimental measurements 65
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
096
097
098
099
100
101
Figure 421 Normalised on-current for PMOS structure ldquoF0rdquo (D08)
and processed according to 441 The normalised saturation cur-rent for each transistor in the array can be seen in Figure 422 afterinterpolation to a 33 times 31 grid It can be clearly observed that thetransistors in proximity to the TSV show a distinctive variationin their saturation current in comparison to the other transist-ors in the array which is the direct consequence of TSV-inducedthermomechanical stress These results also demonstrate that theproposed test structure design has enough measurement preci-sion to distinguish thermomechanical stress from backgroundnoise and variability The north to south stripe observed in theprevious configuration due to boundary condition mismatch isalso present in these measurements
To observe the TSV effect on nearby transistors more accuratelythe normalised saturation current versus distance is plotted foraxis X = 16 and Y = 14 (Figure 423) which cross the TSV almostat its the centre In the graph only actual measurements are plot-ted prior to interpolation We observe that the saturation currentof a transistor at 1microm distance from the TSV increases sim 20 onthe latitudinal axis and decreases sim 15 on the longitudinalwhile the effect is almost negligible after 8microm
The same measurement is repeated on wafer ldquoD20rdquo In Fig-ure 424 the normalised saturation current for each measuredtransistor in the array can be seen after interpolation to a 33 times 31grid Also the normalised saturation current versus distance isplotted for axis X = 16 and Y = 14 (Figure 425) In the graph
44 experimental measurements 66
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
09
10
11
12
105
110
115
120TSV
Figure 422 Normalised on-current for PMOS structure ldquoG0rdquo (D08)
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
085
09
095
1
105
11
115
12
125Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 423 Normalised on-current for PMOS structure ldquoG0rdquo(D08Y=14X=16)
44 experimental measurements 67
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
09
10
11
12
13
TSV
Figure 424 Normalised on-current for PMOS structure ldquoG0rdquo (D20)
we observe that the saturation current of a transistor at 1microm dis-tance from the TSV increases sim 30 on the latitudinal axis anddecreases sim 18 on the longitudinal while the effect is almostnegligible after 8microm These results show that thermomechanicalstress can be quite different at distances very close to the TSV
even in between wafers originating from the same lotFinally the measurement is repeated on wafer ldquoD08rdquo after
increasing the temperature to 60 ordmC Normalised current contourplot can be seen in Figure 426 and measurements for X = 16and Y = 14 in Figure 427 The saturation current of a transistor
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
09
1
11
12
13
14
Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 425 Normalised on-current for PMOS structure ldquoG0rdquo(D20Y=14X=16)
44 experimental measurements 68
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
090
095
100
105
110
115TSV
Figure 426 Normalised on-current for PMOS structure ldquoG0rdquo (D0860˚C)
at 1microm distance from the TSV increases sim 16 on the latitudinalaxis (minus4 in comparison to 25 ordmC) and decreases sim 11 on thelongitudinal (minus4 in comparison to 25 ordmC) while the effect isalmost negligible after 6microm (minus2microm in relation to 25 ordmC)
4 lateral TSV configuration
In this configuration the thermomechanical stress is evaluatedin the lateral direction between 4 TSVs The array is measuredover 98 identical dies of wafer ldquoD08rdquo at 25 ordmC The contour plot ofthe normalised saturation current can be seen in Figure 428 andmeasurements for X = 16 and Y = 14 in Figures 429 and 430
respectively As can be observed the saturation current of atransistor at 1microm distance from the TSV increases sim 18 on thelatitudinal axis (minus2 in relation to 1 TSV) and decreases sim 12 onthe longitudinal (minus3 in relation to 1 TSV) while TSV interactionreduces the effect range to 5microm (minus3microm in relation to 1 TSV)
90˚ rotation TSV configuration
In this configuration the orientation of the transistor channel inrelation to the thermomechanical stress is rotated to 90˚ Thearray is measured over 98 identical dies of wafer ldquoD08rdquo at 25 ordmCThe contour plot of the normalised saturation current can beseen in Figure 431 and measurements for X = 16 and Y = 14in Figure 432 As was expected the stress effect on mobility has
44 experimental measurements 69
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
085
09
095
1
105
11
115
12Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 427 Normalised on-current for PMOS structure ldquoG0rdquo (D08Y=14 X=16 60˚C)
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
085
090
095
100
105
110
115
TSV
TSV TSV
TSV
Figure 428 Normalised on-current for PMOS structure ldquoI0rdquo (D08)
44 experimental measurements 70
2 1 1 2 3 4 5 6 7 7 6 5 4 3 2 1 1 2 309
095
1
105
11
115
12
X16
Distance (μm)
Normalized
Ion
TSV TSV
Figure 429 Normalised on-current for PMOS structure ldquoI0rdquo (D08X=16)
3 2 1 1 2 3 4 5 6 7 8 8 7 6 5 4 3 2 1 1 208208408608809092094096098
1102
Y14
Distance (μm)
Normalized
Ion
TSV TSV
Figure 430 Normalised on-current for PMOS structure ldquoI0rdquo (D08Y=14)
44 experimental measurements 71
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
090
095
100
105
110
115
120TSV
Figure 431 Normalised on-current for PMOS structure ldquoG90rdquo (D08)
been reversed between the two axis The saturation current of atransistor at 1microm distance from the TSV decreases sim 13 on thelatitudinal axis and increases sim 20 on the longitudinal whilethe effect is almost negligible after 6microm
444 PMOS (short channel)
In this configuration short-channel MOSFETs (07microm times 007microm) areplaced around a single TSV (Figure 47) The contour plot of thenormalised saturation current can be seen in Figure 433 and
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
08509
0951
10511
11512
125Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 432 Normalised on-current for PMOS structure ldquoG90rdquo (D08Y=14 X=16)
44 experimental measurements 72
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
094
096
098
100
102
104
106
108
110
TSV
Figure 433 Normalised on-current for PMOS structure ldquoB0rdquo (D08)
measurements for X = 16 and Y = 14 in Figure 434 At at 1microm
distance from the TSV the saturation current decreases 7 onthe longitudinal axis (minus8 in relation to long channel) while onthe latitudinal axis no stress effect could be detected That couldbe attributed either to short-channel effects or malfunctioningof the specific transistor due to proximity to the TSV and theexperimental nature of the manufacturing process On both axisthe effect is almost negligible after 5 minus 6microm
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
085
09
095
1
105
11
115Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 434 Normalised on-current for PMOS structure ldquoB0rdquo (D08Y=14 X=16)
45 summary and conclusions 73
Size
Waf
er
Con
figu
rati
on
Rot
atio
n
Eff
ect
1micro
mla
t
Eff
ect
1micro
mlo
ng
KO
Z
L D08 1-TSV (25 ordmC) 0ordm +20 minus15 8microm
L D20 1-TSV (25 ordmC) 0ordm +30 minus18 8microm
L D08 1-TSV (60 ordmC) 0ordm +16 minus11 6microm
L D08 4-lat TSV (25 ordmC) 0ordm +18 minus12 5microm
L D08 1-TSV (25 ordmC) 90ordm minus13 +20 6microm
S D08 1-TSV (25 ordmC) 0ordm sim 1 minus7 8microm
Table 45 Summary of experimental measurements on long (L) andshort (S) channel PMOS devices
45 summary and conclusions
In this chapter a test structure was presented for monitoringthe impact of TSV-induced thermomechanical stress on MOSFET
device performance The test structure implemented arrays ofMOSFET devices superimposed with TSVs in various configura-tions Digital selection logic allowed access to the MOSFET deviceswhich could be used to indirectly monitor thermomechanicalstress by measuring the saturation current of transistors
The structure was implemented in Imecrsquos experimental 65nm
3D process and used to characterize the process technology whileevaluating the test structure Experimental measurements werecarried on PMOS devices and the thermomechanical stress wasfound to be on par with simulation estimations
In Figure 45 the measurements are summarised for the latit-udinal and longitudinal axis at 1microm distance from the TSV alongwith the recommended keep-out-zone (KOZ) The KOZ is indic-ative and represents the distance where the thermomechanicalstress effect on saturation current is almost negligible (lt 1) Ina practical case KOZs will depend on the tolerated saturation cur-rent mismatch between devices which is a direct consequence oftiming and other circuit constraints Typically digital circuits cantolerate and thus function properly with much more transistorvariability than analog circuits
Comparing the experimental results of this work to other pub-lished data in the literature reveals that the thermomechanicalstress effect varies dramatically with different process character-istics Experimental measurements reported by Cho et al [17]confirm that long channel devices are more sensitive than short
45 summary and conclusions 74
channel however both the saturation current shift and KOZs areconsiderably reduced to 2 and lt 2microm respectively In additionthe dependency of the stress effect on TSV position was not ob-served by the authors in contrast to the measurements presentedin this work and predictions by theoretical models Contradictoryresults between theory and experiments and between differentexperiments emphasise the need for better understanding of thespecific mechanisms involved in the TSV-induced thermomech-anical stress
Evaluation of the measurements results allowed for some con-clusions to be made on the effectiveness of the presented teststructure
1 It is desirable for transistors in the array to be in a regu-lar grid so as processing of the measurement data is lesscomplex
2 The boundary conditions of transistors have an consider-able effect on transistor matching and thus measurementaccuracy
3 Distances of interest are generally lt 8microm therefore thearrays could be smaller or more dense in the area in prox-imity to the TSV than further away
4 The experimental process used to implement the test struc-ture had low yield which stressed out the need for mech-anisms to detect faults in the digital selection logic
The experiences learned from this project can be potentially usedfor designing improved thermomechanical stress characterizationstructures for next generation TSV processes In a future workit would be interesting to perform experimental measurementson NMOS devices which were not evaluated in this project Fur-thermore the cause of the glitch measured on the short-channeltransistor nearest to the TSV should be further investigated
In the following chapter the potential of the energy-recoverytechnique is investigated for reducing the energy dissipation ofTSV interconnects in future 3D-SoC designs implementing a largenumber of TSVs
5E VA L U AT I O N O F E N E R G Y- R E C O V E RY F O R T S VI N T E R C O N N E C T S
51 introduction
TSV-based 3D-ICs enable low-parasitic direct connections betweenfunctional blocks in a SoC by replacing traditional long horizontal2D global interconnects with short vertical ones (21 page 5)Compared to 2D interconnects the reduced resistance-capacitance(RC) parasitics of TSVs improve both speed (RC) and energy(12CV2
DD) performance of circuits However the TSV parasiticcapacitance may still become an important source of energydissipation in large densely interconnected 3D-SoCs since thecombined capacitance and thus the energy required to drive TSVs
increases linearly with the number of tiers and interconnections(Figure 51)
The principal technique for reducing energy dissipation inCMOS has traditionally been based on supply voltage scaling[14] though this represents a technical challenge below the 65nm
node since PVT variations and physical limits decrease in practicethe range over which voltages may be scaled An alternativelow-power design technique is energy-recovery logic [31] whichdemonstrates frequency-dependent energy dissipation and thusdoes not suffer from the limitations of voltage scaling The originsand operating principle of energy-recovery logic were discussedin detail in 22 (page 13)
In this chapter the potential of the energy-recovery techniqueis investigated for reducing the energy dissipation of TSV inter-connects in 3D-ICs The total energy dissipation per cycle andoptimum device sizing are extracted for the proposed schemeusing theoretical modelling while the configuration is evaluatedagainst conventional static CMOS Some of the presented methodsand results have appeared previously in Asimakopoulos et al[7]
52 energy-recovery tsv interconnects
As was discussed in 22 (page 13) energy-recovery circuits con-serve energy by restricting current-flow across resistances andsubsequently recovering part of the supplied energy using time-variable power sources The energy dissipated in an energy-
75
52 energy-recovery tsv interconnects 76
1XTSV
Tier1
TSV
Tier0LOGIC 0
LOGIC 0
TSV
LOGIC 0
LOGIC 0
TSV
LOGIC n
LOGIC n Tier1
Tier0
TSV
LOGIC 0
LOGIC 0
TSV
LOGIC n
LOGIC n
Tier(m-1)
Tier(m)
mXnXTSVFigure 51 TSV energy dissipation per cycle increase with 3D integra-
tion density
recovery circuit becomes proportional to the charging period (T)of the capacitive load (CL) [82 60]
EER prop
(
RCL
T
)
CLV2DD (51)
In comparison a conventional CMOS circuit driving an equi-valent capacitive load would dissipate during the same chargingperiod T [59]
ECMOS =12
CLV2DD (52)
Combining Equations 51 and 52 we calculate the energydissipation relation between the two circuits
EER
ECMOSprop
2RCL
T(53)
From Equation 53 we observe that the conditions under whichenergy-recovery conserves energy when compared to conven-tional CMOS are when either the charging period (T) is large orthe resistancecapacitance (RCL) of the load is small
In a 3D-IC TSV parasitics in terms of resistance-capacitance (RC)are inherently low by design as was discussed in 21 (page 5)In fact as was demonstrated by Katti et al [27] resistance isnegligible and thus the electrical behaviour of a TSV in a 3D-IC
52 energy-recovery tsv interconnects 77
Tier2Tier1TSV
LOGIC 1
LOGIC 2
driver
TSV
LOGIC 1
LOGIC 2
CMOS
driver
Convert
Energy-recovery
PCG
Figure 52 Comparison of a typical CMOS TSV driver to the proposedenergy-recovery
can be modelled as a single capacitance due to its predomin-ant impact on energydelay Therefore and based exclusivelyon Equation 53 energy-recovery has an advantage over con-ventional static CMOS when used for driving low-parasitic TSV
interconnectsThe differences between a typical CMOS driver in a TSV-based
3D-IC and the energy-recovery scheme proposed in this chapterare illustrated in Figure 52 A logic block in Tier1 generates adigital output which is regenerated by a driver and receivedby a second logic block in Tier2 after passing through the TSVIn the proposed energy-recovery scheme as an intermediatestep the digital output is converted to an adiabatic signal beforepassing through the TSV so as to take advantage of the low-power characteristics of energy-recovery Compatibility betweenthe receiving digital logic block in Tier2 and energy-recovery isretained by restoring the adiabatic signal back to its digital form
Converting the digital signal to an adiabatic as depicted inthe simplified diagram of Figure 52 requires the use of a spe-cialised driver that can combine (mix) the time-variable signalof the power-clock generator (PCG) with a digital output Signalsgenerated by a PCG return to zero at the end of a cycle resem-bling the operation of a digital clock (222 page 18) In contrastdigital outputs are typically allowed to have arbitrary values fromone cycle to the next thus a special structure is required thatcan encode the PCG generated signal to arbitrary binary values
52 energy-recovery tsv interconnects 78
IN IN
ININ
OUT OUT
IN
IN
OUT
OUT
Figure 53 Adiabatic driver
Figure 54 A simple pulse-to-level converter (P2LC)
An example of such structure is the adiabatic driver [8 84] (Fig-ure 53) Adiabatic drivers are composed of a pair of transmissiongates switching onoff alternately based on the value of a digitalinput On each cycle the time-variable adiabatic signal is redirec-ted either to the driverrsquos inverting or non-inverting output andas a result the data carried by the adiabatic signal are dual-railencoded
The signals at the outputs of the adiabatic driver can be restoredto their digital form after passing through the TSV by convertingthe dual-rail encoded sinusoidal pulses back to level signals usinga pulse-to-level converter (P2LC) [72] (Figure 54)
For generation of the time-variable adiabatic signal a resonantPCG may be used such as the one discussed in 222 (page 18)A resonant PCG generates continuous sinusoidal pulses approx-imating the idealised (linear) adiabatic signal of Figure 52 Theadiabatic driverrsquos resistance (RTG) along with the capacitive load(CL) assume a crucial role in the generation of the resonant si-nusoidal pulse as their presence in the signal path along with
53 analysis 79
VDD2
Rind
RTG
AdiabaticDriver
CL
M1
T=1f
L ind
RTG
CL
Figure 55 Resonant pulse generator with dual-rail data encoding
Tier2
Tier1
f-fPULSE
VDD2CLK IN
Delay1CLK1
CLK0
Delay2CLK2OUT1
Data In
Data Out
CLKRES
TSV TSV TSV
CLKBufferAdiabatic
Driver
P2LC
f-f
Data1
M1
Figure 56 Energy-recovery scheme for TSV interconnects
a resonant-tank inductor form an RLC oscillator [44] (Figure 55)resonating at
f =1
2πradic
LindCL(54)
A detailed block diagram of the proposed energy-recoveryscheme including the PCG and dual-rail encoding is illustratedin Figure 56
53 analysis
The energy dissipation relation depicted in Equation 51 is basedon the assumption that an idealised (linear) adiabatic signalis used for charging the TSV capacitance Furthermore energy
53 analysis 80
losses occurring in the additional components used in the realisticenergy-recovery circuit illustrated in Figure 56 are not consideredin Equation 51 The purpose of this section is to analyse andmodel all possible sources of energy dissipation in the proposedenergy-recovery scheme
The bulk of the energy dissipation in the energy-recovery cir-cuit of Figure 56 will occur in the adiabatic driver (EAD) theinductorrsquos parasitic resistance (Rind) transistor M1 (EM1) and theP2LC (EP2L) Therefore the total energy dissipated per cycle inthe energy-recovery circuit would be
EER = EAD + Eind + EM1 + EP2L (55)
Parameters EAD Eind and EM1 are theoretically derived in thefollowing subsections while EP2L depends on the specific P2LCs
architecture used and could be easily extracted from simulationdata as will be shown later in this chapter
531 Adiabatic driver
There are two sources of energy dissipation in the adiabaticdriver of Figure 53 adiabatic dissipation caused by current flow-ing through the transmission gate resistance (RTG) and CMOS
dissipation caused by external stages driving the transmissiongate and pull-down NMOS transistor input capacitances
The sinusoidal pulses generated by the PCG chargedischargethe load capacitance (CL) through the transmission gate resistance(RTG) The full chargedischarge cycle of the load capacitanceconcludes during a period T = 1 f while the energy dissipatedon the transmission gate resistance (ETG) during that period isadiabatic and thus proportional to Equation 22 Appending acorrection factor ξ equal to π28 for a sine-shaped current [8 76]and multiplying the equation by 2 to calculate energy dissipationfor the full cycle
ETG = 2ξ
(
RTGCL12 T
)
CLV2DD =
π2
2(RTGCL f )CLV2
DD (56)
The cross-coupled PMOS pair of Figure 53 reduces the transmis-sion gate input capacitance by 12 when compared to the simpleadiabatic driver of Figure 215 (page 18) External CMOS stagesare not required to drive the PMOS pair however the PMOS gatecapacitance will appear as an additional capacitive load at theoutput of the driver Energy will still be dissipated for driving thePMOS pair but will be adiabatic and thus more energy efficient
53 analysis 81
Transmission gates are typically sized to have equal NMOS andPMOS parts as increasing the size of the PMOS only slightly im-proves the gate resistance while significantly increasing the inputcapacitance [78] Thus assuming both NMOSPMOS transistorsare equally sized to Wn the additional capacitive load appearingat the driverrsquos output due to the PMOS pair would be one gatecapacitance Cn In addition drainsource diffusion capacitance(CD) will be an important portion of the load since in each cycle6CD will be present in the current flow path (4 contributed bythe ON and 2 by the OFF transmission gate) Including the TSV
capacitance at the output of the adiabatic driver the combinedload capacitance seen by the PCG in any cycle would be
CL = CTSV + Cn + 6CD (57)
The transmission gate resistance (RTG) can be related to thegate capacitance by a ldquodevice technology factorrdquo (κTG) [43] whichwe can define for calculation convenience
κTG = RTGCn rArr RTG =κTG
Cn(58)
Assuming that the pull-down NMOS transistors of Figure 53are small CMOS dissipation is primarily the result of driving theNMOS part of the transmission gates which is CnV2
DD Combiningthe CMOS dissipation with Equations 56 57 and 58 gives thetotal dissipated energy per cycle in the adiabatic driver
EAD = CnV2DD +
π2
2κTG
Cnf [CTSV + Cn + 6CD]
2 V2DD (59)
The second term of Equation 59 has a consistent contributionon the energy dissipation in every cycle while the first term isdependent on data switching activity (D) We can also furthersimplify this equation by defining the diffusion capacitance asa fraction of the transistors input capacitance (CD = bCn) andassigning term π2
2 κTG f to a variable (y = π2
2 κTG f ) Equation 59then becomes
EAD = D middot CnV2DD +
y
Cn[CTSV + (6b + 1)Cn]
2 V2DD
=
[
Cn
(
D
y+ (6b + 1)2
)
+1
CnC2
TSV
]
middot yV2DD
+(12b + 2)CTSV middot yV2DD (510)
Since in Equation 510 Cn is the free parameter the first twoterms are inversely proportional and thus EAD is minimised when
53 analysis 82
they become equal Solving equation Cn
(
Dy + (6b + 1)2
)
= 1Cn
C2TSV
for Cn the optimum transmission gate input capacitance is calcu-lated for a given CTSV
Cn(opt) =
radic
[D
y+ (6b + 1)2]minus1CTSV (511)
Combining Equations 510 and 511 gives the energy dissipationof an optimally sized adiabatic driver
EAD(min) =
[radic
D
(6b + 1)2y+ 1 + 1
]
middot (12b+ 2)yCTSVV2DD (512)
532 Inductorrsquos parasitic resistance
The inductorrsquos (Lind) parasitic resistance Rind is proportional tothe Qind factor which typically depends on the inductorrsquos imple-mentation technology For the purposes of this analysis it can beestimated as
Qind =1
Rind
radic
Lind
CLrArr Rind =
1QindCL2π f
(513)
Energy dissipation in Rind is adiabatic and can be estimatedcombining Equations 56 and 513
Eind =π2
2(RindCL f )CLV2
DD =π
4CL
QindV2
DD (514)
533 Switch M1
Transistor M1 is switched-on briefly to recover the energy dissip-ated each cycle in the Rtotal = RTG + Rind Its energy dissipationis a trade-off between dissipation in its on-resistance RM1 andinput capacitance CM1
Since M1 is a fairly large transistor previous ratioed stages willhave a significant contribution to the energy consumption as wellFor that reason a m factor is used to compensate for the additionallosses IM1(rms) is the root mean square (RMS) current passingthrough the transistor while turned-on and VGM1 is the peak gatevoltage A methodology for deriving optimum values for boththese parameters is proposed in [44] and the total dissipatedenergy in M1 can be estimated as
EM1(min) = 2IM1(rms)VGM1
radic
mκM1
f(515)
54 simulations 83
Delay2
CLK1
CLK2Delay1
CLKW
PULSE
CLK0T
PD
OUT1
PULSE
P2L
RES
CLK
Figure 57 Energy-recovery timing constraints
534 Timing Constraints
The gate of transistor M1 in Figure 55 is controlled by a pulsewith period TCLK and width WPULSE For optimum energy ef-ficiency the digital inputs of the adiabatic driver in Figure 53should switch during the time period defined by WPULSE whilethe input values must be valid for the complete resonant clockcycle as shown in Figure 57
The adiabatic driverrsquos input data latching is controlled by clocksignal CLK1 (Figure 56) and thus its positive edge should takeplace during WPULSE (Delay1 lt WPULSE) If these constraints arenot met energy that could be otherwise recovered by the inductorwill be dissipated on M1 during WPULSE The clock signal at thereceiving end CLK2 does not have the same constraint howeveritrsquos positive edge should occur before the P2LC outputs new data(Delay2 lt WPULSE + PDP2L)
54 simulations
To verify and evaluate the proposed theoretical models a testcase was designed for which energy dissipation was calculatedusing both theory and SPICE simulations The test case assumed1 minus bit of information per cycle was transferred between 2 tiersof a 3D-IC in which the TSV capacitance (CTSV) was 80 f F TheTSV was driven using the energy-recovery technique proposedin this chapter implemented with a single adiabatic driver fordata encoding and an inductor (Lind Rind) for generation of the
54 simulations 84
M1IN IN
IN
OUT OUT
IN
Figure 58 Test case for the evaluation of the theoretical model
Q250MHz 500MHz 750MHz
T S e () T S e () T S e ()
6 360 409 -120 476 516 -78 578 613 -57
8 308 347 -114 417 449 -70 515 543 -52
10 276 312 -115 382 411 -70 477 498 -43
12 255 289 -119 358 386 -71 451 472 -44
Table 51 Energy dissipation results ( f Jcycle) theoretical model (T)versus SPICE simulation (S)
adiabatic signal and energy recovery The circuit used for the testcase is illustrated in Figure 58
The energy dissipation on the adiabatic driver inductor andtransistor M1 were calculated using Equations 512 514 and 515with the Q factor and operating frequency acting as free variablesTechnology-specific parameters κTG κM1 and CD were extractedusing simulation models for Imecrsquos 130nm 3D process (Table 22page 11) The same circuit was simulated on Cadence VirtuosoSpectre simulator under identical conditions The energy dissipa-tion results for various Q factors and operating frequencies arelisted in Table 51
The calculated and simulated energy dissipation results arealso plotted in Figure 59 As can be observed theoretical es-timations are lower than than the equivalent results reported bythe simulator in all the cases examined That can be expected asthe calculations performed by the simulator are thorough andconsider much more details that are skipped in the theoreticalmodels Nevertheless there is good correlation between theoryand simulations
In Figure 510 the error in the estimation of the energy dissip-ation is plotted when using the theoretical model The Q factordoes not seem to have a significant effect on the error howeverthere is a considerable dependence on the operating frequency
54 simulations 85
5 6 7 8 9 10 11 12 130
10
20
30
40
50
60
70
T250 S250 T500 S500 T750 S750
Q factor
Energydissipation(fF)
Figure 59 Energy dissipation results ( f Jcycle) theoretical model (T)versus SPICE simulator (S)
5 6 7 8 9 10 11 12 13
-14
-12
-10
-8
-6
-4
-2
0
250MHz 500MHz 750MHz
Q factor
Error()
Figure 510 Energy dissipation estimation error of the theoretical modelcompared to the SPICE simulator
The parameters κTG κM1 and CD that were extracted for thespecific technology even though they were used as constants inthe models in practice they are frequency dependent Thereforeit could be said that based on the parameters chosen the theor-etical results are optimised for a specific frequency Increasingfrequency doesnrsquot necessary result in reduced error as seemsto be suggested by the graphs in Figure 510 In fact it is ex-pected that the estimation error will show a positive sign forlarger frequencies As was discussed previously in the text it isa reasonable expectation for the theoretical models to slightlyunderestimate the energy-dissipation
54 simulations 86
M1
TSV
TSV
TSV
Data in OutCMOS
OutER
Energy-recovery
CMOS
P2LC
Adiabaticdriver
Figure 511 Test case for comparison of energy-recovery to CMOS
541 Comparison to CMOS
When comparing different circuits it is typically desirable to takea holistic approach evaluating various aspects such as powerarea speed cost etc The purpose of this section is to comparethe energy-recovery scheme to a conventional static CMOS circuitusing just the theoretical models presented in this chapter Thisbasically limits the comparison to the energy dissipation andspeed parameters while on Chapter 6 (page 92) area is alsoconsidered The test case used for the comparison is illustratedin Figure 511 The TSV load capacitance was assumed to be 80 f F
for both circuitsThe energy dissipation of the CMOS circuit in Figure 511 is
described by Equation 52 Appending the data switching activity(D)
ECMOS = D middot 12
CLV2DD (516)
Equation 516 obviously calculates the theoretical minimumenergy dissipation since no parasitics and previous CMOS stagesare included In practice the energy dissipation of the CMOS
circuit can be much larger however Equation 516 is consideredadequate for the purposes of this comparison
Energy dissipation of the energy-recovery circuit is calculatedusing Equation 55 The energy contribution of each sub-componentthe adiabatic driver (EAD) the inductorrsquos parasitic resistance(Rind) and transistor M1 (EM1) were theoretically derived pre-viously in the text The P2LC energy was simulated in SPICEfor convenience The simple circuit topology of Figure 54 wasslightly improved adding the PMOS transistors seen in Figure 512so as to reduce short-circuit current when the input is in the range
54 simulations 87
Figure 512 Improved P2LC design
0 100 200 300 400 500 600 700 800 9000
2
4
6
8
10
12
14
Frequency (MHz)
Energy
dissipation(fJ)
Figure 513 Simulated energy dissipation of improved P2LC design
Vth rarr (VDD minus Vth) The energy dissipation results extracted us-ing SPICE can be seen in Figure 513 As can be observed theenergy dissipation is frequency dependent something not expec-ted in a CMOS circuit This behaviour is caused by the slow risingsinusoidal inputs which for low frequencies extend the periodduring which both PMOS and NMOS are on and thus increaseshort-circuit current
The Q factor is a critical parameter in the energy efficiency ofthe energy-recovery scheme since the entire circuit is basically anRLC oscillator A wide range of inductors are available as discretecomponents for use in circuits with Q factors depending on thesize and material On-chip integrated inductors have more re-stricted performance however their compact size is an advantageIntegrated Q factors are typically in the neighbourhood of 10for nH range inductors with sim 100microm radius [5] The Q factor
54 simulations 88
4 6 8 10 12 14 16 18 20 22-20
-10
0
10
20
30
40
50
60
800MHz 700MHz 600MHz 500MHz 400MHz 300MHz 200MHzQ factor
Energyred uction()
Figure 514 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS for variable Q factor(C = 80 f F)
4 6 8 10 12 14 16 18 20 220050
100150200250300350400450
Inductor M1 Adiabatic driver P2LC
Q factor
Energy
con trib
ution(
)
Figure 515 Energy contribution of energy-recovery circuit componentsat 200MHz (C = 80 f F)
values in the following simulations were deliberately chosen tobe in realistic ranges for integrated inductors
In Figure 514 the percentage reduction in energy dissipationachieved by the energy-recovery circuit is plotted when comparedto CMOS As expected energy efficiency of the energy-recoverycircuit improves with high Q factors and low frequencies
In Figures 515 and 516 the energy dissipation contributionof each component in the energy-recovery circuit can be seenat 200MHz and 800MHz respectively In both cases the energydissipated on the inductorrsquos resistance is reduced with higher Q
factors At lower frequencies the P2LC energy dissipation becomesa significant part of the total
In Figure 517 the percentage reduction in energy dissipationis plotted when the TSV capacitance (CTSV) is variable The en-ergy dissipation in the adiabatic driver (Equation 512) inductor(Equation 514) and transistor M1 (Equation 515) are linearlyrelated to the TSV capacitance Since the same applies to the CMOS
circuit (Equation 516) relative energy dissipation of both circuitsshould be constant with TSV capacitance The change observed in
54 simulations 89
4 6 8 10 12 14 16 18 20 2200
100
200
300
400
500
600
700
Inductor M1 Adiabatic driver P2LC
Q factor
Energy
con trib
ution(
)
Figure 516 Energy contribution of energy-recovery circuit componentsat 800MHz (C = 80 f F)
20 40 60 80 100 120 140 160 180 200 220-100
00
100
200
300
400
500
Q6 Q10 Q14 Q18
TSV capacitance (fF)
Energy
red u
ction(
)
Figure 517 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS with variable TSVcapacitance (C = 80 f F f = 500MHz)
Figure 517 is caused by the constant energy dissipation contribu-tion of the P2LC However as the load capacitance increases theenergy dissipation on the P2LC becomes insignificant in relationto the rest of the circuit components thus the reduction in energyis expected to asymptotically approach a maximum value
Switching activity (D) can also be a significant factor affect-ing energy performance Since in the energy-recovery circuit thesinusoidal oscillation may not be halted all capacitances in thecurrent flow path will charge and discharge on each cycle regard-less of data activity In contrast static CMOS ideally dissipatesenergy only when switching and thus the energy-recovery circuithas a disadvantage at low switching activities In Figure 518 theestimated effect of the switching activity on energy performanceis plotted for an operating frequency of 200MHz
Since the technology factors κTG and κM1 were extracted for a130nm process reducing their value by 12 could also provide uswith an estimation of the circuitrsquos energy performance for a 65nm
node The result in plotted in Figure 519 and as can be observedtechnology scaling has a positive effect on energy dissipationwhen compared to conventional static CMOS
54 simulations 90
4 6 8 10 12 14 16 18 20 22-40
-20
0
20
40
60
80
D=05 D=07 D=09 D=1
Q factor
Energyred uction()
Figure 518 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS (C = 80 f F f =200MHz)
4 6 8 10 12 14 16 18 20 220
10
20
30
40
50
60
130nm 65nm
Q factor
Energyred uction()
Figure 519 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS for variable technology(C = 80 f F f = 500MHz)
55 summary and conclusions 91
55 summary and conclusions
The work presented in this chapter investigated the potential ofthe energy-recovery technique as used in adiabatic logic andresonant clock distribution networks for reducing the energy dis-sipation of TSV interconnections in 3D-ICs The proposed energy-recovery scheme for TSVs was analysed using theoretical model-ling generating equations for energy dissipation and optimumdevice sizing The model accuracy was evaluated against SPICEsimulations which showed good correlation with the theoreticalestimations on a 130nm technology process
The energy-recovery scheme was compared to conventionalstatic CMOS when both circuits driving equivalent TSV load ca-pacitances under similar conditions The analysis revealed theenergy performance dependence on multiple circuit parametersthe Q factor operating frequency switching activity and TSV capa-citance The results demonstrated favourable energy performancefor the energy-recovery scheme for low operating frequenciesand high Q factors switching activities and TSV capacitancesFurthermore an estimation was provided on the energy perform-ance of the proposed scheme in an advanced technology nodewhich predicted further energy improvement
Encoding the PCG signal to arbitrary digital values using thedual-rail adiabatic driver (Figure 53) is only one of the possiblesolutions In fact energy could be saved if instead of dual-railanother encoding method is used such as 1-out-of-4 [74] In 1-out-of-4 encoding 4 outputs are used to represent 2-bits however onlyone of the outputs is charged at any given cycle In comparisondual-rail charges two outputs per cycle for the same number ofbits and thus 1-out-of-4 could potentially be saving up to 50energy when compared to dual-rail 1-out-of-4 encoding couldprove an interesting extension to the proposed energy-recoveryscheme
The theoretical models developed in this work can be used asa quick alternative to SPICE simulations for design-space explor-ation This is shown in the following chapter where the theoryestablished on the energy-recovery TSV scheme is used for design-ing a low-power 3D demonstrator circuit that was fabricated on a130nm process
6E N E R G Y- R E C O V E RY 3 D - I C D E M O N S T R AT O R
61 introduction
In the previous chapter a theoretical approach was taken toinvestigate the potential of the energy-recovery technique forreducing the energy dissipation of TSV interconnects in 3D-ICs Thetotal energy dissipation per cycle and optimum device sizing wereextracted for the proposed scheme using theoretical modellingwhile analysis revealed the design parameters affecting energyperformance
Practical design issues such as wire parasitics signal distri-bution area etc were not considered hence in this chapter a3D demonstrator circuit based on the energy-recovery scheme isdesigned under realistic physical and electrical constraints Thedemonstrator is compared to a CMOS circuit designed with thesame specifications using post-layout simulations Both circuitsare fabricated on a 130nm 3D technology process and evaluatedbased on the experimental measurements
62 physical implementation
The demonstrator circuit implemented a 500MHz 80-bit parallelIO channel on a 5-tier TSV-based 3D-IC The TSVs interconnectionswere driven using the energy-recovery scheme as discussed inChapter 5
The target process technology allowed for a low-resistivitybackside re-distribution layer (RDL) to be deposited which wasused to implement a high-Q integrated inductor Stacking of dieswas not possible in the target process so the 5-tier 3D-IC wasemulated by series-connecting TSVs in groups of 4 as is illustratedin Figure 61
The circuit consisted of a total of 640 TSVs (2 times 4 times 80bits) ofwhich half were electrically connected to the integrated inductorduring a single clock cycle The TSV capacitance for the fabrica-tion technology process had already been characterized and wasestimated at sim 40 f F (Table 22 page 11) Optimal device sizingand parameters for the energy-recovery IO circuit at 500MHz
were calculated using the methodology developed in Chapter 5
and are listed in Table 61In Figure 62 the schematic diagram of the energy-recovery
circuit is illustrated The main components comprising the circuitare the adiabatic drivers the pulse-to-level converters (P2LCs)
92
62 physical implementation 93
AB
AB
CDEFD
C
ABCD
EFD
C
Figure 61 Demonstrator circuit structure (showing 1-bit IO channelface-down view)
Adiabatic driver 80-bit circuit
Wn 461microm (Eq511) Rtotal 21Ω ( RTG80 )
RTG 168Ω (Eq58) Ctotal 165pF (80 middot Cload)
CL 207 f F (Eq57) Lind 613nH (Eq54)
Table 61 Energy-recovery IO circuit parameters
62 physical implementation 94
IntegratedInductor
VINDOER
PulseGenerator
S1AS2
DataGenerator
TSV
TSV
P2LCAdiabaticDriver
P2LCAdiabaticDriver TSV
TSV
Buffer
Buffer
1
80
1
80
EN
DecouplingCapacitor
Figure 62 Energy-recovery IO circuit schematic diagram
Metal2Metal1PolyActiveVDD
GND
RES_CLK
DATA
OUT
OUT_B
DATA
OUT_B OUT
RES_CLK
GND
VDD
Figure 63 Adiabatic driver (layoutschematic view)
the data generator the pulse generator transistor M1 and theintegrated inductor All sub-components are discussed in detailin the following paragraphs
621 Adiabatic driver
The adiabatic driver was based on the configuration theoreticallyanalysed in 531 (page 80) The adiabatic driver had 2 inputsfor the resonant pulse (RES_CLK) and the digital data (DATA)signals and 2 outputs for the inverted (OUT_B) and non-inverted(OUT) signals The layout and schematic views of the adiabaticdriver are illustrated in Figure 63
62 physical implementation 95
Metal2Metal1PolyActive
IN
OUT_B
VDD
IN_B
OUT
GND
OUT_B
IN
GND
VDD
OUT
IN_B
Figure 64 P2LC (layoutschematic view)
622 Pulse-to-level converter
The pulse-to-level converter (P2LC) was based on the simple circuitof Figure 54 (page 78) The design in layout and schematic viewsis illustrated in Figure 64 The circuit had 2 inputs for the inverted(IN_B) and non-inverted (IN) resonant pulse and 2 outputs forthe inverted (OUT_B) and non-inverted (OUT) digital signals
623 Data generator
A data generator was used for generating the digital data sig-nal required at the adiabatic driverrsquos input (DATA) The circuitwhich is illustrated in Figure 65 was a simple D-type flip-flopconfigured as a toggle The data generator had 1 input a digitalclock signal (CLK) operating at the same frequency as the res-onant clock and the digital data output (Q) with a switchingactivity of 1
As discussed in 534 (page 83) there are certain timing con-straints that must be respected for the energy-recovery circuitto operate efficiently The digital data received at the adiabaticdriverrsquos input (DATA) should switch during the period definedby WPULSE (Figure 66) In the demonstrator circuit the distribu-tion network carrying the DATA signal to the adiabatic driverswas carefully designed so that its delay impact was less thanWPULSE
624 Pulse generator
The gate of transistor M1 in Figure 62 is controlled by a pulsehaving the same frequency as the resonant clock (RES_CLK)and width WPULSE which is calculated using the models in 533(page 82) This digital pulse can be generated externally of thechip using a highly accurate laboratory signal generator Howeverwire parasitics in the path from the external signal source to the
62 physical implementation 96
Metal2Metal1PolyActive EN
CLK
VDD
Q
GND
D
CLKGND
ENVDD
QF-F
D
GND
VDDCLK
D Q
Figure 65 Data generator (layoutschematic view)
Delay1
WPULSE
PULSE
CLKTCLK
DATA
CLK_RES
Figure 66 Timing constraints for the energy-recovery demonstratorcircuit
62 physical implementation 97
PULSES1A
S2
GND
VDD
Metal2Metal1PolyActive
VDD
S1
S2
GND
PULSE
Figure 67 Pulse generator (layoutschematic view)
S2ΔPhase
WPULSE
PULSE
S1A
Figure 68 Pulse generator InputOutput signals
gate of on-chip transistor M1 affect the rising and falling times ofthe pulse resulting in pulse widths which are difficult to control
For that reason the pulse generator was implemented on-chip(Figure 67) and could be used to generate pulses with arbitrarywidths and frequencies The pulse generator had 2 sinusoidalinputs (S1A S2) supplied externally of the chip and generateda digital pulse with the same frequency as S1AS2 The pulsewidth was defined by the phase difference between S1AS2 asillustrated in Figure 68
625 Integrated inductor
The integrated inductor was designed in ASITIC1 as a planarsquare spiral with fixed dimensions (500microm times 500microm) and vari-able inductance while aiming for optimum Q factor The simu-
1 ASITIC is CAD tool that aids in the optimization and modelling of spiralinductors (httprficeecsberkeleyedu~niknejadasitichtml)
62 physical implementation 98
100 200 300 400 500 600 700 8005
7
9
11
13
15
17
19
21
10
15
20
25
30
35
40
45
Q (Simulated) Energy Reduction ()
Frequency (MHz)
Qfactor
Energyreduction()
Figure 69 Inductorrsquos Q factor simulation in ASITIC Estimated energydissipation reduction per cycle over standard CMOS
lated Q factor of the inductor for each frequency was then usedin Equation 55 to estimate the energy performance of the energy-recovery circuit when compared to standard CMOS (Equation 52)
As can be observed in Figure 69 the maximum performance isnot reached at the point where the maximum Q factor is achievedsince energy-recovery is also dependent on the operating fre-quency As a result the largest reduction in energy dissipationwas 40 attained at 300MHz while the maximum Q factor re-ported by ASITIC was 20 at 625MHz The high Q factors werea consequence of the large area allocated for the inductor andthe availability of the backside re-distribution layer (RDL) on thetarget process technology for the implementation of the inductorwhere thick low-resistance copper paths could be deposited
For an operating frequency of 500MHz the inductor had a Qfactor of sim 17 which resulted in an estimated energy reductionof 33 in comparison to the CMOS circuit The inductor had avalue of 613nH at that frequency and its design in layout viewcan be seen in Figure 610
626 Top level design
The various sub-components of the circuit were interconnectedusing Metal1 and Metal2 layers In the design there were twoglobal signals that assumed a critical role in the performance ofthe circuit The digital data signal (DATA) which was discussedin 623 and the resonant clock (RES_CLK)
For the distribution of the data signal buffers and interconnectswere properly sized so that the timing constraints of Figure 66were respected A sinusoidal signal such as RES_CLK does notrequire regeneration as it carries just one frequency harmonichowever the path itself has to be properly sized so as to make surethat any parasitic resistancescapacitances are not significantlyaffecting the target performance of the design As was previously
62 physical implementation 99
500μ
m
500μm
Figure 610 Planar square spiral inductor designed in ASITIC
shown by Chueh et al [19] when the driven load capacitancein a resonant distribution network is much larger than wireparasitic capacitance then wire resistance is more significant fordetermining both skew and energy performance
The resonant pulse distribution topology used in this designwas an asymmetric tree as shown in Figure 611 in which theinductor forces a current at the root branch terminating on alladiabatic drivers in parallel As such and considering that theresonant pulse supplies 80 adiabatic drivers with transmissiongate resistance of RTG = 168Ω the total load resistance seen atthe output node of the inductor is RTG
80 = 16880 = 21Ω However as
the current flows to lower branch levels in the tree the perceivedresistance increases to RTG
16 RTG4 and on the last one to RTG Any
additional resistance from wires is going to add in series with theadiabatic driver resistance which in turn will induce an almostlinear effect on the energy performance as can be estimated fromEquation 512
Assuming that the total driven resistance is not desirable toincrease more than 20 and that there are 4 branch levels inthe design each branch can tolerate wire resistance equal to20
4 = 5 of its value Starting from the inductor output nodeand calculating allowed wire resistivity for each branch level thecalculated tolerances for each branch are listed in Table 62 Thesecalculated limits were used for optimally sizing the resonantclock distribution tree in the demonstrator circuit
To ensure a stable voltage supply at the input of the inductor(VIND) a 1pF interdigitated metal-insulator-metal (MIM) de-coupling capacitor [78] was designed (Figure 612) which wasinserted between VIND and ground The top level design of
62 physical implementation 100
Adiabatic drivers
RTG
16
RTG
80
RTG
4
RTG
Level 1Level 2Level 3Level 4
Figure 611 Resonant pulse distribution tree
Branch level Wire resistance
1st 5 middot 16880 = 105mΩ
2nd 5 middot 16816 = 525mΩ
3rd 5 middot 1684 = 21Ω
4th 5 middot 168Ω = 84Ω
Total load 168+8480 + 21
20 + 05255 + 0105 = 252Ω
Table 62 Calculated resistance tolerances for each wire branch in theresonant clock distribution tree
63 post-layout simulation and comparison 101
+-
+-Metal2 Metal1
180μ
m
100μm
Figure 612 Interdigitated metal-insulator-metal capacitor (fringe capa-citor)
the energy-recovery IO circuit in layout view is illustrated inFigure 613
627 CMOS IO circuit
The CMOS IO circuit which was designed for comparison pur-poses is illustrated in the schematic diagram of Figure 614 Thecircuit used half the number of TSVs than the energy-recoverycircuit and was composed of two basic sub-components The datagenerator which was identical to the one discussed in 623 andthe CMOS drivers (Figure 615) which were properly sized to drivethe 160 f F TSV capacitance
The top level layout view of the CMOS IO circuit can be seenin Figure 616
63 post-layout simulation and comparison
The theoretically calculated parameters were used for the designof the demonstrator circuit in Imecrsquos 3D130nm process technologyIn Table 63 the estimated energy dissipation of each circuit sub-component is listed
63 post-layout simulation and comparison 102
22mm
12mm
GND GNDVBUF1 VDR
GND
VIND
S1A
S2
EN
OER
Figure 613 Energy-recovery IO circuit in layout view
63 post-layout simulation and comparison 103
OCMOS
S1B DataGenerator
TSV
CMOSDriver
CMOSDriver
TSV Buffer
1
80
EN
Figure 614 CMOS IO circuit schematic diagram
GND
VCMOS
DATA
078u
039u
105u
053u
316u
158u
950u
475u
OUT
Figure 615 CMOS driver schematic
12mm
12mm
OCMOS S1B GND VBUF2 VCMOSGND
GND GNDEN VCMOS
Figure 616 CMOS IO circuit in layout view
63 post-layout simulation and comparison 104
Energy dissipation (80-bits cycle)
Inductor 11pJ (Eq514)
Switch M1 092pJ [ ]
P2LCs 098pJ (Simulated)
Adiabatic drivers 315pJ (Eq512)
Total 615pJ
Table 63 Energy-dissipation theoretical calculations for Q = 17 f =500MHz
After parasitics extraction the top-level design was simulatedin SPICE The post-layout simulation results for a selection ofoutput signals are plotted in Figure 617 As can be observedthe timing constraints discussed in 623 are respected since theadiabatic driverrsquos input signal (DATA) optimally switches duringthe 327ps PULSE width It can also be seen that the resonantclock (CLK_RES) does not fully complete its discharge cyclebefore it is being forced to do so by the PULSE signal since theadditional capacitive load induced by the wire parasitics reducesthe oscillatorrsquos natural frequency
Post-layout simulated energy dissipation as well as area re-quirements for the energy-recovery and CMOS IO circuits arelisted in Table 64 The energy dissipation of the energy-recoveryimplementation is 25 more at the post-layout level in compar-ison to the theoretical estimations The energy overhead can beattributed to the effect of the wire parasitics as discussed in626 and the transistor parasitics that are not included in thetheoretical equations Therefore the accuracy of the theoreticalmodels can be considered acceptable for designing circuits basedon the proposed energy-recovery scheme
As was expected the dual-rail energy-recovery circuit occupiesalmost double the area of the equivalent CMOS circuit and eventhough both implementations have increased energy dissipationin comparison to the theoretical calculations the CMOS circuitenergy increases relatively more The latter was anticipated sincethe theoretical estimation of the CMOS energy dissipation wascalculated using the simple Equation 52 which does not accountfor any additional driver stages and parasitics The end result isthat the comparative performance favours the energy-recoveryimplementation which improves its energy savings from the
63 post-layout simulation and comparison 105
8 9 10 11 12 13 14 15 160
02
04
06
08
1
12
14RES_CLK
Time (ns)
V(V
)
8 9 10 11 12 13 14 15 160
02
04
06
08
1
12
14DATA
Time (ns)
V(V
)
8 9 10 11 12 13 14 15 160
02
04
06
08
1
12
14PULSE
Time (ns)
V(V
)
8 9 10 11 12 13 14 15 160
02
04
06
08
1
12
14OER
Time (ns)
V(V
)
Figure 617 Post-layout SPICE simulation of the energy-recovery cir-cuit
64 experimental measurements 106
AreaEnergy
dissipation(Theory)
Energydissipation
(Post-layout)
CMOS 144mm2922pJ
(115 f Jbit)
(Eq52)
1265pJ
(158 f Jbit)
Energy recovery 264mm2 615pJ
(77 f Jbit)
768pJ
(96 f Jbit)
Difference +83 minus333 minus393
Table 64 Post-layout simulated energy dissipation Area require-ments
theoretically calculated 333 up to 393 in the post-layoutsimulations
64 experimental measurements
The presented energy-recovery and CMOS IO circuits were fabric-ated on a 200mm wafer (Figure A3) which contained 47 identicaldies Micrograph images of the energy recovery and CMOS circuitscan be seen in Figures 618 and 619 respectively
641 Measurement setup
For IO access a manual on-wafer measurement system was usedthat allowed direct contact with the wafer and the IO test padsthrough RF probe heads (Figure A4) IO signal generation andmeasurement was provided by laboratory instruments whichwere manually controlled The measurement setup is illustratedin Figure 620
The IO test pads for the energy-recovery and CMOS circuitsand their equivalent instrument connections used for each testpad are summarised in Tables 65 and 66
642 Results
CMOS IO circuit
In Figure 621 the input signal S1B and the output signal OCMOS
are shown as captured by the oscilloscope Signal S1B whichhas a frequency of 500MHz is used by the data generator (Fig-ure 614) to generate a data signal switching at the same frequency
64 experimental measurements 107
Figure 618 Energy-recovery IO circuit micrograph
Figure 619 CMOS IO circuit micrograph
AABC
DEFF
D
EEBA
EBBA
Figure 620 Measurement setup
64 experimental measurements 108
Test pad Direction Instrument connection
VIND Power Power supply - gt06V
GND Power Power supply - 0V
S1A InputSine-wave generator (0deg Phase500MHz non-50Ω-terminated)
S2 InputSine-wave generator (VariablePhase 500MHznon-50Ω-terminated)
EN InputDigital Output (enable outputdata switching)
OER Output Oscilloscope
VDR PowerPower supply (adiabaticdrivers+P2LCs) - 12V
VBUF1 Power Power supply (Buffers) - 12V
Table 65 Energy-recovery circuit instrument connections
Test pad Direction Instrument connection
VCMOS PowerPower supply (CMOS drivers) -12V
GND Power Power supply - 0V
VBUF2 Power Power supply (Buffers) - 12V
S1B InputSine-wave generator (0deg Phase500MHz non-50Ω-terminated)
OCMOS Output Oscilloscope
EN InputDigital Output (enable outputdata switching)
Table 66 CMOS circuit instrument connections
64 experimental measurements 109
S1B
OCMOS
Figure 621 Oscilloscope output for signals S1B and OCMOS
100 200 300 400 500 600 700 800012345678910
Die1 Die2 Simulation
Frequency (MHz)
Powerdissipation(mW)
Figure 622 Measuredsimulated power dissipation for the CMOS cir-cuit
that passes through the TSVs and is output on OCMOS Thereforethe measured result on the oscilloscope suggests that the circuitis operating as expected
The proper operation of the circuit is also verified by measuringpower and energy per cycle dissipation on pad VCMOS Thepower dissipation of the two dies measured is approximatelylinear as expected by simulations (Figure 622) while energydissipation approximately constant (Figure 623) In additionabsolute power and energy dissipation values closely match post-layout simulations
Energy-recovery IO circuit
Input signal S1A and output signal OER can be seen in Fig-ure 624 as captured by the oscilloscope The output at OER
64 experimental measurements 110
100 200 300 400 500 600 700 800024681012141618
Die1 Die2 Simulation
Frequency (MHz)
Energydissipation(pJ)
Figure 623 Measuredsimulated energy dissipation for the CMOScircuit
S1A
OER
Figure 624 Oscilloscope output for signals S1A and OER
shows a data signal switching at 500MHz as expected The slowrising edge of the data signal suggests higher than anticipatedparasitics at the OER IO pad though the functionality of thecircuit is not affected
The energy dissipation is calculated by measuring the inputcurrent on IO test pads VIND and VDR The results for the twodies measured can be seen in Figure 625 The energy dissipationshows an approximately constant relation to frequency similar tothe CMOS circuit which does not correlate with post-layout simu-lations A properly operating energy-recovery circuit would beexpected to show minimum energy dissipation at the resonancefrequency (sim 500MHz)
Since the energy-recovery circuit shows CMOS-type energy dis-sipation it was estimated that the inductor could not resonate
65 summary and conclusions 111
200 250 300 350 400 450 500 550 600 65002468101214161820
Die1 Die2 Simulation
Frequency (MHz)
Energydissipation(pJ)
Figure 625 Measuredsimulated energy dissipation for the energy-recovery circuit
IntegratedInductor
VINDOER
PulseGenerator
S1A
S2
DataGenerator
TSV
TSV
P2LCAdiabaticDriver
P2LCAdiabaticDriver TSV
TSV
Buffer
Buffer
1
80
1
80
EN
DecouplingCapacitor
200pF
Figure 626 Energy-recovery IO circuit with fault hypothesis
properly and a hypothesis was made that the cause was a largerthan anticipated parasitic capacitance on the integrated inductorTo determine whether the hypothesis is valid a 200pF capacitoris inserted in the circuit schematic connected at the output ofthe inductor as shown in Figure 626 The new circuit is simu-lated and as can be seen in Figure 627 the output signal at OER
switches properly even though the resonant clock (RES_CLK) isnot oscillating The simulated energy dissipation of the circuitwith the fault hypothesis can be seen in Figure 628 and as canbe observed it is very similar to the experimental measurementsboth in terms of absolute values and frequency relation
65 summary and conclusions
In this chapter the design approach of a demonstrator circuitwas presented for evaluating the energy-recovery 3D interconnect-ing scheme Expanding on the theoretical analysis proposed inChapter 5 practical design issues were discussed and methodswere suggested for overcoming them
The circuit was designed and implemented on an experimental130nm 3D process technology and post-layout simulations res-
65 summary and conclusions 112
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15050
055
060
065
070
075
080RES_CLK
Time (ns)
V(V
)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 150
02
04
06
08
1
12
14PULSE
Time (ns)
V(V
)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1500
02
04
06
08
10
12
14OER
Time (ns)
V(V
)
Figure 627 Simulation of circuit with fault hypothesis
65 summary and conclusions 113
200 250 300 350 400 450 500 550 600 6500
5
10
15
20
25
Die1 Die2 Simulation CL=200pF
Frequency (MHz)
Ene
rgy
diss
ipat
ion
(pJ)
Figure 628 Measured energy dissipation and simulation of circuit withfault hypothesis
ults including parasitics were shown and interpreted The post-layout results revealed that the demonstrator circuit dissipated25 more energy per cycle than the value estimated by the-ory however the energy overhead was attributed to the effectof the wiretransistor parasitics which are not included in thetheoretical models Nevertheless the energy-recovery IO circuitdissipated 39 less energy per cycle than the equivalent CMOS
circuit implemented on the same process technologyExperimental measurements demonstrated that the fabricated
CMOS IO circuit closely matched the energy-dissipation estim-ated by post-layout simulations However the energy-recoveryIO circuit did not operate as expected and as a result its wasnot possible to evaluate the energy-recovery 3D interconnectingscheme on-chip Using simulation data the fault was attributedto higher than anticipated integrated inductor parasitics
7C O N C L U S I O N S
Planar on-chip interconnects have become a major concern inmodern processes as their impact on IC performance has beenprogressively increasing with each technology node In recentyears 3D interconnects based on vertical TSVs have been proposedas a possible solution to the interconnect bottleneck TSVs havethe potential to offer the best all around technology for meet-ing future technology trends with their advantages includingreduced parasitics and high interconnection density
As TSV is a relatively recent 3D interconnect technology thereare still challenges to be overcome before TSVs can be widelyadopted and commercialised Issues in the fabrication of TSV
devices are in the process of being resolved or solutions havealready been found For implementing TSV-based 3D-ICs withmore complex functionalities than simple test chips 3D designmethodologies and tools are needed which are still at their in-fancy stage Accurate perception of the TSV characteristics thatinfluence circuit performance is a major step towards the devel-opment of design methodologies and tools for 3D-ICs
Key TSV characteristics are the TSV parasitic capacitance andthermomechanical stress distribution The parasitic capacitanceaffects delay and power consumption of signals passing throughthe TSV interconnect while thermomechanical stress affects MOS-
FET device mobility in the vicinity of TSVs For extracting theseparameters accurately and under different conditions sophist-icated test structures were presented in this work which werefabricated and evaluated on an experimental 3D process
Although the TSV parasitic capacitance is small it may stillbecome an important source of power consumption as TSVs areimplemented in large densely interconnected 3D-SoCs Since con-ventional low-power design techniques based on voltage scalingrepresent a technical challenge in modern technology nodesa novel TSV interconnection scheme with frequency-dependentpower consumption was proposed in this work based on energy-recovery logic The proposed scheme was analysed using the-oretical modelling while a demonstrator IC was designed andfabricated on a 130nm 3D process
The contributions of this thesis as presented in the previouschapters are summarised in the following sections as well aspossible directions for future work
114
71 main contributions 115
71 main contributions
From the parasitics associated with a TSV capacitance assumesthe most significant role on its electrical properties such as en-ergy dissipation and delay thus accurate characterization of theTSV capacitance is highly desirable when designing a 3D systemIn Chapter 3 a group of test structures were presented for theelectrical characterization of the TSV parasitic capacitance un-der various conditions The test structures were based on theCBCM technique which has been used in the past for measuringBEOL metal interconnect parasitic capacitances and MOSFET ca-pacitances The presented test structures were fabricated on a65nm 3D process and the measurement results were statisticallyanalysed in terms of TSV capacitance die-to-die (D2D) and die-to-wafer (D2W) variability Comparison of the measurement data tosimulated oxide liner thickness variation for the specific processrevealed the oxide liner as a major contributor to the observedTSV capacitance variability Measurement results also confirmedthe effect of TSV pitch on the parasitic capacitance as has beenpreviously reported in the literature for a different process Theproposed CBCM-based TSV capacitance measurement techniquewas shown to be superior to conventional LCR measurementsproducing both accurate measurement results as well as lessoutliers (less defective samples) in the measurement data
The integration of TSV 3D interconnects in CMOS technology hasbeen shown to induce thermomechanical stress on the silicon (Si)substrate Thermomechanical stress can have an impact on activedevice carrier mobility implemented on the same Si substratethus monitoring and controlling stress is crucial for designing TSV-aware circuits with predictable performance Simulations usingfinite element method (FEM) modelling predict a complex ther-momechanical stress distribution which emphasises the need forsophisticated characterization structures that can monitor stressimpact with precision and verify simulation models In Chapter4 a test structure was presented for indirectly monitoring of theTSV-induced thermomechanical stress distribution by measuringits effect on transistor saturation current The structure implemen-ted arrays of MOSFET devices superimposed with TSVs in variousconfigurations and enabled accurate measurement of individualdevices by combining digital selection logic with four-terminalsensing Imecrsquos experimental 65nm 3D process was characterizedusing the presented test structure extracting the thermomechan-ical stress effect on PMOS devices which was found to be on parwith simulation estimations
In future large densely interconnected 3D-SoCs the TSV parasiticcapacitance may become an important source of energy dissip-ation Energy-recovery logic is an interesting low-power design
72 future work 116
approach as it is not limited by voltage scaling which representsa technical challenge in modern technology nodes In Chapter5 the potential of the energy-recovery technique was investig-ated for reducing the energy dissipation of TSV interconnectsin 3D-ICs The proposed energy-recovery scheme was analysedusing theoretical modelling generating equations for energy dis-sipation and optimum device sizing The model accuracy wasevaluated against SPICE simulations which showed good cor-relation with the theoretical estimations on a 130nm technologyprocess Comparison of the energy-recovery scheme to conven-tional static CMOS demonstrated favourable energy performancefor the energy-recovery scheme for low operating frequenciesand high Q factors switching activities and TSV capacitances
The theoretical models developed in Chapter 5 were used fordesigning a energy-recovery 3D demonstrator circuit in Chapter6 under realistic physical and electrical constraints The energy-recovery circuit and its CMOS equivalent were fabricated on a130nm 3D process technology and post-layout simulations showedthat the energy-recovery circuit dissipated 39 less energy percycle than the CMOS circuit on the same process technology Exper-imental measurements and comparison to simulations confirmedthe proper operation of the fabricated CMOS circuit however theenergy-recovery circuit did not operate as expected and as a res-ult its was not possible to evaluate the proposed energy-recoveryscheme on a chip Using simulation data the fault was attributedto higher than anticipated integrated inductor parasitics
72 future work
The CBCM-based test structures in Chapter 3 were shown to be atleast as accurate as external instruments for characterizing TSV
capacitance Considering that CBCM can evaluate single-endedTSVs it could prove a valuable tool for testing TSVs prior to diestacking which is a critical step for keeping 3D-IC yield acceptableMeasuring the TSV parasitic capacitance can provide with import-ant information about the quality of the fabricated TSV devicesince even slight variations in process parameters will affect itsparasitic capacitance as was demonstrated in Chapter 3 CBCM issimpler and potentially more accurate than other techniques thathave been proposed in the literature for the same role such asRC constant measurement using sense amplification [16]
In Chapter 4 the experiences learned from evaluation of theexperimental measurements results could be potentially used fordesigning improved thermomechanical stress characterizationstructures for next generation TSV processes
1 The transistors should be in a regular grid in the array forsimple processing of the measurement data
72 future work 117
2 The boundary conditions of transistors should be carefullydesigned as they have an considerable effect on transistormatching and thus measurement accuracy
3 Distances of interest are generally lt 8microm thus the arrayscould be smaller and more dense in the area in proximityto the TSV
4 Mechanisms are required to detect faults in the digitalselection logic when the implementation process is expectedto have low-yield issues
Furthermore in a future work the thermomechanical stress effecton NMOS devices should be measured which were not evaluatedin this project Simulations predict that NMOS devices are muchless affected by thermomechanical stress than PMOS devices thusthe approach used in this work might not be sufficiently accurateto extract stress effect on NMOS mobility
In Chapter 5 the energy-recovery scheme proposed could use adifferent encoding technique than dual-rail as a possible improve-ment A suggested encoding is 1-out-of-4 [74] since only chargesa single output at any given cycle which can potentially save upto 50 energy in comparison to dual-rail Also the demonstra-tion of the energy-recovery scheme feasibility on a chip is stillpending since the energy-recovery 3D demonstrator discussedin Chapter 6 failed to operate properly The fault was attributedto higher than anticipated integrated inductor parasitics thusthe same circuit should be implemented in the future with animproved inductor design for verifying the proposed theory
AA P P E N D I X A P H O T O S
Figure A1 3D65 300mm wafer (on chuck)
Figure A2 Probe card
118
appendix a photos 119
Figure A3 3D130 200mm wafer (on chuck)
ProbeHead
Microscope
Figure A4 Probe head
BA P P E N D I X B S TAT I S T I C S
mean (micro)
In a given data set the mean is the sum of the values divided bythe number of values The mean represents the average value in apopulation If xi is a value in the data set and n the total numberof values the mean micro can be calculated as
micro =1n
n
sumi=1
xi
median (m)
The median is useful for skewed distributions and represents thearithmetic value that the majority of the values in a data set tendThe median of a normal distribution with mean micro and varianceσ2 is the same as the mean
standard deviation (σ)
Standard deviation is used for measuring variability in a dataset It shows how much variation there is from the mean valueTo calculate the standard deviation of a data set first the squareof the difference of each value from the mean is calculated andthen the square root of the mean gives the standard deviation
A normal distribution contains 682 of its values within plusmn1σOther distributions can have a different spread and typically alow standard deviation indicates low variability whereas a highstandard deviation increased variability
standard deviation of the mean (σmicro )
The standard deviation of the sampled mean (σmicro) provides in-formation on the precision of the calculated mean of a data setThis is especially useful in experimental measurements wherethe certainty of the data being an accurate representation of themeasured physical quantity increases the more the measurementis repeated
Assuming there is a data set X with n values xi of mean micro
120
appendix b statistics 121
variance(X) = σ2X
variance(micro) = variance( 1n sum
ni=1 xi) =
1n2 variance(sumn
i=1 xi) =1n2 sum
ni=1 variance(xi) =
nn2 variance(X) =1n variance(X) =
σ2Xn rarr
σmicro = σXradicn
coefficient of variation
The coefficient of variation is the ratio of the standard deviationσ to the mean micro
CV =σ
micro
It can be more useful in determining variability in a populationthan standard deviation as in contrast to standard deviationcoefficient of variation is a dimensionless number often expressedas percentage
outlier
Outliers are values in a data set that are numerically distant fromthe rest of the data Observing and sometimes removing theoutliers is crucial in experimental measurement as these valuesusually indicate a measurement error or other malfunction De-termining outliers is generally subjective to the observer as thereis no specific mathematical definition of what constitutes outliersTypically the outliers are located by observing the probabilitydistribution function of a data set
kruskalndash wallis [34]
KruskalndashWallis one-way analysis of variance compares the medi-ans of two or more data sets to determine if the samples originatefrom separate populations This analysis tests the null-hypothesisthat there is no relationship between the data sets calculating theprobability of the data sets originating from the same populationand rejecting the null-hypothesis if it is less than a significancelevel The KruskalndashWallis test can be used on data sets that donot have a normal distribution and their variances do not have tobe equal
B I B L I O G R A P H Y
[1] Model 2602 dual-channel system sourcemeter instru-ment URL httpwwwkeithleycomproductsdcac
currentsourcebroadpurposemn=2602
[2] Max-3d URL httpwwwmicromagiccom
[3] The international technology roadmap for semiconductors2007 URL httppublicitrsnet
[4] The international technology roadmap for semiconductors2009 URL httppublicitrsnet
[5] Jaime Aguilera Joaquin de No Andres Garcia-Alonso FrankOehler Heiko Hein and Josef Sauerer A guide for on-chip inductor design in a conventional cmos process forrf applications In Applied Microwave amp Wireless Magazinepages 56ndash65 October 2001
[6] M Arsalan and M Shams Charge-recovery power clockgenerators for adiabatic logic circuits In Proc 18th Int VLSI
Design Conf pages 171ndash174 2005 doi 101109ICVD200564
[7] P Asimakopoulos G Van der Plas A Yakovlev andP Marchal Evaluation of energy-recovering interconnectsfor low-power 3d stacked ics In Proc IEEE Int Conf
3D System Integration 3DIC 2009 pages 1ndash5 2009 doi1011093DIC20095306581
[8] W C Athas L J Svensson J G Koller N Tzartzanis andE Ying-Chin Chou Low-power digital systems based onadiabatic-switching principles 2(4)398ndash407 1994 doi 10110992335009
[9] C H Bennett Logical reversibility of computation IBM
Journal of Research and Development 17(6)525ndash532 1973 doi101147rd1760525
[10] Charles H Bennett The fundamental physical lim-its of computation Scientific American 253(1)48ndash561985 URL httpwwwnaturecomdoifinder101038
scientificamerican0785-48
[11] E Beyne 3d system integration technologies In Proc Int
VLSI Technology Systems and Applications Symp pages 1ndash92006 doi 101109VTSA2006251113
122
bibliography 123
[12] Mark Bohr and Kaizad Mistry Intelrsquos revolu-tionary 22 nm transistor technology 2011 URLhttpdownloadintelcomnewsroomkits22nmpdfs
22nm-Details_Presentationpdf
[13] Ke Cao S Dobre and Jiang Hu Standard cell charac-terization considering lithography induced variations InProc 43rd ACMIEEE Design Automation Conf pages 801ndash8042006 doi 101109DAC2006229327
[14] A P Chandrakasan S Sheng and R W Brodersen Low-power cmos digital design 27(4)473ndash484 1992 doi 1011094126534
[15] J C Chen B W McGaughy D Sylvester and Chenming HuAn on-chip attofarad interconnect charge-based capacitancemeasurement (cbcm) technique In Proc Int Electron Devices
Meeting IEDM rsquo96 pages 69ndash72 1996 doi 101109IEDM1996553124
[16] Po-Yuan Chen Cheng-Wen Wu and Ding-Ming Kwai On-chip tsv testing for 3d ic before bonding using sense ampli-fication In Proc ATS rsquo09 Asian Test Symp pages 450ndash4552009 doi 101109ATS200942
[17] Sungdong Cho Sinwoo Kang Kangwook Park Jaechul KimKiyoung Yun Kisoon Bae Woon Seob Lee Sangwook JiEunji Kim Jangho Kim Y L Park and E S Jung Impact oftsv proximity on 45nm cmos devices in wafer level In Proc
IEEE Int Interconnect Technology Conf and 2011 Materials for
Advanced Metallization (IITCMAM) pages 1ndash3 2011 doi101109IITC20115940326
[18] Kyu-Myung Choi An industrial perspective of 3d ic inte-gration technology from the viewpoint of design technologyIn Proc 15th Asia and South Pacific Design Automation Conf
(ASP-DAC) pages 544ndash547 2010 doi 101109ASPDAC20105419823
[19] Juang-Ying Chueh C H Ziesler and M C PapaefthymiouEmpirical evaluation of timing and power in resonant clockdistribution In Proc Int Symp Circuits and Systems ISCAS
rsquo04 volume 2 2004 doi 101109ISCAS20041329255
[20] Thuy Dao D H Triyoso M Petras and M CanonicoThrough silicon via stress characterization In Proc IEEE
Int Conf IC Design and Technology ICICDT rsquo09 pages 39ndash412009 doi 101109ICICDT20095166260
[21] M Grange R Weerasekera D Pamunuwa and H Ten-hunen Exploration of through silicon via interconnect par-
bibliography 124
asitics for 3-dimensional integrated circuits In Workshop
Notes Design Automation and Test in Europe (DATE) 2009
[22] N Z Haron and S Hamdioui Why is cmos scaling comingto an end In Proc 3rd Int Design and Test Workshop IDT
2008 pages 98ndash103 2008 doi 101109IDT20084802475
[23] Imec 200mm 3d-sic tsv line 2009 URL httpwwwimec
beScientificReportSR2009HTML1213307html
[24] J W Joyner R Venkatesan P Zarkesh-Ha J A Davis andJ D Meindl Impact of three-dimensional architectures oninterconnects in gigascale integration 9(6)922ndash928 2001doi 10110992974905
[25] A P Karmarkar Xiaopeng Xu and V Moroz Perfor-manace and reliability analysis of 3d-integration struc-tures employing through silicon via (tsv) In Proc IEEE
Int Reliability Physics Symp pages 682ndash687 2009 doi101109IRPS20095173329
[26] G Katti A Mercha M Stucchi Z Tokei D VelenisJ Van Olmen C Huyghebaert A Jourdain M RakowskiI Debusschere P Soussan H Oprins W DehaeneK De Meyer Y Travaly E Beyne S Biesemans andB Swinnen Temperature dependent electrical character-istics of through-si-via (tsv) interconnections In Proc Int
Interconnect Technology Conf (IITC) pages 1ndash3 2010 doi101109IITC20105510311
[27] G Katti M Stucchi K De Meyer and W Dehaene Electricalmodeling and characterization of through silicon via forthree-dimensional ics 57(1)256ndash262 2010 doi 101109TED20092034508
[28] G Katti M Stucchi J Van Olmen K De Meyer and W De-haene Through-silicon-via capacitance reduction techniqueto benefit 3-d ic performance 31(6)549ndash551 2010 doi101109LED20102046712
[29] Walt Kester Sensor signal conditioning In Jon SWilson editor Sensor Technology Handbook pages 31 ndash136 Newnes Burlington 2005 ISBN 978-0-75-067729-5 doi DOI101016B978-075067729-550044-6 URLhttpwwwsciencedirectcomsciencearticlepii
B9780750677295500446
[30] Jonggab Kil Jie Gu and C H Kim A high-speed variation-tolerant interconnect technique for sub-threshold circuitsusing capacitive boosting 16(4)456ndash465 2008 doi 101109TVLSI2007915455
bibliography 125
[31] Suhwan Kim C H Ziesler and M C PapaefthymiouCharge-recovery computing on silicon 54(6)651ndash659 2005doi 101109TC200591
[32] C Kortekaas On-chip quasi-static floating-gate capacitancemeasurement method In Proc Int Conf Microelectronic Test
Structures ICMTS 1990 pages 109ndash113 1990 doi 101109ICMTS199067889
[33] A Kramer J S Denker B Flower and J Moroney 2ndorder adiabatic computation with 2n-2p and 2n-2n2p logiccircuits In Proceedings of the 1995 international symposium on
Low power design ISLPED rsquo95 pages 191ndash196 New York NYUSA 1995 ACM ISBN 0-89791-744-8 doi httpdoiacmorg101145224081224115 URL httpdoiacmorg10
1145224081224115
[34] William H Kruskal and W Allen Wallis Use of ranks in one-criterion variance analysis Journal of the American Statistical
Association 47(260)583ndash621 1952 ISSN 01621459 doi 1023072280779 URL httpdxdoiorg1023072280779
[35] Kelin J Kuhn Moorersquos law past 32nm Future challengesin device scaling 2009 URL httpdownloadintelcom
pressroompdfkkuhnKuhn_IWCE_invited_textpdf
[36] R Landauer Irreversibility and heat generation in the com-puting process IBM Journal of Research and Development 5
(3)183ndash191 1961 doi 101147rd530183
[37] Joonho Lim Dong-Gyu Kim and Soo-Ik Chae A 16-bitcarry-lookahead adder using reversible energy recoverylogic for ultra-low-energy systems 34(6)898ndash903 1999 doi1011094766827
[38] Joonho Lim Dong-Gyu Kim and Soo-Ik Chae nmos re-versible energy recovery logic for ultra-low-energy applica-tions 35(6)865ndash875 2000 doi 1011094845190
[39] F Liu X Gu K A Jenkins E A Cartier Y Liu P Song andS J Koester Electrical characterization of 3d through-silicon-vias In Proc 60th Electronic Components and Technology Conf
(ECTC) pages 1100ndash1105 2010 doi 101109ECTC20105490839
[40] K H Lu Xuefeng Zhang Suk-Kyu Ryu J Im Rui Huangand P S Ho Thermo-mechanical reliability of 3-d ics con-taining through silicon vias In Proc 59th Electronic Compo-
nents and Technology Conf ECTC 2009 pages 630ndash634 2009doi 101109ECTC20095074079
bibliography 126
[41] Nir Magen Avinoam Kolodny Uri Weiser and NachumShamir Interconnect-power dissipation in a microprocessorIn Proceedings of the 2004 international workshop on System
level interconnect prediction SLIP rsquo04 pages 7ndash13 New YorkNY USA 2004 ACM ISBN 1-58113-818-0 doi httpdoiacmorg101145966747966750 URL httpdoiacm
org101145966747966750
[42] H Mahmoodi V Tirumalashetty M Cooke and K RoyUltra low-power clocking scheme using energy recoveryand clock gating 17(1)33ndash44 2009 doi 101109TVLSI20082008453
[43] D Maksimovic A mos gate drive with resonant transitionsIn Proc nd Annual IEEE Power Electronics Specialists Conf
PESC rsquo91 Record pages 527ndash532 1991 doi 101109PESC1991162725
[44] D Maksimovic and V G Oklobdzija Integrated power clockgenerators for low energy logic In Proc th Annual IEEE
Power Electronics Specialists Conf PESC rsquo95 Record volume 1pages 61ndash67 1995 doi 101109PESC1995474793
[45] D Maksimovic V G Oklobdzija B Nikolic and K WCurrent Clocked cmos adiabatic logic with integratedsingle-phase power-clock supply 8(4)460ndash463 2000 doi10110992863629
[46] E J Marinissen Testing tsv-based three-dimensional stackedics In Proc Design Automation amp Test in Europe Conf amp
Exhibition (DATE) pages 1689ndash1694 2010
[47] E J Marinissen and Y Zorian Testing 3d chips containingthrough-silicon vias In Proc Int Test Conf ITC 2009 pages1ndash11 2009 doi 101109TEST20095355573
[48] A Mercha A Redolfi M Stucchi N Minas J Van Ol-men S Thangaraju D Velenis S Domae Y Yang G KattiR Labie C Okoro M Zhao P Asimakopoulos I De WolfT Chiarella T Schram E Rohr A Van Ammel A Jour-dain W Ruythooren S Armini A Radisic H PhilipsenN Heylen M Kostermans P Jaenen E Sleeckx D Sabun-cuoglu Tezcan I Debusschere P Soussan D PerryG Van der Plas J H Cho P Marchal Y Travaly E BeyneS Biesemans and B Swinnen Impact of thinning andthrough silicon via proximity on high-k metal gate firstcmos performance In Proc Symp VLSI Technology (VLSIT)pages 109ndash110 2010 doi 101109VLSIT20105556190
[49] A Mercha G Van der Plas V Moroz I De Wolf P Asi-makopoulos N Minas S Domae D Perry M Choi
bibliography 127
A Redolfi C Okoro Y Yang J Van Olmen S ThangarajuD S Tezcan P Soussan J H Cho A Yakovlev P Mar-chal Y Travaly E Beyne S Biesemans and B SwinnenComprehensive analysis of the impact of single and arraysof through silicon vias induced stress on high-k metalgate cmos performance In Proc IEEE Int Electron Devices
Meeting (IEDM) 2010 doi 101109IEDM20105703278
[50] Yong Moon and Deog-Kyoon Jeong An efficient chargerecovery logic circuit 31(4)514ndash522 1996 doi 1011094499727
[51] G E Moore Cramming more components onto inte-grated circuits Electronics 38(8)114ndash117 April 1965 doi101109JPROC1998658762 URL httpdxdoiorg10
1109JPROC1998658762
[52] O S Nakagawa S-Y Oh T Hsu and S Habu Benchmarkmethodology of interconnect capacitance simulation usinginter-digitated capacitors In Proc Int Microelectronic Test
Structures ICMTS 1998 Conf pages 235ndash237 1998 doi 101109ICMTS1998688103
[53] M W Newman S Muthukumar M SchueleinT Dambrauskas P A Dunaway J M Jordan S KulkarniC D Linde T A Opheim R A Stingel W WorwagL A Topic and J M Swan Fabrication and electricalcharacterization of 3d vertical interconnects In Proc
56th Electronic Components and Technology Conf 2006 doi101109ECTC20061645676
[54] V G Oklobdzija D Maksimovic and Fengcheng Lin Pass-transistor adiabatic logic using single power-clock supply44(10)842ndash846 1997 doi 10110982633443
[55] J Van Olmen C Huyghebaert J Coenen J Van AelstE Sleeckx A Van Ammel S Armini G Katti J VaesW Dehaene E Beyne and Y Travaly Integration chal-lenges of copper through silicon via (tsv) metallization for3d-stacked ic integration Microelectronic Engineering 88
(5)745 ndash 748 2011 ISSN 0167-9317 doi DOI101016jmee201006026 URL httpwwwsciencedirectcom
sciencearticlepiiS0167931710002121 The 2010 Inter-national workshop on - MAM 2010 The 2010 Internationalworkshop on
[56] J J Paulos and D A Antoniadis Measurement of minimum-geometry mos transistor capacitances 20(1)277ndash283 1985doi 101109JSSC19851052303
bibliography 128
[57] Vasilis F Pavlidis and Eby G Friedman Interconnect de-lay minimization through interlayer via placement in 3-dics In Proceedings of the 15th ACM Great Lakes symposium
on VLSI GLSVLSI rsquo05 pages 20ndash25 New York NY USA2005 ACM ISBN 1-59593-057-4 doi httpdoiacmorg10114510576611057669 URL httpdoiacmorg101145
10576611057669
[58] Dan Perry Jonghoon Cho Shinichi Domae Panagiotis Asi-makopoulos Alex Yakovlev Pol Marchal Geert Van derPlas and Nikolaos Minas An efficient array structure tocharacterize the impact of through silicon vias on fet devicesIn Proc IEEE Int Microelectronic Test Structures (ICMTS) Confpages 118ndash122 2011 doi 101109ICMTS20115976872
[59] Jan M Rabaey Anantha Chandrakasan and Borivoje NikolicDigital integrated circuits- A design perspective Prentice Hall2ed edition 2004
[60] K Roy and S Prasad Low-power CMOS VLSI circuit de-
sign A Wiley-Interscience publication Wiley 2000 ISBN9780471114888
[61] K Roy S Mukhopadhyay and H Mahmoodi-MeimandLeakage current mechanisms and leakage reduction tech-niques in deep-submicrometer cmos circuits 91(2)305ndash3272003 doi 101109JPROC2002808156
[62] V S Sathe M C Papaefthymiou and C H Ziesler Boostlogic a high speed energy recovery circuit family In Proc
IEEE Computer Society Annual Symp VLSI pages 22ndash27 2005doi 101109ISVLSI200522
[63] V S Sathe J C Kao and M C Papaefthymiou Resonant-clock latch-based design 43(4)864ndash873 2008 doi 101109JSSC2008917501
[64] B Sell A Avellan and W H Krautschneider Charge-basedcapacitance measurements (cbcm) on mos devices 2(1)9ndash122002 doi 101109TDMR20021014667
[65] C S Selvanayagam Xiaowu Zhang R Rajoo and D PinjalaModelling stress in silicon with tsvs and its effect on mobilityIn Proc 11th Electronics Packaging Technology Conf EPTC rsquo09pages 612ndash618 2009 doi 101109EPTC20095416477
[66] D Shariff P C Marimuthu K Hsiao L Asoy Chia LaiYee Aung Kyaw Oo K Buchanan K Crook T Wilby andS Burgess Integration of fine-pitched through-silicon viasand integrated passive devices In Proc IEEE 61st Elec-
tronic Components and Technology Conf (ECTC) pages 844ndash848 2011 doi 101109ECTC20115898609
bibliography 129
[67] S Stoukatch C Winters E Beyne W De Raedt andC Van Hoof 3d-sip integration for autonomous sensornodes In Proc 56th Electronic Components and Technology
Conf 2006 doi 101109ECTC20061645678
[68] L J Svensson and J G Koller Driving a capacitive loadwithout dissipating fcv2 In Proc IEEE Symp Low Power
Electronics Digest of Technical Papers pages 100ndash101 1994 doi101109LPE1994573220
[69] B Swinnen W Ruythooren P De Moor L Bogaerts L Car-bonell K De Munck B Eyckens S Stoukatch D S TezcanZ Tokei J Vaes J Van Aelst and E Beyne 3d integrationby cu-cu thermo-compression bonding of extremely thinnedbulk-si die containing 10 IcircŒm pitch through-si vias In Proc
Int Electron Devices Meeting IEDM rsquo06 pages 1ndash4 2006 doi101109IEDM2006346786
[70] N Tanaka M Kawashita Y Yoshimura T Uematsu M Fuji-sawa H Shimokawa N Kinoshita T Naito T Kikuchi andT Akazawa Characterization of mos transistors after tsvfabrication and 3d-assembly In Proc 2nd Electronics System-
Integration Technology Conf ESTC 2008 pages 131ndash134 2008doi 101109ESTC20084684338
[71] S E Thompson Guangyu Sun Youn Sung Choi andT Nishida Uniaxial-process-induced strained-si extend-ing the cmos roadmap 53(5)1010ndash1020 2006 doi 101109TED2006872088
[72] N Tzartzanis and W C Athas Clock-powered cmos ahybrid adiabatic logic style for energy-efficient computingIn Proc 20th Anniversary Conf Advanced Research in VLSIpages 137ndash151 1999 doi 101109ARVLSI1999756044
[73] J Van Olmen A Mercha G Katti C HuyghebaertJ Van Aelst E Seppala Zhao Chao S Armini J VaesR C Teixeira M Van Cauwenberghe P Verdonck K Ver-hemeldonck A Jourdain W Ruythooren M de Potterde ten Broeck A Opdebeeck T Chiarella B Parvais I De-busschere T Y Hoffmann B De Wachter W DehaeneM Stucchi M Rakowski P Soussan R Cartuyvels E BeyneS Biesemans and B Swinnen 3d stacked ic demonstrationusing a through silicon via first approach In Proc IEEE Int
Electron Devices Meeting IEDM 2008 pages 1ndash4 2008 doi101109IEDM20084796763
[74] Tom Verhoeff Delay-insensitive codes - an overview Dis-
tributed Computing 31ndash8 1988 ISSN 0178-2770 URL http
dxdoiorg101007BF01788562 101007BF01788562
bibliography 130
[75] P Vitanov T Dimitrova and I Eisele Direct capacitancemeasurements of small geometry mos transistors Micro-
electronics Journal 22(7-8)77 ndash 89 1991 ISSN 0026-2692doi DOI1010160026-2692(91)90016-G URL httpwww
sciencedirectcomsciencearticleB6V44-4829XDJ-2N
2f5b2c221dcf240c38cd13b7b3432f913
[76] B Voss and M Glesner A low power sinusoidal clockIn Proc IEEE Int Symp Circuits and Systems ISCAS 2001volume 4 pages 108ndash111 2001 doi 101109ISCAS2001922182
[77] R Weerasekera M Grange D Pamunuwa H Tenhunenand Li-Rong Zheng Compact modelling of through-siliconvias (tsvs) in three-dimensional (3-d) integrated circuits InProc IEEE Int Conf 3D System Integration 3DIC 2009 pages1ndash8 2009 doi 1011093DIC20095306541
[78] NHE Weste and DF Harris CMOS VLSI design a cir-
cuits and systems perspective PearsonAddison-Wesley 2005ISBN 9780321269775
[79] Steve Wozniak Failure magazine interview (by jason zasky)July 2000 URL httpfailuremagcomindexphpsite
printsteve_wozniak_interview
[80] Hyung Suk Yang and Muhannad S Bakir 3d integrationof cmos and mems using mechanically flexible intercon-nects (mfi) and through silicon vias (tsv) In Proc 60th Elec-
tronic Components and Technology Conf (ECTC) pages 822ndash828 2010 doi 101109ECTC20105490716
[81] Yu Yang G Katti R Labie Y Travaly B Verlinden andI De Wolf Electrical evaluation of 130-nm mosfets withtsv proximity in 3d-sic structure In Proc Int Interconnect
Technology Conf (IITC) pages 1ndash3 2010 doi 101109IITC20105510710
[82] Yibin Ye and K Roy Energy recovery circuits using re-versible and partially reversible logic 43(9)769ndash778 1996doi 10110981536746
[83] Yibin Ye and K Roy Qserl quasi-static energy recoverylogic 36(2)239ndash248 2001 doi 1011094902764
[84] C C Yeh J H Lou and J B Kuo 15 v cmos full-swingenergy efficient logic (eel) circuit suitable for low-voltageand low-power vlsi applications Electronics Letters 33(16)1375ndash1376 1997 doi 101049el19970915
bibliography 131
[85] C P Yuan and T N Trick A simple formula for the esti-mation of the capacitance of two-dimensional interconnectsin vlsi circuits 3(12)391ndash393 1982 doi 101109EDL198225610
- Abstract
- Publications
- Acknowledgments
- Contents
- List of Figures
- List of Tables
- Acronyms
- 1 Introduction
-
- 11 Research goals and contribution
- 12 Thesis outline
-
- 2 Background
-
- 21 3D integrated circuits
-
- 211 3D integration approaches
- 212 TSV interconnects
-
- 22 Energy-recovery logic
-
- 221 Energy-recovery architectures
- 222 Power-clock generators
-
- 23 Full-custom design flow
- 24 Summary
-
- 3 Characterization of TSV capacitance
-
- 31 Introduction
- 32 TSV parasitic capacitance
- 33 Integrated capacitance measurement techniques
- 34 Electrical characterization
-
- 341 Physical implementation
- 342 Measurement setup
- 343 Experimental results
-
- 35 Summary and conclusions
-
- 4 TSV proximity impact on MOSFET performance
-
- 41 Introduction
- 42 TSV-induced thermomechanical stress
- 43 Test structure implementation
-
- 431 MOSFET arrays
- 432 Digital selection logic
-
- 44 Experimental measurements
-
- 441 Data processing methodology
- 442 Measurement setup
- 443 PMOS (long channel)
- 444 PMOS (short channel)
-
- 45 Summary and conclusions
-
- 5 Evaluation of energy-recovery for TSV interconnects
-
- 51 Introduction
- 52 Energy-recovery TSV interconnects
- 53 Analysis
-
- 531 Adiabatic driver
- 532 Inductors parasitic resistance
- 533 Switch M1
- 534 Timing Constraints
-
- 54 Simulations
-
- 541 Comparison to CMOS
-
- 55 Summary and conclusions
-
- 6 Energy-recovery 3D-IC demonstrator
-
- 61 Introduction
- 62 Physical implementation
-
- 621 Adiabatic driver
- 622 Pulse-to-level converter
- 623 Data generator
- 624 Pulse generator
- 625 Integrated inductor
- 626 Top level design
- 627 CMOS IO circuit
-
- 63 Post-layout simulation and comparison
- 64 Experimental measurements
-
- 641 Measurement setup
- 642 Results
-
- 65 Summary and conclusions
-
- 7 Conclusions
-
- 71 Main contributions
- 72 Future work
-
- A Appendix A Photos
- B Appendix B Statistics
- Bibliography
-
P U B L I C AT I O N S
Some ideas and figures have appeared previously in the followingpublications
1 D Perry J Cho S Domae P Asimakopoulos A YakovlevP Marchal G Van der Plas and N Minas An efficientarray structure to characterize the impact of through siliconvias on fet devices In Proc IEEE Int Microelectronic TestStructures (ICMTS) Conf pages 118ndash122 2011 (best paper
award)
2 A Mercha G Van der Plas V Moroz I De Wolf P Asi-
makopoulos N Minas S Domae D Perry M Choi ARedolfi C Okoro Y Yang J Van Olmen S ThangarajuD S Tezcan P Soussan J H Cho A Yakovlev P Mar-chal Y Travaly E Beyne S Biesemans and B SwinnenComprehensive analysis of the impact of single and arraysof through silicon vias induced stress on high-k metalgate cmos performance In Proc IEEE Int Electron DevicesMeeting (IEDM) 2010
3 A Mercha A Redolfi M Stucchi N Minas J Van OlmenS Thangaraju D Velenis S Domae Y Yang G Katti RLa- bie C Okoro M Zhao P Asimakopoulos I De WolfT Chiarella T Schram E Rohr A Van Ammel A Jour-dain W Ruythooren S Armini A Radisic H PhilipsenN Heylen M Kostermans P Jaenen E Sleeckx D Sabun-cuoglu Tezcan I Debusschere P Soussan D Perry G Vander Plas J H Cho P Marchal Y Travaly E Beyne S Biese-mans and B Swinnen Impact of thinning and throughsilicon via proximity on high-k metal gate first cmos per-formance In Proc Symp VLSI Technology (VLSIT) pages109ndash110 2010
4 P Asimakopoulos G Van der Plas A Yakovlev and PMarchal Evaluation of energy-recovering interconnects forlow-power 3d stacked ics In Proc IEEE Int Conf 3D Sys-tem Integration 3DIC 2009 pages 1ndash5 2009
5 P Asimakopoulos and A Yakovlev An AdiabaticPower-Supply Controller For Asynchronous Logic Circuits20th UK Asynchronous Forum pages 1ndash4 2008
iv
In the end I hope therersquos a little note somewhere
that says I designed a good computer
mdash Steve Wozniak [79]
A C K N O W L E D G E M E N T S
First and foremost I would like to express my gratitude to mysupervisor Prof Alex Yakovlev who provided the perfect balancebetween freedom and guidance allowing me to obtain sphericalknowledge and scope in microelectronics but at the same time topromptly complete my studies Irsquom also grateful to the Engineer-ing and Physical Science Research Council (EPSRC) for fundingmy PhD studies through their Doctoral Training Accounts
As part of my research project I had the opportunity to spendsome time at Imec Belgium The experiences and lessons learnedduring my stay at Imec will undoubtedly prove valuable in myforthcoming professional career I would like to thank Dan Perryfrom Qualcomm for sharing his experience and ideas for the de-sign of the characterization structures implemented on the 65nm
Imec process Special thanks to Shinichi Domae from Panasonicand Dr Dimitrios Velenis for their assistance with the 65nm wafermeasurements as well as Marco Facchini for the time he spendtrying to measure the energy-recovery circuits Last but not leastIrsquod like to thank Geert Van der Plas and Paul Marchal for theiradvices and guidance throughout my stay at Imec
Finally many thanks to my colleagues at Newcastle Universityfor assisting me in diverse ways during my studies I would liketo especially mention Dr Santosh Shedabale and James Dochertyfor reviewing and commenting on parts of this text as well asDr Robin Emery for the numerous technical discussions and forhelping setup the design tools
v
C O N T E N T S
1 Introduction 1
11 Research goals and contribution 2
12 Thesis outline 3
2 Background 5
21 3D integrated circuits 5
211 3D integration approaches 6
212 TSV interconnects 9
22 Energy-recovery logic 13
221 Energy-recovery architectures 17
222 Power-clock generators 18
23 Full-custom design flow 20
24 Summary 21
3 Characterization of TSV capacitance 23
31 Introduction 23
32 TSV parasitic capacitance 23
33 Integrated capacitance measurement techniques 24
34 Electrical characterization 27
341 Physical implementation 28
342 Measurement setup 36
343 Experimental results 37
35 Summary and conclusions 44
4 TSV proximity impact on MOSFET performance 46
41 Introduction 46
42 TSV-induced thermomechanical stress 46
43 Test structure implementation 48
431 MOSFET arrays 51
432 Digital selection logic 55
44 Experimental measurements 61
441 Data processing methodology 61
442 Measurement setup 62
443 PMOS (long channel) 64
444 PMOS (short channel) 71
45 Summary and conclusions 73
5 Evaluation of energy-recovery for TSV interconnects 75
51 Introduction 75
52 Energy-recovery TSV interconnects 75
53 Analysis 79
531 Adiabatic driver 80
532 Inductorrsquos parasitic resistance 82
533 Switch M1 82
534 Timing Constraints 83
54 Simulations 83
541 Comparison to CMOS 86
vi
contents vii
55 Summary and conclusions 91
6 Energy-recovery 3D-IC demonstrator 92
61 Introduction 92
62 Physical implementation 92
621 Adiabatic driver 94
622 Pulse-to-level converter 95
623 Data generator 95
624 Pulse generator 95
625 Integrated inductor 97
626 Top level design 98
627 CMOS IO circuit 101
63 Post-layout simulation and comparison 101
64 Experimental measurements 106
641 Measurement setup 106
642 Results 106
65 Summary and conclusions 111
7 Conclusions 114
71 Main contributions 115
72 Future work 116
a Appendix A Photos 118
b Appendix B Statistics 120
bibliography 122
L I S T O F F I G U R E S
Figure 11 Scaling trend of logic and interconnect delay[30] 2
Figure 21 Comparison of 2D3D Integration 5
Figure 22 System-in-a-package 6
Figure 23 Package-on-package 6
Figure 24 Imecrsquos 3-D SIP wireless bio-electronic sensor[67] 7
Figure 25 3D wafer-level-packaging 7
Figure 26 3D-stacked IC 8
Figure 27 TSV process steps 9
Figure 28 Before thinningafter thinning 10
Figure 29 TSV stacking 10
Figure 210 Conventional AND gate 14
Figure 211 CMOS switching 16
Figure 212 Adiabatic switching 16
Figure 213 Basic 2N-2P adiabatic circuit 17
Figure 214 2N-2P timing 17
Figure 215 Adiabatic driver 18
Figure 216 Stepwise charging generator 19
Figure 217 1N power clock generator 20
Figure 218 Design flow 22
Figure 219 Inverter in Virtuoso schematic 22
Figure 220 Inverter in Virtuoso layout 22
Figure 31 Isolated 2D wire capacitance 24
Figure 32 TSV parasitic capacitance 25
Figure 33 Capacitance vs Gate voltage (CV) diagramof a MOS Capacitor 25
Figure 34 CBCM pseudo-inverters 27
Figure 35 Current as a function of frequency in CBCM 28
Figure 36 CBCM pseudo-inverter pair in layout 29
Figure 37 PRC1 test pad module (layout) 30
Figure 38 PRC1 test pad module (micrograph) 30
Figure 39 Structure rsquoArsquo - Single TSV capacitance (TOPdie) 32
Figure 310 Structure rsquoBrsquo - Single TSV capacitance (BOT-TOM die) 33
Figure 311 Structure rsquoCrsquo - Variable pitch TSV capacitance34
Figure 312 Structure rsquoDrsquo - TSV capacitance measure-ment with interconnection on TOP or BOT-TOM die 34
Figure 313 Structure rsquoRrsquo - Evaluation of TSV capacit-ance measurement accuracy 35
viii
List of Figures ix
Figure 314 Structure rsquoErsquo - TSV capacitance effect onsignal propagation delay (layout) 35
Figure 315 Structure rsquoErsquo - TSV capacitance effect onsignal propagation delay (schematic) 36
Figure 316 Measurement setup 36
Figure 317 TSV capacitance D2D variability on waferrsquoD11rsquo - Probability density function 39
Figure 318 TSV capacitance D2D variability on waferrsquoD18rsquo - Probability density function 40
Figure 319 TSV capacitance measurement results onwafer D11 40
Figure 320 TSV capacitance measurement results onwafer D18 41
Figure 321 Simulated oxide liner thickness variation 42
Figure 322 TSV capacitance W2W variability - Probab-ility density functions 42
Figure 323 Variable pitch TSV capacitance experimentaldata from wafer rsquoD18rsquo - Probability densityfunction 44
Figure 324 LCR experimental data from wafer rsquoD18rsquo -Probability density function 44
Figure 41 TSV thermomechanical stress simulation [40] 47
Figure 42 TSV thermomechanical stress interactionbetween two TSVs [40] 48
Figure 43 Estimated mobility shift on PMOS (a) andNMOS (b) transistors in between two TSVsseparated by distance d [65] 49
Figure 44 PTP1 test pad module (layout) 50
Figure 45 PTP1 test pad module (micrograph) 50
Figure 46 MOSFET dimensions and rotations 51
Figure 47 1 TSV MOSFET array configuration 52
Figure 48 4 diagonal TSVs MOSFET array configuration52
Figure 49 4 lateral TSVs MOSFET array configuration 53
Figure 410 9 TSVs MOSFET array configuration 53
Figure 411 MOSFET array detail 55
Figure 412 Digital selection logic 56
Figure 413 MOSFET array with digital selection logic 57
Figure 414 4-bit Gate-enableArray-enable decoder 58
Figure 415 Transmission gate for MOSFET Gate terminals59
Figure 416 Transmission gates for MOSFET Drain-Source terminals 59
Figure 417 2-bit Drain-select decoder 60
Figure 418 Processing of measurement results 62
Figure 419 Interpolation of measurement results 63
Figure 420 Measurement setup 63
Figure 421 Normalised on-current for PMOS structureldquoF0rdquo (D08) 65
List of Figures x
Figure 422 Normalised on-current for PMOS structureldquoG0rdquo (D08) 66
Figure 423 Normalised on-current for PMOS structureldquoG0rdquo (D08Y=14X=16) 66
Figure 424 Normalised on-current for PMOS structureldquoG0rdquo (D20) 67
Figure 425 Normalised on-current for PMOS structureldquoG0rdquo (D20Y=14X=16) 67
Figure 426 Normalised on-current for PMOS structureldquoG0rdquo (D08 60˚C) 68
Figure 427 Normalised on-current for PMOS structureldquoG0rdquo (D08 Y=14 X=16 60˚C) 69
Figure 428 Normalised on-current for PMOS structureldquoI0rdquo (D08) 69
Figure 429 Normalised on-current for PMOS structureldquoI0rdquo (D08 X=16) 70
Figure 430 Normalised on-current for PMOS structureldquoI0rdquo (D08 Y=14) 70
Figure 431 Normalised on-current for PMOS structureldquoG90rdquo (D08) 71
Figure 432 Normalised on-current for PMOS structureldquoG90rdquo (D08 Y=14 X=16) 71
Figure 433 Normalised on-current for PMOS structureldquoB0rdquo (D08) 72
Figure 434 Normalised on-current for PMOS structureldquoB0rdquo (D08 Y=14 X=16) 72
Figure 51 TSV energy dissipation per cycle increasewith 3D integration density 76
Figure 52 Comparison of a typical CMOS TSV driverto the proposed energy-recovery 77
Figure 53 Adiabatic driver 78
Figure 54 A simple pulse-to-level converter (P2LC) 78
Figure 55 Resonant pulse generator with dual-rail dataencoding 79
Figure 56 Energy-recovery scheme for TSV interconnects79
Figure 57 Energy-recovery timing constraints 83
Figure 58 Test case for the evaluation of the theoret-ical model 84
Figure 59 Energy dissipation results ( f Jcycle) the-oretical model (T) versus SPICE simulator(S) 85
Figure 510 Energy dissipation estimation error of thetheoretical model compared to the SPICEsimulator 85
Figure 511 Test case for comparison of energy-recoveryto CMOS 86
Figure 512 Improved P2LC design 87
List of Figures xi
Figure 513 Simulated energy dissipation of improvedP2LC design 87
Figure 514 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS for variable Q factor (C = 80 f F) 88
Figure 515 Energy contribution of energy-recovery cir-cuit components at 200MHz (C = 80 f F) 88
Figure 516 Energy contribution of energy-recovery cir-cuit components at 800MHz (C = 80 f F) 89
Figure 517 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS with variable TSV capacitance (C =
80 f F f = 500MHz) 89
Figure 518 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS (C = 80 f F f = 200MHz) 90
Figure 519 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS for variable technology (C = 80 f Ff = 500MHz) 90
Figure 61 Demonstrator circuit structure (showing 1-bit inputoutput (IO) channel face-downview) 93
Figure 62 Energy-recovery IO circuit schematic dia-gram 94
Figure 63 Adiabatic driver (layoutschematic view) 94
Figure 64 P2LC (layoutschematic view) 95
Figure 65 Data generator (layoutschematic view) 96
Figure 66 Timing constraints for the energy-recoverydemonstrator circuit 96
Figure 67 Pulse generator (layoutschematic view) 97
Figure 68 Pulse generator InputOutput signals 97
Figure 69 Inductorrsquos Q factor simulation in ASITIC Estimated energy dissipation reduction percycle over standard CMOS 98
Figure 610 Planar square spiral inductor designed inASITIC 99
Figure 611 Resonant pulse distribution tree 100
Figure 612 Interdigitated metal-insulator-metal capa-citor (fringe capacitor) 101
Figure 613 Energy-recovery IO circuit in layout view 102
Figure 614 CMOS IO circuit schematic diagram 103
Figure 615 CMOS driver schematic 103
Figure 616 CMOS IO circuit in layout view 103
Figure 617 Post-layout SPICE simulation of the energy-recovery circuit 105
Figure 618 Energy-recovery IO circuit micrograph 107
List of Figures xii
Figure 619 CMOS IO circuit micrograph 107
Figure 620 Measurement setup 107
Figure 621 Oscilloscope output for signals S1B and OC-MOS 109
Figure 622 Measuredsimulated power dissipation forthe CMOS circuit 109
Figure 623 Measuredsimulated energy dissipation forthe CMOS circuit 110
Figure 624 Oscilloscope output for signals S1A and OER110
Figure 625 Measuredsimulated energy dissipation forthe energy-recovery circuit 111
Figure 626 Energy-recovery IO circuit with fault hy-pothesis 111
Figure 627 Simulation of circuit with fault hypothesis 112
Figure 628 Measured energy dissipation and simula-tion of circuit with fault hypothesis 113
Figure A1 3D65 300mm wafer (on chuck) 118
Figure A2 Probe card 118
Figure A3 3D130 200mm wafer (on chuck) 119
Figure A4 Probe head 119
L I S T O F TA B L E S
Table 21 3D interconnect technologies comparison 8
Table 22 Technology parameters for Imec 3D130 process11
Table 23 Technology parameters for Imec 3D65 process11
Table 24 Comparison energy-recoveryconventionalCMOS 16
Table 31 Test structures on the PRC1 module 31
Table 32 PRC1 test pad instrument connections 37
Table 33 TSV capacitance D2D variability on waferrsquoD11rsquo 38
Table 34 TSV capacitance D2D variability on waferrsquoD18rsquo 39
Table 35 Variable pitch TSV capacitance experimentaldata from wafer rsquoD18rsquo 43
Table 36 Comparison between CBCMLCR measure-ments on wafer rsquoD18rsquo 45
Table 41 Coefficients of thermal expansion 47
Table 42 MOSFET array configurations for test padmodules PTP1PTP2 54
Table 43 MOSFET array configurations selected forexperimental evaluation 61
Table 44 PTP1PTP2 test pad module instrumentconnections 64
Table 45 Summary of experimental measurementson long (L) and short (S) channel PMOSdevices 73
Table 51 Energy dissipation results ( f Jcycle) the-oretical model (T) versus SPICE simulation(S) 84
Table 61 Energy-recovery IO circuit parameters 93
Table 62 Calculated resistance tolerances for eachwire branch in the resonant clock distribu-tion tree 100
Table 63 Energy-dissipation theoretical calculationsfor Q = 17 f = 500MHz 104
Table 64 Post-layout simulated energy dissipation Area requirements 106
Table 65 Energy-recovery circuit instrument connec-tions 108
Table 66 CMOS circuit instrument connections 108
xiii
A C R O N Y M S
2D two-dimentional
2N-2P 2 nMOS2 pMOS adiabatic logic
3D-IC three-dimentional integrated circuit
3D-SIC 3D stacked-IC
3D-SIP 3D system-in-package
3D three-dimentional
3D-WLP 3D wafer-level-packaging
BCB benzocyclobutene
BEOL back-end-of-line
CAL clocked adiabatic logic
CBCM charge-based capacitance measurement
CMOS complementary metal-oxide-semiconductor
CMP chemical-mechanical planarization
CTE coefficient of thermal expansion
Cu copper
CVD chemical vapor deposition
D2D die-to-die
D2W die-to-wafer
DCVSL differential cascode voltage switch logic
DRC design rule check
DUT device under test
ECRL efficient charge recovery logic
EDA electronic design automation
FEM finite element method
FEOL front-end-of-line
GPIB general purpose interface bus
IC integrated circuit
xiv
acronyms xv
IO inputoutput
KOZ keep-out-zone
LC inductance-capacitance
LVS layout versus schematic
MEMS microelectromechanical systems
MIM metal-insulator-metal
MOSFET metal-oxide-semiconductor field-effect transistor
MOS metal-oxide-semiconductor
MPU microprocessor unit
MuGFET multiple gate field-effect transistor
NMOS n-channel MOSFET
P2LC pulse-to-level converter
PAL pass-transistor adiabatic logic
PCB printed circuit board
PCG power-clock generator
PMD pre-metal dielectric
PMOS p-channel MOSFET
PVT process voltage temperature
QSERL quasi-static energy recovery logic
RC resistance-capacitance
RDL re-distribution layer
RLC resistance-inductance-capacitance
RMS root mean square
SIP system-in-a-package
Si silicon
SMU source measurement unit
SoC system-on-chip
TaN tantalum nitride
TSV through-silicon via
W tungsten
acronyms xvi
W2W wafer-to-wafer
WLP wafer-level-packaging
1I N T R O D U C T I O N
Extending integrated circuit (IC) functionality while keepingcost per silicon area approximately constant has historicallybeen the fundamental factor behind the growth of the semi-conductor industry Ever since Moorersquos observation [51] in 1965
chip functionality has been doubling every 15-2 years advancingfrom the Intel 4004 with 2000 transistors to modern generationmicroprocessor units (MPUs) that integrate over 1 billion tran-sistors [35] While there are indications that this trend might beslowing down due to planar metal-oxide-semiconductor field-effect transistor (MOSFET) physical limitations [22] innovativesolutions such as multiple gate field-effect transistors (MuGFETs)continue to drive the transistor roadmap forward as was recentlydemonstrated by Intel on their 22nm node MPU [12] It can bereasonably expected that IC transistor density will continue toincrease well into the next decade reaching 10 or even 100 billiontransistors per chip [4]
As transistor gate length dielectric thickness and junctiondepth are scaled performance in terms of speed and power im-proves as well The performance requirements for ICs have beensteadily increasing for succeeding technology generations drivenby market demand Although transistor scaling has traditionallybeen the main technology focus determining both cost and per-formance of ICs in recent years the performance gains enabled byscaling have been gradually challenged by on-chip interconnects
In an IC interconnects are the conducting wires connectingdevices and functional blocks in the circuit With smaller tran-sistors and increased functionality in modern processes intercon-nects have been progressively becoming more complex occupyingup to 12 metal layers in the chip As wire length increased sodid the signal delay time and power consumption of intercon-nects which are typically determined by wire intrinsic parasiticsin terms of resistance-capacitance (RC) It was estimated that a130nm microprocessor consumed over 50 of dynamic poweron interconnects [41] with a projection that in future designsup to 80 of microprocessor power would be consumed byinterconnects [3] Likewise signal propagation delay of globalinterconnects as a portion of the total system delay has beenrapidly increasing for succeeding technology nodes (Figure 11)
The rapid increase of the interconnect impact on IC perform-ance prompted ITRS1 to appoint the realisation of novel and
1 International Technology Roadmap for Semiconductors
1
11 research goals and contribution 2
Figure 11 Scaling trend of logic and interconnect delay [30]
innovative interconnect technologies as one of the ldquoGrand Chal-lengesrdquo in 2007 [3] Since then several technologies have been in-vestigated to address the interconnect bottleneck such as copperlow-k interconnects optical interconnects and three-dimentional (3D)interconnects Although copperlow-k interconnects are morecompatible with current processes and remain the most attractiveoption into the near term 3D interconnects are probably the mostrealistic option for meeting future technology trends [53]
Emerging 3D integration technologies based on vertical through-silicon vias (TSVs) promise a solution to the interconnect per-formance bottleneck along with reduced fabrication cost andheterogeneous integration TSVs are electrical connections passingthrough the silicon substrate and interconnecting separate diesstacked vertically in a 3D package [69] While there are several 3D
interconnect technologies [11] TSVs have the potential to offer thebest all around 3D integration technology with their advantagesincluding reduced parasitics and high interconnection density
11 research goals and contribution
As TSV is a relatively recent 3D interconnect technology there arestill challenges to be overcome before TSVs can be widely adoptedand commercialised Fabrication issues such as etching linerdeposition metallization and chemical-mechanical planarization(CMP) steps are in the process of being resolved or solutions havealready been found [55] Consequently the next logical step isthe implementation of three-dimentional integrated circuits (3D-
ICs) with more complex functionalities than simple test chipsHowever complex circuit design requires design methodologiesand tools that for 3D-ICs are still at their infancy stage [18]
The motivation behind this project was to accelerate the de-velopment of such design methodologies and tools for future
12 thesis outline 3
3D-IC designs by addressing key concerns of the TSV technologyprimarily from the circuit designerrsquos perspective The researcheffort was aimed at developing characterization structures forevaluating TSV characteristics affecting signal propagation delaypower consumption and circuit robustness as well as proposingdesign techniques for overcoming limitations Although TSVs areinterconnects their behaviour resembles more to MOSFET devicesthan the commonly used back-end-of-line (BEOL) metal wireswhile their integration into silicon introduces new challenges forICs Consequently under certain conditions conventional designapproaches are sufficient neither for TSV characterization nor forTSV interconnection
The TSV parasitic capacitance and thermomechanical stress dis-tribution are critical TSV characteristics The parasitic capacitanceaffects both delay and power consumption of signals passingthrough the TSV interconnect while thermomechanical stress af-fects MOSFET device mobility in the vicinity of TSVs Accuratelydetermining these parameters is necessary for generating designrules and models that can be used in TSV-aware design tools Aspart of this project sophisticated test structures were developedand fabricated on a 65nm 3D process for extracting the TSV para-sitic capacitance and thermomechanical stress distribution
The TSV parasitic capacitance is typically small however it maystill become an important source of power consumption as TSVs
are gradually implemented in large densely interconnected 3D-system-on-chips (SoCs) Although conventional low-power designtechniques based on voltage scaling can be used this representsa technical challenge in modern technology nodes In this worka novel TSV interconnection scheme is proposed which is basedon energy-recovery logic and shows frequency-dependent powerconsumption The scheme is analysed using theoretical modellingwhile a demonstrator IC was designed based on the developedtheory and fabricated on a 130nm 3D process
The work presented in this thesis was part of a larger researchproject undertaken at Imec2 with the goal of improving andoptimising TSV-based 3D integration technology The authorrsquoscontribution within this project was to propose design and sim-ulate the circuits presented in the following chapters as well asperform and analyse the experimental measurements
12 thesis outline
This thesis is organised into seven Chapters and two Appendixesas follows
2 Imec is a micro- and nanoelectronics research center located in Leuven Belgium(httpwwwimecbe)
12 thesis outline 4
Chapter 2 (Background) introduces the basic concepts com-prising the fields of 3D integration and energy-recovery logic Inaddition the design flow used for implementing the design partof this work is described
Chapter 3 (Characterization of TSV capacitance) presents agroup of test structures for characterizing the TSV parasitic ca-pacitance under various conditions The test structures are fab-ricated on a 65nm 3D process and the measurement data areanalysed using statistical methods
Chapter 4 (TSV proximity impact on MOSFET performance)presents a test structure for monitoring the impact of TSV-inducedthermomechanical stress on MOSFET device performance Thetest structure uses proven design techniques for implementinga large number of test-cases and accessing numerous deviceswith maximum precision The structure is fabricated and usedfor characterizing thermomechanical stress on a 65nm 3D process
Chapter 5 (Evaluation of energy-recovery for TSV intercon-nects) investigates the potential of the energy-recovery techniquefor reducing the energy dissipation of TSV interconnects in 3D-ICsThe total energy dissipation per cycle and optimum device sizingare extracted for the proposed scheme using theoretical model-ling while the configuration is evaluated against conventionalstatic complementary metal-oxide-semiconductor (CMOS) design
Chapter 6 (Energy-recovery 3D-IC demonstrator) elaborates onthe design of a 3D demonstrator circuit based on the energy-recovery scheme under realistic physical and electrical con-straints The demonstrator is compared to a CMOS circuit de-signed with identical specifications using post-layout simulationsBoth circuits are fabricated on a 130nm 3D technology processand evaluated based on the experimental measurements
Chapter 7 (Conclusions) summarises the contributions of thisthesis as presented in the previous chapters and discusses areasof future work
Appendix A includes photos used as reference in various partsof this work
Appendix B elaborates on the statistical descriptors used inthis work for analysing experimental measurement data
2B A C K G R O U N D
The work presented in this thesis contributes in the fields of 3D
integration and energy-recovery logic This chapter introduces thebasic concepts comprising these two fields not in extensive detailbut enough that the reader can understand the motivation andthe choices made in the context of this work Furthermore thefull-custom design flow is discussed that was used to implementthe design part of this thesis as will be presented in the followingchapters
21 3d integrated circuits
CMOS integration technologies are traditionally based on two-dimentional (2D) planar architectures Each die is individuallypackaged and placed on a printed circuit board (PCB) which isused to provide mechanical support and interconnection betweendifferent dies and electronic components A 3D-IC is a stack ofmultiple dies (Figure 21) vertically integrated and interconnec-ted sharing a common package
There are several motivations for the development of 3D-ICsPerhaps the most significant advantage of 3D integration is thereduced interconnection delay and energy dissipation In generalglobal interconnect delay and energy is linearly related to wirelength in an IC By using 3D integration modelling of simple3D-IC architectures has demonstrated that up to 50 reduction inglobal interconnect length may be achieved [24] with comparablereductions in delay and energy
Another advantage of 3D integration is heterogeneous integra-tion Presently 2D planar architectures limit designers to a singlefabrication technology for both the analog and digital parts of
ABCDEFBA ABCDEFBA
Figure 21 Comparison of 2D3D Integration
5
21 3d integrated circuits 6
ABCDEFBC
E
Figure 22 System-in-a-package
ABCDEFBC
E
Figure 23 Package-on-package
a circuit The common practice is to use an inexpensive digitalprocess for the complete design which is usually inefficient forimplementing analog circuits In contrast a 3D-IC allows for themost suitable process technology to be used for implementingeach part of the design In addition other components may beintegrated in the same circuit such as microelectromechanicalsystems (MEMS) [80] and passive devices [66] (resistors capacitorsinductors) enabling ICs with novel functionalities
The replacement of long horizontal global interconnects withshort vertical ones allows for reduced die size as well which isdirectly related to the fabrication cost Also the vertical integ-ration of dies in the same package removes the need for thespacing overhead between components that is necessary in PCBsincreasing the area efficiency and reducing the form factor
211 3D integration approaches
3D integration may be achieved using several microelectronictechnologies which result in different densities and capabilitiesThe levels of 3D integration can be reasonably classified into threemajor technology platforms [11 4] based on the interconnectwiring hierarchy and the industrial infrastructure
bull 3D system-in-package (3D-SIP) Stacking of individual pack-ages
bull 3D wafer-level-packaging (3D-WLP) Stacking of embeddeddies by wafer-level-packaging (WLP)
bull 3D stacked-IC (3D-SIC) TSV-based vertical system integra-tion
The 3D-SIP integration is based on traditional packaging intercon-nect technologies Each sub-system is integrated in a system-in-a-package (SIP) fashion (Figure 22) using wire-bonds for inter-connection and subsequently each SIP subsystem is assembledvertically to create package-on-package 3D stacks (Figure 23)
3D-SIP is currently the most mature technology and has alreadyreached volume production It provides with increased function-ality in a smaller form factor and low cost However interconnectdensity is limited since both wire bond pads and solder bumps
21 3d integrated circuits 7
Figure 24 Imecrsquos 3-D SIP wireless bio-electronic sensor [67]
A
BCDEAFE
FFDC
CFE
Figure 25 3D wafer-level-packaging
take up a large area on silicon due to their large pitch The longbond wire length can also significantly distort signals due toits intrinsic parasitics Hence this approach offers limited per-formance when compared to other 3D interconnect solutions Anexample of 3D-SIP application is the wireless bio-electronic sensor(Figure 24) developed at Imec [67]
3D-WLP integration is based on WLP technology directly stack-ing individual dies face-to-face using flip-chip micro-bumps forinterconnection The 3D interconnects are processed after IC pas-sivation using WLP techniques and may also be realised usingTSVs (Figure 25) The TSVs if existing are much larger than theequivalent interconnections in a 3D-SIC typically in the range of20 minus 40microm (1 minus 5microm in 3D-SIC) [11]
The 3D-SIC technology [69] is similar to a 2D SoC approachexcept that circuits exist physically on different layers and verticalinterconnects replace the global interconnections of the SoC 3D-
SICs (Figure 26) are based on TSV interconnections and have thepotential to offer the best all around3D integration technology Theadvantages of TSV-based 3D-SICs include the short interconnectlength (reduced parasitics) and the high interconnection densityHowever the additional process steps required for implementingthe TSVs increase manufacturing cost
A comparison between the discussed 3D interconnect technolo-gies is attempted in Table 21
21 3d integrated circuits 8
ABCD
EFFDB
F
Figure 26 3D-stacked IC
3D interconnect
technologyAdvantages Disadvantages
3D system-in-package (3D-SIP)
Simple existingtechnology bestheterogeneous
integration
Performance(speed power) low
densitymanufacturing costin high-quantities
3D wafer-level-packaging (3D-
WLP)
Existingtechnology good
compromisebetween
performance-manufacturing
cost
Averageperformance and
density
3D stacked-IC (3D-SIC)
Performance(speed power)
high densitymanufacturing costin high-quantities
Manufacturing costin low-quantities
not ready formanufacturing
Table 21 3D interconnect technologies comparison
21 3d integrated circuits 9
ABBCDE
FBACACDA
ACCD F
Figure 27 TSV process steps
212 TSV interconnects
Of particular interest to this PhD thesis is the 3D-SIC approachdeveloped at Imec and specifically the TSVs used for verticalinterconnection
In Imecrsquos 3D-SIC process [69 73] TSVs are created prior to theBEOL processing The TSV process steps are illustrated in Fig-ure 27 After processing of the CMOS front-end-of-line (FEOL) andpre-metal dielectric (PMD) stack the TSV hole is plasma-etchedwith a diameter of 5microm and a minimum pitch of 10microm A thinchemical vapor deposition (CVD) oxide layer (sim 200nm) is thendeposited for insulating the TSVs from the substrate The metal-lization sequence consists of a tantalum nitride (TaN) barrier de-position and filling of the TSV hole with electroplated copper (Cu)Finally the Cu overburden is removed in a top-side CMP step anda standard 2-metal layer BEOL process is applied to finalise thedie
Before stacking the wafer is mounted on a temporary carrierand thinned down to a silicon (Si) thickness of sim 25microm leavingthe TSV Cu exposed on the back-side of the wafer (Figure 28)After dicing the dies are stacked by Cu-Cu thermocompressionin a die-to-die (D2D) or die-to-wafer (D2W) fashion (Figure 29)using a tacky polymer as an intermediate glue layer such asbenzocyclobutene (BCB) (ǫr = 27) Target technology parametersfor Imecrsquos 3D-SIC 130nm and 65nm processes are summarised inTables 22 and 23 respectively
As TSV is a relatively new 3D interconnect technology thereare several challenges to be overcome before TSV technology canbe widely adopted and commercialised Issues in the fabricationlike etching liner deposition metallization and CMP steps arein the process of being resolved or solutions have already beenfound [55] On the contrary design methodologies for TSV-basedICs are still at their infancy stage [18]
21 3d integrated circuits 10
AAAB
CAAAB
ABC
DEF
Figure 28 Before thinningafter thinning
ABC ABC
DE DE
F
DEE
B
CBA
Figure 29 TSV stacking
21 3d integrated circuits 11
Minimum devicechannel length (nm)
130
Supply voltage (V) 12
Routing layers 2
TSV diameter (microm) 5
TSV pitch (microm) 10
TSV length (microm) 22
TSV resistance (mΩ) ~20
TSV capacitance( f F)
~40
Wafer diameter(mm)
200
Table 22 Technology paramet-ers for Imec 3D130
process
Minimum devicechannel length (nm)
70
Supply voltage (V) 10
Routing layers 2
TSV diameter (microm) 5
TSV pitch (microm) 10
TSV length (microm) 40
TSV resistance (mΩ) ~40
TSV capacitance( f F)
~80
Wafer diameter(mm)
300
Table 23 Technology paramet-ers for Imec 3D65 pro-cess
Approaching the technology challenges from the designerrsquosperspective we can emphasise on the following issues designtools thermomechanical stress thermal analysis electrical char-acteristics and testing
Design tools
Although it is possible to design 3D-ICs using existing tools byutilising ad hoc techniques as 3D implementations become moreadvanced ldquonativerdquo 3D design and analysis tools will becomenecessary A few physical layout tools are currently availablefor TSV-based design [2] however they do not offer automaticpartitioning placement and routing Moreover there are noavailable commercial tools for evaluating performance (speedpower) reliability and manufacturability in 3D-IC designs
Thermomechanical stress
TSV fill materials may include copper (Cu) tungsten (W) orpolysilicon[20] The large mismatch in the coefficient of thermalexpansion (CTE) between the fill material of the TSV (17 middot 10minus6K)and the Si surface (3 middot 10minus6K) induces mechanical stress aroundthe vias As Cu contracts much faster than Si it pulls the surface ofthe surrounding bulk while cooling down causing tensile stressin the area [25] Since even low stress values affect carrier mobilityin MOSFET devices [71] performance shift of the transistors inproximity to TSVs may occur
21 3d integrated circuits 12
The intensity of TSV-induced stress depends on various condi-tions such as TSV geometry device proximity and process char-acteristics Defining design rules such as keep-out-zones (KOZs)for digital and analog circuits and generating stress-effect modelsis an important milestone for implementing TSV-aware circuitswith predictable performance Simulations predict the complexdistribution of thermomechanical stress [40] which emphasisesthe need for sophisticated characterization structures that canmonitor stress impact with precision and verify simulation mod-els In Chapter 4 a characterization structure is presented forevaluating the impact of TSV-induced thermomechanical stresson MOSFET device performance
Thermal analysis
The close proximity of dies in a 3D-IC vertical stack increasesthe heat density proportionally to the number of stacked diesAs temperature affects transistor parameters such as leakagecurrent accurate thermal analysis is required for 3D-ICs to ensurethat design specifications are met Presently tools for thermalanalysis of planar ICs are available however they have to beadapted for use with complex 3D systems so as to reduce runtime and improve the accuracy of TSV models [18] In additionldquothermally-awarerdquo design tools are required which can partitionthe design in such a way that high heat generating componentsare placed on layers in proximity to the heat-sink
Electrical characteristics
The electrical characteristics of TSVs such as delay crosstalk sig-nal integrity (SI) and power integrity (PI) dictate many aspects ofa 3D-IC design Proper evaluation of TSV characteristics typicallythrough simulation means is critical for accurately predicting theperformance of a 3D system [57]
Several attempts have been made to generate models of theTSV structure [77 27] using its intrinsic resistance-inductance-capacitance (RLC) parasitics in an equivalent circuit model TheRLC parasitics depend on the materials and geometry of the TSV
structure while prior work has indicated that these parametersare relatively small with the capacitance assuming the dominantrole on TSV electrical properties [27] For verifying the accur-acy and calibrating the models experimental measurements arerequired using compatible characterization structures Such struc-tures are presented in Chapter 3 which were used for extractingthe TSVs parasitic capacitance of an experimental 3D process
22 energy-recovery logic 13
Testing
Testing ICs for defects is crucial for guaranteeing a sufficient man-ufacturing quality The unique processing steps of TSVs such aswafer thinning alignment and bonding introduce new types ofdefects that should be considered during testing [46] Further-more finding and diagnosing defects in TSV interconnections isconsiderably more challenging than 2D interconnects consideringthat it is not possible to gain physical access to the vias after thetop die has been placed All these factors emphasise the need fornew testing structures and methodologies for testing 3D systems
22 energy-recovery logic
TSV 3D interconnect technology aims to reduce delay and en-ergy dissipation when compared to the traditionally used planarBEOL interconnections A key TSV characteristic is its parasiticcapacitance discussed in Chapter 3 which is typically small(Tables 22 and 23) influencing proportionally both delay andenergy dissipation Although TSV technology offers significantperformance improvements to previous solutions a low-powerinterconnection scheme may still be desirable under certain con-ditions due to specific characteristics of TSV technology
1 The TSV capacitance may become a concern in future largedensely interconnected 3D-SoCs as it increases linearly withthe number of tiers and interconnections
2 Since interconnects do not follow transistor scaling tran-sistor gate capacitance will always be smaller than intercon-nect capacitance With the gap tending to widen in futuretechnologies it can be reasonably expected that 3D intercon-nects will become a performance bottleneck in ICs just asthe BEOL interconnections preceding them
Furthermore conventional CMOS low-power design techniquesbased on voltage scaling represent a technical challenge in mod-ern technology nodes as process voltage temperature (PVT) vari-ations and increased leakage [61] decrease in practice the rangeover which voltages may be scaled An interesting low-powerdesign approach is energy-recovery logic which demonstratesfrequency-dependent energy dissipation Energy-recovery logicis a design technique for minimising the energy dissipation ofcomputing devices well below typical CMOS limits Its major lim-itation is the requirement for passive components to generatetime-variable signals which cannot be efficiently integrated intocurrent generation ICs TSV-based 3D stacking and heterogeneousintegration of high-quality passive devices [66] may enable theuse of novel design techniques such as energy-recovery that
22 energy-recovery logic 14
Figure 210 Conventional AND gate
were not practically feasible in the past due to technology lim-itations In this PhD thesis energy-recovery is used for drivingTSV-based interconnections in a low-power scheme discussed inChapters 5 and 6
Historically energy-recovery logic has its roots in reversiblecomputing Reversibility is a concept from thermodynamics de-scribing physical processes that after they have taken place canbe reversed without loss or dissipation of energy Reversibleprocesses cause no change in the entropy of the system or itssurroundings (do not generate any waste heat) therefore theyachieve maximum energy efficiency The concept of reversibilityin thermodynamics may be extended to logical reversibility ifa computation can be implemented in such a way that it is al-ways possible to determine the previous state of the computationgiven a description of its current state In practical terms logicalreversibility would require that all information about the state ofthe computation in any stage is retained and never discarded
Yet the typical computing device is not reversible (irreversible)and as Landauer [36] has shown there is a fundamental min-imum amount of energy that must dissipated in such devicesThis minimum energy dissipation is known as the Von Neu-mann - Landauer limit and is at least kT ln 2 per irreversible bitcomputation The irreversible operation of conventional comput-ing can be demonstrated using the AND gate as an example(Figure 210) The AND gate has two input lines and one outputWhen both inputs are at logical 1 the output is 1 however theoutput is 0 in all of the remaining three combinations that have alogical 0 on the inputs Hence every time the gatersquos output be-comes 0 there is loss of information since we cannot determineexactly which input state caused the logical 0 at the output
Computations do not necessarily have to be irreversible aswas shown by Bennett [9] and thus it is still possible to breakthe kT ln 2 limit A reversible computing device would have to
22 energy-recovery logic 15
perform each computation twice the first time in the forwarddirection applying the inputs to a logic function and obtainingthe result and the second time backwards computing the inversefunction and returning the system to its initial state If a com-putation could be performed in such a way and implementedon non-dissipative hardware its energy requirements could bepotentially reduced to zero [10]
In practice even if a circuit is designed to be logically reversibleachieving zero-energy computations utilising existing electricalcircuits is limited considering that any process merely involvingcharge transfer will result in dissipated energy Nevertheless it isstill possible to reduce the energy dissipation during charge trans-ferring to ldquoasymptotically zerordquo levels applying circuit designtechniques generally known as adiabatic switching [8 31]
The operating principle of adiabatic switching can be bestexplained by comparing it to conventional CMOS switching In aconventional CMOS circuit (Figure 211) when the switch closes(typically the pull-up network) a certain amount of charge Q =
CVDD is pulled out of the positive power supply rail that chargesthe load capacitance C up to VDD The energy transferred fromthe power supply during the charging period (T) is [59]
E = QVDD = CV2DD (21)
From the total transferred energy only half ( 12 CV2
DD) is storedon the capacitor C while the other half is dissipated on theswitch and interconnect resistance R As can be derived fromEquation 21 to reduce the energy dissipation in conventionalCMOS either the supply voltage VDD has to be reduced or theload capacitance C
The adiabatic charging of an equivalent node capacitance C
is modelled in Figure 212 In contrast to the conventional CMOS
circuit the supply voltage in this case is not constant but time-variable with a slow rising time (T ≫ RC) If the charging periodof the capacitor (or rising time of the voltage supply) is T then itcan be proven [82 60] that the energy dissipated on the resistanceR would be proportional to
EAdiabatic prop
(
RC
T
)
CV2DD (22)
The distinct advantage of adiabatic switching is that the slowrising time of the supply voltage keeps the voltage difference(and thus current flow) on the resistance R low at all times Asthe charge transfer is spread more evenly over the entire chargingperiod the energy dissipation (E = I2R) is greatly reduced Inaddition as can be derived from Equation 22 increasing thecharging period (T) can potentially reduce the energy dissipationto arbitrary low levels irrespective of the VDD or C parameters
22 energy-recovery logic 16
A
B
CC
D
E
D
Figure 211 CMOS switching
AA
B
C
B
Figure 212 Adiabatic switching
Conventional CMOS Energy-recovery CMOS
2 states True False 3 states True False Off
Energy of information bitdissipated as heat on the
pull-down network
Energy of information bitrecovered by the
power-supply
High speed Lowmedium speed
High energy dissipation Very low energy dissipation
Area efficient Some area overhead
Table 24 Comparison energy-recoveryconventional CMOS
Combining reversible computations with adiabatic switchingtechniques allows for circuit implementations with extremely lowenergy dissipation compared to conventional designs [37 38]However fully reversible circuits in CMOS technology sufferfrom high area overhead as the inverse logic function has tobe implemented and intermediate logic states must be stored toenable reversibility of the computation [8] In cases where theoverhead of full reversibility is too high it can be significantlyreduced by allowing some information to be occasionally dis-carded without sacrificing much of the energy benefits Thesecircuits that partially apply the principles of reversible comput-ing and adiabatic switching to achieve low but not zero energydissipation are called energy or charge recovery [31]
A comparison between energy-recovery and conventional CMOSis summarised in Table 24
22 energy-recovery logic 17
ABCABC
Figure 213 Basic 2N-2P adiabaticcircuit
ABC
ABC
DEF
EE
DEF
Figure 214 2N-2P timing
221 Energy-recovery architectures
Several energy-recovery circuit architectures have been proposedin the literature over the past years Examples of logic families are2 nMOS2 pMOS adiabatic logic (2N-2P) [33] efficient charge re-covery logic (ECRL) [50] pass-transistor adiabatic logic (PAL) [54]clocked adiabatic logic (CAL) [45] quasi-static energy recoverylogic (QSERL) [83] and boost logic [62] These architectures applythe principles of reversible computing and adiabatic switchingto a variable degree while their differences are mainly relatedto characteristics such as singledual-rail logic style number ofclock phases chargingdischarging path and energy efficiency
A simple configuration is the 2N-2P family of logic circuits(Figure 213) which is based on the differential cascode voltageswitch logic (DCVSL) circuit [59] The timing diagram of the 2N-2P
circuit is presented in Figure 214 using an idealised adiabaticvoltage supply Initially the voltage supply is in the WAIT phase(LOW) keeping the outputs in the LOW state Then the inputs areset complementary to each other and the supply voltage rampsup As the inputs are evaluated the output that was enabled bythe pull-down network follows the power supply until it reachesVDD At that moment the inputs return to the LOW state andafter a certain period of time in the HOLD ldquo1rdquo phase the voltagesupply ramps down discharging the output As can be observedthe average voltage supply during the computation cycle is wellbelow VDD which can be advantageous in modern technologynodes where leakage currents are of a concern
The adiabatic chargingdischarging takes place during theramping updown of the EVALUATERESET phases while thereverse computation occurs during the RESET phase when thetransferred charge is returned to the power supply Howeverthe reversibility of the circuit is only partial since the p-channel
22 energy-recovery logic 18
Figure 215 Adiabatic driver
MOSFET (PMOS) transistor that was turned-on by the pull-downnetwork will be forced to turn-off when the output is approxim-ately a threshold voltage above zero This will cause some chargeto be trapped on the output which will be dissipated as heat on asubsequent cycle Using the 2N-2P circuit arbitrary logic functionsmay be implemented in the place of the pull-down network
The energy efficient characteristics of energy-recovery logic canbe also beneficial for driving large load capacitances A circuitthat may be used in such case is the adiabatic driveramplifier(Figure 215) [8]
The input of the adiabatic driver is dual-rail encoded as well asthe output The operation principle is straightforward First theinputs are set to valid values and then an adiabatic time-variablesignal goes through the pass-gate charging the load capacitanceat the output to its maximum value The output is now validand can be used by other stages to perform computations Sub-sequently the output slowly discharges returning the charge backto the power-supply Energy is saved by two methods Firstlythe slow-rising adiabatic signal forces a low voltage differencebetween the inputs of the pass-gate resulting in low currentand thus low energy dissipation Secondly the energy that isnot dissipated as heat during the transfer is returned back tothe power-supply which can be re-used in the next computationcycle
Circuits utilising the energy-recovery technique have been suc-cessfully implemented in the past while they have demonstratedgreat potential whilst driving highly-parasitic global intercon-nects such as clock distribution networks [42 63]
222 Power-clock generators
Energy-recovery circuits require specialised power supplies thatcan generate multiple-phase time-variable adiabatic signals and
22 energy-recovery logic 19
A
A
B
C
C
Figure 216 Stepwise charging generator
recover charge Since the power supply in an energy-recoverycircuit provides with both the power and clock signal they arereferred to as power-clock generators (PCGs) [6] The power-clockgenerators fall into two categories staircase and resonant gener-ators
Staircase generators use the stepwise charging technique [68]The basic circuit (Figure 216) is composed of N number of tankcapacitors (CT) that are successively discharged in N steps un-til the load capacitance (CLOAD) is charged to the maximumvoltage level This technique attempts to ldquoemulaterdquo the adiabaticchargingdischarging signal while its linearity depends on thenumber of charging steps N The advantage of staircase generat-ors is that they can be easily integrated on-chip since they do notrequire inductors as is the case with resonant generators How-ever both speed performance and energy efficiency are relativelylow
Resonant generators [44] are based on inductance-capacitance(LC) oscillators A simple circuit is the 1N single-phase powerclock generator of Figure 217 If R is the resistance of the adia-batic driver in Figure 215 then the load capacitance (CLOAD)along with a resonant-tank inductor (L) form an RLC oscillatorresonating at
f =1
2πradic
LCLOAD
(23)
As typical for RLC oscillators the shape of the power-clockis a sinewave which approximates the idealised waveform ofFigure 214 Energy is transferred adiabatically during the risinghalf of the sinewave while on the falling half the energy is
23 full-custom design flow 20
A
BCD
E
DFF
CD
Figure 217 1N power clock generator
partially recovered and stored in the inductor for later use RLC
oscillators are very energy efficient power-clock generators andcan operate at high speeds However inductors may be difficultto implement on-chip
23 full-custom design flow
All circuits presented in this thesis were designed using the full-custom approach since the target fabrication technology was anexperimental 3D process for which neither standard cells nordedicated 3D design tools were in our disposal
In the full-custom design methodology each individual tran-sistor and associated connectivity in the integrated circuit ismanually specified in the layout level by the designer Full-customis considered to be superior to standard-cell design when themain concern is circuit performance and area efficiency Further-more full-custom design approaches are essential in cases wherespecialised circuitry are to be designed or when standard celllibraries are not available
The electronic design automation (EDA) environment that wasused was based on Cadence1 applications and tools as integ-rated in the Design Framework II (DFII version 5141) CadenceDFII consists of tools for design management (library manager)schematic entry (Virtuoso schematic) physical layout (Virtuosolayout) verification (Assura) and simulation (Spectre) Thesetools are completely customisable supporting various processtechnologies through foundry-specific design kits In our case
1 Cadence Design Systems Inc (httpwwwcadencecom)
24 summary 21
the Assura tool was replaced by Mentor Graphics2 Calibre forthe verification step
Our customised design flow is illustrated in Figure 218 Ini-tially the circuit is captured at the transistor-level using theschematic editor (Figure 219) and the completed design is simu-lated and modified until it conforms to the design specificationsThe same circuit is re-created in the layout editor (Figure 220)where the detailed geometries and positioning of each fabricationmask layer is described The layout must conform to a series ofprocess design rules and any violations of these rules can be de-tected during the design rule check (DRC) step Subsequently thelayout design is translated to a circuit description (netlist) whichis compared to the initial circuit in the schematic editor duringthe layout versus schematic (LVS) step If the two designs matchthe interconnectivity and transistor parasitics are estimated fromthe layout design and another netlist is created which includesthese parasitics This detailed netlist can be simulated to get amore accurate estimation of the circuit behaviour including theinfluence of parasitics
24 summary
In this chapter the reader was introduced to the basic conceptsand terminology that are relevant in the context of this thesisThe following chapter begins the discussion of the research partof this work presenting a group of test structures that were usedfor the electrical characterization of the TSV parasitic capacitancein an experimental 3D process
2 Mentor Graphics Inc (httpwwwmentorcom)
24 summary 22
ABCDEA
FCCAAE
ABCDEAEC
DE
E
F
CEA
EEAEA
EDE
CAC
E
EC
CAC
Figure 218 Design flow
Figure 219 Inverter in Virtuososchematic
Figure 220 Inverter in Virtuosolayout
3C H A R A C T E R I Z AT I O N O F T S V C A PA C I TA N C E
31 introduction
In a semiconductor process technology electrical characterizationis crucial for testing device performance and reliability after goingthrough the process fabrication steps Electrical characterizationverifies the electrical properties of devices to specifications andprovides feedback to the wafer fabrication process engineers forimproving and optimising the process technology
In a 3D system the electrical behaviour of the TSV can bemodelled in terms of its resistance-inductance-capacitance (RLC)parasitics as discussed in 212 (page 12) Parasitic models are typ-ically used for simulating circuit behaviour and they are criticalfor accurately predicting the performance of a 3D system duringdesign verification [57]
From the RLC parasitics associated with a TSV capacitanceassumes the most significant role on its electrical properties suchas energy dissipation and delay [27] The capacitancersquos role is sodominant that for latency and signal integrity (SI) analysis theTSV behaviour can be modelled by considering the capacitancealone [21] Thus accurate characterization of the TSV capacitanceis highly desirable when designing a 3D system
Typical laboratory capacitance measurement techniques basedon LCR1 meters require well controlled process technologies forrealising a high number of devices without defects which mightnot be possible at the current maturity level of TSV process tech-nology In this chapter an alternative measurement technique isproposed for accurate on-chip characterization of the TSV parasiticcapacitance Test structures based on this technique are fabricatedon a 65nm 3D process (Table 23 page 11) and used for charac-terizing the TSV parasitic capacitance under various conditionsThe experimental measurements are analysed using statisticalmethods and the proposed test structures are evaluated
32 tsv parasitic capacitance
A typical isolated 2D back-end-of-line (BEOL) metal wire can bemodelled as a conductor over a ground plane (Figure 31) Sincethe conductor is separated from the ground plane by a dielectric
1 LCR meter is a measurement equipment used to determine the inductance (L)capacitance (C) and resistance (R) of devices based on impedance measure-ments
23
33 integrated capacitance measurement techniques 24
ABC
Figure 31 Isolated 2D wire capacitance
material (εox) its electrical behaviour closely resembles that ofa capacitor The basic formula for calculating a parallel platecapacitance is
C =εox
hwl (31)
In practice calculating the parasitic capacitance of a BEOL metalwire is significantly more complex since fringing fields increasethe total capacitance value [78] Nevertheless whether a 2D metalwire capacitance is calculated using Equation 31 or more com-plex formulas [85] its value is only dependent on geometricalcharacteristics and the dielectric material
In that context 3D interconnects in the form of TSVs do notbehave as simple wires but as devices The TSV copper (Cu) issurrounded by silicon (Si) and these two materials are separatedby a dielectric oxide liner (Figure 32) This structure resemblesa metal-oxide-semiconductor (MOS) capacitor and as has beenshown by modelling and measurements [27] exhibits distinctaccumulation depletion and inversion regions (Figure 33) TheTSV capacitance attains its maximum value (Cox) in the accumula-tion when Vbias lt VFB (flatband voltage) and inversion regions(Vbias gt Vth f lt 1kHz)
The bias voltage and frequency dependency of the TSV capacit-ance complicates both modelling and the experimental measure-ment of its parasitic capacitance Furthermore the MOS behaviourof TSVs is not desirable in circuits using 3D interconnections Forthat reason special techniques are employed [28] that shift theTSV C-V characteristic in such a way that for typical circuit oper-ating voltages (0 minus VDD) the TSV capacitance is constant and atits minimum value (Figure 33)
33 integrated capacitance measurement techniques
Extracting the TSV parasitic capacitance requires the use of meas-urement techniques that are compatible with the particular TSV
33 integrated capacitance measurement techniques 25
ABCDE
F
FD
F
F
FD
Figure 32 TSV parasitic capacitance
ABCDE FACDE
ECDE
BCFACDE
FCCDED DBCDE
D
DAB
Figure 33 Capacitance vs Gate voltage (CV) diagram of a MOS Capa-citor
33 integrated capacitance measurement techniques 26
characteristics as indicated in the previous section The tech-niques commonly used for measuring integrated capacitances ina laboratory environment can be classified into two categories
1 Measurements based on external instruments these can bedirect capacitance measurements LCR meter based [52] orRF S-parameter based measurements [39]
2 On-chip capacitance measurements examples are the ref-erence capacitor technique [32] ring-oscillator based [26]and charge-based [15]
The disadvantage of using external instruments directly connec-ted to the device under test (DUT) is that the measurement setupintroduces additional parasitics that have to be subtracted fromthe measurement result Using standard measurement setupssignificant noise is added when attempting to determine capa-citances well below 1pF due to the influence of the instrumentparasitics [64] In cases where the DUT capacitance is small suchas the femto-Farad TSV capacitance (Table 23) a large number ofidentical devices (typically in the hundreds) can be connected inparallel to increase the total capacitance value [75] However thismethod is only usable with process technologies that are wellcontrolled and allow the realisation of a high number of deviceswithout defects In general the extracted capacitance of the wholepopulation may not be representative of a single device [56] andas a result individual device measurement accuracy may suffer
On-chip techniques can achieve better accuracy when the DUT
capacitance is small Although DUT yield is not critical the pro-cess technology should allow the integration of active devicesand their parameters should be known to a reasonable degreeA simple yet high resolution technique is the charge-based ca-pacitance measurement (CBCM) which can be used to extractcapacitances down to the atto-Farad range [15]
In the CBCM technique the DUT capacitance (CDUT) is chargeddis-charged at frequency f using a pseudo-inverter (Figure 34) whilethe average charging current (Im) is measured with a laboratoryammeter Since transistor and wiring parasitic capacitances are inparallel with the DUT capacitance a second (reference) inverter isused to measure these parasitics (Ire f ) which are then subtractedfrom the calculated result The DUT capacitance is calculated asfollows
CDUT =Im minus Ire f
VDD middot f(32)
The accuracy of the CBCM technique can be further increasedby measuring the charging current over a range of frequenciesThe DUT capacitance can be then calculated from the slope of
34 electrical characterization 27
AB
AC
DE
FE
DE
FE
AB
AC
B
Figure 34 CBCM pseudo-inverters
Inet (Im minus Ire f ) (Figure 35) The accuracy is greatly improved aspseudo-inverter leakage current is removed from the extractedresult (slope is not affected by constant biases) and also defectiveDUTs can be detected by inspecting the linearity of the current-frequency plot
The CBCM technique has been previously used for measuringBEOL metal interconnect parasitic capacitances [15] and MOSFET
capacitances [64] In this work CBCM is proposed for the on-chipmeasurement of the TSV parasitic capacitance CBCM is compat-ible with the TSV parasitic capacitance on the premise that thevoltage and frequency ranges of CBCM are selected such thatthe TSV is forced into a region where its capacitance is constantAs was shown in the previous section (Figure 33) TSV capacit-ance is constant for typical circuit operating voltages while thefrequency dependency of the capacitance becomes significantonly in the multi-MHzGHz range [39] Therefore operating theCBCM pseudo-inverters in the 0 minus VDD range and below 1MHz
should be adequate to avoid the MOS behaviour of the TSV
34 electrical characterization
A group of test structures based on the CBCM technique wereimplemented for on-chip characterization of the TSV capacitanceon Imecrsquos 65nm 3D process technology (Table 23) As it is commonin IC design maximising the utilisation of the available chiparea is one of the priorities of the design effort For that reasonsupplementary test structures were designed and implemented
34 electrical characterization 28
AB
CADE
F
AB
E
CAD
B
Figure 35 Current as a function of frequency in CBCM
on the same chip that went beyond the core objectives of thiswork
The complete design consisted of 6 test structures 5 for ex-tracting the TSV capacitance under various conditions based onCBCM and 1 additional experiment for direct measurement ofthe RC constant All test structures are presented in this sectionhowever due to equipment and material availability only themost significant were measured and analysed
341 Physical implementation
In the CBCM technique the extraction accuracy of the TSV capacit-ance is highly dependant on the design of the pseudo-invertersThe rsquoreferencersquo pseudo-inverter in CBCM is used for measuringboth the wire parasitic capacitance and the inverter parasitics inthe form of diffusion and gate-drain overlap capacitances There-fore it is critical that the two pseudo-inverters are identical interms of parasitics as any mismatches will appear in the extrac-ted measurement result
Transistor mismatch is mostly influenced by process technologyparameters however it is possible to minimise process variationssuch as in the poly gate length by applying special design tech-niques A major cause of gate length variation is the polysiliconpitch to neighbours and one way to mitigate that effect is to fillunused space between transistor gates with polysilicon dummies[13] This technique was widely used in the CBCM pseudo-inverterdesign presented in this work (Figure 36) Concurrently theinverter transistors were designed with relatively large dimen-sions for the implementation process technology (P10microm1micromN5microm1microm) so as to minimise the effect of variation as a per-centage of the transistor parasitics
34 electrical characterization 29
Poly dummy fills
Metal2Metal1PolysiliconActive
1μm 5μm
Figure 36 CBCM pseudo-inverter pair in layout
Imecrsquos 65nm 3D process allows for 2-tier vertical stacking ofdies (TOP and BOTTOM) with the TSV passing through theTOP die Si substrate as illustrated in Figure 32 The IO foreach test structure was accessible through a standard 2 times 12 testpad module on the TOP die compatible with our wafer probetest system The 2 times 12 test pad module (PRC1) along with thecomplete test structure design is illustrated in Figure 37 while amicrograph image of the fabricated chip can be seen in Figure 38
The CBCM pseudo-inverters were implemented on either TOP
or BOTTOM tier while the TSVs were connected to the invert-ers from TOP or BOTTOM and arranged in 4 configurations (11 times 2 2 times 2 2 times 5) with different pitch distances The 6 test struc-tures available on the PRC1 test pad module are summarised inTable 31 A detailed description of the test structures (rsquoArsquo rsquoBrsquo rsquoCrsquorsquoDrsquo rsquoRrsquo and rsquoErsquo) follows in the paragraphs below
Structure rsquoArsquo Single TSV capacitance (TOP die)
The purpose of test structure rsquoArsquo (Figure 39) is to extract the capa-citance of a single TSV when it is electrically accessed through theTOP die The structure is composed of 3 pseudo-inverters con-nected to the TSVs using identical geometry Metal-2 layer wireswhose capacitance is also measured (Reference) In addition to
34 electrical characterization 30
DTOP DBTM
B10 A10
BREF AREF
B1 A1
R1 R2
SET VDD
RESET GND
C15 C10
C50 C20
SET VDD
ERC SET
EREF EC Metal2 (TOP)Metal1 (TOP)
Metal1 (BTM)Metal2 (BTM)
Figure 37 PRC1 test pad module(layout)
Figure 38 PRC1 test pad module(micrograph)
34 electrical characterization 31
Structure A B
TSVs - 1 2x5 - 1 2x5
Inverterlocation
TOP TOP TOP BTM BTM BTM
TSVconnection
TOP TOP TOP BTM BTM BTM
Multi-TSVconnection
- - TOP - - BTM
Multi-TSVpitch (microm)
- - 15 - - 15
Pad name AREF A1 A10 BREF B1 B10
Structure C D
TSVs 2x2 2x2 2x2 2x2 1x2 1x2
Inverterlocation
TOP TOP TOP TOP TOP TOP
TSVconnection
TOP TOP TOP TOP TOP TOP
Multi-TSVconnection
TOP TOP TOP TOP TOP BTM
Multi-TSVpitch (microm)
10 15 20 50 20 20
Pad name C10 C15 C20 C50 DTOP DBTM
Structure R E
TSVs 1 1 - 1x2 1x2
Inverterlocation
TOP TOP TOP TOP TOP
TSVconnection
TOP TOP - TOP TOP
Multi-TSVconnection
- - - TOP BTM
Multi-TSVpitch (microm)
- - - 20 20
Pad name R1 R2 EREF EC ERC
Table 31 Test structures on the PRC1 module
34 electrical characterization 32
Pseudo-inverters
2x5 TSVs
TSV
Reference
Metal2 (TOP)Metal1 (TOP)
Figure 39 Structure rsquoArsquo - Single TSV capacitance (TOP die)
the single TSV DUT the structure also contains 10 parallel TSVs
arranged in a 2 times 5 array with a 20microm pitch Since the suitabilityof the CBCM method for extracting the small TSV capacitance wasundetermined during design time the 2times 5 array was included asa precaution effectively increasing the DUT capacitance an orderof magnitude
Structure rsquoBrsquo Single TSV capacitance (BOTTOM die)
The design of structure rsquoBrsquo is similar to the structure rsquoArsquo asdescribed above The only difference is that the CBCM pseudo-inverters are implemented on the BOTTOM die (Figure 310)while connectivity to the TSVs is provided using a wire on theMetal-2 layer of the BOTTOM die The purpose of this structureis to examine whether the perceived TSV capacitance varies whenit is accessed from the BOTTOM side
Structure rsquoCrsquo Variable pitch TSV capacitance
The use of minimum TSV pitch in designs is highly desirable soas to minimise total chip area The purpose of test structure rsquoCrsquo(Figure 39) is to investigate whether TSV pitch has an effect onthe parasitic capacitance It has been previously reported in the
34 electrical characterization 33
2x5 TSVs
Reference
TSV
Pseudo-inverters
Metal2 (BTM)Metal1 (BTM)
Figure 310 Structure rsquoBrsquo - Single TSV capacitance (BOTTOM die)
literature that TSV proximity to other TSVs may affect its parasiticcapacitance probably through an interaction with the oxide liner[23] Test structure rsquoCrsquo aims to verify the presence of the pitcheffect on the 65nm 3D process
The CBCM pseudo-inverters are implemented on the TOP die(Figure 311) while the TSVs are connected in parallel in groups of4 with variable pitch distances at 10microm 15microm 20microm and 50micromAll groups connect to the pseudo-inverters using identical lengthwires Since wire parasitics are not measured their value is notextracted from the measured TSV capacitances thus the resultsare relative
Structure rsquoDrsquo TSV capacitance with interconnection on TOP or BOT-
TOM die
The purpose of structure rsquoDrsquo (Figure 312) is similar to structurersquoBrsquo which is to examine whether the perceived TSV capacitancechanges when it is accessed from the BOTTOM die However inthis structure the CBCM pseudo-inverters are implemented on theTOP die in the case the front-end-of-line (FEOL) active devices ofthe BOTTOM die were not capable of operating properly afterfabrication and stacking
34 electrical characterization 34
2x2 TSVs - 10μm
2x2 TSVs - 15μm
2x2 TSVs - 20μm
2x2 TSVs - 50μm
Metal2 (TOP)Metal1 (TOP)
Figure 311 Structure rsquoCrsquo - Variable pitch TSV capacitance
Metal2 (TOP)Metal1 (TOP)Metal1 (BTM)
Figure 312 Structure rsquoDrsquo - TSV capacitance measurement with inter-connection on TOP or BOTTOM die
34 electrical characterization 35
Metal2 (TOP)Metal1 (TOP)
Figure 313 Structure rsquoRrsquo - Evaluation of TSV capacitance measurementaccuracy
Metal2 (TOP)Metal1 (TOP)Metal1 (BTM)
Driver
Reference
Driver
Figure 314 Structure rsquoErsquo - TSV capacitance effect on signal propagationdelay (layout)
Structure rsquoRrsquo Evaluation of TSV capacitance measurement accuracy
Structure rsquoRrsquo (Figure 313) aims to evaluate the accuracy withwhich CBCM can extract the capacitance of two nearby TSVs Thetwo pseudo-inverters wires and TSVs are identical thus theor-etically they should give identical results Any variance in theresult should be mainly due to transistor variability
Structure rsquoErsquo TSV capacitance effect on signal propagation delay
Structure rsquoErsquo (Figure 314) allows the direct measurement ofthe delay caused by the TSV on signal propagation The signalpropagation delay is measured under two conditions When thesignal passes through the TSV (which includes the effect of resist-ance on delay) and when only the TSV capacitance is in the pathof the current (Figure 315) The contribution of wiretransistorparasitics is also measured so as to be subtracted from the finalresult
34 electrical characterization 36
ABC
D
ECBFC
ABC
Figure 315 Structure rsquoErsquo - TSV capacitance effect on signal propagationdelay (schematic)
ABC
DEF
CD
A
CA
A
Figure 316 Measurement setup
342 Measurement setup
The presented test structures were fabricated on 300mm wafers(Figure A1) each containing 472 identical dies For IO accessa general purpose on-wafer measurement system was used thatallowed direct contact with the wafer and the 2 times 12 test padmodule through probecard pins (Figure A2) All IO signals wereprovided by standard laboratory instruments controlled throughthe general purpose interface bus (GPIB) port of a PC runninga customised LabVIEW2 program The measurement setup isillustrated in Figure 316
The voltage source of the CBCM pseudo-inverters was set at1V and the CBCM pulse frequencies were swept in the range100kHz minus 1MHz For each measured test structure the DC cur-rent measurement was stored in non-volatile memory and sub-
2 LabVIEW is a visual programming language from National Instruments forautomating measurement equipment in a laboratory setup
34 electrical characterization 37
Test pad Direction Instrument connection
VDD Power Power supply - 1V
GND Power Power supply - 0V
SET InputPulse generator
(Non-overlapping with RESET)
RESET InputPulse generator
(Non-overlapping with SET)
A[REF 1 10] InputSMU (Inverter source - 1V
supply)
B[REF 1 10] InputSMU (Inverter source - 1V
supply)
C[10 15 2050]
InputSMU (Inverter source - 1V
supply)
D[TOP BTM] InputSMU (Inverter source - 1V
supply)
R[1 2] InputSMU (Inverter source - 1V
supply)
E[REF C RC] Output Oscilloscope (high-impedance)
Table 32 PRC1 test pad instrument connections
sequently the prober progressed to the next die measuringall dies on the wafer The IO test pads of the PRC1 module(Figure 37) and the instrument connections are summarised inTable 32
343 Experimental results
There were two wafers available for measurements named in thistext rsquoD11rsquo and rsquoD18rsquo which were selected from the same lot Eachwafer contained 472 identical dies of the TOP 3D tier measuredusing the measurement setup described in 342 Measurementdata were extracted only from test structures rsquoArsquo (Figure 39) andrsquoCrsquo (Figure 311) which were analysed using standard statisticalmethods
When utilising statistical methods to analyse data acquired ina laboratory environment it is common for some data pointsto be significantly away from the other measurements Thesedata points are termed outliers (Appendix B) and usually resultfrom measurement errors such as bad probe pin contacts ormalfunctioning dies on the wafer Since the following analysisis focused on TSV capacitance variability these few outliers areexcluded from the data set so as to improve analysis accuracy
34 electrical characterization 38
Structure rsquoArsquo (Fig 39)
TSV dimensions (ld) 40microm5microm
Oxide liner thickness 200nm
Total dies 472
Processed dies (excluding outliers) 454
Mean capacitance 709 f F
Standard deviation (σ) 19 f F
Coefficient of variation (3σmicro) 80
Table 33 TSV capacitance D2D variability on wafer rsquoD11rsquo
The statistical descriptors used for analysing the measurementdata are elaborated in Appendix B and include calculations ofmean standard deviation and coefficient of variation Theseparameters are used to evaluate the TSV capacitance in terms ofdie-to-die (D2D) wafer-to-wafer (W2W) and TSV pitch variabilityThe accuracy of the CBCM results is evaluated by comparing themto standard LCR measurements Furthermore the correlationbetween the extracted TSV capacitance and simulated oxide linerthickness is shown signifying the predominant role of the oxideliner on the observed TSV capacitance variability
Die to die (D2D) variability
The die-to-die (D2D) variability is evaluated by measuring thecapacitance of the single TSV of structure rsquoArsquo (Figure 39) acrossall dies on the wafer The measurements are repeated on bothavailable wafers (rsquoD11rsquo and rsquoD18rsquo)
On wafer rsquoD11rsquo 18 dies showed a capacitance value severalstandard deviations away from the calculated mean of the totalpopulation These outliers indicated a measurement error or mal-functioning on the die and they were removed from the statisticalanalysis In the remaining 454 dies the mean capacitance of theexperimental data was 709 f F with a standard deviation (σ) of19 f F (Table 33) Standard deviation is an important indicator al-lowing us to determine whether the variance in the experimentaldata is expected or if there is a random error affecting the meas-urements As can be seen in Figure 317 706 of the measuredvalues are within plusmnσ of the mean capacitance value Since mostof the experimental data are within a standard deviation of themean value the variability observed is insignificant and expectedthus the measurement results can be considered valid
On wafer rsquoD18rsquo from the 472 total dies 58 were removed asoutliers The mean capacitance of the experimental data was895 f F with a standard deviation (σ) of 22 f F (Table 34) Theprobability density function can be seen in Figure 318 with 691
34 electrical characterization 39
-60 -40 -20 00 20 40 60 80
σ-σ
706
Deviation from mean (fF)
0
005
01
015
02
025
Figure 317 TSV capacitance D2D variability on wafer rsquoD11rsquo - Probabil-ity density function
Structure rsquoArsquo (Fig 39)
TSV dimensions (ld) 40microm5microm
Oxide liner thickness 200nm
Total dies 472
Processed dies (excluding outliers) 414
Mean capacitance 895 f F
Standard deviation (σ) 22 f F
Coefficient of variation (3σmicro) 74
Table 34 TSV capacitance D2D variability on wafer rsquoD18rsquo
of the experimental data within plusmnσ of the mean capacitancevalue
Correlation between TSV capacitance and oxide liner thickness vari-
ation
The TSV capacitance as extracted from the experimental measure-ments is plotted for wafers rsquoD11rsquo and rsquoD18rsquo (Figures 319 and 320)The capacitance value in the figures is normalised to the calcu-lated mean while xy axis indicate die coordinates on the waferWe observe that on both wafers the capacitance is near the meanvalue at the periphery of the wafer and reduces as we approachto the centre indicating a common cause of variability on bothwafers
Parameters affecting the TSV capacitance are the geometry ofthe TSV (length diameter) and oxide liner characteristics (dielec-tric constant thickness) The variation of the oxide liner thicknessacross the wafer can be predicted by simulation means Oxideliner simulation results for the specific process and fabrication
34 electrical characterization 40
σ-σ
691
Deviation from mean (fF)-70 -50 -30 -10 10 30 50 70
0
002
004
006
008
01
012
014
016
018
02
Figure 318 TSV capacitance D2D variability on wafer rsquoD18rsquo - Probabil-ity density function
075
080
085
090
095
100
105
110
0
5
10
15
20
25
30
-30-25
-20-15
-10-5
Norm
aliz
ed
capacitance
Die X
Die Y
075
080
085
090
095
100
105
Wafer area
Figure 319 TSV capacitance measurement results on wafer D11
34 electrical characterization 41
086088090092094096098100102104106108
0
5
10
15
20
25
30
-30-25
-20-15
-10-5
Norm
aliz
ed
capacitance
Die X
Die Y
086
088
090
092
094
096
098
100
102
104
106
108
Wafer area
Figure 320 TSV capacitance measurement results on wafer D18
tools can be seen in Figure 321 The liner thickness in the graphis normalised to the mean value and xy axis indicate die coordin-ates on the wafer As can be observed the oxide liner thickness isnear the mean value at the periphery of the wafer and increasesto a peak value when approaching the centre Since the TSV ca-pacitance is inversely proportional to the oxide liner thicknesstherersquos a strong correlation between the two variations suggestingthe oxide liner thickness to be the major cause of die-to-die (D2D)variability on wafers rsquoD11rsquo and rsquoD18rsquo
Wafer to wafer (W2W) variability
Probability density functions for both wafers rsquoD11rsquo and rsquoD18rsquo areplotted in Figure 322 Comparing the two experimental data sets(Tables 33 and 34) we can see that although the main statisticaldescriptors have not changed much (standard deviation is slightlylarger in D18) the mean capacitance is much larger in rsquoD18rsquo(895 f F) when compared to rsquoD11rsquo (709 f F) The difference is 262capacitance increase from wafer to wafer Taking into accountthat both wafers are from the same lot this result indicates theneed for better controllability on process parameters
34 electrical characterization 42
109
100
095Nor
mal
ized
oxi
de th
ickn
ess
-5-10 -15 -20 -25 -30
510
1520
2530
0Die Y
Die X
Figure 321 Simulated oxide liner thickness variation
60 65 70 75 80 85 90 95 1000
005
01
015
02
025D11 D18
plusmnσ
plusmnσ
Capacitance (fF)
Figure 322 TSV capacitance W2W variability - Probability density func-tions
34 electrical characterization 43
Structure rsquoCrsquo (Fig 311)
TSV dimensions (ld) 40microm5microm
Oxide liner thickness 200nm
Total dies 472
Processed dies (excluding outliers) 424
Experiment C10 C50
Number of TSVs 4 4
Mean capacitance 4073 f F 4125 f F
Standard deviation (σ) 84 f F 74 f F
Population within σ 695 653
Table 35 Variable pitch TSV capacitance experimental data from waferrsquoD18rsquo
Variable pitch TSV capacitance
The variable pitch TSV capacitance was extracted using structurersquoCrsquo (Figure 311) TSVs with pitch distances at 10microm and 50microm
were measured (experiments C10 and C50) on wafer rsquoD18rsquo Fromthe 472 total dies 424 were functional and the measurementresults are summarised in Table 35 The extracted capacitancevalues represent the 4 parallel-connected TSVs which also includethe parasitics of Metal 2 connectivity wires
As can be observed in the probability distribution functions(Figure 323) the measurement data for experiments C10 and C50
are very close to each other and there is an overlap between theirstandard deviations Even though there is an observable changein mean capacitance between the two experiments the overlap ofstandard deviations does not allow to conclude whether the capa-citance change is due to difference in pitch or process variability
To decide whether the change observed in capacitance is signi-ficant we run a KruskalndashWallis (Appendix B) one-way statisticalanalysis of variance in Minitab3 Assuming that the null hypo-thesis is true we calculate a p-value of 0 indicating that the twodata sets are different and that the pitch has indeed an affect theTSVs capacitance value The pitch dependence of the capacitancewith the same direction has been also previously observed onImecrsquos 130nm 3D process [23]
Measurement accuracy of CBCM
The capacitance measurements extracted using the CBCM tech-nique were compared to conventional LCR measurements (Fig-ure 324) on wafer rsquoD18rsquo Comparing the two measurements we
3 Minitab is a program that implements common statistical functions It is de-veloped by Minitab Inc (httpwwwminitabcom)
35 summary and conclusions 44
376 380 384 388 392 396 400 404 408 412 416 420 424 428 4320
001
002
003
004
005
006C10 C50
Capacitance (fF)
plusmnσC10
plusmnσC50
695 653
Figure 323 Variable pitch TSV capacitance experimental data fromwafer rsquoD18rsquo - Probability density function
-80 -60 -40 -20 00 20 40 60 80 1000
002
004
006
008
01
012
014
016
018σ-σ
688
Deviation from mean (fF)
Figure 324 LCR experimental data from wafer rsquoD18rsquo - Probability dens-ity function
observe in Table 36 that the standard deviations of the two datasets are similar while mean capacitance is sim 8 smaller in theLCR method However the LCR method produced more out-liers than CBCM That was expected as LCR cannot extract thecapacitance of a single TSV but can only measure groups of TSVs
connected in parallel (32 in our example) so as to increase thetotal measured capacitance This increases the chance of measur-ing TSVs with defects which in turn affects the accuracy of theLCR extracted capacitance
35 summary and conclusions
In this chapter various test structures based on the CBCM tech-nique were presented for the electrical characterization of theTSV parasitic capacitance The test structures were implementedon a 65nm 3D process technology and measured using a gen-eral purpose on-wafer measurement system The measurement
35 summary and conclusions 45
Method LCR CBCM
Total dies 472 472
Processed dies (excluding outliers) 393 414
Mean capacitance 820 f F 895 f F
Standard deviation (σ) 23 f F 22 f F
Coefficient of variation (3σmicro) 86 74
Table 36 Comparison between CBCMLCR measurements on waferrsquoD18rsquo
results were statistically analysed in terms of TSV capacitancedie-to-die (D2D) and die-to-wafer (D2W) variability The calculatedD2D variability was found to be reasonable while a comparisonto simulated data of the oxide liner thickness variation revealedthe oxide liner as a major contributor to the observed TSV capa-citance variability However D2W variability was extensive andfeedback was provided to the fabrication engineering team forfurther improvement of the process The results also confirmedthe effect of TSV pitch on the parasitic capacitance as has beenpreviously reported in the literature for a different process [23]
The comparison of CBCM-based measurements to conventionalLCR gave similar results while in addition the CBCM techniquewas shown to produce less outliers (less defective samples) in themeasurement data Considering the simplicity of CBCM and thatit has acceptable accuracy for extracting the capacitance of singleTSVs it can be a valuable tool for process characterization in theearly research efforts of 3D integration
The ability of CBCM for measuring single-ended TSVs could alsoprove beneficial for testing TSVs prior to die stacking which is ancritical step for keeping 3D-IC yield acceptable [47] Extracting TSV
parasitic capacitance can provide important information about theTSV device fabrication quality since slight variations in processparameters will greatly affect its parasitic capacitance as wasdemonstrated in this chapter Techniques based on RC constantmeasurement and sense amplification have been proposed in theliterature for a similar role [16] however they are significantlymore complex to implement than CBCM
In the following chapter another critical TSV characteristic ther-momechanical stress is evaluated using a test structure designedand implemented on the same 65nm 3D process
4T S V P R O X I M I T Y I M PA C T O N M O S F E TP E R F O R M A N C E
41 introduction
The integration of through-silicon via (TSV) 3D interconnects inCMOS technology has raised concerns over the impact on activedevices implemented on the same silicon (Si) substrate The largecoefficient of thermal expansion (CTE) mismatch between the TSV
filling material and Si substrate has been shown to induce ther-momechanical stress and deform the surface in proximity to theTSV Since thermomechanical stress can have an effect on activedevice carrier mobility [71] it is essential that the stress impacton MOSFET performance is kept under control Monitoring theTSV-induced thermomechanical stress effect on MOSFET deviceperformance and generating keep-out-zone (KOZ) guidelines fordigital and analog circuits is an important milestone for design-ing TSV-aware circuits with predictable performance
Simulations using finite element method (FEM) modelling [40]predict that TSV-induced thermomechanical stress has a com-plex distribution that depends on numerous conditions such asTSV geometry device type channel orientation and proximityHowever currently published experimental measurements onthermomechanical stress investigate just a small number of test-cases typically limited to a single TSVchannel orientationaxis[53 70 81] The complex distribution of thermomechanical stressemphasises the need for sophisticated characterization structuresthat can monitor stress impact with precision and verify simula-tion models
In this chapter a test structure is presented for monitoring theimpact of TSV-induced thermomechanical stress on MOSFET deviceperformance The test structure uses proven design techniquesfor implementing a large number of test-cases and accessingnumerous devices with maximum precision Furthermore thetest structure is fabricated on an experimental 65nm 3D process(Table 23) and used for characterizing that process Some of thepresented circuits and results have previously appeared in thefollowing publications [48 49 58]
42 tsv-induced thermomechanical stress
Individual TSV integration technologies use different conductivefill materials which may include copper (Cu) tungsten (W) or
46
42 tsv-induced thermomechanical stress 47
Material CTE at 20 ˚C (10minus6K)
Si 3
Cu 17
W 45
Table 41 Coefficients of thermal expansion
Figure 41 TSV thermomechanical stress simulation [40]
polysilicon [20] The coefficient of thermal expansion (CTE) mis-match (Table 41) between the fill material and the silicon (Si) caninduce strain on the surface in proximity to the TSV This deform-ation is caused by thermomechanical stress along the XY-planeduring the cooling-down phase [25] due to the filling materialrsquoshigher-than-Si contraction rate
Simulations using finite element method (FEM) modelling showthat the thermomechanical stress exhibits two-fold rotationalsymmetry with tensile and compressive stresses concentratedon separate axis [40] The thermomechanical stress distributiondepends on TSV geometrical characteristics as well as the inter-action with other TSVs Figure 41 illustrates the simulated stressdistribution of a single TSV while in Figure 42 the interactionbetween two near-by TSVs is shown As can be observed whenmultiple TSVs are in proximity to each other stress-free areas canbe generated as a result of destructive interaction
It is well known that stress can affect carrier mobility in MOS-
FET devices In fact strained-silicon technology processes havebeen in use for years as means to enhance device performance inseveral logic families [71] However in TSV-based 3D integrationthe induced strain is not intentional since it is a characteristic in-trinsic to the TSV process technology and thus the resulting shiftin device performance might not be always desirable Character-ization of the TSV impact on device performance and definition ofdesign rules such as keep-out-zones (KOZs) for digital and ana-log circuits is an important milestone for implementing TSV-awarecircuits with predictable performance
Simulation results [65] have shown that the mobility shift de-pends on various conditions such as TSV geometry device type
43 test structure implementation 48
Figure 42 TSV thermomechanical stress interaction between two TSVs[40]
proximity and channel orientation In Figure 43 the mobilityshift of transistors between two TSVs at variable distances is simu-lated As can be observed the effect is positive and more intenseon PMOS devices while for n-channel MOSFET (NMOS) the mobil-ity shift is mostly negative
The mobility shift due to TSV proximity has been also con-firmed by experimental data [53 70 81] and in most cases theperformance shift was found to be within acceptable limits How-ever currently published experimental results investigate a smallnumber of test-cases typically limited to a single TSVdevice ori-entationaxis As was shown in Figures 41 and 42 and discussedin the text the thermomechanical stress distribution is complexand depends on various factors In order to evaluate the impactof TSV thermomechanical stress accurately and under differentconditions more sophisticated test structures are required Suchapproach is attempted in the test structure which is presented inthe following sections
43 test structure implementation
TSV-induced thermomechanical stress can be compressive in somedirections tensile in others and itrsquos effect on device mobilitymay vary depending on MOSFET typeorientationdistance orinteraction with other TSVs To accurately characterize the effectof TSV proximity on MOSFET device performance a test structurewas designed and fabricated on Imecrsquos 65nm 3D process whichattempted to cover as many test-cases as possible
43 test structure implementation 49
Figure 43 Estimated mobility shift on PMOS (a) and NMOS (b) tran-sistors in between two TSVs separated by distance d [65]
The test structure consisted of MOSFET arrays superimposedover TSVs such that MOSFET devices surrounded the TSVs in alldirections at variable distances The purpose of these experimentswas to indirectly observe the thermomechanical stress effect ontransistor mobility by measuring the saturation current of MOSFET
devices at various distances and directions in relation to the TSVsSeveral configurations of TSVs and MOSFET arrays were investig-ated so as to characterize the impact of single and multiple TSVsas well as to observe the effect on different device types
Stacked silicon dies were not fabricated thus all devices wereimplemented on the TOP die Si substrate as illustrated in Fig-ure 32 (page 25) The test structure inputoutput (IO) was ac-cessible through a standard 2 times 12 test pad module on the TOP
die compatible with our wafer probe test system In total two testpad modules were designed for this experiment PTP1 and PTP2implementing NMOS-type and PMOS-type arrays respectively Testpad module PTP1 along with the test structure layout is illus-trated in Figure 44 while a micrograph image of the fabricatedchip can be seen in Figure 45
43 test structure implementation 50
A0
B0
C0
D0
E0
F0
G0
H0
I0
J0
A90
B90
E90
F90
G90
J90
DS3 DF3
DF2 DS0
DS2 DF0
DF1 DS1
SS SF
DEC6 VDD
DEC7 GND
DEC8 VG
DEC4 DEC9
DEC5 GND
DEC2 DEC3
DEC0 DEC1
Arraydecoder
Draindecoder
Figure 44 PTP1 test pad mod-ule (layout)
Figure 45 PTP1 test pad mod-ule (micrograph)
43 test structure implementation 51
Metal2Metal1PolyActive
TSV
MOSFETs
07μm007μm 0deg
07μm007μm 90deg
05μm05μm 90deg
05μm05μm 0deg
Figure 46 MOSFET dimensions and rotations
431 MOSFET arrays
Each MOSFET array was a separate experiment consisting of 37 times37 transistors which could be either NMOS-type or PMOS-typeshort channel (07microm times 007microm) or long channel (05microm times 05microm)as well as at 0ordm or 90ordm rotation in relation to TSVs (Figure 46)
The MOSFET arrays enclosed TSVs in one of five different config-urations 1 TSV (Figure 47) 4 TSVs diagonal (Figure 48) 4 TSVs
lateral (Figure 49) 9 TSVs (Figure 410) and no-TSVs In total therewere 16 configurations for each MOSFET array type (NMOSPMOS)as listed in Table 42 The location of each MOSFET array in thetop-level design is illustrated in Figure 44
The number of available IO test pads on each module was2 times 12 which imposed a limitation for accessing all 21904 (16 times37 times 37) MOSFETs To overcome that restriction only a subset ofthe transistors in each MOSFET array were wired out (16times 16) andthus electrically accessed using digital selection logic The restof the transistors in the 37 times 37 MOSFET array were inserted asdummy devices placed in the gaps to ensure uniform boundaryconditions and good matching between transistors (Figure 411)
Since thermomechanical stress exhibits two-fold rotational sym-metry as discussed in 42 the stress effect is the same for tran-sistors on the same axis whether they are placed eastwest ofthe TSV on the latitudinal axis or northsouth of the TSV on thelongitudinal axis The symmetry of the stress distribution wasexploited in the design of the MOSFET arrays so as to increase theresolution of the experiment despite the limited number of tran-sistors As can be seen in Figure 411 the transistors in the arrayare placed in an irregular grid with a 2microm pitch Some transistorsare placed on odd-number distances (1microm 3microm) in relation tothe TSV while others on even-number distances (2microm 4microm)Nevertheless rotational symmetry allows the evaluation of thethermomechanical stress at all directions with a 1microm resolution
43 test structure implementation 52
TSV
MOSFETs Substrate taps
Figure 47 1 TSV MOSFET array configuration
MOSFETs Substrate taps
TSV TSV
TSV TSV
Figure 48 4 diagonal TSVs MOSFET array configuration
43 test structure implementation 53
MOSFETs Substrate taps
TSV
TSV TSV
TSV
Figure 49 4 lateral TSVs MOSFET array configuration
MOSFETs Substrate taps
TSV TSV TSV
TSV TSV TSV
TSV TSV TSV
Figure 410 9 TSVs MOSFET array configuration
43 test structure implementation 54
Configuration RotationFET
width(microm)
FETlength(microm)
Name
0 0deg 07 007 A0
1 0deg 07 007 B0
4diagonal 0deg 07 007 C0
4lateral 0deg 07 007 D0
9 0deg 07 007 E0
0 0deg 05 05 F0
1 0deg 05 05 G0
4diagonal 0deg 05 05 H0
4lateral 0deg 05 05 I0
9 0deg 05 05 J0
0 90deg 07 007 A90
1 90deg 07 007 B90
9 90deg 07 007 E90
0 90deg 05 05 F90
1 90deg 05 05 G90
9 90deg 05 05 J90
Table 42 MOSFET array configurations for test pad modulesPTP1PTP2
43 test structure implementation 55
2μm 1μm
Metal2Metal1PolyActive
FET
DUMMY
TAP
TSV
Figure 411 MOSFET array detail
432 Digital selection logic
Each MOSFET array was composed of 256 active transistors (16 times16) Test modules PTP1 and PTP2 implemented 16 array con-figurations each with 4096 devices in total (16 con f igurations times256 transistors) which could be accessed individually throughthe 24 IO pads of the 2 times 12 test pad modules Individual tran-sistor Gate Drain and Source terminals were multiplexed tothe IO pads using a combination of decoders and transmission-gates (Figure 412) The transmission-gates were used as analogswitches for current flow while the decoders enabled the corres-ponding transmission-gates for accessing the externally selectedtransistor for measurement
The procedure for selecting a single MOSFET device for meas-urement was as follows
bull Array (0 to 15) was enabled using the 4 minus bit Array-selectdecoder The MOSFET Source terminal (S) common to alltransistors in a single array was electrically connected toIO pads ldquoSFrdquoldquoSSrdquo when enabled
bull MOSFET Gate terminal (G0 to G15) was connected to IO
pad ldquoVGrdquo using the 4 minus bit Gate-select decoder and corres-ponding transmission-gates
bull MOSFET Drain terminal (D0 to D15) was connected to IO
pads ldquoDFnrdquoldquoDSnrdquo using the 2 minus bit Drain-select decoder
43 test structure implementation 56
D0
D15
DF[0-3]
DS[0-3]
Drain-select decoder (2-bit)
SSF
SS
Array-select decoder (4-bit)
VG
Gate-select decoder (4-bit)
G0 G15
MOSFET Array
Figure 412 Digital selection logic
and corresponding transmission-gates Drain terminals werealways enabled in groups of 4 and connected sequentiallyto IO pads ldquoDF0rdquoldquoDS0rdquo to ldquoDF3rdquoldquoDS3rdquo thus the 2 minus bit
Drain-select decoder was adequate for accessing all 16Drain terminals in the array
When a MOSFET is switched on a voltage drop is expectedbetween the IO pads and the transistor DrainSource terminalsdue to the resistance of wires To guarantee accurate voltagedelivery transistor DrainSource terminals are ldquosensedrdquo on theldquoDSnrdquoldquoSSrdquo IO pads so as to ensure that the applied voltageon IO pads ldquoDFnrdquoldquoSFrdquo supplies the desired voltage values onthe transistor terminals This procedure known as four-terminalsensing [29] is typically automated when using laboratory meas-urement equipment such as a source measurement unit (SMU)
A single MOSFET array with its digital selection logic can beseen in layout view in Figure 413 Transistor Gate terminals areselected using a 4minus bit decoder (Figure 414) while the externallyapplied voltage on IO pad ldquoVGrdquo passes through transmissiongates (Figure 415) The transmission gates schematic used forthe DrainSource terminals can be seen in Figure 416 The Drainterminal transmission gates are enabled in groups of 4 usinga 2 minus bit decoder (Figure 417) If a MOSFET array is not in theselected state by the Array-select decoder Gate terminals areforced at GND voltage level (for PTP1) or to VDD voltage (forPTP2) so that non-selected MOSFETs remain turned-off insteadof floating
43 test structure implementation 57
Gatedecoder
Metal2Metal1PolyActive
Transmissiongates (DS)
Transmission gates (G)
MOSFETArray
Figure 413 MOSFET array with digital selection logic
43 test structure implementation 58
BLK
A[0] A[1] A[2] A[3]
DEC[0-15]
Figure 414 4-bit Gate-enableArray-enable decoder
43 test structure implementation 59
DEC
VGGVDD
Transmission gate
Pull-downtransistor
Figure 415 Transmission gate for MOSFET Gate terminals
Transmission gates
FORCE
SENSE
DRAINSOURCE
EN
VDD
Figure 416 Transmission gates for MOSFET DrainSource terminals
43 test structure implementation 60
A[0] A[1]
DEC[0-3]
Figure 417 2-bit Drain-select decoder
44 experimental measurements 61
Configuration Type Name Wafer
1 No-TSVPMOS
(05microm times 05microm)F0 D08
2 1-TSV 0ordm 25 ordmCPMOS
(05microm times 05microm)G0 D08
3 1-TSV 0ordm 25 ordmCPMOS
(05microm times 05microm)G0 D20
4 1-TSV 0ordm 60 ordmCPMOS
(05microm times 05microm)G0 D08
54-lateral TSV 0ordm 25
ordmCPMOS
(05microm times 05microm)I0 D08
6 1-TSV 90ordm 25 ordmCPMOS
(05microm times 05microm)G90 D08
7 1-TSV 0ordm 25 ordmCPMOS (07microm times
007microm)B0 D08
Table 43 MOSFET array configurations selected for experimental eval-uation
44 experimental measurements
Test modules PTP1 and PTP2 implementing the presented MOS-
FET arrays were fabricated on 300mm wafers (Figure A1) eachcontaining 472 identical dies Two wafers were selected for meas-urements referred to as ldquoD08rdquo and ldquoD20rdquo both originating fromthe same lot The test module used for experimental evaluationwas PTP2 containing the PMOS-device arrays which were expec-ted to have the highest performance impact from thermomech-anical stress as was discussed in 42 The specific MOSFET arraysthat were selected for evaluation are listed in Table 43
441 Data processing methodology
For each MOSFET array evaluated the drain-to-source current(IDS) of all transistors in the array was measured in the saturationregion (VDS = 1V VGS = 1V) Subsequently the saturation cur-rent was divided with the median current value of all transistorsin the array generating IDS values normalised to the median Themedian of a data set represents the arithmetic value that the ma-jority of values tend (Appendix B) Since only a small percentageof the 256 transistors in the array are in proximity to the TSVsthe median IDS value represented typical transistor saturationcurrent not affected by thermomechanical stress
Measurement precision is increased with repeated measure-ments as random noise is removed from the experiment reducing
44 experimental measurements 62
ABCDEFBBC
FCDDFBCFFC
FCB
A
BAACADE
CECBBC
CFFCDE
B$BCFFBDBCBF
amp(BFC)B
BC
FCDDFA
CAF
Figure 418 Processing of measurement results
the standard deviation of the sampled mean (Appendix B) Con-sequently each MOSFET array was measured across 98 identicaldies on the wafer and their results were averaged Since the satur-ation current shift due to thermomechanical stress is systematicacross all dies averaging the saturation currents only removedthe random noise effectively increasing the measurement pre-cision by a factor of sim 10 (
radic98) The described procedure for
processing the measurement results is illustrated in Figure 418Each MOSFET array contains 256 transistors which are not
placed in a regular 16times 16 grid as can be seen in Figure 411 Thisapproach was chosen so as to minimise the number of transistorsin an array while allowing for 1microm resolution in the evaluation ofthe thermomechanical stress by taking advantage of its two-foldrotational symmetry as discussed in 431 However illustratingmeasurements in a contour graph requires a regular grid of dataThus for illustration purposes the experimental measurementswere reconstructed to a 33 times 31 regular grid (1023 transistors)with the intermediate values estimated using linear interpolation(Figure 419)
442 Measurement setup
For IO access a general purpose on-wafer measurement sys-tem was used that allowed direct contact with the wafer and the2times 12 test pad modules through probecard pins (Figure A2) IO
signal generation and measurement was provided by standard
44 experimental measurements 63
ABCDECDBF
ABECBF
AB
AB
Figure 419 Interpolation of measurement results
laboratory instruments controlled through the GPIB port of a PCrunning a customised LabVIEW program Four-terminal sens-ing for MOSFET DrainSource terminals was automated using aKeithley 2602 dual-channel SMU [1] The measurement setup isillustrated in Figure 420
The IO test pads for the PTP1PTP2 modules can be seen inFigure 44 while the instrument connections used for each testpad are summarised in Table 44
ABCCDE
F
AA
A
DBCB
Figure 420 Measurement setup
44 experimental measurements 64
Test pad Direction Instrument connection
VDD Power Power supply - 1V
GND Power Power supply - 0V
DEC[0-9] InputDigital Output
(GATEDRAINARRAY SelectDecoders Input)
DF[0-3] Input SMU (Drain Force terminal)
DS[0-3] Output SMU (Drain Sense terminal)
SF Input SMU (Source Force terminal)
SS Output SMU (Source Sense terminal)
VG Input Digital Output (Gate voltage)
Table 44 PTP1PTP2 test pad module instrument connections
443 PMOS (long channel)
No-TSV configuration
This MOSFET array does not contain any TSVs thus transistormobility is not affected by thermomechanical stress The arraywas measured on 98 identical dies of wafer ldquoD08rdquo the saturationcurrents normalised and then averaged over all measured diesusing the procedure elaborated in 441 In Figure 421 the norm-alised saturation current for each transistor in the array can beseen after interpolation to a 33 times 31 grid Each coordinate in thefigure represents a 1microm distance
As can be observed the majority of transistors in the array havealmost identical saturation currents as expected However there isa distinctive stripe running from north to south where the currentvalue is reduced up to sim 5 This revealed a flaw in the designof the MOSFET arrays that can be best observed in Figure 411The transistors that are located adjacent to the substrate tap donot perceive the same boundary conditions as the rest of thetransistors in the array That difference is enough to affect thephysical characteristics of these transistors during fabricationwhich in turn determine transistor parameters such as saturationcurrent This flaw stresses out the significance of careful boundarycondition design when targeting for good transistor matching ina test structure
1 TSV configuration
In this configuration long-channel MOSFETs (05microm times 05microm) areplaced in the periphery of a single TSV (Figure 47) The transistorsin the array were measured over 98 identical dies of wafer ldquoD08rdquo
44 experimental measurements 65
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
096
097
098
099
100
101
Figure 421 Normalised on-current for PMOS structure ldquoF0rdquo (D08)
and processed according to 441 The normalised saturation cur-rent for each transistor in the array can be seen in Figure 422 afterinterpolation to a 33 times 31 grid It can be clearly observed that thetransistors in proximity to the TSV show a distinctive variationin their saturation current in comparison to the other transist-ors in the array which is the direct consequence of TSV-inducedthermomechanical stress These results also demonstrate that theproposed test structure design has enough measurement preci-sion to distinguish thermomechanical stress from backgroundnoise and variability The north to south stripe observed in theprevious configuration due to boundary condition mismatch isalso present in these measurements
To observe the TSV effect on nearby transistors more accuratelythe normalised saturation current versus distance is plotted foraxis X = 16 and Y = 14 (Figure 423) which cross the TSV almostat its the centre In the graph only actual measurements are plot-ted prior to interpolation We observe that the saturation currentof a transistor at 1microm distance from the TSV increases sim 20 onthe latitudinal axis and decreases sim 15 on the longitudinalwhile the effect is almost negligible after 8microm
The same measurement is repeated on wafer ldquoD20rdquo In Fig-ure 424 the normalised saturation current for each measuredtransistor in the array can be seen after interpolation to a 33 times 31grid Also the normalised saturation current versus distance isplotted for axis X = 16 and Y = 14 (Figure 425) In the graph
44 experimental measurements 66
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
09
10
11
12
105
110
115
120TSV
Figure 422 Normalised on-current for PMOS structure ldquoG0rdquo (D08)
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
085
09
095
1
105
11
115
12
125Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 423 Normalised on-current for PMOS structure ldquoG0rdquo(D08Y=14X=16)
44 experimental measurements 67
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
09
10
11
12
13
TSV
Figure 424 Normalised on-current for PMOS structure ldquoG0rdquo (D20)
we observe that the saturation current of a transistor at 1microm dis-tance from the TSV increases sim 30 on the latitudinal axis anddecreases sim 18 on the longitudinal while the effect is almostnegligible after 8microm These results show that thermomechanicalstress can be quite different at distances very close to the TSV
even in between wafers originating from the same lotFinally the measurement is repeated on wafer ldquoD08rdquo after
increasing the temperature to 60 ordmC Normalised current contourplot can be seen in Figure 426 and measurements for X = 16and Y = 14 in Figure 427 The saturation current of a transistor
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
09
1
11
12
13
14
Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 425 Normalised on-current for PMOS structure ldquoG0rdquo(D20Y=14X=16)
44 experimental measurements 68
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
090
095
100
105
110
115TSV
Figure 426 Normalised on-current for PMOS structure ldquoG0rdquo (D0860˚C)
at 1microm distance from the TSV increases sim 16 on the latitudinalaxis (minus4 in comparison to 25 ordmC) and decreases sim 11 on thelongitudinal (minus4 in comparison to 25 ordmC) while the effect isalmost negligible after 6microm (minus2microm in relation to 25 ordmC)
4 lateral TSV configuration
In this configuration the thermomechanical stress is evaluatedin the lateral direction between 4 TSVs The array is measuredover 98 identical dies of wafer ldquoD08rdquo at 25 ordmC The contour plot ofthe normalised saturation current can be seen in Figure 428 andmeasurements for X = 16 and Y = 14 in Figures 429 and 430
respectively As can be observed the saturation current of atransistor at 1microm distance from the TSV increases sim 18 on thelatitudinal axis (minus2 in relation to 1 TSV) and decreases sim 12 onthe longitudinal (minus3 in relation to 1 TSV) while TSV interactionreduces the effect range to 5microm (minus3microm in relation to 1 TSV)
90˚ rotation TSV configuration
In this configuration the orientation of the transistor channel inrelation to the thermomechanical stress is rotated to 90˚ Thearray is measured over 98 identical dies of wafer ldquoD08rdquo at 25 ordmCThe contour plot of the normalised saturation current can beseen in Figure 431 and measurements for X = 16 and Y = 14in Figure 432 As was expected the stress effect on mobility has
44 experimental measurements 69
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
085
09
095
1
105
11
115
12Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 427 Normalised on-current for PMOS structure ldquoG0rdquo (D08Y=14 X=16 60˚C)
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
085
090
095
100
105
110
115
TSV
TSV TSV
TSV
Figure 428 Normalised on-current for PMOS structure ldquoI0rdquo (D08)
44 experimental measurements 70
2 1 1 2 3 4 5 6 7 7 6 5 4 3 2 1 1 2 309
095
1
105
11
115
12
X16
Distance (μm)
Normalized
Ion
TSV TSV
Figure 429 Normalised on-current for PMOS structure ldquoI0rdquo (D08X=16)
3 2 1 1 2 3 4 5 6 7 8 8 7 6 5 4 3 2 1 1 208208408608809092094096098
1102
Y14
Distance (μm)
Normalized
Ion
TSV TSV
Figure 430 Normalised on-current for PMOS structure ldquoI0rdquo (D08Y=14)
44 experimental measurements 71
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
090
095
100
105
110
115
120TSV
Figure 431 Normalised on-current for PMOS structure ldquoG90rdquo (D08)
been reversed between the two axis The saturation current of atransistor at 1microm distance from the TSV decreases sim 13 on thelatitudinal axis and increases sim 20 on the longitudinal whilethe effect is almost negligible after 6microm
444 PMOS (short channel)
In this configuration short-channel MOSFETs (07microm times 007microm) areplaced around a single TSV (Figure 47) The contour plot of thenormalised saturation current can be seen in Figure 433 and
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
08509
0951
10511
11512
125Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 432 Normalised on-current for PMOS structure ldquoG90rdquo (D08Y=14 X=16)
44 experimental measurements 72
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
094
096
098
100
102
104
106
108
110
TSV
Figure 433 Normalised on-current for PMOS structure ldquoB0rdquo (D08)
measurements for X = 16 and Y = 14 in Figure 434 At at 1microm
distance from the TSV the saturation current decreases 7 onthe longitudinal axis (minus8 in relation to long channel) while onthe latitudinal axis no stress effect could be detected That couldbe attributed either to short-channel effects or malfunctioningof the specific transistor due to proximity to the TSV and theexperimental nature of the manufacturing process On both axisthe effect is almost negligible after 5 minus 6microm
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
085
09
095
1
105
11
115Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 434 Normalised on-current for PMOS structure ldquoB0rdquo (D08Y=14 X=16)
45 summary and conclusions 73
Size
Waf
er
Con
figu
rati
on
Rot
atio
n
Eff
ect
1micro
mla
t
Eff
ect
1micro
mlo
ng
KO
Z
L D08 1-TSV (25 ordmC) 0ordm +20 minus15 8microm
L D20 1-TSV (25 ordmC) 0ordm +30 minus18 8microm
L D08 1-TSV (60 ordmC) 0ordm +16 minus11 6microm
L D08 4-lat TSV (25 ordmC) 0ordm +18 minus12 5microm
L D08 1-TSV (25 ordmC) 90ordm minus13 +20 6microm
S D08 1-TSV (25 ordmC) 0ordm sim 1 minus7 8microm
Table 45 Summary of experimental measurements on long (L) andshort (S) channel PMOS devices
45 summary and conclusions
In this chapter a test structure was presented for monitoringthe impact of TSV-induced thermomechanical stress on MOSFET
device performance The test structure implemented arrays ofMOSFET devices superimposed with TSVs in various configura-tions Digital selection logic allowed access to the MOSFET deviceswhich could be used to indirectly monitor thermomechanicalstress by measuring the saturation current of transistors
The structure was implemented in Imecrsquos experimental 65nm
3D process and used to characterize the process technology whileevaluating the test structure Experimental measurements werecarried on PMOS devices and the thermomechanical stress wasfound to be on par with simulation estimations
In Figure 45 the measurements are summarised for the latit-udinal and longitudinal axis at 1microm distance from the TSV alongwith the recommended keep-out-zone (KOZ) The KOZ is indic-ative and represents the distance where the thermomechanicalstress effect on saturation current is almost negligible (lt 1) Ina practical case KOZs will depend on the tolerated saturation cur-rent mismatch between devices which is a direct consequence oftiming and other circuit constraints Typically digital circuits cantolerate and thus function properly with much more transistorvariability than analog circuits
Comparing the experimental results of this work to other pub-lished data in the literature reveals that the thermomechanicalstress effect varies dramatically with different process character-istics Experimental measurements reported by Cho et al [17]confirm that long channel devices are more sensitive than short
45 summary and conclusions 74
channel however both the saturation current shift and KOZs areconsiderably reduced to 2 and lt 2microm respectively In additionthe dependency of the stress effect on TSV position was not ob-served by the authors in contrast to the measurements presentedin this work and predictions by theoretical models Contradictoryresults between theory and experiments and between differentexperiments emphasise the need for better understanding of thespecific mechanisms involved in the TSV-induced thermomech-anical stress
Evaluation of the measurements results allowed for some con-clusions to be made on the effectiveness of the presented teststructure
1 It is desirable for transistors in the array to be in a regu-lar grid so as processing of the measurement data is lesscomplex
2 The boundary conditions of transistors have an consider-able effect on transistor matching and thus measurementaccuracy
3 Distances of interest are generally lt 8microm therefore thearrays could be smaller or more dense in the area in prox-imity to the TSV than further away
4 The experimental process used to implement the test struc-ture had low yield which stressed out the need for mech-anisms to detect faults in the digital selection logic
The experiences learned from this project can be potentially usedfor designing improved thermomechanical stress characterizationstructures for next generation TSV processes In a future workit would be interesting to perform experimental measurementson NMOS devices which were not evaluated in this project Fur-thermore the cause of the glitch measured on the short-channeltransistor nearest to the TSV should be further investigated
In the following chapter the potential of the energy-recoverytechnique is investigated for reducing the energy dissipation ofTSV interconnects in future 3D-SoC designs implementing a largenumber of TSVs
5E VA L U AT I O N O F E N E R G Y- R E C O V E RY F O R T S VI N T E R C O N N E C T S
51 introduction
TSV-based 3D-ICs enable low-parasitic direct connections betweenfunctional blocks in a SoC by replacing traditional long horizontal2D global interconnects with short vertical ones (21 page 5)Compared to 2D interconnects the reduced resistance-capacitance(RC) parasitics of TSVs improve both speed (RC) and energy(12CV2
DD) performance of circuits However the TSV parasiticcapacitance may still become an important source of energydissipation in large densely interconnected 3D-SoCs since thecombined capacitance and thus the energy required to drive TSVs
increases linearly with the number of tiers and interconnections(Figure 51)
The principal technique for reducing energy dissipation inCMOS has traditionally been based on supply voltage scaling[14] though this represents a technical challenge below the 65nm
node since PVT variations and physical limits decrease in practicethe range over which voltages may be scaled An alternativelow-power design technique is energy-recovery logic [31] whichdemonstrates frequency-dependent energy dissipation and thusdoes not suffer from the limitations of voltage scaling The originsand operating principle of energy-recovery logic were discussedin detail in 22 (page 13)
In this chapter the potential of the energy-recovery techniqueis investigated for reducing the energy dissipation of TSV inter-connects in 3D-ICs The total energy dissipation per cycle andoptimum device sizing are extracted for the proposed schemeusing theoretical modelling while the configuration is evaluatedagainst conventional static CMOS Some of the presented methodsand results have appeared previously in Asimakopoulos et al[7]
52 energy-recovery tsv interconnects
As was discussed in 22 (page 13) energy-recovery circuits con-serve energy by restricting current-flow across resistances andsubsequently recovering part of the supplied energy using time-variable power sources The energy dissipated in an energy-
75
52 energy-recovery tsv interconnects 76
1XTSV
Tier1
TSV
Tier0LOGIC 0
LOGIC 0
TSV
LOGIC 0
LOGIC 0
TSV
LOGIC n
LOGIC n Tier1
Tier0
TSV
LOGIC 0
LOGIC 0
TSV
LOGIC n
LOGIC n
Tier(m-1)
Tier(m)
mXnXTSVFigure 51 TSV energy dissipation per cycle increase with 3D integra-
tion density
recovery circuit becomes proportional to the charging period (T)of the capacitive load (CL) [82 60]
EER prop
(
RCL
T
)
CLV2DD (51)
In comparison a conventional CMOS circuit driving an equi-valent capacitive load would dissipate during the same chargingperiod T [59]
ECMOS =12
CLV2DD (52)
Combining Equations 51 and 52 we calculate the energydissipation relation between the two circuits
EER
ECMOSprop
2RCL
T(53)
From Equation 53 we observe that the conditions under whichenergy-recovery conserves energy when compared to conven-tional CMOS are when either the charging period (T) is large orthe resistancecapacitance (RCL) of the load is small
In a 3D-IC TSV parasitics in terms of resistance-capacitance (RC)are inherently low by design as was discussed in 21 (page 5)In fact as was demonstrated by Katti et al [27] resistance isnegligible and thus the electrical behaviour of a TSV in a 3D-IC
52 energy-recovery tsv interconnects 77
Tier2Tier1TSV
LOGIC 1
LOGIC 2
driver
TSV
LOGIC 1
LOGIC 2
CMOS
driver
Convert
Energy-recovery
PCG
Figure 52 Comparison of a typical CMOS TSV driver to the proposedenergy-recovery
can be modelled as a single capacitance due to its predomin-ant impact on energydelay Therefore and based exclusivelyon Equation 53 energy-recovery has an advantage over con-ventional static CMOS when used for driving low-parasitic TSV
interconnectsThe differences between a typical CMOS driver in a TSV-based
3D-IC and the energy-recovery scheme proposed in this chapterare illustrated in Figure 52 A logic block in Tier1 generates adigital output which is regenerated by a driver and receivedby a second logic block in Tier2 after passing through the TSVIn the proposed energy-recovery scheme as an intermediatestep the digital output is converted to an adiabatic signal beforepassing through the TSV so as to take advantage of the low-power characteristics of energy-recovery Compatibility betweenthe receiving digital logic block in Tier2 and energy-recovery isretained by restoring the adiabatic signal back to its digital form
Converting the digital signal to an adiabatic as depicted inthe simplified diagram of Figure 52 requires the use of a spe-cialised driver that can combine (mix) the time-variable signalof the power-clock generator (PCG) with a digital output Signalsgenerated by a PCG return to zero at the end of a cycle resem-bling the operation of a digital clock (222 page 18) In contrastdigital outputs are typically allowed to have arbitrary values fromone cycle to the next thus a special structure is required thatcan encode the PCG generated signal to arbitrary binary values
52 energy-recovery tsv interconnects 78
IN IN
ININ
OUT OUT
IN
IN
OUT
OUT
Figure 53 Adiabatic driver
Figure 54 A simple pulse-to-level converter (P2LC)
An example of such structure is the adiabatic driver [8 84] (Fig-ure 53) Adiabatic drivers are composed of a pair of transmissiongates switching onoff alternately based on the value of a digitalinput On each cycle the time-variable adiabatic signal is redirec-ted either to the driverrsquos inverting or non-inverting output andas a result the data carried by the adiabatic signal are dual-railencoded
The signals at the outputs of the adiabatic driver can be restoredto their digital form after passing through the TSV by convertingthe dual-rail encoded sinusoidal pulses back to level signals usinga pulse-to-level converter (P2LC) [72] (Figure 54)
For generation of the time-variable adiabatic signal a resonantPCG may be used such as the one discussed in 222 (page 18)A resonant PCG generates continuous sinusoidal pulses approx-imating the idealised (linear) adiabatic signal of Figure 52 Theadiabatic driverrsquos resistance (RTG) along with the capacitive load(CL) assume a crucial role in the generation of the resonant si-nusoidal pulse as their presence in the signal path along with
53 analysis 79
VDD2
Rind
RTG
AdiabaticDriver
CL
M1
T=1f
L ind
RTG
CL
Figure 55 Resonant pulse generator with dual-rail data encoding
Tier2
Tier1
f-fPULSE
VDD2CLK IN
Delay1CLK1
CLK0
Delay2CLK2OUT1
Data In
Data Out
CLKRES
TSV TSV TSV
CLKBufferAdiabatic
Driver
P2LC
f-f
Data1
M1
Figure 56 Energy-recovery scheme for TSV interconnects
a resonant-tank inductor form an RLC oscillator [44] (Figure 55)resonating at
f =1
2πradic
LindCL(54)
A detailed block diagram of the proposed energy-recoveryscheme including the PCG and dual-rail encoding is illustratedin Figure 56
53 analysis
The energy dissipation relation depicted in Equation 51 is basedon the assumption that an idealised (linear) adiabatic signalis used for charging the TSV capacitance Furthermore energy
53 analysis 80
losses occurring in the additional components used in the realisticenergy-recovery circuit illustrated in Figure 56 are not consideredin Equation 51 The purpose of this section is to analyse andmodel all possible sources of energy dissipation in the proposedenergy-recovery scheme
The bulk of the energy dissipation in the energy-recovery cir-cuit of Figure 56 will occur in the adiabatic driver (EAD) theinductorrsquos parasitic resistance (Rind) transistor M1 (EM1) and theP2LC (EP2L) Therefore the total energy dissipated per cycle inthe energy-recovery circuit would be
EER = EAD + Eind + EM1 + EP2L (55)
Parameters EAD Eind and EM1 are theoretically derived in thefollowing subsections while EP2L depends on the specific P2LCs
architecture used and could be easily extracted from simulationdata as will be shown later in this chapter
531 Adiabatic driver
There are two sources of energy dissipation in the adiabaticdriver of Figure 53 adiabatic dissipation caused by current flow-ing through the transmission gate resistance (RTG) and CMOS
dissipation caused by external stages driving the transmissiongate and pull-down NMOS transistor input capacitances
The sinusoidal pulses generated by the PCG chargedischargethe load capacitance (CL) through the transmission gate resistance(RTG) The full chargedischarge cycle of the load capacitanceconcludes during a period T = 1 f while the energy dissipatedon the transmission gate resistance (ETG) during that period isadiabatic and thus proportional to Equation 22 Appending acorrection factor ξ equal to π28 for a sine-shaped current [8 76]and multiplying the equation by 2 to calculate energy dissipationfor the full cycle
ETG = 2ξ
(
RTGCL12 T
)
CLV2DD =
π2
2(RTGCL f )CLV2
DD (56)
The cross-coupled PMOS pair of Figure 53 reduces the transmis-sion gate input capacitance by 12 when compared to the simpleadiabatic driver of Figure 215 (page 18) External CMOS stagesare not required to drive the PMOS pair however the PMOS gatecapacitance will appear as an additional capacitive load at theoutput of the driver Energy will still be dissipated for driving thePMOS pair but will be adiabatic and thus more energy efficient
53 analysis 81
Transmission gates are typically sized to have equal NMOS andPMOS parts as increasing the size of the PMOS only slightly im-proves the gate resistance while significantly increasing the inputcapacitance [78] Thus assuming both NMOSPMOS transistorsare equally sized to Wn the additional capacitive load appearingat the driverrsquos output due to the PMOS pair would be one gatecapacitance Cn In addition drainsource diffusion capacitance(CD) will be an important portion of the load since in each cycle6CD will be present in the current flow path (4 contributed bythe ON and 2 by the OFF transmission gate) Including the TSV
capacitance at the output of the adiabatic driver the combinedload capacitance seen by the PCG in any cycle would be
CL = CTSV + Cn + 6CD (57)
The transmission gate resistance (RTG) can be related to thegate capacitance by a ldquodevice technology factorrdquo (κTG) [43] whichwe can define for calculation convenience
κTG = RTGCn rArr RTG =κTG
Cn(58)
Assuming that the pull-down NMOS transistors of Figure 53are small CMOS dissipation is primarily the result of driving theNMOS part of the transmission gates which is CnV2
DD Combiningthe CMOS dissipation with Equations 56 57 and 58 gives thetotal dissipated energy per cycle in the adiabatic driver
EAD = CnV2DD +
π2
2κTG
Cnf [CTSV + Cn + 6CD]
2 V2DD (59)
The second term of Equation 59 has a consistent contributionon the energy dissipation in every cycle while the first term isdependent on data switching activity (D) We can also furthersimplify this equation by defining the diffusion capacitance asa fraction of the transistors input capacitance (CD = bCn) andassigning term π2
2 κTG f to a variable (y = π2
2 κTG f ) Equation 59then becomes
EAD = D middot CnV2DD +
y
Cn[CTSV + (6b + 1)Cn]
2 V2DD
=
[
Cn
(
D
y+ (6b + 1)2
)
+1
CnC2
TSV
]
middot yV2DD
+(12b + 2)CTSV middot yV2DD (510)
Since in Equation 510 Cn is the free parameter the first twoterms are inversely proportional and thus EAD is minimised when
53 analysis 82
they become equal Solving equation Cn
(
Dy + (6b + 1)2
)
= 1Cn
C2TSV
for Cn the optimum transmission gate input capacitance is calcu-lated for a given CTSV
Cn(opt) =
radic
[D
y+ (6b + 1)2]minus1CTSV (511)
Combining Equations 510 and 511 gives the energy dissipationof an optimally sized adiabatic driver
EAD(min) =
[radic
D
(6b + 1)2y+ 1 + 1
]
middot (12b+ 2)yCTSVV2DD (512)
532 Inductorrsquos parasitic resistance
The inductorrsquos (Lind) parasitic resistance Rind is proportional tothe Qind factor which typically depends on the inductorrsquos imple-mentation technology For the purposes of this analysis it can beestimated as
Qind =1
Rind
radic
Lind
CLrArr Rind =
1QindCL2π f
(513)
Energy dissipation in Rind is adiabatic and can be estimatedcombining Equations 56 and 513
Eind =π2
2(RindCL f )CLV2
DD =π
4CL
QindV2
DD (514)
533 Switch M1
Transistor M1 is switched-on briefly to recover the energy dissip-ated each cycle in the Rtotal = RTG + Rind Its energy dissipationis a trade-off between dissipation in its on-resistance RM1 andinput capacitance CM1
Since M1 is a fairly large transistor previous ratioed stages willhave a significant contribution to the energy consumption as wellFor that reason a m factor is used to compensate for the additionallosses IM1(rms) is the root mean square (RMS) current passingthrough the transistor while turned-on and VGM1 is the peak gatevoltage A methodology for deriving optimum values for boththese parameters is proposed in [44] and the total dissipatedenergy in M1 can be estimated as
EM1(min) = 2IM1(rms)VGM1
radic
mκM1
f(515)
54 simulations 83
Delay2
CLK1
CLK2Delay1
CLKW
PULSE
CLK0T
PD
OUT1
PULSE
P2L
RES
CLK
Figure 57 Energy-recovery timing constraints
534 Timing Constraints
The gate of transistor M1 in Figure 55 is controlled by a pulsewith period TCLK and width WPULSE For optimum energy ef-ficiency the digital inputs of the adiabatic driver in Figure 53should switch during the time period defined by WPULSE whilethe input values must be valid for the complete resonant clockcycle as shown in Figure 57
The adiabatic driverrsquos input data latching is controlled by clocksignal CLK1 (Figure 56) and thus its positive edge should takeplace during WPULSE (Delay1 lt WPULSE) If these constraints arenot met energy that could be otherwise recovered by the inductorwill be dissipated on M1 during WPULSE The clock signal at thereceiving end CLK2 does not have the same constraint howeveritrsquos positive edge should occur before the P2LC outputs new data(Delay2 lt WPULSE + PDP2L)
54 simulations
To verify and evaluate the proposed theoretical models a testcase was designed for which energy dissipation was calculatedusing both theory and SPICE simulations The test case assumed1 minus bit of information per cycle was transferred between 2 tiersof a 3D-IC in which the TSV capacitance (CTSV) was 80 f F TheTSV was driven using the energy-recovery technique proposedin this chapter implemented with a single adiabatic driver fordata encoding and an inductor (Lind Rind) for generation of the
54 simulations 84
M1IN IN
IN
OUT OUT
IN
Figure 58 Test case for the evaluation of the theoretical model
Q250MHz 500MHz 750MHz
T S e () T S e () T S e ()
6 360 409 -120 476 516 -78 578 613 -57
8 308 347 -114 417 449 -70 515 543 -52
10 276 312 -115 382 411 -70 477 498 -43
12 255 289 -119 358 386 -71 451 472 -44
Table 51 Energy dissipation results ( f Jcycle) theoretical model (T)versus SPICE simulation (S)
adiabatic signal and energy recovery The circuit used for the testcase is illustrated in Figure 58
The energy dissipation on the adiabatic driver inductor andtransistor M1 were calculated using Equations 512 514 and 515with the Q factor and operating frequency acting as free variablesTechnology-specific parameters κTG κM1 and CD were extractedusing simulation models for Imecrsquos 130nm 3D process (Table 22page 11) The same circuit was simulated on Cadence VirtuosoSpectre simulator under identical conditions The energy dissipa-tion results for various Q factors and operating frequencies arelisted in Table 51
The calculated and simulated energy dissipation results arealso plotted in Figure 59 As can be observed theoretical es-timations are lower than than the equivalent results reported bythe simulator in all the cases examined That can be expected asthe calculations performed by the simulator are thorough andconsider much more details that are skipped in the theoreticalmodels Nevertheless there is good correlation between theoryand simulations
In Figure 510 the error in the estimation of the energy dissip-ation is plotted when using the theoretical model The Q factordoes not seem to have a significant effect on the error howeverthere is a considerable dependence on the operating frequency
54 simulations 85
5 6 7 8 9 10 11 12 130
10
20
30
40
50
60
70
T250 S250 T500 S500 T750 S750
Q factor
Energydissipation(fF)
Figure 59 Energy dissipation results ( f Jcycle) theoretical model (T)versus SPICE simulator (S)
5 6 7 8 9 10 11 12 13
-14
-12
-10
-8
-6
-4
-2
0
250MHz 500MHz 750MHz
Q factor
Error()
Figure 510 Energy dissipation estimation error of the theoretical modelcompared to the SPICE simulator
The parameters κTG κM1 and CD that were extracted for thespecific technology even though they were used as constants inthe models in practice they are frequency dependent Thereforeit could be said that based on the parameters chosen the theor-etical results are optimised for a specific frequency Increasingfrequency doesnrsquot necessary result in reduced error as seemsto be suggested by the graphs in Figure 510 In fact it is ex-pected that the estimation error will show a positive sign forlarger frequencies As was discussed previously in the text it isa reasonable expectation for the theoretical models to slightlyunderestimate the energy-dissipation
54 simulations 86
M1
TSV
TSV
TSV
Data in OutCMOS
OutER
Energy-recovery
CMOS
P2LC
Adiabaticdriver
Figure 511 Test case for comparison of energy-recovery to CMOS
541 Comparison to CMOS
When comparing different circuits it is typically desirable to takea holistic approach evaluating various aspects such as powerarea speed cost etc The purpose of this section is to comparethe energy-recovery scheme to a conventional static CMOS circuitusing just the theoretical models presented in this chapter Thisbasically limits the comparison to the energy dissipation andspeed parameters while on Chapter 6 (page 92) area is alsoconsidered The test case used for the comparison is illustratedin Figure 511 The TSV load capacitance was assumed to be 80 f F
for both circuitsThe energy dissipation of the CMOS circuit in Figure 511 is
described by Equation 52 Appending the data switching activity(D)
ECMOS = D middot 12
CLV2DD (516)
Equation 516 obviously calculates the theoretical minimumenergy dissipation since no parasitics and previous CMOS stagesare included In practice the energy dissipation of the CMOS
circuit can be much larger however Equation 516 is consideredadequate for the purposes of this comparison
Energy dissipation of the energy-recovery circuit is calculatedusing Equation 55 The energy contribution of each sub-componentthe adiabatic driver (EAD) the inductorrsquos parasitic resistance(Rind) and transistor M1 (EM1) were theoretically derived pre-viously in the text The P2LC energy was simulated in SPICEfor convenience The simple circuit topology of Figure 54 wasslightly improved adding the PMOS transistors seen in Figure 512so as to reduce short-circuit current when the input is in the range
54 simulations 87
Figure 512 Improved P2LC design
0 100 200 300 400 500 600 700 800 9000
2
4
6
8
10
12
14
Frequency (MHz)
Energy
dissipation(fJ)
Figure 513 Simulated energy dissipation of improved P2LC design
Vth rarr (VDD minus Vth) The energy dissipation results extracted us-ing SPICE can be seen in Figure 513 As can be observed theenergy dissipation is frequency dependent something not expec-ted in a CMOS circuit This behaviour is caused by the slow risingsinusoidal inputs which for low frequencies extend the periodduring which both PMOS and NMOS are on and thus increaseshort-circuit current
The Q factor is a critical parameter in the energy efficiency ofthe energy-recovery scheme since the entire circuit is basically anRLC oscillator A wide range of inductors are available as discretecomponents for use in circuits with Q factors depending on thesize and material On-chip integrated inductors have more re-stricted performance however their compact size is an advantageIntegrated Q factors are typically in the neighbourhood of 10for nH range inductors with sim 100microm radius [5] The Q factor
54 simulations 88
4 6 8 10 12 14 16 18 20 22-20
-10
0
10
20
30
40
50
60
800MHz 700MHz 600MHz 500MHz 400MHz 300MHz 200MHzQ factor
Energyred uction()
Figure 514 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS for variable Q factor(C = 80 f F)
4 6 8 10 12 14 16 18 20 220050
100150200250300350400450
Inductor M1 Adiabatic driver P2LC
Q factor
Energy
con trib
ution(
)
Figure 515 Energy contribution of energy-recovery circuit componentsat 200MHz (C = 80 f F)
values in the following simulations were deliberately chosen tobe in realistic ranges for integrated inductors
In Figure 514 the percentage reduction in energy dissipationachieved by the energy-recovery circuit is plotted when comparedto CMOS As expected energy efficiency of the energy-recoverycircuit improves with high Q factors and low frequencies
In Figures 515 and 516 the energy dissipation contributionof each component in the energy-recovery circuit can be seenat 200MHz and 800MHz respectively In both cases the energydissipated on the inductorrsquos resistance is reduced with higher Q
factors At lower frequencies the P2LC energy dissipation becomesa significant part of the total
In Figure 517 the percentage reduction in energy dissipationis plotted when the TSV capacitance (CTSV) is variable The en-ergy dissipation in the adiabatic driver (Equation 512) inductor(Equation 514) and transistor M1 (Equation 515) are linearlyrelated to the TSV capacitance Since the same applies to the CMOS
circuit (Equation 516) relative energy dissipation of both circuitsshould be constant with TSV capacitance The change observed in
54 simulations 89
4 6 8 10 12 14 16 18 20 2200
100
200
300
400
500
600
700
Inductor M1 Adiabatic driver P2LC
Q factor
Energy
con trib
ution(
)
Figure 516 Energy contribution of energy-recovery circuit componentsat 800MHz (C = 80 f F)
20 40 60 80 100 120 140 160 180 200 220-100
00
100
200
300
400
500
Q6 Q10 Q14 Q18
TSV capacitance (fF)
Energy
red u
ction(
)
Figure 517 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS with variable TSVcapacitance (C = 80 f F f = 500MHz)
Figure 517 is caused by the constant energy dissipation contribu-tion of the P2LC However as the load capacitance increases theenergy dissipation on the P2LC becomes insignificant in relationto the rest of the circuit components thus the reduction in energyis expected to asymptotically approach a maximum value
Switching activity (D) can also be a significant factor affect-ing energy performance Since in the energy-recovery circuit thesinusoidal oscillation may not be halted all capacitances in thecurrent flow path will charge and discharge on each cycle regard-less of data activity In contrast static CMOS ideally dissipatesenergy only when switching and thus the energy-recovery circuithas a disadvantage at low switching activities In Figure 518 theestimated effect of the switching activity on energy performanceis plotted for an operating frequency of 200MHz
Since the technology factors κTG and κM1 were extracted for a130nm process reducing their value by 12 could also provide uswith an estimation of the circuitrsquos energy performance for a 65nm
node The result in plotted in Figure 519 and as can be observedtechnology scaling has a positive effect on energy dissipationwhen compared to conventional static CMOS
54 simulations 90
4 6 8 10 12 14 16 18 20 22-40
-20
0
20
40
60
80
D=05 D=07 D=09 D=1
Q factor
Energyred uction()
Figure 518 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS (C = 80 f F f =200MHz)
4 6 8 10 12 14 16 18 20 220
10
20
30
40
50
60
130nm 65nm
Q factor
Energyred uction()
Figure 519 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS for variable technology(C = 80 f F f = 500MHz)
55 summary and conclusions 91
55 summary and conclusions
The work presented in this chapter investigated the potential ofthe energy-recovery technique as used in adiabatic logic andresonant clock distribution networks for reducing the energy dis-sipation of TSV interconnections in 3D-ICs The proposed energy-recovery scheme for TSVs was analysed using theoretical model-ling generating equations for energy dissipation and optimumdevice sizing The model accuracy was evaluated against SPICEsimulations which showed good correlation with the theoreticalestimations on a 130nm technology process
The energy-recovery scheme was compared to conventionalstatic CMOS when both circuits driving equivalent TSV load ca-pacitances under similar conditions The analysis revealed theenergy performance dependence on multiple circuit parametersthe Q factor operating frequency switching activity and TSV capa-citance The results demonstrated favourable energy performancefor the energy-recovery scheme for low operating frequenciesand high Q factors switching activities and TSV capacitancesFurthermore an estimation was provided on the energy perform-ance of the proposed scheme in an advanced technology nodewhich predicted further energy improvement
Encoding the PCG signal to arbitrary digital values using thedual-rail adiabatic driver (Figure 53) is only one of the possiblesolutions In fact energy could be saved if instead of dual-railanother encoding method is used such as 1-out-of-4 [74] In 1-out-of-4 encoding 4 outputs are used to represent 2-bits however onlyone of the outputs is charged at any given cycle In comparisondual-rail charges two outputs per cycle for the same number ofbits and thus 1-out-of-4 could potentially be saving up to 50energy when compared to dual-rail 1-out-of-4 encoding couldprove an interesting extension to the proposed energy-recoveryscheme
The theoretical models developed in this work can be used asa quick alternative to SPICE simulations for design-space explor-ation This is shown in the following chapter where the theoryestablished on the energy-recovery TSV scheme is used for design-ing a low-power 3D demonstrator circuit that was fabricated on a130nm process
6E N E R G Y- R E C O V E RY 3 D - I C D E M O N S T R AT O R
61 introduction
In the previous chapter a theoretical approach was taken toinvestigate the potential of the energy-recovery technique forreducing the energy dissipation of TSV interconnects in 3D-ICs Thetotal energy dissipation per cycle and optimum device sizing wereextracted for the proposed scheme using theoretical modellingwhile analysis revealed the design parameters affecting energyperformance
Practical design issues such as wire parasitics signal distri-bution area etc were not considered hence in this chapter a3D demonstrator circuit based on the energy-recovery scheme isdesigned under realistic physical and electrical constraints Thedemonstrator is compared to a CMOS circuit designed with thesame specifications using post-layout simulations Both circuitsare fabricated on a 130nm 3D technology process and evaluatedbased on the experimental measurements
62 physical implementation
The demonstrator circuit implemented a 500MHz 80-bit parallelIO channel on a 5-tier TSV-based 3D-IC The TSVs interconnectionswere driven using the energy-recovery scheme as discussed inChapter 5
The target process technology allowed for a low-resistivitybackside re-distribution layer (RDL) to be deposited which wasused to implement a high-Q integrated inductor Stacking of dieswas not possible in the target process so the 5-tier 3D-IC wasemulated by series-connecting TSVs in groups of 4 as is illustratedin Figure 61
The circuit consisted of a total of 640 TSVs (2 times 4 times 80bits) ofwhich half were electrically connected to the integrated inductorduring a single clock cycle The TSV capacitance for the fabrica-tion technology process had already been characterized and wasestimated at sim 40 f F (Table 22 page 11) Optimal device sizingand parameters for the energy-recovery IO circuit at 500MHz
were calculated using the methodology developed in Chapter 5
and are listed in Table 61In Figure 62 the schematic diagram of the energy-recovery
circuit is illustrated The main components comprising the circuitare the adiabatic drivers the pulse-to-level converters (P2LCs)
92
62 physical implementation 93
AB
AB
CDEFD
C
ABCD
EFD
C
Figure 61 Demonstrator circuit structure (showing 1-bit IO channelface-down view)
Adiabatic driver 80-bit circuit
Wn 461microm (Eq511) Rtotal 21Ω ( RTG80 )
RTG 168Ω (Eq58) Ctotal 165pF (80 middot Cload)
CL 207 f F (Eq57) Lind 613nH (Eq54)
Table 61 Energy-recovery IO circuit parameters
62 physical implementation 94
IntegratedInductor
VINDOER
PulseGenerator
S1AS2
DataGenerator
TSV
TSV
P2LCAdiabaticDriver
P2LCAdiabaticDriver TSV
TSV
Buffer
Buffer
1
80
1
80
EN
DecouplingCapacitor
Figure 62 Energy-recovery IO circuit schematic diagram
Metal2Metal1PolyActiveVDD
GND
RES_CLK
DATA
OUT
OUT_B
DATA
OUT_B OUT
RES_CLK
GND
VDD
Figure 63 Adiabatic driver (layoutschematic view)
the data generator the pulse generator transistor M1 and theintegrated inductor All sub-components are discussed in detailin the following paragraphs
621 Adiabatic driver
The adiabatic driver was based on the configuration theoreticallyanalysed in 531 (page 80) The adiabatic driver had 2 inputsfor the resonant pulse (RES_CLK) and the digital data (DATA)signals and 2 outputs for the inverted (OUT_B) and non-inverted(OUT) signals The layout and schematic views of the adiabaticdriver are illustrated in Figure 63
62 physical implementation 95
Metal2Metal1PolyActive
IN
OUT_B
VDD
IN_B
OUT
GND
OUT_B
IN
GND
VDD
OUT
IN_B
Figure 64 P2LC (layoutschematic view)
622 Pulse-to-level converter
The pulse-to-level converter (P2LC) was based on the simple circuitof Figure 54 (page 78) The design in layout and schematic viewsis illustrated in Figure 64 The circuit had 2 inputs for the inverted(IN_B) and non-inverted (IN) resonant pulse and 2 outputs forthe inverted (OUT_B) and non-inverted (OUT) digital signals
623 Data generator
A data generator was used for generating the digital data sig-nal required at the adiabatic driverrsquos input (DATA) The circuitwhich is illustrated in Figure 65 was a simple D-type flip-flopconfigured as a toggle The data generator had 1 input a digitalclock signal (CLK) operating at the same frequency as the res-onant clock and the digital data output (Q) with a switchingactivity of 1
As discussed in 534 (page 83) there are certain timing con-straints that must be respected for the energy-recovery circuitto operate efficiently The digital data received at the adiabaticdriverrsquos input (DATA) should switch during the period definedby WPULSE (Figure 66) In the demonstrator circuit the distribu-tion network carrying the DATA signal to the adiabatic driverswas carefully designed so that its delay impact was less thanWPULSE
624 Pulse generator
The gate of transistor M1 in Figure 62 is controlled by a pulsehaving the same frequency as the resonant clock (RES_CLK)and width WPULSE which is calculated using the models in 533(page 82) This digital pulse can be generated externally of thechip using a highly accurate laboratory signal generator Howeverwire parasitics in the path from the external signal source to the
62 physical implementation 96
Metal2Metal1PolyActive EN
CLK
VDD
Q
GND
D
CLKGND
ENVDD
QF-F
D
GND
VDDCLK
D Q
Figure 65 Data generator (layoutschematic view)
Delay1
WPULSE
PULSE
CLKTCLK
DATA
CLK_RES
Figure 66 Timing constraints for the energy-recovery demonstratorcircuit
62 physical implementation 97
PULSES1A
S2
GND
VDD
Metal2Metal1PolyActive
VDD
S1
S2
GND
PULSE
Figure 67 Pulse generator (layoutschematic view)
S2ΔPhase
WPULSE
PULSE
S1A
Figure 68 Pulse generator InputOutput signals
gate of on-chip transistor M1 affect the rising and falling times ofthe pulse resulting in pulse widths which are difficult to control
For that reason the pulse generator was implemented on-chip(Figure 67) and could be used to generate pulses with arbitrarywidths and frequencies The pulse generator had 2 sinusoidalinputs (S1A S2) supplied externally of the chip and generateda digital pulse with the same frequency as S1AS2 The pulsewidth was defined by the phase difference between S1AS2 asillustrated in Figure 68
625 Integrated inductor
The integrated inductor was designed in ASITIC1 as a planarsquare spiral with fixed dimensions (500microm times 500microm) and vari-able inductance while aiming for optimum Q factor The simu-
1 ASITIC is CAD tool that aids in the optimization and modelling of spiralinductors (httprficeecsberkeleyedu~niknejadasitichtml)
62 physical implementation 98
100 200 300 400 500 600 700 8005
7
9
11
13
15
17
19
21
10
15
20
25
30
35
40
45
Q (Simulated) Energy Reduction ()
Frequency (MHz)
Qfactor
Energyreduction()
Figure 69 Inductorrsquos Q factor simulation in ASITIC Estimated energydissipation reduction per cycle over standard CMOS
lated Q factor of the inductor for each frequency was then usedin Equation 55 to estimate the energy performance of the energy-recovery circuit when compared to standard CMOS (Equation 52)
As can be observed in Figure 69 the maximum performance isnot reached at the point where the maximum Q factor is achievedsince energy-recovery is also dependent on the operating fre-quency As a result the largest reduction in energy dissipationwas 40 attained at 300MHz while the maximum Q factor re-ported by ASITIC was 20 at 625MHz The high Q factors werea consequence of the large area allocated for the inductor andthe availability of the backside re-distribution layer (RDL) on thetarget process technology for the implementation of the inductorwhere thick low-resistance copper paths could be deposited
For an operating frequency of 500MHz the inductor had a Qfactor of sim 17 which resulted in an estimated energy reductionof 33 in comparison to the CMOS circuit The inductor had avalue of 613nH at that frequency and its design in layout viewcan be seen in Figure 610
626 Top level design
The various sub-components of the circuit were interconnectedusing Metal1 and Metal2 layers In the design there were twoglobal signals that assumed a critical role in the performance ofthe circuit The digital data signal (DATA) which was discussedin 623 and the resonant clock (RES_CLK)
For the distribution of the data signal buffers and interconnectswere properly sized so that the timing constraints of Figure 66were respected A sinusoidal signal such as RES_CLK does notrequire regeneration as it carries just one frequency harmonichowever the path itself has to be properly sized so as to make surethat any parasitic resistancescapacitances are not significantlyaffecting the target performance of the design As was previously
62 physical implementation 99
500μ
m
500μm
Figure 610 Planar square spiral inductor designed in ASITIC
shown by Chueh et al [19] when the driven load capacitancein a resonant distribution network is much larger than wireparasitic capacitance then wire resistance is more significant fordetermining both skew and energy performance
The resonant pulse distribution topology used in this designwas an asymmetric tree as shown in Figure 611 in which theinductor forces a current at the root branch terminating on alladiabatic drivers in parallel As such and considering that theresonant pulse supplies 80 adiabatic drivers with transmissiongate resistance of RTG = 168Ω the total load resistance seen atthe output node of the inductor is RTG
80 = 16880 = 21Ω However as
the current flows to lower branch levels in the tree the perceivedresistance increases to RTG
16 RTG4 and on the last one to RTG Any
additional resistance from wires is going to add in series with theadiabatic driver resistance which in turn will induce an almostlinear effect on the energy performance as can be estimated fromEquation 512
Assuming that the total driven resistance is not desirable toincrease more than 20 and that there are 4 branch levels inthe design each branch can tolerate wire resistance equal to20
4 = 5 of its value Starting from the inductor output nodeand calculating allowed wire resistivity for each branch level thecalculated tolerances for each branch are listed in Table 62 Thesecalculated limits were used for optimally sizing the resonantclock distribution tree in the demonstrator circuit
To ensure a stable voltage supply at the input of the inductor(VIND) a 1pF interdigitated metal-insulator-metal (MIM) de-coupling capacitor [78] was designed (Figure 612) which wasinserted between VIND and ground The top level design of
62 physical implementation 100
Adiabatic drivers
RTG
16
RTG
80
RTG
4
RTG
Level 1Level 2Level 3Level 4
Figure 611 Resonant pulse distribution tree
Branch level Wire resistance
1st 5 middot 16880 = 105mΩ
2nd 5 middot 16816 = 525mΩ
3rd 5 middot 1684 = 21Ω
4th 5 middot 168Ω = 84Ω
Total load 168+8480 + 21
20 + 05255 + 0105 = 252Ω
Table 62 Calculated resistance tolerances for each wire branch in theresonant clock distribution tree
63 post-layout simulation and comparison 101
+-
+-Metal2 Metal1
180μ
m
100μm
Figure 612 Interdigitated metal-insulator-metal capacitor (fringe capa-citor)
the energy-recovery IO circuit in layout view is illustrated inFigure 613
627 CMOS IO circuit
The CMOS IO circuit which was designed for comparison pur-poses is illustrated in the schematic diagram of Figure 614 Thecircuit used half the number of TSVs than the energy-recoverycircuit and was composed of two basic sub-components The datagenerator which was identical to the one discussed in 623 andthe CMOS drivers (Figure 615) which were properly sized to drivethe 160 f F TSV capacitance
The top level layout view of the CMOS IO circuit can be seenin Figure 616
63 post-layout simulation and comparison
The theoretically calculated parameters were used for the designof the demonstrator circuit in Imecrsquos 3D130nm process technologyIn Table 63 the estimated energy dissipation of each circuit sub-component is listed
63 post-layout simulation and comparison 102
22mm
12mm
GND GNDVBUF1 VDR
GND
VIND
S1A
S2
EN
OER
Figure 613 Energy-recovery IO circuit in layout view
63 post-layout simulation and comparison 103
OCMOS
S1B DataGenerator
TSV
CMOSDriver
CMOSDriver
TSV Buffer
1
80
EN
Figure 614 CMOS IO circuit schematic diagram
GND
VCMOS
DATA
078u
039u
105u
053u
316u
158u
950u
475u
OUT
Figure 615 CMOS driver schematic
12mm
12mm
OCMOS S1B GND VBUF2 VCMOSGND
GND GNDEN VCMOS
Figure 616 CMOS IO circuit in layout view
63 post-layout simulation and comparison 104
Energy dissipation (80-bits cycle)
Inductor 11pJ (Eq514)
Switch M1 092pJ [ ]
P2LCs 098pJ (Simulated)
Adiabatic drivers 315pJ (Eq512)
Total 615pJ
Table 63 Energy-dissipation theoretical calculations for Q = 17 f =500MHz
After parasitics extraction the top-level design was simulatedin SPICE The post-layout simulation results for a selection ofoutput signals are plotted in Figure 617 As can be observedthe timing constraints discussed in 623 are respected since theadiabatic driverrsquos input signal (DATA) optimally switches duringthe 327ps PULSE width It can also be seen that the resonantclock (CLK_RES) does not fully complete its discharge cyclebefore it is being forced to do so by the PULSE signal since theadditional capacitive load induced by the wire parasitics reducesthe oscillatorrsquos natural frequency
Post-layout simulated energy dissipation as well as area re-quirements for the energy-recovery and CMOS IO circuits arelisted in Table 64 The energy dissipation of the energy-recoveryimplementation is 25 more at the post-layout level in compar-ison to the theoretical estimations The energy overhead can beattributed to the effect of the wire parasitics as discussed in626 and the transistor parasitics that are not included in thetheoretical equations Therefore the accuracy of the theoreticalmodels can be considered acceptable for designing circuits basedon the proposed energy-recovery scheme
As was expected the dual-rail energy-recovery circuit occupiesalmost double the area of the equivalent CMOS circuit and eventhough both implementations have increased energy dissipationin comparison to the theoretical calculations the CMOS circuitenergy increases relatively more The latter was anticipated sincethe theoretical estimation of the CMOS energy dissipation wascalculated using the simple Equation 52 which does not accountfor any additional driver stages and parasitics The end result isthat the comparative performance favours the energy-recoveryimplementation which improves its energy savings from the
63 post-layout simulation and comparison 105
8 9 10 11 12 13 14 15 160
02
04
06
08
1
12
14RES_CLK
Time (ns)
V(V
)
8 9 10 11 12 13 14 15 160
02
04
06
08
1
12
14DATA
Time (ns)
V(V
)
8 9 10 11 12 13 14 15 160
02
04
06
08
1
12
14PULSE
Time (ns)
V(V
)
8 9 10 11 12 13 14 15 160
02
04
06
08
1
12
14OER
Time (ns)
V(V
)
Figure 617 Post-layout SPICE simulation of the energy-recovery cir-cuit
64 experimental measurements 106
AreaEnergy
dissipation(Theory)
Energydissipation
(Post-layout)
CMOS 144mm2922pJ
(115 f Jbit)
(Eq52)
1265pJ
(158 f Jbit)
Energy recovery 264mm2 615pJ
(77 f Jbit)
768pJ
(96 f Jbit)
Difference +83 minus333 minus393
Table 64 Post-layout simulated energy dissipation Area require-ments
theoretically calculated 333 up to 393 in the post-layoutsimulations
64 experimental measurements
The presented energy-recovery and CMOS IO circuits were fabric-ated on a 200mm wafer (Figure A3) which contained 47 identicaldies Micrograph images of the energy recovery and CMOS circuitscan be seen in Figures 618 and 619 respectively
641 Measurement setup
For IO access a manual on-wafer measurement system was usedthat allowed direct contact with the wafer and the IO test padsthrough RF probe heads (Figure A4) IO signal generation andmeasurement was provided by laboratory instruments whichwere manually controlled The measurement setup is illustratedin Figure 620
The IO test pads for the energy-recovery and CMOS circuitsand their equivalent instrument connections used for each testpad are summarised in Tables 65 and 66
642 Results
CMOS IO circuit
In Figure 621 the input signal S1B and the output signal OCMOS
are shown as captured by the oscilloscope Signal S1B whichhas a frequency of 500MHz is used by the data generator (Fig-ure 614) to generate a data signal switching at the same frequency
64 experimental measurements 107
Figure 618 Energy-recovery IO circuit micrograph
Figure 619 CMOS IO circuit micrograph
AABC
DEFF
D
EEBA
EBBA
Figure 620 Measurement setup
64 experimental measurements 108
Test pad Direction Instrument connection
VIND Power Power supply - gt06V
GND Power Power supply - 0V
S1A InputSine-wave generator (0deg Phase500MHz non-50Ω-terminated)
S2 InputSine-wave generator (VariablePhase 500MHznon-50Ω-terminated)
EN InputDigital Output (enable outputdata switching)
OER Output Oscilloscope
VDR PowerPower supply (adiabaticdrivers+P2LCs) - 12V
VBUF1 Power Power supply (Buffers) - 12V
Table 65 Energy-recovery circuit instrument connections
Test pad Direction Instrument connection
VCMOS PowerPower supply (CMOS drivers) -12V
GND Power Power supply - 0V
VBUF2 Power Power supply (Buffers) - 12V
S1B InputSine-wave generator (0deg Phase500MHz non-50Ω-terminated)
OCMOS Output Oscilloscope
EN InputDigital Output (enable outputdata switching)
Table 66 CMOS circuit instrument connections
64 experimental measurements 109
S1B
OCMOS
Figure 621 Oscilloscope output for signals S1B and OCMOS
100 200 300 400 500 600 700 800012345678910
Die1 Die2 Simulation
Frequency (MHz)
Powerdissipation(mW)
Figure 622 Measuredsimulated power dissipation for the CMOS cir-cuit
that passes through the TSVs and is output on OCMOS Thereforethe measured result on the oscilloscope suggests that the circuitis operating as expected
The proper operation of the circuit is also verified by measuringpower and energy per cycle dissipation on pad VCMOS Thepower dissipation of the two dies measured is approximatelylinear as expected by simulations (Figure 622) while energydissipation approximately constant (Figure 623) In additionabsolute power and energy dissipation values closely match post-layout simulations
Energy-recovery IO circuit
Input signal S1A and output signal OER can be seen in Fig-ure 624 as captured by the oscilloscope The output at OER
64 experimental measurements 110
100 200 300 400 500 600 700 800024681012141618
Die1 Die2 Simulation
Frequency (MHz)
Energydissipation(pJ)
Figure 623 Measuredsimulated energy dissipation for the CMOScircuit
S1A
OER
Figure 624 Oscilloscope output for signals S1A and OER
shows a data signal switching at 500MHz as expected The slowrising edge of the data signal suggests higher than anticipatedparasitics at the OER IO pad though the functionality of thecircuit is not affected
The energy dissipation is calculated by measuring the inputcurrent on IO test pads VIND and VDR The results for the twodies measured can be seen in Figure 625 The energy dissipationshows an approximately constant relation to frequency similar tothe CMOS circuit which does not correlate with post-layout simu-lations A properly operating energy-recovery circuit would beexpected to show minimum energy dissipation at the resonancefrequency (sim 500MHz)
Since the energy-recovery circuit shows CMOS-type energy dis-sipation it was estimated that the inductor could not resonate
65 summary and conclusions 111
200 250 300 350 400 450 500 550 600 65002468101214161820
Die1 Die2 Simulation
Frequency (MHz)
Energydissipation(pJ)
Figure 625 Measuredsimulated energy dissipation for the energy-recovery circuit
IntegratedInductor
VINDOER
PulseGenerator
S1A
S2
DataGenerator
TSV
TSV
P2LCAdiabaticDriver
P2LCAdiabaticDriver TSV
TSV
Buffer
Buffer
1
80
1
80
EN
DecouplingCapacitor
200pF
Figure 626 Energy-recovery IO circuit with fault hypothesis
properly and a hypothesis was made that the cause was a largerthan anticipated parasitic capacitance on the integrated inductorTo determine whether the hypothesis is valid a 200pF capacitoris inserted in the circuit schematic connected at the output ofthe inductor as shown in Figure 626 The new circuit is simu-lated and as can be seen in Figure 627 the output signal at OER
switches properly even though the resonant clock (RES_CLK) isnot oscillating The simulated energy dissipation of the circuitwith the fault hypothesis can be seen in Figure 628 and as canbe observed it is very similar to the experimental measurementsboth in terms of absolute values and frequency relation
65 summary and conclusions
In this chapter the design approach of a demonstrator circuitwas presented for evaluating the energy-recovery 3D interconnect-ing scheme Expanding on the theoretical analysis proposed inChapter 5 practical design issues were discussed and methodswere suggested for overcoming them
The circuit was designed and implemented on an experimental130nm 3D process technology and post-layout simulations res-
65 summary and conclusions 112
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15050
055
060
065
070
075
080RES_CLK
Time (ns)
V(V
)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 150
02
04
06
08
1
12
14PULSE
Time (ns)
V(V
)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1500
02
04
06
08
10
12
14OER
Time (ns)
V(V
)
Figure 627 Simulation of circuit with fault hypothesis
65 summary and conclusions 113
200 250 300 350 400 450 500 550 600 6500
5
10
15
20
25
Die1 Die2 Simulation CL=200pF
Frequency (MHz)
Ene
rgy
diss
ipat
ion
(pJ)
Figure 628 Measured energy dissipation and simulation of circuit withfault hypothesis
ults including parasitics were shown and interpreted The post-layout results revealed that the demonstrator circuit dissipated25 more energy per cycle than the value estimated by the-ory however the energy overhead was attributed to the effectof the wiretransistor parasitics which are not included in thetheoretical models Nevertheless the energy-recovery IO circuitdissipated 39 less energy per cycle than the equivalent CMOS
circuit implemented on the same process technologyExperimental measurements demonstrated that the fabricated
CMOS IO circuit closely matched the energy-dissipation estim-ated by post-layout simulations However the energy-recoveryIO circuit did not operate as expected and as a result its wasnot possible to evaluate the energy-recovery 3D interconnectingscheme on-chip Using simulation data the fault was attributedto higher than anticipated integrated inductor parasitics
7C O N C L U S I O N S
Planar on-chip interconnects have become a major concern inmodern processes as their impact on IC performance has beenprogressively increasing with each technology node In recentyears 3D interconnects based on vertical TSVs have been proposedas a possible solution to the interconnect bottleneck TSVs havethe potential to offer the best all around technology for meet-ing future technology trends with their advantages includingreduced parasitics and high interconnection density
As TSV is a relatively recent 3D interconnect technology thereare still challenges to be overcome before TSVs can be widelyadopted and commercialised Issues in the fabrication of TSV
devices are in the process of being resolved or solutions havealready been found For implementing TSV-based 3D-ICs withmore complex functionalities than simple test chips 3D designmethodologies and tools are needed which are still at their in-fancy stage Accurate perception of the TSV characteristics thatinfluence circuit performance is a major step towards the devel-opment of design methodologies and tools for 3D-ICs
Key TSV characteristics are the TSV parasitic capacitance andthermomechanical stress distribution The parasitic capacitanceaffects delay and power consumption of signals passing throughthe TSV interconnect while thermomechanical stress affects MOS-
FET device mobility in the vicinity of TSVs For extracting theseparameters accurately and under different conditions sophist-icated test structures were presented in this work which werefabricated and evaluated on an experimental 3D process
Although the TSV parasitic capacitance is small it may stillbecome an important source of power consumption as TSVs areimplemented in large densely interconnected 3D-SoCs Since con-ventional low-power design techniques based on voltage scalingrepresent a technical challenge in modern technology nodesa novel TSV interconnection scheme with frequency-dependentpower consumption was proposed in this work based on energy-recovery logic The proposed scheme was analysed using the-oretical modelling while a demonstrator IC was designed andfabricated on a 130nm 3D process
The contributions of this thesis as presented in the previouschapters are summarised in the following sections as well aspossible directions for future work
114
71 main contributions 115
71 main contributions
From the parasitics associated with a TSV capacitance assumesthe most significant role on its electrical properties such as en-ergy dissipation and delay thus accurate characterization of theTSV capacitance is highly desirable when designing a 3D systemIn Chapter 3 a group of test structures were presented for theelectrical characterization of the TSV parasitic capacitance un-der various conditions The test structures were based on theCBCM technique which has been used in the past for measuringBEOL metal interconnect parasitic capacitances and MOSFET ca-pacitances The presented test structures were fabricated on a65nm 3D process and the measurement results were statisticallyanalysed in terms of TSV capacitance die-to-die (D2D) and die-to-wafer (D2W) variability Comparison of the measurement data tosimulated oxide liner thickness variation for the specific processrevealed the oxide liner as a major contributor to the observedTSV capacitance variability Measurement results also confirmedthe effect of TSV pitch on the parasitic capacitance as has beenpreviously reported in the literature for a different process Theproposed CBCM-based TSV capacitance measurement techniquewas shown to be superior to conventional LCR measurementsproducing both accurate measurement results as well as lessoutliers (less defective samples) in the measurement data
The integration of TSV 3D interconnects in CMOS technology hasbeen shown to induce thermomechanical stress on the silicon (Si)substrate Thermomechanical stress can have an impact on activedevice carrier mobility implemented on the same Si substratethus monitoring and controlling stress is crucial for designing TSV-aware circuits with predictable performance Simulations usingfinite element method (FEM) modelling predict a complex ther-momechanical stress distribution which emphasises the need forsophisticated characterization structures that can monitor stressimpact with precision and verify simulation models In Chapter4 a test structure was presented for indirectly monitoring of theTSV-induced thermomechanical stress distribution by measuringits effect on transistor saturation current The structure implemen-ted arrays of MOSFET devices superimposed with TSVs in variousconfigurations and enabled accurate measurement of individualdevices by combining digital selection logic with four-terminalsensing Imecrsquos experimental 65nm 3D process was characterizedusing the presented test structure extracting the thermomechan-ical stress effect on PMOS devices which was found to be on parwith simulation estimations
In future large densely interconnected 3D-SoCs the TSV parasiticcapacitance may become an important source of energy dissip-ation Energy-recovery logic is an interesting low-power design
72 future work 116
approach as it is not limited by voltage scaling which representsa technical challenge in modern technology nodes In Chapter5 the potential of the energy-recovery technique was investig-ated for reducing the energy dissipation of TSV interconnectsin 3D-ICs The proposed energy-recovery scheme was analysedusing theoretical modelling generating equations for energy dis-sipation and optimum device sizing The model accuracy wasevaluated against SPICE simulations which showed good cor-relation with the theoretical estimations on a 130nm technologyprocess Comparison of the energy-recovery scheme to conven-tional static CMOS demonstrated favourable energy performancefor the energy-recovery scheme for low operating frequenciesand high Q factors switching activities and TSV capacitances
The theoretical models developed in Chapter 5 were used fordesigning a energy-recovery 3D demonstrator circuit in Chapter6 under realistic physical and electrical constraints The energy-recovery circuit and its CMOS equivalent were fabricated on a130nm 3D process technology and post-layout simulations showedthat the energy-recovery circuit dissipated 39 less energy percycle than the CMOS circuit on the same process technology Exper-imental measurements and comparison to simulations confirmedthe proper operation of the fabricated CMOS circuit however theenergy-recovery circuit did not operate as expected and as a res-ult its was not possible to evaluate the proposed energy-recoveryscheme on a chip Using simulation data the fault was attributedto higher than anticipated integrated inductor parasitics
72 future work
The CBCM-based test structures in Chapter 3 were shown to be atleast as accurate as external instruments for characterizing TSV
capacitance Considering that CBCM can evaluate single-endedTSVs it could prove a valuable tool for testing TSVs prior to diestacking which is a critical step for keeping 3D-IC yield acceptableMeasuring the TSV parasitic capacitance can provide with import-ant information about the quality of the fabricated TSV devicesince even slight variations in process parameters will affect itsparasitic capacitance as was demonstrated in Chapter 3 CBCM issimpler and potentially more accurate than other techniques thathave been proposed in the literature for the same role such asRC constant measurement using sense amplification [16]
In Chapter 4 the experiences learned from evaluation of theexperimental measurements results could be potentially used fordesigning improved thermomechanical stress characterizationstructures for next generation TSV processes
1 The transistors should be in a regular grid in the array forsimple processing of the measurement data
72 future work 117
2 The boundary conditions of transistors should be carefullydesigned as they have an considerable effect on transistormatching and thus measurement accuracy
3 Distances of interest are generally lt 8microm thus the arrayscould be smaller and more dense in the area in proximityto the TSV
4 Mechanisms are required to detect faults in the digitalselection logic when the implementation process is expectedto have low-yield issues
Furthermore in a future work the thermomechanical stress effecton NMOS devices should be measured which were not evaluatedin this project Simulations predict that NMOS devices are muchless affected by thermomechanical stress than PMOS devices thusthe approach used in this work might not be sufficiently accurateto extract stress effect on NMOS mobility
In Chapter 5 the energy-recovery scheme proposed could use adifferent encoding technique than dual-rail as a possible improve-ment A suggested encoding is 1-out-of-4 [74] since only chargesa single output at any given cycle which can potentially save upto 50 energy in comparison to dual-rail Also the demonstra-tion of the energy-recovery scheme feasibility on a chip is stillpending since the energy-recovery 3D demonstrator discussedin Chapter 6 failed to operate properly The fault was attributedto higher than anticipated integrated inductor parasitics thusthe same circuit should be implemented in the future with animproved inductor design for verifying the proposed theory
AA P P E N D I X A P H O T O S
Figure A1 3D65 300mm wafer (on chuck)
Figure A2 Probe card
118
appendix a photos 119
Figure A3 3D130 200mm wafer (on chuck)
ProbeHead
Microscope
Figure A4 Probe head
BA P P E N D I X B S TAT I S T I C S
mean (micro)
In a given data set the mean is the sum of the values divided bythe number of values The mean represents the average value in apopulation If xi is a value in the data set and n the total numberof values the mean micro can be calculated as
micro =1n
n
sumi=1
xi
median (m)
The median is useful for skewed distributions and represents thearithmetic value that the majority of the values in a data set tendThe median of a normal distribution with mean micro and varianceσ2 is the same as the mean
standard deviation (σ)
Standard deviation is used for measuring variability in a dataset It shows how much variation there is from the mean valueTo calculate the standard deviation of a data set first the squareof the difference of each value from the mean is calculated andthen the square root of the mean gives the standard deviation
A normal distribution contains 682 of its values within plusmn1σOther distributions can have a different spread and typically alow standard deviation indicates low variability whereas a highstandard deviation increased variability
standard deviation of the mean (σmicro )
The standard deviation of the sampled mean (σmicro) provides in-formation on the precision of the calculated mean of a data setThis is especially useful in experimental measurements wherethe certainty of the data being an accurate representation of themeasured physical quantity increases the more the measurementis repeated
Assuming there is a data set X with n values xi of mean micro
120
appendix b statistics 121
variance(X) = σ2X
variance(micro) = variance( 1n sum
ni=1 xi) =
1n2 variance(sumn
i=1 xi) =1n2 sum
ni=1 variance(xi) =
nn2 variance(X) =1n variance(X) =
σ2Xn rarr
σmicro = σXradicn
coefficient of variation
The coefficient of variation is the ratio of the standard deviationσ to the mean micro
CV =σ
micro
It can be more useful in determining variability in a populationthan standard deviation as in contrast to standard deviationcoefficient of variation is a dimensionless number often expressedas percentage
outlier
Outliers are values in a data set that are numerically distant fromthe rest of the data Observing and sometimes removing theoutliers is crucial in experimental measurement as these valuesusually indicate a measurement error or other malfunction De-termining outliers is generally subjective to the observer as thereis no specific mathematical definition of what constitutes outliersTypically the outliers are located by observing the probabilitydistribution function of a data set
kruskalndash wallis [34]
KruskalndashWallis one-way analysis of variance compares the medi-ans of two or more data sets to determine if the samples originatefrom separate populations This analysis tests the null-hypothesisthat there is no relationship between the data sets calculating theprobability of the data sets originating from the same populationand rejecting the null-hypothesis if it is less than a significancelevel The KruskalndashWallis test can be used on data sets that donot have a normal distribution and their variances do not have tobe equal
B I B L I O G R A P H Y
[1] Model 2602 dual-channel system sourcemeter instru-ment URL httpwwwkeithleycomproductsdcac
currentsourcebroadpurposemn=2602
[2] Max-3d URL httpwwwmicromagiccom
[3] The international technology roadmap for semiconductors2007 URL httppublicitrsnet
[4] The international technology roadmap for semiconductors2009 URL httppublicitrsnet
[5] Jaime Aguilera Joaquin de No Andres Garcia-Alonso FrankOehler Heiko Hein and Josef Sauerer A guide for on-chip inductor design in a conventional cmos process forrf applications In Applied Microwave amp Wireless Magazinepages 56ndash65 October 2001
[6] M Arsalan and M Shams Charge-recovery power clockgenerators for adiabatic logic circuits In Proc 18th Int VLSI
Design Conf pages 171ndash174 2005 doi 101109ICVD200564
[7] P Asimakopoulos G Van der Plas A Yakovlev andP Marchal Evaluation of energy-recovering interconnectsfor low-power 3d stacked ics In Proc IEEE Int Conf
3D System Integration 3DIC 2009 pages 1ndash5 2009 doi1011093DIC20095306581
[8] W C Athas L J Svensson J G Koller N Tzartzanis andE Ying-Chin Chou Low-power digital systems based onadiabatic-switching principles 2(4)398ndash407 1994 doi 10110992335009
[9] C H Bennett Logical reversibility of computation IBM
Journal of Research and Development 17(6)525ndash532 1973 doi101147rd1760525
[10] Charles H Bennett The fundamental physical lim-its of computation Scientific American 253(1)48ndash561985 URL httpwwwnaturecomdoifinder101038
scientificamerican0785-48
[11] E Beyne 3d system integration technologies In Proc Int
VLSI Technology Systems and Applications Symp pages 1ndash92006 doi 101109VTSA2006251113
122
bibliography 123
[12] Mark Bohr and Kaizad Mistry Intelrsquos revolu-tionary 22 nm transistor technology 2011 URLhttpdownloadintelcomnewsroomkits22nmpdfs
22nm-Details_Presentationpdf
[13] Ke Cao S Dobre and Jiang Hu Standard cell charac-terization considering lithography induced variations InProc 43rd ACMIEEE Design Automation Conf pages 801ndash8042006 doi 101109DAC2006229327
[14] A P Chandrakasan S Sheng and R W Brodersen Low-power cmos digital design 27(4)473ndash484 1992 doi 1011094126534
[15] J C Chen B W McGaughy D Sylvester and Chenming HuAn on-chip attofarad interconnect charge-based capacitancemeasurement (cbcm) technique In Proc Int Electron Devices
Meeting IEDM rsquo96 pages 69ndash72 1996 doi 101109IEDM1996553124
[16] Po-Yuan Chen Cheng-Wen Wu and Ding-Ming Kwai On-chip tsv testing for 3d ic before bonding using sense ampli-fication In Proc ATS rsquo09 Asian Test Symp pages 450ndash4552009 doi 101109ATS200942
[17] Sungdong Cho Sinwoo Kang Kangwook Park Jaechul KimKiyoung Yun Kisoon Bae Woon Seob Lee Sangwook JiEunji Kim Jangho Kim Y L Park and E S Jung Impact oftsv proximity on 45nm cmos devices in wafer level In Proc
IEEE Int Interconnect Technology Conf and 2011 Materials for
Advanced Metallization (IITCMAM) pages 1ndash3 2011 doi101109IITC20115940326
[18] Kyu-Myung Choi An industrial perspective of 3d ic inte-gration technology from the viewpoint of design technologyIn Proc 15th Asia and South Pacific Design Automation Conf
(ASP-DAC) pages 544ndash547 2010 doi 101109ASPDAC20105419823
[19] Juang-Ying Chueh C H Ziesler and M C PapaefthymiouEmpirical evaluation of timing and power in resonant clockdistribution In Proc Int Symp Circuits and Systems ISCAS
rsquo04 volume 2 2004 doi 101109ISCAS20041329255
[20] Thuy Dao D H Triyoso M Petras and M CanonicoThrough silicon via stress characterization In Proc IEEE
Int Conf IC Design and Technology ICICDT rsquo09 pages 39ndash412009 doi 101109ICICDT20095166260
[21] M Grange R Weerasekera D Pamunuwa and H Ten-hunen Exploration of through silicon via interconnect par-
bibliography 124
asitics for 3-dimensional integrated circuits In Workshop
Notes Design Automation and Test in Europe (DATE) 2009
[22] N Z Haron and S Hamdioui Why is cmos scaling comingto an end In Proc 3rd Int Design and Test Workshop IDT
2008 pages 98ndash103 2008 doi 101109IDT20084802475
[23] Imec 200mm 3d-sic tsv line 2009 URL httpwwwimec
beScientificReportSR2009HTML1213307html
[24] J W Joyner R Venkatesan P Zarkesh-Ha J A Davis andJ D Meindl Impact of three-dimensional architectures oninterconnects in gigascale integration 9(6)922ndash928 2001doi 10110992974905
[25] A P Karmarkar Xiaopeng Xu and V Moroz Perfor-manace and reliability analysis of 3d-integration struc-tures employing through silicon via (tsv) In Proc IEEE
Int Reliability Physics Symp pages 682ndash687 2009 doi101109IRPS20095173329
[26] G Katti A Mercha M Stucchi Z Tokei D VelenisJ Van Olmen C Huyghebaert A Jourdain M RakowskiI Debusschere P Soussan H Oprins W DehaeneK De Meyer Y Travaly E Beyne S Biesemans andB Swinnen Temperature dependent electrical character-istics of through-si-via (tsv) interconnections In Proc Int
Interconnect Technology Conf (IITC) pages 1ndash3 2010 doi101109IITC20105510311
[27] G Katti M Stucchi K De Meyer and W Dehaene Electricalmodeling and characterization of through silicon via forthree-dimensional ics 57(1)256ndash262 2010 doi 101109TED20092034508
[28] G Katti M Stucchi J Van Olmen K De Meyer and W De-haene Through-silicon-via capacitance reduction techniqueto benefit 3-d ic performance 31(6)549ndash551 2010 doi101109LED20102046712
[29] Walt Kester Sensor signal conditioning In Jon SWilson editor Sensor Technology Handbook pages 31 ndash136 Newnes Burlington 2005 ISBN 978-0-75-067729-5 doi DOI101016B978-075067729-550044-6 URLhttpwwwsciencedirectcomsciencearticlepii
B9780750677295500446
[30] Jonggab Kil Jie Gu and C H Kim A high-speed variation-tolerant interconnect technique for sub-threshold circuitsusing capacitive boosting 16(4)456ndash465 2008 doi 101109TVLSI2007915455
bibliography 125
[31] Suhwan Kim C H Ziesler and M C PapaefthymiouCharge-recovery computing on silicon 54(6)651ndash659 2005doi 101109TC200591
[32] C Kortekaas On-chip quasi-static floating-gate capacitancemeasurement method In Proc Int Conf Microelectronic Test
Structures ICMTS 1990 pages 109ndash113 1990 doi 101109ICMTS199067889
[33] A Kramer J S Denker B Flower and J Moroney 2ndorder adiabatic computation with 2n-2p and 2n-2n2p logiccircuits In Proceedings of the 1995 international symposium on
Low power design ISLPED rsquo95 pages 191ndash196 New York NYUSA 1995 ACM ISBN 0-89791-744-8 doi httpdoiacmorg101145224081224115 URL httpdoiacmorg10
1145224081224115
[34] William H Kruskal and W Allen Wallis Use of ranks in one-criterion variance analysis Journal of the American Statistical
Association 47(260)583ndash621 1952 ISSN 01621459 doi 1023072280779 URL httpdxdoiorg1023072280779
[35] Kelin J Kuhn Moorersquos law past 32nm Future challengesin device scaling 2009 URL httpdownloadintelcom
pressroompdfkkuhnKuhn_IWCE_invited_textpdf
[36] R Landauer Irreversibility and heat generation in the com-puting process IBM Journal of Research and Development 5
(3)183ndash191 1961 doi 101147rd530183
[37] Joonho Lim Dong-Gyu Kim and Soo-Ik Chae A 16-bitcarry-lookahead adder using reversible energy recoverylogic for ultra-low-energy systems 34(6)898ndash903 1999 doi1011094766827
[38] Joonho Lim Dong-Gyu Kim and Soo-Ik Chae nmos re-versible energy recovery logic for ultra-low-energy applica-tions 35(6)865ndash875 2000 doi 1011094845190
[39] F Liu X Gu K A Jenkins E A Cartier Y Liu P Song andS J Koester Electrical characterization of 3d through-silicon-vias In Proc 60th Electronic Components and Technology Conf
(ECTC) pages 1100ndash1105 2010 doi 101109ECTC20105490839
[40] K H Lu Xuefeng Zhang Suk-Kyu Ryu J Im Rui Huangand P S Ho Thermo-mechanical reliability of 3-d ics con-taining through silicon vias In Proc 59th Electronic Compo-
nents and Technology Conf ECTC 2009 pages 630ndash634 2009doi 101109ECTC20095074079
bibliography 126
[41] Nir Magen Avinoam Kolodny Uri Weiser and NachumShamir Interconnect-power dissipation in a microprocessorIn Proceedings of the 2004 international workshop on System
level interconnect prediction SLIP rsquo04 pages 7ndash13 New YorkNY USA 2004 ACM ISBN 1-58113-818-0 doi httpdoiacmorg101145966747966750 URL httpdoiacm
org101145966747966750
[42] H Mahmoodi V Tirumalashetty M Cooke and K RoyUltra low-power clocking scheme using energy recoveryand clock gating 17(1)33ndash44 2009 doi 101109TVLSI20082008453
[43] D Maksimovic A mos gate drive with resonant transitionsIn Proc nd Annual IEEE Power Electronics Specialists Conf
PESC rsquo91 Record pages 527ndash532 1991 doi 101109PESC1991162725
[44] D Maksimovic and V G Oklobdzija Integrated power clockgenerators for low energy logic In Proc th Annual IEEE
Power Electronics Specialists Conf PESC rsquo95 Record volume 1pages 61ndash67 1995 doi 101109PESC1995474793
[45] D Maksimovic V G Oklobdzija B Nikolic and K WCurrent Clocked cmos adiabatic logic with integratedsingle-phase power-clock supply 8(4)460ndash463 2000 doi10110992863629
[46] E J Marinissen Testing tsv-based three-dimensional stackedics In Proc Design Automation amp Test in Europe Conf amp
Exhibition (DATE) pages 1689ndash1694 2010
[47] E J Marinissen and Y Zorian Testing 3d chips containingthrough-silicon vias In Proc Int Test Conf ITC 2009 pages1ndash11 2009 doi 101109TEST20095355573
[48] A Mercha A Redolfi M Stucchi N Minas J Van Ol-men S Thangaraju D Velenis S Domae Y Yang G KattiR Labie C Okoro M Zhao P Asimakopoulos I De WolfT Chiarella T Schram E Rohr A Van Ammel A Jour-dain W Ruythooren S Armini A Radisic H PhilipsenN Heylen M Kostermans P Jaenen E Sleeckx D Sabun-cuoglu Tezcan I Debusschere P Soussan D PerryG Van der Plas J H Cho P Marchal Y Travaly E BeyneS Biesemans and B Swinnen Impact of thinning andthrough silicon via proximity on high-k metal gate firstcmos performance In Proc Symp VLSI Technology (VLSIT)pages 109ndash110 2010 doi 101109VLSIT20105556190
[49] A Mercha G Van der Plas V Moroz I De Wolf P Asi-makopoulos N Minas S Domae D Perry M Choi
bibliography 127
A Redolfi C Okoro Y Yang J Van Olmen S ThangarajuD S Tezcan P Soussan J H Cho A Yakovlev P Mar-chal Y Travaly E Beyne S Biesemans and B SwinnenComprehensive analysis of the impact of single and arraysof through silicon vias induced stress on high-k metalgate cmos performance In Proc IEEE Int Electron Devices
Meeting (IEDM) 2010 doi 101109IEDM20105703278
[50] Yong Moon and Deog-Kyoon Jeong An efficient chargerecovery logic circuit 31(4)514ndash522 1996 doi 1011094499727
[51] G E Moore Cramming more components onto inte-grated circuits Electronics 38(8)114ndash117 April 1965 doi101109JPROC1998658762 URL httpdxdoiorg10
1109JPROC1998658762
[52] O S Nakagawa S-Y Oh T Hsu and S Habu Benchmarkmethodology of interconnect capacitance simulation usinginter-digitated capacitors In Proc Int Microelectronic Test
Structures ICMTS 1998 Conf pages 235ndash237 1998 doi 101109ICMTS1998688103
[53] M W Newman S Muthukumar M SchueleinT Dambrauskas P A Dunaway J M Jordan S KulkarniC D Linde T A Opheim R A Stingel W WorwagL A Topic and J M Swan Fabrication and electricalcharacterization of 3d vertical interconnects In Proc
56th Electronic Components and Technology Conf 2006 doi101109ECTC20061645676
[54] V G Oklobdzija D Maksimovic and Fengcheng Lin Pass-transistor adiabatic logic using single power-clock supply44(10)842ndash846 1997 doi 10110982633443
[55] J Van Olmen C Huyghebaert J Coenen J Van AelstE Sleeckx A Van Ammel S Armini G Katti J VaesW Dehaene E Beyne and Y Travaly Integration chal-lenges of copper through silicon via (tsv) metallization for3d-stacked ic integration Microelectronic Engineering 88
(5)745 ndash 748 2011 ISSN 0167-9317 doi DOI101016jmee201006026 URL httpwwwsciencedirectcom
sciencearticlepiiS0167931710002121 The 2010 Inter-national workshop on - MAM 2010 The 2010 Internationalworkshop on
[56] J J Paulos and D A Antoniadis Measurement of minimum-geometry mos transistor capacitances 20(1)277ndash283 1985doi 101109JSSC19851052303
bibliography 128
[57] Vasilis F Pavlidis and Eby G Friedman Interconnect de-lay minimization through interlayer via placement in 3-dics In Proceedings of the 15th ACM Great Lakes symposium
on VLSI GLSVLSI rsquo05 pages 20ndash25 New York NY USA2005 ACM ISBN 1-59593-057-4 doi httpdoiacmorg10114510576611057669 URL httpdoiacmorg101145
10576611057669
[58] Dan Perry Jonghoon Cho Shinichi Domae Panagiotis Asi-makopoulos Alex Yakovlev Pol Marchal Geert Van derPlas and Nikolaos Minas An efficient array structure tocharacterize the impact of through silicon vias on fet devicesIn Proc IEEE Int Microelectronic Test Structures (ICMTS) Confpages 118ndash122 2011 doi 101109ICMTS20115976872
[59] Jan M Rabaey Anantha Chandrakasan and Borivoje NikolicDigital integrated circuits- A design perspective Prentice Hall2ed edition 2004
[60] K Roy and S Prasad Low-power CMOS VLSI circuit de-
sign A Wiley-Interscience publication Wiley 2000 ISBN9780471114888
[61] K Roy S Mukhopadhyay and H Mahmoodi-MeimandLeakage current mechanisms and leakage reduction tech-niques in deep-submicrometer cmos circuits 91(2)305ndash3272003 doi 101109JPROC2002808156
[62] V S Sathe M C Papaefthymiou and C H Ziesler Boostlogic a high speed energy recovery circuit family In Proc
IEEE Computer Society Annual Symp VLSI pages 22ndash27 2005doi 101109ISVLSI200522
[63] V S Sathe J C Kao and M C Papaefthymiou Resonant-clock latch-based design 43(4)864ndash873 2008 doi 101109JSSC2008917501
[64] B Sell A Avellan and W H Krautschneider Charge-basedcapacitance measurements (cbcm) on mos devices 2(1)9ndash122002 doi 101109TDMR20021014667
[65] C S Selvanayagam Xiaowu Zhang R Rajoo and D PinjalaModelling stress in silicon with tsvs and its effect on mobilityIn Proc 11th Electronics Packaging Technology Conf EPTC rsquo09pages 612ndash618 2009 doi 101109EPTC20095416477
[66] D Shariff P C Marimuthu K Hsiao L Asoy Chia LaiYee Aung Kyaw Oo K Buchanan K Crook T Wilby andS Burgess Integration of fine-pitched through-silicon viasand integrated passive devices In Proc IEEE 61st Elec-
tronic Components and Technology Conf (ECTC) pages 844ndash848 2011 doi 101109ECTC20115898609
bibliography 129
[67] S Stoukatch C Winters E Beyne W De Raedt andC Van Hoof 3d-sip integration for autonomous sensornodes In Proc 56th Electronic Components and Technology
Conf 2006 doi 101109ECTC20061645678
[68] L J Svensson and J G Koller Driving a capacitive loadwithout dissipating fcv2 In Proc IEEE Symp Low Power
Electronics Digest of Technical Papers pages 100ndash101 1994 doi101109LPE1994573220
[69] B Swinnen W Ruythooren P De Moor L Bogaerts L Car-bonell K De Munck B Eyckens S Stoukatch D S TezcanZ Tokei J Vaes J Van Aelst and E Beyne 3d integrationby cu-cu thermo-compression bonding of extremely thinnedbulk-si die containing 10 IcircŒm pitch through-si vias In Proc
Int Electron Devices Meeting IEDM rsquo06 pages 1ndash4 2006 doi101109IEDM2006346786
[70] N Tanaka M Kawashita Y Yoshimura T Uematsu M Fuji-sawa H Shimokawa N Kinoshita T Naito T Kikuchi andT Akazawa Characterization of mos transistors after tsvfabrication and 3d-assembly In Proc 2nd Electronics System-
Integration Technology Conf ESTC 2008 pages 131ndash134 2008doi 101109ESTC20084684338
[71] S E Thompson Guangyu Sun Youn Sung Choi andT Nishida Uniaxial-process-induced strained-si extend-ing the cmos roadmap 53(5)1010ndash1020 2006 doi 101109TED2006872088
[72] N Tzartzanis and W C Athas Clock-powered cmos ahybrid adiabatic logic style for energy-efficient computingIn Proc 20th Anniversary Conf Advanced Research in VLSIpages 137ndash151 1999 doi 101109ARVLSI1999756044
[73] J Van Olmen A Mercha G Katti C HuyghebaertJ Van Aelst E Seppala Zhao Chao S Armini J VaesR C Teixeira M Van Cauwenberghe P Verdonck K Ver-hemeldonck A Jourdain W Ruythooren M de Potterde ten Broeck A Opdebeeck T Chiarella B Parvais I De-busschere T Y Hoffmann B De Wachter W DehaeneM Stucchi M Rakowski P Soussan R Cartuyvels E BeyneS Biesemans and B Swinnen 3d stacked ic demonstrationusing a through silicon via first approach In Proc IEEE Int
Electron Devices Meeting IEDM 2008 pages 1ndash4 2008 doi101109IEDM20084796763
[74] Tom Verhoeff Delay-insensitive codes - an overview Dis-
tributed Computing 31ndash8 1988 ISSN 0178-2770 URL http
dxdoiorg101007BF01788562 101007BF01788562
bibliography 130
[75] P Vitanov T Dimitrova and I Eisele Direct capacitancemeasurements of small geometry mos transistors Micro-
electronics Journal 22(7-8)77 ndash 89 1991 ISSN 0026-2692doi DOI1010160026-2692(91)90016-G URL httpwww
sciencedirectcomsciencearticleB6V44-4829XDJ-2N
2f5b2c221dcf240c38cd13b7b3432f913
[76] B Voss and M Glesner A low power sinusoidal clockIn Proc IEEE Int Symp Circuits and Systems ISCAS 2001volume 4 pages 108ndash111 2001 doi 101109ISCAS2001922182
[77] R Weerasekera M Grange D Pamunuwa H Tenhunenand Li-Rong Zheng Compact modelling of through-siliconvias (tsvs) in three-dimensional (3-d) integrated circuits InProc IEEE Int Conf 3D System Integration 3DIC 2009 pages1ndash8 2009 doi 1011093DIC20095306541
[78] NHE Weste and DF Harris CMOS VLSI design a cir-
cuits and systems perspective PearsonAddison-Wesley 2005ISBN 9780321269775
[79] Steve Wozniak Failure magazine interview (by jason zasky)July 2000 URL httpfailuremagcomindexphpsite
printsteve_wozniak_interview
[80] Hyung Suk Yang and Muhannad S Bakir 3d integrationof cmos and mems using mechanically flexible intercon-nects (mfi) and through silicon vias (tsv) In Proc 60th Elec-
tronic Components and Technology Conf (ECTC) pages 822ndash828 2010 doi 101109ECTC20105490716
[81] Yu Yang G Katti R Labie Y Travaly B Verlinden andI De Wolf Electrical evaluation of 130-nm mosfets withtsv proximity in 3d-sic structure In Proc Int Interconnect
Technology Conf (IITC) pages 1ndash3 2010 doi 101109IITC20105510710
[82] Yibin Ye and K Roy Energy recovery circuits using re-versible and partially reversible logic 43(9)769ndash778 1996doi 10110981536746
[83] Yibin Ye and K Roy Qserl quasi-static energy recoverylogic 36(2)239ndash248 2001 doi 1011094902764
[84] C C Yeh J H Lou and J B Kuo 15 v cmos full-swingenergy efficient logic (eel) circuit suitable for low-voltageand low-power vlsi applications Electronics Letters 33(16)1375ndash1376 1997 doi 101049el19970915
bibliography 131
[85] C P Yuan and T N Trick A simple formula for the esti-mation of the capacitance of two-dimensional interconnectsin vlsi circuits 3(12)391ndash393 1982 doi 101109EDL198225610
- Abstract
- Publications
- Acknowledgments
- Contents
- List of Figures
- List of Tables
- Acronyms
- 1 Introduction
-
- 11 Research goals and contribution
- 12 Thesis outline
-
- 2 Background
-
- 21 3D integrated circuits
-
- 211 3D integration approaches
- 212 TSV interconnects
-
- 22 Energy-recovery logic
-
- 221 Energy-recovery architectures
- 222 Power-clock generators
-
- 23 Full-custom design flow
- 24 Summary
-
- 3 Characterization of TSV capacitance
-
- 31 Introduction
- 32 TSV parasitic capacitance
- 33 Integrated capacitance measurement techniques
- 34 Electrical characterization
-
- 341 Physical implementation
- 342 Measurement setup
- 343 Experimental results
-
- 35 Summary and conclusions
-
- 4 TSV proximity impact on MOSFET performance
-
- 41 Introduction
- 42 TSV-induced thermomechanical stress
- 43 Test structure implementation
-
- 431 MOSFET arrays
- 432 Digital selection logic
-
- 44 Experimental measurements
-
- 441 Data processing methodology
- 442 Measurement setup
- 443 PMOS (long channel)
- 444 PMOS (short channel)
-
- 45 Summary and conclusions
-
- 5 Evaluation of energy-recovery for TSV interconnects
-
- 51 Introduction
- 52 Energy-recovery TSV interconnects
- 53 Analysis
-
- 531 Adiabatic driver
- 532 Inductors parasitic resistance
- 533 Switch M1
- 534 Timing Constraints
-
- 54 Simulations
-
- 541 Comparison to CMOS
-
- 55 Summary and conclusions
-
- 6 Energy-recovery 3D-IC demonstrator
-
- 61 Introduction
- 62 Physical implementation
-
- 621 Adiabatic driver
- 622 Pulse-to-level converter
- 623 Data generator
- 624 Pulse generator
- 625 Integrated inductor
- 626 Top level design
- 627 CMOS IO circuit
-
- 63 Post-layout simulation and comparison
- 64 Experimental measurements
-
- 641 Measurement setup
- 642 Results
-
- 65 Summary and conclusions
-
- 7 Conclusions
-
- 71 Main contributions
- 72 Future work
-
- A Appendix A Photos
- B Appendix B Statistics
- Bibliography
-
In the end I hope therersquos a little note somewhere
that says I designed a good computer
mdash Steve Wozniak [79]
A C K N O W L E D G E M E N T S
First and foremost I would like to express my gratitude to mysupervisor Prof Alex Yakovlev who provided the perfect balancebetween freedom and guidance allowing me to obtain sphericalknowledge and scope in microelectronics but at the same time topromptly complete my studies Irsquom also grateful to the Engineer-ing and Physical Science Research Council (EPSRC) for fundingmy PhD studies through their Doctoral Training Accounts
As part of my research project I had the opportunity to spendsome time at Imec Belgium The experiences and lessons learnedduring my stay at Imec will undoubtedly prove valuable in myforthcoming professional career I would like to thank Dan Perryfrom Qualcomm for sharing his experience and ideas for the de-sign of the characterization structures implemented on the 65nm
Imec process Special thanks to Shinichi Domae from Panasonicand Dr Dimitrios Velenis for their assistance with the 65nm wafermeasurements as well as Marco Facchini for the time he spendtrying to measure the energy-recovery circuits Last but not leastIrsquod like to thank Geert Van der Plas and Paul Marchal for theiradvices and guidance throughout my stay at Imec
Finally many thanks to my colleagues at Newcastle Universityfor assisting me in diverse ways during my studies I would liketo especially mention Dr Santosh Shedabale and James Dochertyfor reviewing and commenting on parts of this text as well asDr Robin Emery for the numerous technical discussions and forhelping setup the design tools
v
C O N T E N T S
1 Introduction 1
11 Research goals and contribution 2
12 Thesis outline 3
2 Background 5
21 3D integrated circuits 5
211 3D integration approaches 6
212 TSV interconnects 9
22 Energy-recovery logic 13
221 Energy-recovery architectures 17
222 Power-clock generators 18
23 Full-custom design flow 20
24 Summary 21
3 Characterization of TSV capacitance 23
31 Introduction 23
32 TSV parasitic capacitance 23
33 Integrated capacitance measurement techniques 24
34 Electrical characterization 27
341 Physical implementation 28
342 Measurement setup 36
343 Experimental results 37
35 Summary and conclusions 44
4 TSV proximity impact on MOSFET performance 46
41 Introduction 46
42 TSV-induced thermomechanical stress 46
43 Test structure implementation 48
431 MOSFET arrays 51
432 Digital selection logic 55
44 Experimental measurements 61
441 Data processing methodology 61
442 Measurement setup 62
443 PMOS (long channel) 64
444 PMOS (short channel) 71
45 Summary and conclusions 73
5 Evaluation of energy-recovery for TSV interconnects 75
51 Introduction 75
52 Energy-recovery TSV interconnects 75
53 Analysis 79
531 Adiabatic driver 80
532 Inductorrsquos parasitic resistance 82
533 Switch M1 82
534 Timing Constraints 83
54 Simulations 83
541 Comparison to CMOS 86
vi
contents vii
55 Summary and conclusions 91
6 Energy-recovery 3D-IC demonstrator 92
61 Introduction 92
62 Physical implementation 92
621 Adiabatic driver 94
622 Pulse-to-level converter 95
623 Data generator 95
624 Pulse generator 95
625 Integrated inductor 97
626 Top level design 98
627 CMOS IO circuit 101
63 Post-layout simulation and comparison 101
64 Experimental measurements 106
641 Measurement setup 106
642 Results 106
65 Summary and conclusions 111
7 Conclusions 114
71 Main contributions 115
72 Future work 116
a Appendix A Photos 118
b Appendix B Statistics 120
bibliography 122
L I S T O F F I G U R E S
Figure 11 Scaling trend of logic and interconnect delay[30] 2
Figure 21 Comparison of 2D3D Integration 5
Figure 22 System-in-a-package 6
Figure 23 Package-on-package 6
Figure 24 Imecrsquos 3-D SIP wireless bio-electronic sensor[67] 7
Figure 25 3D wafer-level-packaging 7
Figure 26 3D-stacked IC 8
Figure 27 TSV process steps 9
Figure 28 Before thinningafter thinning 10
Figure 29 TSV stacking 10
Figure 210 Conventional AND gate 14
Figure 211 CMOS switching 16
Figure 212 Adiabatic switching 16
Figure 213 Basic 2N-2P adiabatic circuit 17
Figure 214 2N-2P timing 17
Figure 215 Adiabatic driver 18
Figure 216 Stepwise charging generator 19
Figure 217 1N power clock generator 20
Figure 218 Design flow 22
Figure 219 Inverter in Virtuoso schematic 22
Figure 220 Inverter in Virtuoso layout 22
Figure 31 Isolated 2D wire capacitance 24
Figure 32 TSV parasitic capacitance 25
Figure 33 Capacitance vs Gate voltage (CV) diagramof a MOS Capacitor 25
Figure 34 CBCM pseudo-inverters 27
Figure 35 Current as a function of frequency in CBCM 28
Figure 36 CBCM pseudo-inverter pair in layout 29
Figure 37 PRC1 test pad module (layout) 30
Figure 38 PRC1 test pad module (micrograph) 30
Figure 39 Structure rsquoArsquo - Single TSV capacitance (TOPdie) 32
Figure 310 Structure rsquoBrsquo - Single TSV capacitance (BOT-TOM die) 33
Figure 311 Structure rsquoCrsquo - Variable pitch TSV capacitance34
Figure 312 Structure rsquoDrsquo - TSV capacitance measure-ment with interconnection on TOP or BOT-TOM die 34
Figure 313 Structure rsquoRrsquo - Evaluation of TSV capacit-ance measurement accuracy 35
viii
List of Figures ix
Figure 314 Structure rsquoErsquo - TSV capacitance effect onsignal propagation delay (layout) 35
Figure 315 Structure rsquoErsquo - TSV capacitance effect onsignal propagation delay (schematic) 36
Figure 316 Measurement setup 36
Figure 317 TSV capacitance D2D variability on waferrsquoD11rsquo - Probability density function 39
Figure 318 TSV capacitance D2D variability on waferrsquoD18rsquo - Probability density function 40
Figure 319 TSV capacitance measurement results onwafer D11 40
Figure 320 TSV capacitance measurement results onwafer D18 41
Figure 321 Simulated oxide liner thickness variation 42
Figure 322 TSV capacitance W2W variability - Probab-ility density functions 42
Figure 323 Variable pitch TSV capacitance experimentaldata from wafer rsquoD18rsquo - Probability densityfunction 44
Figure 324 LCR experimental data from wafer rsquoD18rsquo -Probability density function 44
Figure 41 TSV thermomechanical stress simulation [40] 47
Figure 42 TSV thermomechanical stress interactionbetween two TSVs [40] 48
Figure 43 Estimated mobility shift on PMOS (a) andNMOS (b) transistors in between two TSVsseparated by distance d [65] 49
Figure 44 PTP1 test pad module (layout) 50
Figure 45 PTP1 test pad module (micrograph) 50
Figure 46 MOSFET dimensions and rotations 51
Figure 47 1 TSV MOSFET array configuration 52
Figure 48 4 diagonal TSVs MOSFET array configuration52
Figure 49 4 lateral TSVs MOSFET array configuration 53
Figure 410 9 TSVs MOSFET array configuration 53
Figure 411 MOSFET array detail 55
Figure 412 Digital selection logic 56
Figure 413 MOSFET array with digital selection logic 57
Figure 414 4-bit Gate-enableArray-enable decoder 58
Figure 415 Transmission gate for MOSFET Gate terminals59
Figure 416 Transmission gates for MOSFET Drain-Source terminals 59
Figure 417 2-bit Drain-select decoder 60
Figure 418 Processing of measurement results 62
Figure 419 Interpolation of measurement results 63
Figure 420 Measurement setup 63
Figure 421 Normalised on-current for PMOS structureldquoF0rdquo (D08) 65
List of Figures x
Figure 422 Normalised on-current for PMOS structureldquoG0rdquo (D08) 66
Figure 423 Normalised on-current for PMOS structureldquoG0rdquo (D08Y=14X=16) 66
Figure 424 Normalised on-current for PMOS structureldquoG0rdquo (D20) 67
Figure 425 Normalised on-current for PMOS structureldquoG0rdquo (D20Y=14X=16) 67
Figure 426 Normalised on-current for PMOS structureldquoG0rdquo (D08 60˚C) 68
Figure 427 Normalised on-current for PMOS structureldquoG0rdquo (D08 Y=14 X=16 60˚C) 69
Figure 428 Normalised on-current for PMOS structureldquoI0rdquo (D08) 69
Figure 429 Normalised on-current for PMOS structureldquoI0rdquo (D08 X=16) 70
Figure 430 Normalised on-current for PMOS structureldquoI0rdquo (D08 Y=14) 70
Figure 431 Normalised on-current for PMOS structureldquoG90rdquo (D08) 71
Figure 432 Normalised on-current for PMOS structureldquoG90rdquo (D08 Y=14 X=16) 71
Figure 433 Normalised on-current for PMOS structureldquoB0rdquo (D08) 72
Figure 434 Normalised on-current for PMOS structureldquoB0rdquo (D08 Y=14 X=16) 72
Figure 51 TSV energy dissipation per cycle increasewith 3D integration density 76
Figure 52 Comparison of a typical CMOS TSV driverto the proposed energy-recovery 77
Figure 53 Adiabatic driver 78
Figure 54 A simple pulse-to-level converter (P2LC) 78
Figure 55 Resonant pulse generator with dual-rail dataencoding 79
Figure 56 Energy-recovery scheme for TSV interconnects79
Figure 57 Energy-recovery timing constraints 83
Figure 58 Test case for the evaluation of the theoret-ical model 84
Figure 59 Energy dissipation results ( f Jcycle) the-oretical model (T) versus SPICE simulator(S) 85
Figure 510 Energy dissipation estimation error of thetheoretical model compared to the SPICEsimulator 85
Figure 511 Test case for comparison of energy-recoveryto CMOS 86
Figure 512 Improved P2LC design 87
List of Figures xi
Figure 513 Simulated energy dissipation of improvedP2LC design 87
Figure 514 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS for variable Q factor (C = 80 f F) 88
Figure 515 Energy contribution of energy-recovery cir-cuit components at 200MHz (C = 80 f F) 88
Figure 516 Energy contribution of energy-recovery cir-cuit components at 800MHz (C = 80 f F) 89
Figure 517 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS with variable TSV capacitance (C =
80 f F f = 500MHz) 89
Figure 518 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS (C = 80 f F f = 200MHz) 90
Figure 519 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS for variable technology (C = 80 f Ff = 500MHz) 90
Figure 61 Demonstrator circuit structure (showing 1-bit inputoutput (IO) channel face-downview) 93
Figure 62 Energy-recovery IO circuit schematic dia-gram 94
Figure 63 Adiabatic driver (layoutschematic view) 94
Figure 64 P2LC (layoutschematic view) 95
Figure 65 Data generator (layoutschematic view) 96
Figure 66 Timing constraints for the energy-recoverydemonstrator circuit 96
Figure 67 Pulse generator (layoutschematic view) 97
Figure 68 Pulse generator InputOutput signals 97
Figure 69 Inductorrsquos Q factor simulation in ASITIC Estimated energy dissipation reduction percycle over standard CMOS 98
Figure 610 Planar square spiral inductor designed inASITIC 99
Figure 611 Resonant pulse distribution tree 100
Figure 612 Interdigitated metal-insulator-metal capa-citor (fringe capacitor) 101
Figure 613 Energy-recovery IO circuit in layout view 102
Figure 614 CMOS IO circuit schematic diagram 103
Figure 615 CMOS driver schematic 103
Figure 616 CMOS IO circuit in layout view 103
Figure 617 Post-layout SPICE simulation of the energy-recovery circuit 105
Figure 618 Energy-recovery IO circuit micrograph 107
List of Figures xii
Figure 619 CMOS IO circuit micrograph 107
Figure 620 Measurement setup 107
Figure 621 Oscilloscope output for signals S1B and OC-MOS 109
Figure 622 Measuredsimulated power dissipation forthe CMOS circuit 109
Figure 623 Measuredsimulated energy dissipation forthe CMOS circuit 110
Figure 624 Oscilloscope output for signals S1A and OER110
Figure 625 Measuredsimulated energy dissipation forthe energy-recovery circuit 111
Figure 626 Energy-recovery IO circuit with fault hy-pothesis 111
Figure 627 Simulation of circuit with fault hypothesis 112
Figure 628 Measured energy dissipation and simula-tion of circuit with fault hypothesis 113
Figure A1 3D65 300mm wafer (on chuck) 118
Figure A2 Probe card 118
Figure A3 3D130 200mm wafer (on chuck) 119
Figure A4 Probe head 119
L I S T O F TA B L E S
Table 21 3D interconnect technologies comparison 8
Table 22 Technology parameters for Imec 3D130 process11
Table 23 Technology parameters for Imec 3D65 process11
Table 24 Comparison energy-recoveryconventionalCMOS 16
Table 31 Test structures on the PRC1 module 31
Table 32 PRC1 test pad instrument connections 37
Table 33 TSV capacitance D2D variability on waferrsquoD11rsquo 38
Table 34 TSV capacitance D2D variability on waferrsquoD18rsquo 39
Table 35 Variable pitch TSV capacitance experimentaldata from wafer rsquoD18rsquo 43
Table 36 Comparison between CBCMLCR measure-ments on wafer rsquoD18rsquo 45
Table 41 Coefficients of thermal expansion 47
Table 42 MOSFET array configurations for test padmodules PTP1PTP2 54
Table 43 MOSFET array configurations selected forexperimental evaluation 61
Table 44 PTP1PTP2 test pad module instrumentconnections 64
Table 45 Summary of experimental measurementson long (L) and short (S) channel PMOSdevices 73
Table 51 Energy dissipation results ( f Jcycle) the-oretical model (T) versus SPICE simulation(S) 84
Table 61 Energy-recovery IO circuit parameters 93
Table 62 Calculated resistance tolerances for eachwire branch in the resonant clock distribu-tion tree 100
Table 63 Energy-dissipation theoretical calculationsfor Q = 17 f = 500MHz 104
Table 64 Post-layout simulated energy dissipation Area requirements 106
Table 65 Energy-recovery circuit instrument connec-tions 108
Table 66 CMOS circuit instrument connections 108
xiii
A C R O N Y M S
2D two-dimentional
2N-2P 2 nMOS2 pMOS adiabatic logic
3D-IC three-dimentional integrated circuit
3D-SIC 3D stacked-IC
3D-SIP 3D system-in-package
3D three-dimentional
3D-WLP 3D wafer-level-packaging
BCB benzocyclobutene
BEOL back-end-of-line
CAL clocked adiabatic logic
CBCM charge-based capacitance measurement
CMOS complementary metal-oxide-semiconductor
CMP chemical-mechanical planarization
CTE coefficient of thermal expansion
Cu copper
CVD chemical vapor deposition
D2D die-to-die
D2W die-to-wafer
DCVSL differential cascode voltage switch logic
DRC design rule check
DUT device under test
ECRL efficient charge recovery logic
EDA electronic design automation
FEM finite element method
FEOL front-end-of-line
GPIB general purpose interface bus
IC integrated circuit
xiv
acronyms xv
IO inputoutput
KOZ keep-out-zone
LC inductance-capacitance
LVS layout versus schematic
MEMS microelectromechanical systems
MIM metal-insulator-metal
MOSFET metal-oxide-semiconductor field-effect transistor
MOS metal-oxide-semiconductor
MPU microprocessor unit
MuGFET multiple gate field-effect transistor
NMOS n-channel MOSFET
P2LC pulse-to-level converter
PAL pass-transistor adiabatic logic
PCB printed circuit board
PCG power-clock generator
PMD pre-metal dielectric
PMOS p-channel MOSFET
PVT process voltage temperature
QSERL quasi-static energy recovery logic
RC resistance-capacitance
RDL re-distribution layer
RLC resistance-inductance-capacitance
RMS root mean square
SIP system-in-a-package
Si silicon
SMU source measurement unit
SoC system-on-chip
TaN tantalum nitride
TSV through-silicon via
W tungsten
acronyms xvi
W2W wafer-to-wafer
WLP wafer-level-packaging
1I N T R O D U C T I O N
Extending integrated circuit (IC) functionality while keepingcost per silicon area approximately constant has historicallybeen the fundamental factor behind the growth of the semi-conductor industry Ever since Moorersquos observation [51] in 1965
chip functionality has been doubling every 15-2 years advancingfrom the Intel 4004 with 2000 transistors to modern generationmicroprocessor units (MPUs) that integrate over 1 billion tran-sistors [35] While there are indications that this trend might beslowing down due to planar metal-oxide-semiconductor field-effect transistor (MOSFET) physical limitations [22] innovativesolutions such as multiple gate field-effect transistors (MuGFETs)continue to drive the transistor roadmap forward as was recentlydemonstrated by Intel on their 22nm node MPU [12] It can bereasonably expected that IC transistor density will continue toincrease well into the next decade reaching 10 or even 100 billiontransistors per chip [4]
As transistor gate length dielectric thickness and junctiondepth are scaled performance in terms of speed and power im-proves as well The performance requirements for ICs have beensteadily increasing for succeeding technology generations drivenby market demand Although transistor scaling has traditionallybeen the main technology focus determining both cost and per-formance of ICs in recent years the performance gains enabled byscaling have been gradually challenged by on-chip interconnects
In an IC interconnects are the conducting wires connectingdevices and functional blocks in the circuit With smaller tran-sistors and increased functionality in modern processes intercon-nects have been progressively becoming more complex occupyingup to 12 metal layers in the chip As wire length increased sodid the signal delay time and power consumption of intercon-nects which are typically determined by wire intrinsic parasiticsin terms of resistance-capacitance (RC) It was estimated that a130nm microprocessor consumed over 50 of dynamic poweron interconnects [41] with a projection that in future designsup to 80 of microprocessor power would be consumed byinterconnects [3] Likewise signal propagation delay of globalinterconnects as a portion of the total system delay has beenrapidly increasing for succeeding technology nodes (Figure 11)
The rapid increase of the interconnect impact on IC perform-ance prompted ITRS1 to appoint the realisation of novel and
1 International Technology Roadmap for Semiconductors
1
11 research goals and contribution 2
Figure 11 Scaling trend of logic and interconnect delay [30]
innovative interconnect technologies as one of the ldquoGrand Chal-lengesrdquo in 2007 [3] Since then several technologies have been in-vestigated to address the interconnect bottleneck such as copperlow-k interconnects optical interconnects and three-dimentional (3D)interconnects Although copperlow-k interconnects are morecompatible with current processes and remain the most attractiveoption into the near term 3D interconnects are probably the mostrealistic option for meeting future technology trends [53]
Emerging 3D integration technologies based on vertical through-silicon vias (TSVs) promise a solution to the interconnect per-formance bottleneck along with reduced fabrication cost andheterogeneous integration TSVs are electrical connections passingthrough the silicon substrate and interconnecting separate diesstacked vertically in a 3D package [69] While there are several 3D
interconnect technologies [11] TSVs have the potential to offer thebest all around 3D integration technology with their advantagesincluding reduced parasitics and high interconnection density
11 research goals and contribution
As TSV is a relatively recent 3D interconnect technology there arestill challenges to be overcome before TSVs can be widely adoptedand commercialised Fabrication issues such as etching linerdeposition metallization and chemical-mechanical planarization(CMP) steps are in the process of being resolved or solutions havealready been found [55] Consequently the next logical step isthe implementation of three-dimentional integrated circuits (3D-
ICs) with more complex functionalities than simple test chipsHowever complex circuit design requires design methodologiesand tools that for 3D-ICs are still at their infancy stage [18]
The motivation behind this project was to accelerate the de-velopment of such design methodologies and tools for future
12 thesis outline 3
3D-IC designs by addressing key concerns of the TSV technologyprimarily from the circuit designerrsquos perspective The researcheffort was aimed at developing characterization structures forevaluating TSV characteristics affecting signal propagation delaypower consumption and circuit robustness as well as proposingdesign techniques for overcoming limitations Although TSVs areinterconnects their behaviour resembles more to MOSFET devicesthan the commonly used back-end-of-line (BEOL) metal wireswhile their integration into silicon introduces new challenges forICs Consequently under certain conditions conventional designapproaches are sufficient neither for TSV characterization nor forTSV interconnection
The TSV parasitic capacitance and thermomechanical stress dis-tribution are critical TSV characteristics The parasitic capacitanceaffects both delay and power consumption of signals passingthrough the TSV interconnect while thermomechanical stress af-fects MOSFET device mobility in the vicinity of TSVs Accuratelydetermining these parameters is necessary for generating designrules and models that can be used in TSV-aware design tools Aspart of this project sophisticated test structures were developedand fabricated on a 65nm 3D process for extracting the TSV para-sitic capacitance and thermomechanical stress distribution
The TSV parasitic capacitance is typically small however it maystill become an important source of power consumption as TSVs
are gradually implemented in large densely interconnected 3D-system-on-chips (SoCs) Although conventional low-power designtechniques based on voltage scaling can be used this representsa technical challenge in modern technology nodes In this worka novel TSV interconnection scheme is proposed which is basedon energy-recovery logic and shows frequency-dependent powerconsumption The scheme is analysed using theoretical modellingwhile a demonstrator IC was designed based on the developedtheory and fabricated on a 130nm 3D process
The work presented in this thesis was part of a larger researchproject undertaken at Imec2 with the goal of improving andoptimising TSV-based 3D integration technology The authorrsquoscontribution within this project was to propose design and sim-ulate the circuits presented in the following chapters as well asperform and analyse the experimental measurements
12 thesis outline
This thesis is organised into seven Chapters and two Appendixesas follows
2 Imec is a micro- and nanoelectronics research center located in Leuven Belgium(httpwwwimecbe)
12 thesis outline 4
Chapter 2 (Background) introduces the basic concepts com-prising the fields of 3D integration and energy-recovery logic Inaddition the design flow used for implementing the design partof this work is described
Chapter 3 (Characterization of TSV capacitance) presents agroup of test structures for characterizing the TSV parasitic ca-pacitance under various conditions The test structures are fab-ricated on a 65nm 3D process and the measurement data areanalysed using statistical methods
Chapter 4 (TSV proximity impact on MOSFET performance)presents a test structure for monitoring the impact of TSV-inducedthermomechanical stress on MOSFET device performance Thetest structure uses proven design techniques for implementinga large number of test-cases and accessing numerous deviceswith maximum precision The structure is fabricated and usedfor characterizing thermomechanical stress on a 65nm 3D process
Chapter 5 (Evaluation of energy-recovery for TSV intercon-nects) investigates the potential of the energy-recovery techniquefor reducing the energy dissipation of TSV interconnects in 3D-ICsThe total energy dissipation per cycle and optimum device sizingare extracted for the proposed scheme using theoretical model-ling while the configuration is evaluated against conventionalstatic complementary metal-oxide-semiconductor (CMOS) design
Chapter 6 (Energy-recovery 3D-IC demonstrator) elaborates onthe design of a 3D demonstrator circuit based on the energy-recovery scheme under realistic physical and electrical con-straints The demonstrator is compared to a CMOS circuit de-signed with identical specifications using post-layout simulationsBoth circuits are fabricated on a 130nm 3D technology processand evaluated based on the experimental measurements
Chapter 7 (Conclusions) summarises the contributions of thisthesis as presented in the previous chapters and discusses areasof future work
Appendix A includes photos used as reference in various partsof this work
Appendix B elaborates on the statistical descriptors used inthis work for analysing experimental measurement data
2B A C K G R O U N D
The work presented in this thesis contributes in the fields of 3D
integration and energy-recovery logic This chapter introduces thebasic concepts comprising these two fields not in extensive detailbut enough that the reader can understand the motivation andthe choices made in the context of this work Furthermore thefull-custom design flow is discussed that was used to implementthe design part of this thesis as will be presented in the followingchapters
21 3d integrated circuits
CMOS integration technologies are traditionally based on two-dimentional (2D) planar architectures Each die is individuallypackaged and placed on a printed circuit board (PCB) which isused to provide mechanical support and interconnection betweendifferent dies and electronic components A 3D-IC is a stack ofmultiple dies (Figure 21) vertically integrated and interconnec-ted sharing a common package
There are several motivations for the development of 3D-ICsPerhaps the most significant advantage of 3D integration is thereduced interconnection delay and energy dissipation In generalglobal interconnect delay and energy is linearly related to wirelength in an IC By using 3D integration modelling of simple3D-IC architectures has demonstrated that up to 50 reduction inglobal interconnect length may be achieved [24] with comparablereductions in delay and energy
Another advantage of 3D integration is heterogeneous integra-tion Presently 2D planar architectures limit designers to a singlefabrication technology for both the analog and digital parts of
ABCDEFBA ABCDEFBA
Figure 21 Comparison of 2D3D Integration
5
21 3d integrated circuits 6
ABCDEFBC
E
Figure 22 System-in-a-package
ABCDEFBC
E
Figure 23 Package-on-package
a circuit The common practice is to use an inexpensive digitalprocess for the complete design which is usually inefficient forimplementing analog circuits In contrast a 3D-IC allows for themost suitable process technology to be used for implementingeach part of the design In addition other components may beintegrated in the same circuit such as microelectromechanicalsystems (MEMS) [80] and passive devices [66] (resistors capacitorsinductors) enabling ICs with novel functionalities
The replacement of long horizontal global interconnects withshort vertical ones allows for reduced die size as well which isdirectly related to the fabrication cost Also the vertical integ-ration of dies in the same package removes the need for thespacing overhead between components that is necessary in PCBsincreasing the area efficiency and reducing the form factor
211 3D integration approaches
3D integration may be achieved using several microelectronictechnologies which result in different densities and capabilitiesThe levels of 3D integration can be reasonably classified into threemajor technology platforms [11 4] based on the interconnectwiring hierarchy and the industrial infrastructure
bull 3D system-in-package (3D-SIP) Stacking of individual pack-ages
bull 3D wafer-level-packaging (3D-WLP) Stacking of embeddeddies by wafer-level-packaging (WLP)
bull 3D stacked-IC (3D-SIC) TSV-based vertical system integra-tion
The 3D-SIP integration is based on traditional packaging intercon-nect technologies Each sub-system is integrated in a system-in-a-package (SIP) fashion (Figure 22) using wire-bonds for inter-connection and subsequently each SIP subsystem is assembledvertically to create package-on-package 3D stacks (Figure 23)
3D-SIP is currently the most mature technology and has alreadyreached volume production It provides with increased function-ality in a smaller form factor and low cost However interconnectdensity is limited since both wire bond pads and solder bumps
21 3d integrated circuits 7
Figure 24 Imecrsquos 3-D SIP wireless bio-electronic sensor [67]
A
BCDEAFE
FFDC
CFE
Figure 25 3D wafer-level-packaging
take up a large area on silicon due to their large pitch The longbond wire length can also significantly distort signals due toits intrinsic parasitics Hence this approach offers limited per-formance when compared to other 3D interconnect solutions Anexample of 3D-SIP application is the wireless bio-electronic sensor(Figure 24) developed at Imec [67]
3D-WLP integration is based on WLP technology directly stack-ing individual dies face-to-face using flip-chip micro-bumps forinterconnection The 3D interconnects are processed after IC pas-sivation using WLP techniques and may also be realised usingTSVs (Figure 25) The TSVs if existing are much larger than theequivalent interconnections in a 3D-SIC typically in the range of20 minus 40microm (1 minus 5microm in 3D-SIC) [11]
The 3D-SIC technology [69] is similar to a 2D SoC approachexcept that circuits exist physically on different layers and verticalinterconnects replace the global interconnections of the SoC 3D-
SICs (Figure 26) are based on TSV interconnections and have thepotential to offer the best all around3D integration technology Theadvantages of TSV-based 3D-SICs include the short interconnectlength (reduced parasitics) and the high interconnection densityHowever the additional process steps required for implementingthe TSVs increase manufacturing cost
A comparison between the discussed 3D interconnect technolo-gies is attempted in Table 21
21 3d integrated circuits 8
ABCD
EFFDB
F
Figure 26 3D-stacked IC
3D interconnect
technologyAdvantages Disadvantages
3D system-in-package (3D-SIP)
Simple existingtechnology bestheterogeneous
integration
Performance(speed power) low
densitymanufacturing costin high-quantities
3D wafer-level-packaging (3D-
WLP)
Existingtechnology good
compromisebetween
performance-manufacturing
cost
Averageperformance and
density
3D stacked-IC (3D-SIC)
Performance(speed power)
high densitymanufacturing costin high-quantities
Manufacturing costin low-quantities
not ready formanufacturing
Table 21 3D interconnect technologies comparison
21 3d integrated circuits 9
ABBCDE
FBACACDA
ACCD F
Figure 27 TSV process steps
212 TSV interconnects
Of particular interest to this PhD thesis is the 3D-SIC approachdeveloped at Imec and specifically the TSVs used for verticalinterconnection
In Imecrsquos 3D-SIC process [69 73] TSVs are created prior to theBEOL processing The TSV process steps are illustrated in Fig-ure 27 After processing of the CMOS front-end-of-line (FEOL) andpre-metal dielectric (PMD) stack the TSV hole is plasma-etchedwith a diameter of 5microm and a minimum pitch of 10microm A thinchemical vapor deposition (CVD) oxide layer (sim 200nm) is thendeposited for insulating the TSVs from the substrate The metal-lization sequence consists of a tantalum nitride (TaN) barrier de-position and filling of the TSV hole with electroplated copper (Cu)Finally the Cu overburden is removed in a top-side CMP step anda standard 2-metal layer BEOL process is applied to finalise thedie
Before stacking the wafer is mounted on a temporary carrierand thinned down to a silicon (Si) thickness of sim 25microm leavingthe TSV Cu exposed on the back-side of the wafer (Figure 28)After dicing the dies are stacked by Cu-Cu thermocompressionin a die-to-die (D2D) or die-to-wafer (D2W) fashion (Figure 29)using a tacky polymer as an intermediate glue layer such asbenzocyclobutene (BCB) (ǫr = 27) Target technology parametersfor Imecrsquos 3D-SIC 130nm and 65nm processes are summarised inTables 22 and 23 respectively
As TSV is a relatively new 3D interconnect technology thereare several challenges to be overcome before TSV technology canbe widely adopted and commercialised Issues in the fabricationlike etching liner deposition metallization and CMP steps arein the process of being resolved or solutions have already beenfound [55] On the contrary design methodologies for TSV-basedICs are still at their infancy stage [18]
21 3d integrated circuits 10
AAAB
CAAAB
ABC
DEF
Figure 28 Before thinningafter thinning
ABC ABC
DE DE
F
DEE
B
CBA
Figure 29 TSV stacking
21 3d integrated circuits 11
Minimum devicechannel length (nm)
130
Supply voltage (V) 12
Routing layers 2
TSV diameter (microm) 5
TSV pitch (microm) 10
TSV length (microm) 22
TSV resistance (mΩ) ~20
TSV capacitance( f F)
~40
Wafer diameter(mm)
200
Table 22 Technology paramet-ers for Imec 3D130
process
Minimum devicechannel length (nm)
70
Supply voltage (V) 10
Routing layers 2
TSV diameter (microm) 5
TSV pitch (microm) 10
TSV length (microm) 40
TSV resistance (mΩ) ~40
TSV capacitance( f F)
~80
Wafer diameter(mm)
300
Table 23 Technology paramet-ers for Imec 3D65 pro-cess
Approaching the technology challenges from the designerrsquosperspective we can emphasise on the following issues designtools thermomechanical stress thermal analysis electrical char-acteristics and testing
Design tools
Although it is possible to design 3D-ICs using existing tools byutilising ad hoc techniques as 3D implementations become moreadvanced ldquonativerdquo 3D design and analysis tools will becomenecessary A few physical layout tools are currently availablefor TSV-based design [2] however they do not offer automaticpartitioning placement and routing Moreover there are noavailable commercial tools for evaluating performance (speedpower) reliability and manufacturability in 3D-IC designs
Thermomechanical stress
TSV fill materials may include copper (Cu) tungsten (W) orpolysilicon[20] The large mismatch in the coefficient of thermalexpansion (CTE) between the fill material of the TSV (17 middot 10minus6K)and the Si surface (3 middot 10minus6K) induces mechanical stress aroundthe vias As Cu contracts much faster than Si it pulls the surface ofthe surrounding bulk while cooling down causing tensile stressin the area [25] Since even low stress values affect carrier mobilityin MOSFET devices [71] performance shift of the transistors inproximity to TSVs may occur
21 3d integrated circuits 12
The intensity of TSV-induced stress depends on various condi-tions such as TSV geometry device proximity and process char-acteristics Defining design rules such as keep-out-zones (KOZs)for digital and analog circuits and generating stress-effect modelsis an important milestone for implementing TSV-aware circuitswith predictable performance Simulations predict the complexdistribution of thermomechanical stress [40] which emphasisesthe need for sophisticated characterization structures that canmonitor stress impact with precision and verify simulation mod-els In Chapter 4 a characterization structure is presented forevaluating the impact of TSV-induced thermomechanical stresson MOSFET device performance
Thermal analysis
The close proximity of dies in a 3D-IC vertical stack increasesthe heat density proportionally to the number of stacked diesAs temperature affects transistor parameters such as leakagecurrent accurate thermal analysis is required for 3D-ICs to ensurethat design specifications are met Presently tools for thermalanalysis of planar ICs are available however they have to beadapted for use with complex 3D systems so as to reduce runtime and improve the accuracy of TSV models [18] In additionldquothermally-awarerdquo design tools are required which can partitionthe design in such a way that high heat generating componentsare placed on layers in proximity to the heat-sink
Electrical characteristics
The electrical characteristics of TSVs such as delay crosstalk sig-nal integrity (SI) and power integrity (PI) dictate many aspects ofa 3D-IC design Proper evaluation of TSV characteristics typicallythrough simulation means is critical for accurately predicting theperformance of a 3D system [57]
Several attempts have been made to generate models of theTSV structure [77 27] using its intrinsic resistance-inductance-capacitance (RLC) parasitics in an equivalent circuit model TheRLC parasitics depend on the materials and geometry of the TSV
structure while prior work has indicated that these parametersare relatively small with the capacitance assuming the dominantrole on TSV electrical properties [27] For verifying the accur-acy and calibrating the models experimental measurements arerequired using compatible characterization structures Such struc-tures are presented in Chapter 3 which were used for extractingthe TSVs parasitic capacitance of an experimental 3D process
22 energy-recovery logic 13
Testing
Testing ICs for defects is crucial for guaranteeing a sufficient man-ufacturing quality The unique processing steps of TSVs such aswafer thinning alignment and bonding introduce new types ofdefects that should be considered during testing [46] Further-more finding and diagnosing defects in TSV interconnections isconsiderably more challenging than 2D interconnects consideringthat it is not possible to gain physical access to the vias after thetop die has been placed All these factors emphasise the need fornew testing structures and methodologies for testing 3D systems
22 energy-recovery logic
TSV 3D interconnect technology aims to reduce delay and en-ergy dissipation when compared to the traditionally used planarBEOL interconnections A key TSV characteristic is its parasiticcapacitance discussed in Chapter 3 which is typically small(Tables 22 and 23) influencing proportionally both delay andenergy dissipation Although TSV technology offers significantperformance improvements to previous solutions a low-powerinterconnection scheme may still be desirable under certain con-ditions due to specific characteristics of TSV technology
1 The TSV capacitance may become a concern in future largedensely interconnected 3D-SoCs as it increases linearly withthe number of tiers and interconnections
2 Since interconnects do not follow transistor scaling tran-sistor gate capacitance will always be smaller than intercon-nect capacitance With the gap tending to widen in futuretechnologies it can be reasonably expected that 3D intercon-nects will become a performance bottleneck in ICs just asthe BEOL interconnections preceding them
Furthermore conventional CMOS low-power design techniquesbased on voltage scaling represent a technical challenge in mod-ern technology nodes as process voltage temperature (PVT) vari-ations and increased leakage [61] decrease in practice the rangeover which voltages may be scaled An interesting low-powerdesign approach is energy-recovery logic which demonstratesfrequency-dependent energy dissipation Energy-recovery logicis a design technique for minimising the energy dissipation ofcomputing devices well below typical CMOS limits Its major lim-itation is the requirement for passive components to generatetime-variable signals which cannot be efficiently integrated intocurrent generation ICs TSV-based 3D stacking and heterogeneousintegration of high-quality passive devices [66] may enable theuse of novel design techniques such as energy-recovery that
22 energy-recovery logic 14
Figure 210 Conventional AND gate
were not practically feasible in the past due to technology lim-itations In this PhD thesis energy-recovery is used for drivingTSV-based interconnections in a low-power scheme discussed inChapters 5 and 6
Historically energy-recovery logic has its roots in reversiblecomputing Reversibility is a concept from thermodynamics de-scribing physical processes that after they have taken place canbe reversed without loss or dissipation of energy Reversibleprocesses cause no change in the entropy of the system or itssurroundings (do not generate any waste heat) therefore theyachieve maximum energy efficiency The concept of reversibilityin thermodynamics may be extended to logical reversibility ifa computation can be implemented in such a way that it is al-ways possible to determine the previous state of the computationgiven a description of its current state In practical terms logicalreversibility would require that all information about the state ofthe computation in any stage is retained and never discarded
Yet the typical computing device is not reversible (irreversible)and as Landauer [36] has shown there is a fundamental min-imum amount of energy that must dissipated in such devicesThis minimum energy dissipation is known as the Von Neu-mann - Landauer limit and is at least kT ln 2 per irreversible bitcomputation The irreversible operation of conventional comput-ing can be demonstrated using the AND gate as an example(Figure 210) The AND gate has two input lines and one outputWhen both inputs are at logical 1 the output is 1 however theoutput is 0 in all of the remaining three combinations that have alogical 0 on the inputs Hence every time the gatersquos output be-comes 0 there is loss of information since we cannot determineexactly which input state caused the logical 0 at the output
Computations do not necessarily have to be irreversible aswas shown by Bennett [9] and thus it is still possible to breakthe kT ln 2 limit A reversible computing device would have to
22 energy-recovery logic 15
perform each computation twice the first time in the forwarddirection applying the inputs to a logic function and obtainingthe result and the second time backwards computing the inversefunction and returning the system to its initial state If a com-putation could be performed in such a way and implementedon non-dissipative hardware its energy requirements could bepotentially reduced to zero [10]
In practice even if a circuit is designed to be logically reversibleachieving zero-energy computations utilising existing electricalcircuits is limited considering that any process merely involvingcharge transfer will result in dissipated energy Nevertheless it isstill possible to reduce the energy dissipation during charge trans-ferring to ldquoasymptotically zerordquo levels applying circuit designtechniques generally known as adiabatic switching [8 31]
The operating principle of adiabatic switching can be bestexplained by comparing it to conventional CMOS switching In aconventional CMOS circuit (Figure 211) when the switch closes(typically the pull-up network) a certain amount of charge Q =
CVDD is pulled out of the positive power supply rail that chargesthe load capacitance C up to VDD The energy transferred fromthe power supply during the charging period (T) is [59]
E = QVDD = CV2DD (21)
From the total transferred energy only half ( 12 CV2
DD) is storedon the capacitor C while the other half is dissipated on theswitch and interconnect resistance R As can be derived fromEquation 21 to reduce the energy dissipation in conventionalCMOS either the supply voltage VDD has to be reduced or theload capacitance C
The adiabatic charging of an equivalent node capacitance C
is modelled in Figure 212 In contrast to the conventional CMOS
circuit the supply voltage in this case is not constant but time-variable with a slow rising time (T ≫ RC) If the charging periodof the capacitor (or rising time of the voltage supply) is T then itcan be proven [82 60] that the energy dissipated on the resistanceR would be proportional to
EAdiabatic prop
(
RC
T
)
CV2DD (22)
The distinct advantage of adiabatic switching is that the slowrising time of the supply voltage keeps the voltage difference(and thus current flow) on the resistance R low at all times Asthe charge transfer is spread more evenly over the entire chargingperiod the energy dissipation (E = I2R) is greatly reduced Inaddition as can be derived from Equation 22 increasing thecharging period (T) can potentially reduce the energy dissipationto arbitrary low levels irrespective of the VDD or C parameters
22 energy-recovery logic 16
A
B
CC
D
E
D
Figure 211 CMOS switching
AA
B
C
B
Figure 212 Adiabatic switching
Conventional CMOS Energy-recovery CMOS
2 states True False 3 states True False Off
Energy of information bitdissipated as heat on the
pull-down network
Energy of information bitrecovered by the
power-supply
High speed Lowmedium speed
High energy dissipation Very low energy dissipation
Area efficient Some area overhead
Table 24 Comparison energy-recoveryconventional CMOS
Combining reversible computations with adiabatic switchingtechniques allows for circuit implementations with extremely lowenergy dissipation compared to conventional designs [37 38]However fully reversible circuits in CMOS technology sufferfrom high area overhead as the inverse logic function has tobe implemented and intermediate logic states must be stored toenable reversibility of the computation [8] In cases where theoverhead of full reversibility is too high it can be significantlyreduced by allowing some information to be occasionally dis-carded without sacrificing much of the energy benefits Thesecircuits that partially apply the principles of reversible comput-ing and adiabatic switching to achieve low but not zero energydissipation are called energy or charge recovery [31]
A comparison between energy-recovery and conventional CMOSis summarised in Table 24
22 energy-recovery logic 17
ABCABC
Figure 213 Basic 2N-2P adiabaticcircuit
ABC
ABC
DEF
EE
DEF
Figure 214 2N-2P timing
221 Energy-recovery architectures
Several energy-recovery circuit architectures have been proposedin the literature over the past years Examples of logic families are2 nMOS2 pMOS adiabatic logic (2N-2P) [33] efficient charge re-covery logic (ECRL) [50] pass-transistor adiabatic logic (PAL) [54]clocked adiabatic logic (CAL) [45] quasi-static energy recoverylogic (QSERL) [83] and boost logic [62] These architectures applythe principles of reversible computing and adiabatic switchingto a variable degree while their differences are mainly relatedto characteristics such as singledual-rail logic style number ofclock phases chargingdischarging path and energy efficiency
A simple configuration is the 2N-2P family of logic circuits(Figure 213) which is based on the differential cascode voltageswitch logic (DCVSL) circuit [59] The timing diagram of the 2N-2P
circuit is presented in Figure 214 using an idealised adiabaticvoltage supply Initially the voltage supply is in the WAIT phase(LOW) keeping the outputs in the LOW state Then the inputs areset complementary to each other and the supply voltage rampsup As the inputs are evaluated the output that was enabled bythe pull-down network follows the power supply until it reachesVDD At that moment the inputs return to the LOW state andafter a certain period of time in the HOLD ldquo1rdquo phase the voltagesupply ramps down discharging the output As can be observedthe average voltage supply during the computation cycle is wellbelow VDD which can be advantageous in modern technologynodes where leakage currents are of a concern
The adiabatic chargingdischarging takes place during theramping updown of the EVALUATERESET phases while thereverse computation occurs during the RESET phase when thetransferred charge is returned to the power supply Howeverthe reversibility of the circuit is only partial since the p-channel
22 energy-recovery logic 18
Figure 215 Adiabatic driver
MOSFET (PMOS) transistor that was turned-on by the pull-downnetwork will be forced to turn-off when the output is approxim-ately a threshold voltage above zero This will cause some chargeto be trapped on the output which will be dissipated as heat on asubsequent cycle Using the 2N-2P circuit arbitrary logic functionsmay be implemented in the place of the pull-down network
The energy efficient characteristics of energy-recovery logic canbe also beneficial for driving large load capacitances A circuitthat may be used in such case is the adiabatic driveramplifier(Figure 215) [8]
The input of the adiabatic driver is dual-rail encoded as well asthe output The operation principle is straightforward First theinputs are set to valid values and then an adiabatic time-variablesignal goes through the pass-gate charging the load capacitanceat the output to its maximum value The output is now validand can be used by other stages to perform computations Sub-sequently the output slowly discharges returning the charge backto the power-supply Energy is saved by two methods Firstlythe slow-rising adiabatic signal forces a low voltage differencebetween the inputs of the pass-gate resulting in low currentand thus low energy dissipation Secondly the energy that isnot dissipated as heat during the transfer is returned back tothe power-supply which can be re-used in the next computationcycle
Circuits utilising the energy-recovery technique have been suc-cessfully implemented in the past while they have demonstratedgreat potential whilst driving highly-parasitic global intercon-nects such as clock distribution networks [42 63]
222 Power-clock generators
Energy-recovery circuits require specialised power supplies thatcan generate multiple-phase time-variable adiabatic signals and
22 energy-recovery logic 19
A
A
B
C
C
Figure 216 Stepwise charging generator
recover charge Since the power supply in an energy-recoverycircuit provides with both the power and clock signal they arereferred to as power-clock generators (PCGs) [6] The power-clockgenerators fall into two categories staircase and resonant gener-ators
Staircase generators use the stepwise charging technique [68]The basic circuit (Figure 216) is composed of N number of tankcapacitors (CT) that are successively discharged in N steps un-til the load capacitance (CLOAD) is charged to the maximumvoltage level This technique attempts to ldquoemulaterdquo the adiabaticchargingdischarging signal while its linearity depends on thenumber of charging steps N The advantage of staircase generat-ors is that they can be easily integrated on-chip since they do notrequire inductors as is the case with resonant generators How-ever both speed performance and energy efficiency are relativelylow
Resonant generators [44] are based on inductance-capacitance(LC) oscillators A simple circuit is the 1N single-phase powerclock generator of Figure 217 If R is the resistance of the adia-batic driver in Figure 215 then the load capacitance (CLOAD)along with a resonant-tank inductor (L) form an RLC oscillatorresonating at
f =1
2πradic
LCLOAD
(23)
As typical for RLC oscillators the shape of the power-clockis a sinewave which approximates the idealised waveform ofFigure 214 Energy is transferred adiabatically during the risinghalf of the sinewave while on the falling half the energy is
23 full-custom design flow 20
A
BCD
E
DFF
CD
Figure 217 1N power clock generator
partially recovered and stored in the inductor for later use RLC
oscillators are very energy efficient power-clock generators andcan operate at high speeds However inductors may be difficultto implement on-chip
23 full-custom design flow
All circuits presented in this thesis were designed using the full-custom approach since the target fabrication technology was anexperimental 3D process for which neither standard cells nordedicated 3D design tools were in our disposal
In the full-custom design methodology each individual tran-sistor and associated connectivity in the integrated circuit ismanually specified in the layout level by the designer Full-customis considered to be superior to standard-cell design when themain concern is circuit performance and area efficiency Further-more full-custom design approaches are essential in cases wherespecialised circuitry are to be designed or when standard celllibraries are not available
The electronic design automation (EDA) environment that wasused was based on Cadence1 applications and tools as integ-rated in the Design Framework II (DFII version 5141) CadenceDFII consists of tools for design management (library manager)schematic entry (Virtuoso schematic) physical layout (Virtuosolayout) verification (Assura) and simulation (Spectre) Thesetools are completely customisable supporting various processtechnologies through foundry-specific design kits In our case
1 Cadence Design Systems Inc (httpwwwcadencecom)
24 summary 21
the Assura tool was replaced by Mentor Graphics2 Calibre forthe verification step
Our customised design flow is illustrated in Figure 218 Ini-tially the circuit is captured at the transistor-level using theschematic editor (Figure 219) and the completed design is simu-lated and modified until it conforms to the design specificationsThe same circuit is re-created in the layout editor (Figure 220)where the detailed geometries and positioning of each fabricationmask layer is described The layout must conform to a series ofprocess design rules and any violations of these rules can be de-tected during the design rule check (DRC) step Subsequently thelayout design is translated to a circuit description (netlist) whichis compared to the initial circuit in the schematic editor duringthe layout versus schematic (LVS) step If the two designs matchthe interconnectivity and transistor parasitics are estimated fromthe layout design and another netlist is created which includesthese parasitics This detailed netlist can be simulated to get amore accurate estimation of the circuit behaviour including theinfluence of parasitics
24 summary
In this chapter the reader was introduced to the basic conceptsand terminology that are relevant in the context of this thesisThe following chapter begins the discussion of the research partof this work presenting a group of test structures that were usedfor the electrical characterization of the TSV parasitic capacitancein an experimental 3D process
2 Mentor Graphics Inc (httpwwwmentorcom)
24 summary 22
ABCDEA
FCCAAE
ABCDEAEC
DE
E
F
CEA
EEAEA
EDE
CAC
E
EC
CAC
Figure 218 Design flow
Figure 219 Inverter in Virtuososchematic
Figure 220 Inverter in Virtuosolayout
3C H A R A C T E R I Z AT I O N O F T S V C A PA C I TA N C E
31 introduction
In a semiconductor process technology electrical characterizationis crucial for testing device performance and reliability after goingthrough the process fabrication steps Electrical characterizationverifies the electrical properties of devices to specifications andprovides feedback to the wafer fabrication process engineers forimproving and optimising the process technology
In a 3D system the electrical behaviour of the TSV can bemodelled in terms of its resistance-inductance-capacitance (RLC)parasitics as discussed in 212 (page 12) Parasitic models are typ-ically used for simulating circuit behaviour and they are criticalfor accurately predicting the performance of a 3D system duringdesign verification [57]
From the RLC parasitics associated with a TSV capacitanceassumes the most significant role on its electrical properties suchas energy dissipation and delay [27] The capacitancersquos role is sodominant that for latency and signal integrity (SI) analysis theTSV behaviour can be modelled by considering the capacitancealone [21] Thus accurate characterization of the TSV capacitanceis highly desirable when designing a 3D system
Typical laboratory capacitance measurement techniques basedon LCR1 meters require well controlled process technologies forrealising a high number of devices without defects which mightnot be possible at the current maturity level of TSV process tech-nology In this chapter an alternative measurement technique isproposed for accurate on-chip characterization of the TSV parasiticcapacitance Test structures based on this technique are fabricatedon a 65nm 3D process (Table 23 page 11) and used for charac-terizing the TSV parasitic capacitance under various conditionsThe experimental measurements are analysed using statisticalmethods and the proposed test structures are evaluated
32 tsv parasitic capacitance
A typical isolated 2D back-end-of-line (BEOL) metal wire can bemodelled as a conductor over a ground plane (Figure 31) Sincethe conductor is separated from the ground plane by a dielectric
1 LCR meter is a measurement equipment used to determine the inductance (L)capacitance (C) and resistance (R) of devices based on impedance measure-ments
23
33 integrated capacitance measurement techniques 24
ABC
Figure 31 Isolated 2D wire capacitance
material (εox) its electrical behaviour closely resembles that ofa capacitor The basic formula for calculating a parallel platecapacitance is
C =εox
hwl (31)
In practice calculating the parasitic capacitance of a BEOL metalwire is significantly more complex since fringing fields increasethe total capacitance value [78] Nevertheless whether a 2D metalwire capacitance is calculated using Equation 31 or more com-plex formulas [85] its value is only dependent on geometricalcharacteristics and the dielectric material
In that context 3D interconnects in the form of TSVs do notbehave as simple wires but as devices The TSV copper (Cu) issurrounded by silicon (Si) and these two materials are separatedby a dielectric oxide liner (Figure 32) This structure resemblesa metal-oxide-semiconductor (MOS) capacitor and as has beenshown by modelling and measurements [27] exhibits distinctaccumulation depletion and inversion regions (Figure 33) TheTSV capacitance attains its maximum value (Cox) in the accumula-tion when Vbias lt VFB (flatband voltage) and inversion regions(Vbias gt Vth f lt 1kHz)
The bias voltage and frequency dependency of the TSV capacit-ance complicates both modelling and the experimental measure-ment of its parasitic capacitance Furthermore the MOS behaviourof TSVs is not desirable in circuits using 3D interconnections Forthat reason special techniques are employed [28] that shift theTSV C-V characteristic in such a way that for typical circuit oper-ating voltages (0 minus VDD) the TSV capacitance is constant and atits minimum value (Figure 33)
33 integrated capacitance measurement techniques
Extracting the TSV parasitic capacitance requires the use of meas-urement techniques that are compatible with the particular TSV
33 integrated capacitance measurement techniques 25
ABCDE
F
FD
F
F
FD
Figure 32 TSV parasitic capacitance
ABCDE FACDE
ECDE
BCFACDE
FCCDED DBCDE
D
DAB
Figure 33 Capacitance vs Gate voltage (CV) diagram of a MOS Capa-citor
33 integrated capacitance measurement techniques 26
characteristics as indicated in the previous section The tech-niques commonly used for measuring integrated capacitances ina laboratory environment can be classified into two categories
1 Measurements based on external instruments these can bedirect capacitance measurements LCR meter based [52] orRF S-parameter based measurements [39]
2 On-chip capacitance measurements examples are the ref-erence capacitor technique [32] ring-oscillator based [26]and charge-based [15]
The disadvantage of using external instruments directly connec-ted to the device under test (DUT) is that the measurement setupintroduces additional parasitics that have to be subtracted fromthe measurement result Using standard measurement setupssignificant noise is added when attempting to determine capa-citances well below 1pF due to the influence of the instrumentparasitics [64] In cases where the DUT capacitance is small suchas the femto-Farad TSV capacitance (Table 23) a large number ofidentical devices (typically in the hundreds) can be connected inparallel to increase the total capacitance value [75] However thismethod is only usable with process technologies that are wellcontrolled and allow the realisation of a high number of deviceswithout defects In general the extracted capacitance of the wholepopulation may not be representative of a single device [56] andas a result individual device measurement accuracy may suffer
On-chip techniques can achieve better accuracy when the DUT
capacitance is small Although DUT yield is not critical the pro-cess technology should allow the integration of active devicesand their parameters should be known to a reasonable degreeA simple yet high resolution technique is the charge-based ca-pacitance measurement (CBCM) which can be used to extractcapacitances down to the atto-Farad range [15]
In the CBCM technique the DUT capacitance (CDUT) is chargeddis-charged at frequency f using a pseudo-inverter (Figure 34) whilethe average charging current (Im) is measured with a laboratoryammeter Since transistor and wiring parasitic capacitances are inparallel with the DUT capacitance a second (reference) inverter isused to measure these parasitics (Ire f ) which are then subtractedfrom the calculated result The DUT capacitance is calculated asfollows
CDUT =Im minus Ire f
VDD middot f(32)
The accuracy of the CBCM technique can be further increasedby measuring the charging current over a range of frequenciesThe DUT capacitance can be then calculated from the slope of
34 electrical characterization 27
AB
AC
DE
FE
DE
FE
AB
AC
B
Figure 34 CBCM pseudo-inverters
Inet (Im minus Ire f ) (Figure 35) The accuracy is greatly improved aspseudo-inverter leakage current is removed from the extractedresult (slope is not affected by constant biases) and also defectiveDUTs can be detected by inspecting the linearity of the current-frequency plot
The CBCM technique has been previously used for measuringBEOL metal interconnect parasitic capacitances [15] and MOSFET
capacitances [64] In this work CBCM is proposed for the on-chipmeasurement of the TSV parasitic capacitance CBCM is compat-ible with the TSV parasitic capacitance on the premise that thevoltage and frequency ranges of CBCM are selected such thatthe TSV is forced into a region where its capacitance is constantAs was shown in the previous section (Figure 33) TSV capacit-ance is constant for typical circuit operating voltages while thefrequency dependency of the capacitance becomes significantonly in the multi-MHzGHz range [39] Therefore operating theCBCM pseudo-inverters in the 0 minus VDD range and below 1MHz
should be adequate to avoid the MOS behaviour of the TSV
34 electrical characterization
A group of test structures based on the CBCM technique wereimplemented for on-chip characterization of the TSV capacitanceon Imecrsquos 65nm 3D process technology (Table 23) As it is commonin IC design maximising the utilisation of the available chiparea is one of the priorities of the design effort For that reasonsupplementary test structures were designed and implemented
34 electrical characterization 28
AB
CADE
F
AB
E
CAD
B
Figure 35 Current as a function of frequency in CBCM
on the same chip that went beyond the core objectives of thiswork
The complete design consisted of 6 test structures 5 for ex-tracting the TSV capacitance under various conditions based onCBCM and 1 additional experiment for direct measurement ofthe RC constant All test structures are presented in this sectionhowever due to equipment and material availability only themost significant were measured and analysed
341 Physical implementation
In the CBCM technique the extraction accuracy of the TSV capacit-ance is highly dependant on the design of the pseudo-invertersThe rsquoreferencersquo pseudo-inverter in CBCM is used for measuringboth the wire parasitic capacitance and the inverter parasitics inthe form of diffusion and gate-drain overlap capacitances There-fore it is critical that the two pseudo-inverters are identical interms of parasitics as any mismatches will appear in the extrac-ted measurement result
Transistor mismatch is mostly influenced by process technologyparameters however it is possible to minimise process variationssuch as in the poly gate length by applying special design tech-niques A major cause of gate length variation is the polysiliconpitch to neighbours and one way to mitigate that effect is to fillunused space between transistor gates with polysilicon dummies[13] This technique was widely used in the CBCM pseudo-inverterdesign presented in this work (Figure 36) Concurrently theinverter transistors were designed with relatively large dimen-sions for the implementation process technology (P10microm1micromN5microm1microm) so as to minimise the effect of variation as a per-centage of the transistor parasitics
34 electrical characterization 29
Poly dummy fills
Metal2Metal1PolysiliconActive
1μm 5μm
Figure 36 CBCM pseudo-inverter pair in layout
Imecrsquos 65nm 3D process allows for 2-tier vertical stacking ofdies (TOP and BOTTOM) with the TSV passing through theTOP die Si substrate as illustrated in Figure 32 The IO foreach test structure was accessible through a standard 2 times 12 testpad module on the TOP die compatible with our wafer probetest system The 2 times 12 test pad module (PRC1) along with thecomplete test structure design is illustrated in Figure 37 while amicrograph image of the fabricated chip can be seen in Figure 38
The CBCM pseudo-inverters were implemented on either TOP
or BOTTOM tier while the TSVs were connected to the invert-ers from TOP or BOTTOM and arranged in 4 configurations (11 times 2 2 times 2 2 times 5) with different pitch distances The 6 test struc-tures available on the PRC1 test pad module are summarised inTable 31 A detailed description of the test structures (rsquoArsquo rsquoBrsquo rsquoCrsquorsquoDrsquo rsquoRrsquo and rsquoErsquo) follows in the paragraphs below
Structure rsquoArsquo Single TSV capacitance (TOP die)
The purpose of test structure rsquoArsquo (Figure 39) is to extract the capa-citance of a single TSV when it is electrically accessed through theTOP die The structure is composed of 3 pseudo-inverters con-nected to the TSVs using identical geometry Metal-2 layer wireswhose capacitance is also measured (Reference) In addition to
34 electrical characterization 30
DTOP DBTM
B10 A10
BREF AREF
B1 A1
R1 R2
SET VDD
RESET GND
C15 C10
C50 C20
SET VDD
ERC SET
EREF EC Metal2 (TOP)Metal1 (TOP)
Metal1 (BTM)Metal2 (BTM)
Figure 37 PRC1 test pad module(layout)
Figure 38 PRC1 test pad module(micrograph)
34 electrical characterization 31
Structure A B
TSVs - 1 2x5 - 1 2x5
Inverterlocation
TOP TOP TOP BTM BTM BTM
TSVconnection
TOP TOP TOP BTM BTM BTM
Multi-TSVconnection
- - TOP - - BTM
Multi-TSVpitch (microm)
- - 15 - - 15
Pad name AREF A1 A10 BREF B1 B10
Structure C D
TSVs 2x2 2x2 2x2 2x2 1x2 1x2
Inverterlocation
TOP TOP TOP TOP TOP TOP
TSVconnection
TOP TOP TOP TOP TOP TOP
Multi-TSVconnection
TOP TOP TOP TOP TOP BTM
Multi-TSVpitch (microm)
10 15 20 50 20 20
Pad name C10 C15 C20 C50 DTOP DBTM
Structure R E
TSVs 1 1 - 1x2 1x2
Inverterlocation
TOP TOP TOP TOP TOP
TSVconnection
TOP TOP - TOP TOP
Multi-TSVconnection
- - - TOP BTM
Multi-TSVpitch (microm)
- - - 20 20
Pad name R1 R2 EREF EC ERC
Table 31 Test structures on the PRC1 module
34 electrical characterization 32
Pseudo-inverters
2x5 TSVs
TSV
Reference
Metal2 (TOP)Metal1 (TOP)
Figure 39 Structure rsquoArsquo - Single TSV capacitance (TOP die)
the single TSV DUT the structure also contains 10 parallel TSVs
arranged in a 2 times 5 array with a 20microm pitch Since the suitabilityof the CBCM method for extracting the small TSV capacitance wasundetermined during design time the 2times 5 array was included asa precaution effectively increasing the DUT capacitance an orderof magnitude
Structure rsquoBrsquo Single TSV capacitance (BOTTOM die)
The design of structure rsquoBrsquo is similar to the structure rsquoArsquo asdescribed above The only difference is that the CBCM pseudo-inverters are implemented on the BOTTOM die (Figure 310)while connectivity to the TSVs is provided using a wire on theMetal-2 layer of the BOTTOM die The purpose of this structureis to examine whether the perceived TSV capacitance varies whenit is accessed from the BOTTOM side
Structure rsquoCrsquo Variable pitch TSV capacitance
The use of minimum TSV pitch in designs is highly desirable soas to minimise total chip area The purpose of test structure rsquoCrsquo(Figure 39) is to investigate whether TSV pitch has an effect onthe parasitic capacitance It has been previously reported in the
34 electrical characterization 33
2x5 TSVs
Reference
TSV
Pseudo-inverters
Metal2 (BTM)Metal1 (BTM)
Figure 310 Structure rsquoBrsquo - Single TSV capacitance (BOTTOM die)
literature that TSV proximity to other TSVs may affect its parasiticcapacitance probably through an interaction with the oxide liner[23] Test structure rsquoCrsquo aims to verify the presence of the pitcheffect on the 65nm 3D process
The CBCM pseudo-inverters are implemented on the TOP die(Figure 311) while the TSVs are connected in parallel in groups of4 with variable pitch distances at 10microm 15microm 20microm and 50micromAll groups connect to the pseudo-inverters using identical lengthwires Since wire parasitics are not measured their value is notextracted from the measured TSV capacitances thus the resultsare relative
Structure rsquoDrsquo TSV capacitance with interconnection on TOP or BOT-
TOM die
The purpose of structure rsquoDrsquo (Figure 312) is similar to structurersquoBrsquo which is to examine whether the perceived TSV capacitancechanges when it is accessed from the BOTTOM die However inthis structure the CBCM pseudo-inverters are implemented on theTOP die in the case the front-end-of-line (FEOL) active devices ofthe BOTTOM die were not capable of operating properly afterfabrication and stacking
34 electrical characterization 34
2x2 TSVs - 10μm
2x2 TSVs - 15μm
2x2 TSVs - 20μm
2x2 TSVs - 50μm
Metal2 (TOP)Metal1 (TOP)
Figure 311 Structure rsquoCrsquo - Variable pitch TSV capacitance
Metal2 (TOP)Metal1 (TOP)Metal1 (BTM)
Figure 312 Structure rsquoDrsquo - TSV capacitance measurement with inter-connection on TOP or BOTTOM die
34 electrical characterization 35
Metal2 (TOP)Metal1 (TOP)
Figure 313 Structure rsquoRrsquo - Evaluation of TSV capacitance measurementaccuracy
Metal2 (TOP)Metal1 (TOP)Metal1 (BTM)
Driver
Reference
Driver
Figure 314 Structure rsquoErsquo - TSV capacitance effect on signal propagationdelay (layout)
Structure rsquoRrsquo Evaluation of TSV capacitance measurement accuracy
Structure rsquoRrsquo (Figure 313) aims to evaluate the accuracy withwhich CBCM can extract the capacitance of two nearby TSVs Thetwo pseudo-inverters wires and TSVs are identical thus theor-etically they should give identical results Any variance in theresult should be mainly due to transistor variability
Structure rsquoErsquo TSV capacitance effect on signal propagation delay
Structure rsquoErsquo (Figure 314) allows the direct measurement ofthe delay caused by the TSV on signal propagation The signalpropagation delay is measured under two conditions When thesignal passes through the TSV (which includes the effect of resist-ance on delay) and when only the TSV capacitance is in the pathof the current (Figure 315) The contribution of wiretransistorparasitics is also measured so as to be subtracted from the finalresult
34 electrical characterization 36
ABC
D
ECBFC
ABC
Figure 315 Structure rsquoErsquo - TSV capacitance effect on signal propagationdelay (schematic)
ABC
DEF
CD
A
CA
A
Figure 316 Measurement setup
342 Measurement setup
The presented test structures were fabricated on 300mm wafers(Figure A1) each containing 472 identical dies For IO accessa general purpose on-wafer measurement system was used thatallowed direct contact with the wafer and the 2 times 12 test padmodule through probecard pins (Figure A2) All IO signals wereprovided by standard laboratory instruments controlled throughthe general purpose interface bus (GPIB) port of a PC runninga customised LabVIEW2 program The measurement setup isillustrated in Figure 316
The voltage source of the CBCM pseudo-inverters was set at1V and the CBCM pulse frequencies were swept in the range100kHz minus 1MHz For each measured test structure the DC cur-rent measurement was stored in non-volatile memory and sub-
2 LabVIEW is a visual programming language from National Instruments forautomating measurement equipment in a laboratory setup
34 electrical characterization 37
Test pad Direction Instrument connection
VDD Power Power supply - 1V
GND Power Power supply - 0V
SET InputPulse generator
(Non-overlapping with RESET)
RESET InputPulse generator
(Non-overlapping with SET)
A[REF 1 10] InputSMU (Inverter source - 1V
supply)
B[REF 1 10] InputSMU (Inverter source - 1V
supply)
C[10 15 2050]
InputSMU (Inverter source - 1V
supply)
D[TOP BTM] InputSMU (Inverter source - 1V
supply)
R[1 2] InputSMU (Inverter source - 1V
supply)
E[REF C RC] Output Oscilloscope (high-impedance)
Table 32 PRC1 test pad instrument connections
sequently the prober progressed to the next die measuringall dies on the wafer The IO test pads of the PRC1 module(Figure 37) and the instrument connections are summarised inTable 32
343 Experimental results
There were two wafers available for measurements named in thistext rsquoD11rsquo and rsquoD18rsquo which were selected from the same lot Eachwafer contained 472 identical dies of the TOP 3D tier measuredusing the measurement setup described in 342 Measurementdata were extracted only from test structures rsquoArsquo (Figure 39) andrsquoCrsquo (Figure 311) which were analysed using standard statisticalmethods
When utilising statistical methods to analyse data acquired ina laboratory environment it is common for some data pointsto be significantly away from the other measurements Thesedata points are termed outliers (Appendix B) and usually resultfrom measurement errors such as bad probe pin contacts ormalfunctioning dies on the wafer Since the following analysisis focused on TSV capacitance variability these few outliers areexcluded from the data set so as to improve analysis accuracy
34 electrical characterization 38
Structure rsquoArsquo (Fig 39)
TSV dimensions (ld) 40microm5microm
Oxide liner thickness 200nm
Total dies 472
Processed dies (excluding outliers) 454
Mean capacitance 709 f F
Standard deviation (σ) 19 f F
Coefficient of variation (3σmicro) 80
Table 33 TSV capacitance D2D variability on wafer rsquoD11rsquo
The statistical descriptors used for analysing the measurementdata are elaborated in Appendix B and include calculations ofmean standard deviation and coefficient of variation Theseparameters are used to evaluate the TSV capacitance in terms ofdie-to-die (D2D) wafer-to-wafer (W2W) and TSV pitch variabilityThe accuracy of the CBCM results is evaluated by comparing themto standard LCR measurements Furthermore the correlationbetween the extracted TSV capacitance and simulated oxide linerthickness is shown signifying the predominant role of the oxideliner on the observed TSV capacitance variability
Die to die (D2D) variability
The die-to-die (D2D) variability is evaluated by measuring thecapacitance of the single TSV of structure rsquoArsquo (Figure 39) acrossall dies on the wafer The measurements are repeated on bothavailable wafers (rsquoD11rsquo and rsquoD18rsquo)
On wafer rsquoD11rsquo 18 dies showed a capacitance value severalstandard deviations away from the calculated mean of the totalpopulation These outliers indicated a measurement error or mal-functioning on the die and they were removed from the statisticalanalysis In the remaining 454 dies the mean capacitance of theexperimental data was 709 f F with a standard deviation (σ) of19 f F (Table 33) Standard deviation is an important indicator al-lowing us to determine whether the variance in the experimentaldata is expected or if there is a random error affecting the meas-urements As can be seen in Figure 317 706 of the measuredvalues are within plusmnσ of the mean capacitance value Since mostof the experimental data are within a standard deviation of themean value the variability observed is insignificant and expectedthus the measurement results can be considered valid
On wafer rsquoD18rsquo from the 472 total dies 58 were removed asoutliers The mean capacitance of the experimental data was895 f F with a standard deviation (σ) of 22 f F (Table 34) Theprobability density function can be seen in Figure 318 with 691
34 electrical characterization 39
-60 -40 -20 00 20 40 60 80
σ-σ
706
Deviation from mean (fF)
0
005
01
015
02
025
Figure 317 TSV capacitance D2D variability on wafer rsquoD11rsquo - Probabil-ity density function
Structure rsquoArsquo (Fig 39)
TSV dimensions (ld) 40microm5microm
Oxide liner thickness 200nm
Total dies 472
Processed dies (excluding outliers) 414
Mean capacitance 895 f F
Standard deviation (σ) 22 f F
Coefficient of variation (3σmicro) 74
Table 34 TSV capacitance D2D variability on wafer rsquoD18rsquo
of the experimental data within plusmnσ of the mean capacitancevalue
Correlation between TSV capacitance and oxide liner thickness vari-
ation
The TSV capacitance as extracted from the experimental measure-ments is plotted for wafers rsquoD11rsquo and rsquoD18rsquo (Figures 319 and 320)The capacitance value in the figures is normalised to the calcu-lated mean while xy axis indicate die coordinates on the waferWe observe that on both wafers the capacitance is near the meanvalue at the periphery of the wafer and reduces as we approachto the centre indicating a common cause of variability on bothwafers
Parameters affecting the TSV capacitance are the geometry ofthe TSV (length diameter) and oxide liner characteristics (dielec-tric constant thickness) The variation of the oxide liner thicknessacross the wafer can be predicted by simulation means Oxideliner simulation results for the specific process and fabrication
34 electrical characterization 40
σ-σ
691
Deviation from mean (fF)-70 -50 -30 -10 10 30 50 70
0
002
004
006
008
01
012
014
016
018
02
Figure 318 TSV capacitance D2D variability on wafer rsquoD18rsquo - Probabil-ity density function
075
080
085
090
095
100
105
110
0
5
10
15
20
25
30
-30-25
-20-15
-10-5
Norm
aliz
ed
capacitance
Die X
Die Y
075
080
085
090
095
100
105
Wafer area
Figure 319 TSV capacitance measurement results on wafer D11
34 electrical characterization 41
086088090092094096098100102104106108
0
5
10
15
20
25
30
-30-25
-20-15
-10-5
Norm
aliz
ed
capacitance
Die X
Die Y
086
088
090
092
094
096
098
100
102
104
106
108
Wafer area
Figure 320 TSV capacitance measurement results on wafer D18
tools can be seen in Figure 321 The liner thickness in the graphis normalised to the mean value and xy axis indicate die coordin-ates on the wafer As can be observed the oxide liner thickness isnear the mean value at the periphery of the wafer and increasesto a peak value when approaching the centre Since the TSV ca-pacitance is inversely proportional to the oxide liner thicknesstherersquos a strong correlation between the two variations suggestingthe oxide liner thickness to be the major cause of die-to-die (D2D)variability on wafers rsquoD11rsquo and rsquoD18rsquo
Wafer to wafer (W2W) variability
Probability density functions for both wafers rsquoD11rsquo and rsquoD18rsquo areplotted in Figure 322 Comparing the two experimental data sets(Tables 33 and 34) we can see that although the main statisticaldescriptors have not changed much (standard deviation is slightlylarger in D18) the mean capacitance is much larger in rsquoD18rsquo(895 f F) when compared to rsquoD11rsquo (709 f F) The difference is 262capacitance increase from wafer to wafer Taking into accountthat both wafers are from the same lot this result indicates theneed for better controllability on process parameters
34 electrical characterization 42
109
100
095Nor
mal
ized
oxi
de th
ickn
ess
-5-10 -15 -20 -25 -30
510
1520
2530
0Die Y
Die X
Figure 321 Simulated oxide liner thickness variation
60 65 70 75 80 85 90 95 1000
005
01
015
02
025D11 D18
plusmnσ
plusmnσ
Capacitance (fF)
Figure 322 TSV capacitance W2W variability - Probability density func-tions
34 electrical characterization 43
Structure rsquoCrsquo (Fig 311)
TSV dimensions (ld) 40microm5microm
Oxide liner thickness 200nm
Total dies 472
Processed dies (excluding outliers) 424
Experiment C10 C50
Number of TSVs 4 4
Mean capacitance 4073 f F 4125 f F
Standard deviation (σ) 84 f F 74 f F
Population within σ 695 653
Table 35 Variable pitch TSV capacitance experimental data from waferrsquoD18rsquo
Variable pitch TSV capacitance
The variable pitch TSV capacitance was extracted using structurersquoCrsquo (Figure 311) TSVs with pitch distances at 10microm and 50microm
were measured (experiments C10 and C50) on wafer rsquoD18rsquo Fromthe 472 total dies 424 were functional and the measurementresults are summarised in Table 35 The extracted capacitancevalues represent the 4 parallel-connected TSVs which also includethe parasitics of Metal 2 connectivity wires
As can be observed in the probability distribution functions(Figure 323) the measurement data for experiments C10 and C50
are very close to each other and there is an overlap between theirstandard deviations Even though there is an observable changein mean capacitance between the two experiments the overlap ofstandard deviations does not allow to conclude whether the capa-citance change is due to difference in pitch or process variability
To decide whether the change observed in capacitance is signi-ficant we run a KruskalndashWallis (Appendix B) one-way statisticalanalysis of variance in Minitab3 Assuming that the null hypo-thesis is true we calculate a p-value of 0 indicating that the twodata sets are different and that the pitch has indeed an affect theTSVs capacitance value The pitch dependence of the capacitancewith the same direction has been also previously observed onImecrsquos 130nm 3D process [23]
Measurement accuracy of CBCM
The capacitance measurements extracted using the CBCM tech-nique were compared to conventional LCR measurements (Fig-ure 324) on wafer rsquoD18rsquo Comparing the two measurements we
3 Minitab is a program that implements common statistical functions It is de-veloped by Minitab Inc (httpwwwminitabcom)
35 summary and conclusions 44
376 380 384 388 392 396 400 404 408 412 416 420 424 428 4320
001
002
003
004
005
006C10 C50
Capacitance (fF)
plusmnσC10
plusmnσC50
695 653
Figure 323 Variable pitch TSV capacitance experimental data fromwafer rsquoD18rsquo - Probability density function
-80 -60 -40 -20 00 20 40 60 80 1000
002
004
006
008
01
012
014
016
018σ-σ
688
Deviation from mean (fF)
Figure 324 LCR experimental data from wafer rsquoD18rsquo - Probability dens-ity function
observe in Table 36 that the standard deviations of the two datasets are similar while mean capacitance is sim 8 smaller in theLCR method However the LCR method produced more out-liers than CBCM That was expected as LCR cannot extract thecapacitance of a single TSV but can only measure groups of TSVs
connected in parallel (32 in our example) so as to increase thetotal measured capacitance This increases the chance of measur-ing TSVs with defects which in turn affects the accuracy of theLCR extracted capacitance
35 summary and conclusions
In this chapter various test structures based on the CBCM tech-nique were presented for the electrical characterization of theTSV parasitic capacitance The test structures were implementedon a 65nm 3D process technology and measured using a gen-eral purpose on-wafer measurement system The measurement
35 summary and conclusions 45
Method LCR CBCM
Total dies 472 472
Processed dies (excluding outliers) 393 414
Mean capacitance 820 f F 895 f F
Standard deviation (σ) 23 f F 22 f F
Coefficient of variation (3σmicro) 86 74
Table 36 Comparison between CBCMLCR measurements on waferrsquoD18rsquo
results were statistically analysed in terms of TSV capacitancedie-to-die (D2D) and die-to-wafer (D2W) variability The calculatedD2D variability was found to be reasonable while a comparisonto simulated data of the oxide liner thickness variation revealedthe oxide liner as a major contributor to the observed TSV capa-citance variability However D2W variability was extensive andfeedback was provided to the fabrication engineering team forfurther improvement of the process The results also confirmedthe effect of TSV pitch on the parasitic capacitance as has beenpreviously reported in the literature for a different process [23]
The comparison of CBCM-based measurements to conventionalLCR gave similar results while in addition the CBCM techniquewas shown to produce less outliers (less defective samples) in themeasurement data Considering the simplicity of CBCM and thatit has acceptable accuracy for extracting the capacitance of singleTSVs it can be a valuable tool for process characterization in theearly research efforts of 3D integration
The ability of CBCM for measuring single-ended TSVs could alsoprove beneficial for testing TSVs prior to die stacking which is ancritical step for keeping 3D-IC yield acceptable [47] Extracting TSV
parasitic capacitance can provide important information about theTSV device fabrication quality since slight variations in processparameters will greatly affect its parasitic capacitance as wasdemonstrated in this chapter Techniques based on RC constantmeasurement and sense amplification have been proposed in theliterature for a similar role [16] however they are significantlymore complex to implement than CBCM
In the following chapter another critical TSV characteristic ther-momechanical stress is evaluated using a test structure designedand implemented on the same 65nm 3D process
4T S V P R O X I M I T Y I M PA C T O N M O S F E TP E R F O R M A N C E
41 introduction
The integration of through-silicon via (TSV) 3D interconnects inCMOS technology has raised concerns over the impact on activedevices implemented on the same silicon (Si) substrate The largecoefficient of thermal expansion (CTE) mismatch between the TSV
filling material and Si substrate has been shown to induce ther-momechanical stress and deform the surface in proximity to theTSV Since thermomechanical stress can have an effect on activedevice carrier mobility [71] it is essential that the stress impacton MOSFET performance is kept under control Monitoring theTSV-induced thermomechanical stress effect on MOSFET deviceperformance and generating keep-out-zone (KOZ) guidelines fordigital and analog circuits is an important milestone for design-ing TSV-aware circuits with predictable performance
Simulations using finite element method (FEM) modelling [40]predict that TSV-induced thermomechanical stress has a com-plex distribution that depends on numerous conditions such asTSV geometry device type channel orientation and proximityHowever currently published experimental measurements onthermomechanical stress investigate just a small number of test-cases typically limited to a single TSVchannel orientationaxis[53 70 81] The complex distribution of thermomechanical stressemphasises the need for sophisticated characterization structuresthat can monitor stress impact with precision and verify simula-tion models
In this chapter a test structure is presented for monitoring theimpact of TSV-induced thermomechanical stress on MOSFET deviceperformance The test structure uses proven design techniquesfor implementing a large number of test-cases and accessingnumerous devices with maximum precision Furthermore thetest structure is fabricated on an experimental 65nm 3D process(Table 23) and used for characterizing that process Some of thepresented circuits and results have previously appeared in thefollowing publications [48 49 58]
42 tsv-induced thermomechanical stress
Individual TSV integration technologies use different conductivefill materials which may include copper (Cu) tungsten (W) or
46
42 tsv-induced thermomechanical stress 47
Material CTE at 20 ˚C (10minus6K)
Si 3
Cu 17
W 45
Table 41 Coefficients of thermal expansion
Figure 41 TSV thermomechanical stress simulation [40]
polysilicon [20] The coefficient of thermal expansion (CTE) mis-match (Table 41) between the fill material and the silicon (Si) caninduce strain on the surface in proximity to the TSV This deform-ation is caused by thermomechanical stress along the XY-planeduring the cooling-down phase [25] due to the filling materialrsquoshigher-than-Si contraction rate
Simulations using finite element method (FEM) modelling showthat the thermomechanical stress exhibits two-fold rotationalsymmetry with tensile and compressive stresses concentratedon separate axis [40] The thermomechanical stress distributiondepends on TSV geometrical characteristics as well as the inter-action with other TSVs Figure 41 illustrates the simulated stressdistribution of a single TSV while in Figure 42 the interactionbetween two near-by TSVs is shown As can be observed whenmultiple TSVs are in proximity to each other stress-free areas canbe generated as a result of destructive interaction
It is well known that stress can affect carrier mobility in MOS-
FET devices In fact strained-silicon technology processes havebeen in use for years as means to enhance device performance inseveral logic families [71] However in TSV-based 3D integrationthe induced strain is not intentional since it is a characteristic in-trinsic to the TSV process technology and thus the resulting shiftin device performance might not be always desirable Character-ization of the TSV impact on device performance and definition ofdesign rules such as keep-out-zones (KOZs) for digital and ana-log circuits is an important milestone for implementing TSV-awarecircuits with predictable performance
Simulation results [65] have shown that the mobility shift de-pends on various conditions such as TSV geometry device type
43 test structure implementation 48
Figure 42 TSV thermomechanical stress interaction between two TSVs[40]
proximity and channel orientation In Figure 43 the mobilityshift of transistors between two TSVs at variable distances is simu-lated As can be observed the effect is positive and more intenseon PMOS devices while for n-channel MOSFET (NMOS) the mobil-ity shift is mostly negative
The mobility shift due to TSV proximity has been also con-firmed by experimental data [53 70 81] and in most cases theperformance shift was found to be within acceptable limits How-ever currently published experimental results investigate a smallnumber of test-cases typically limited to a single TSVdevice ori-entationaxis As was shown in Figures 41 and 42 and discussedin the text the thermomechanical stress distribution is complexand depends on various factors In order to evaluate the impactof TSV thermomechanical stress accurately and under differentconditions more sophisticated test structures are required Suchapproach is attempted in the test structure which is presented inthe following sections
43 test structure implementation
TSV-induced thermomechanical stress can be compressive in somedirections tensile in others and itrsquos effect on device mobilitymay vary depending on MOSFET typeorientationdistance orinteraction with other TSVs To accurately characterize the effectof TSV proximity on MOSFET device performance a test structurewas designed and fabricated on Imecrsquos 65nm 3D process whichattempted to cover as many test-cases as possible
43 test structure implementation 49
Figure 43 Estimated mobility shift on PMOS (a) and NMOS (b) tran-sistors in between two TSVs separated by distance d [65]
The test structure consisted of MOSFET arrays superimposedover TSVs such that MOSFET devices surrounded the TSVs in alldirections at variable distances The purpose of these experimentswas to indirectly observe the thermomechanical stress effect ontransistor mobility by measuring the saturation current of MOSFET
devices at various distances and directions in relation to the TSVsSeveral configurations of TSVs and MOSFET arrays were investig-ated so as to characterize the impact of single and multiple TSVsas well as to observe the effect on different device types
Stacked silicon dies were not fabricated thus all devices wereimplemented on the TOP die Si substrate as illustrated in Fig-ure 32 (page 25) The test structure inputoutput (IO) was ac-cessible through a standard 2 times 12 test pad module on the TOP
die compatible with our wafer probe test system In total two testpad modules were designed for this experiment PTP1 and PTP2implementing NMOS-type and PMOS-type arrays respectively Testpad module PTP1 along with the test structure layout is illus-trated in Figure 44 while a micrograph image of the fabricatedchip can be seen in Figure 45
43 test structure implementation 50
A0
B0
C0
D0
E0
F0
G0
H0
I0
J0
A90
B90
E90
F90
G90
J90
DS3 DF3
DF2 DS0
DS2 DF0
DF1 DS1
SS SF
DEC6 VDD
DEC7 GND
DEC8 VG
DEC4 DEC9
DEC5 GND
DEC2 DEC3
DEC0 DEC1
Arraydecoder
Draindecoder
Figure 44 PTP1 test pad mod-ule (layout)
Figure 45 PTP1 test pad mod-ule (micrograph)
43 test structure implementation 51
Metal2Metal1PolyActive
TSV
MOSFETs
07μm007μm 0deg
07μm007μm 90deg
05μm05μm 90deg
05μm05μm 0deg
Figure 46 MOSFET dimensions and rotations
431 MOSFET arrays
Each MOSFET array was a separate experiment consisting of 37 times37 transistors which could be either NMOS-type or PMOS-typeshort channel (07microm times 007microm) or long channel (05microm times 05microm)as well as at 0ordm or 90ordm rotation in relation to TSVs (Figure 46)
The MOSFET arrays enclosed TSVs in one of five different config-urations 1 TSV (Figure 47) 4 TSVs diagonal (Figure 48) 4 TSVs
lateral (Figure 49) 9 TSVs (Figure 410) and no-TSVs In total therewere 16 configurations for each MOSFET array type (NMOSPMOS)as listed in Table 42 The location of each MOSFET array in thetop-level design is illustrated in Figure 44
The number of available IO test pads on each module was2 times 12 which imposed a limitation for accessing all 21904 (16 times37 times 37) MOSFETs To overcome that restriction only a subset ofthe transistors in each MOSFET array were wired out (16times 16) andthus electrically accessed using digital selection logic The restof the transistors in the 37 times 37 MOSFET array were inserted asdummy devices placed in the gaps to ensure uniform boundaryconditions and good matching between transistors (Figure 411)
Since thermomechanical stress exhibits two-fold rotational sym-metry as discussed in 42 the stress effect is the same for tran-sistors on the same axis whether they are placed eastwest ofthe TSV on the latitudinal axis or northsouth of the TSV on thelongitudinal axis The symmetry of the stress distribution wasexploited in the design of the MOSFET arrays so as to increase theresolution of the experiment despite the limited number of tran-sistors As can be seen in Figure 411 the transistors in the arrayare placed in an irregular grid with a 2microm pitch Some transistorsare placed on odd-number distances (1microm 3microm) in relation tothe TSV while others on even-number distances (2microm 4microm)Nevertheless rotational symmetry allows the evaluation of thethermomechanical stress at all directions with a 1microm resolution
43 test structure implementation 52
TSV
MOSFETs Substrate taps
Figure 47 1 TSV MOSFET array configuration
MOSFETs Substrate taps
TSV TSV
TSV TSV
Figure 48 4 diagonal TSVs MOSFET array configuration
43 test structure implementation 53
MOSFETs Substrate taps
TSV
TSV TSV
TSV
Figure 49 4 lateral TSVs MOSFET array configuration
MOSFETs Substrate taps
TSV TSV TSV
TSV TSV TSV
TSV TSV TSV
Figure 410 9 TSVs MOSFET array configuration
43 test structure implementation 54
Configuration RotationFET
width(microm)
FETlength(microm)
Name
0 0deg 07 007 A0
1 0deg 07 007 B0
4diagonal 0deg 07 007 C0
4lateral 0deg 07 007 D0
9 0deg 07 007 E0
0 0deg 05 05 F0
1 0deg 05 05 G0
4diagonal 0deg 05 05 H0
4lateral 0deg 05 05 I0
9 0deg 05 05 J0
0 90deg 07 007 A90
1 90deg 07 007 B90
9 90deg 07 007 E90
0 90deg 05 05 F90
1 90deg 05 05 G90
9 90deg 05 05 J90
Table 42 MOSFET array configurations for test pad modulesPTP1PTP2
43 test structure implementation 55
2μm 1μm
Metal2Metal1PolyActive
FET
DUMMY
TAP
TSV
Figure 411 MOSFET array detail
432 Digital selection logic
Each MOSFET array was composed of 256 active transistors (16 times16) Test modules PTP1 and PTP2 implemented 16 array con-figurations each with 4096 devices in total (16 con f igurations times256 transistors) which could be accessed individually throughthe 24 IO pads of the 2 times 12 test pad modules Individual tran-sistor Gate Drain and Source terminals were multiplexed tothe IO pads using a combination of decoders and transmission-gates (Figure 412) The transmission-gates were used as analogswitches for current flow while the decoders enabled the corres-ponding transmission-gates for accessing the externally selectedtransistor for measurement
The procedure for selecting a single MOSFET device for meas-urement was as follows
bull Array (0 to 15) was enabled using the 4 minus bit Array-selectdecoder The MOSFET Source terminal (S) common to alltransistors in a single array was electrically connected toIO pads ldquoSFrdquoldquoSSrdquo when enabled
bull MOSFET Gate terminal (G0 to G15) was connected to IO
pad ldquoVGrdquo using the 4 minus bit Gate-select decoder and corres-ponding transmission-gates
bull MOSFET Drain terminal (D0 to D15) was connected to IO
pads ldquoDFnrdquoldquoDSnrdquo using the 2 minus bit Drain-select decoder
43 test structure implementation 56
D0
D15
DF[0-3]
DS[0-3]
Drain-select decoder (2-bit)
SSF
SS
Array-select decoder (4-bit)
VG
Gate-select decoder (4-bit)
G0 G15
MOSFET Array
Figure 412 Digital selection logic
and corresponding transmission-gates Drain terminals werealways enabled in groups of 4 and connected sequentiallyto IO pads ldquoDF0rdquoldquoDS0rdquo to ldquoDF3rdquoldquoDS3rdquo thus the 2 minus bit
Drain-select decoder was adequate for accessing all 16Drain terminals in the array
When a MOSFET is switched on a voltage drop is expectedbetween the IO pads and the transistor DrainSource terminalsdue to the resistance of wires To guarantee accurate voltagedelivery transistor DrainSource terminals are ldquosensedrdquo on theldquoDSnrdquoldquoSSrdquo IO pads so as to ensure that the applied voltageon IO pads ldquoDFnrdquoldquoSFrdquo supplies the desired voltage values onthe transistor terminals This procedure known as four-terminalsensing [29] is typically automated when using laboratory meas-urement equipment such as a source measurement unit (SMU)
A single MOSFET array with its digital selection logic can beseen in layout view in Figure 413 Transistor Gate terminals areselected using a 4minus bit decoder (Figure 414) while the externallyapplied voltage on IO pad ldquoVGrdquo passes through transmissiongates (Figure 415) The transmission gates schematic used forthe DrainSource terminals can be seen in Figure 416 The Drainterminal transmission gates are enabled in groups of 4 usinga 2 minus bit decoder (Figure 417) If a MOSFET array is not in theselected state by the Array-select decoder Gate terminals areforced at GND voltage level (for PTP1) or to VDD voltage (forPTP2) so that non-selected MOSFETs remain turned-off insteadof floating
43 test structure implementation 57
Gatedecoder
Metal2Metal1PolyActive
Transmissiongates (DS)
Transmission gates (G)
MOSFETArray
Figure 413 MOSFET array with digital selection logic
43 test structure implementation 58
BLK
A[0] A[1] A[2] A[3]
DEC[0-15]
Figure 414 4-bit Gate-enableArray-enable decoder
43 test structure implementation 59
DEC
VGGVDD
Transmission gate
Pull-downtransistor
Figure 415 Transmission gate for MOSFET Gate terminals
Transmission gates
FORCE
SENSE
DRAINSOURCE
EN
VDD
Figure 416 Transmission gates for MOSFET DrainSource terminals
43 test structure implementation 60
A[0] A[1]
DEC[0-3]
Figure 417 2-bit Drain-select decoder
44 experimental measurements 61
Configuration Type Name Wafer
1 No-TSVPMOS
(05microm times 05microm)F0 D08
2 1-TSV 0ordm 25 ordmCPMOS
(05microm times 05microm)G0 D08
3 1-TSV 0ordm 25 ordmCPMOS
(05microm times 05microm)G0 D20
4 1-TSV 0ordm 60 ordmCPMOS
(05microm times 05microm)G0 D08
54-lateral TSV 0ordm 25
ordmCPMOS
(05microm times 05microm)I0 D08
6 1-TSV 90ordm 25 ordmCPMOS
(05microm times 05microm)G90 D08
7 1-TSV 0ordm 25 ordmCPMOS (07microm times
007microm)B0 D08
Table 43 MOSFET array configurations selected for experimental eval-uation
44 experimental measurements
Test modules PTP1 and PTP2 implementing the presented MOS-
FET arrays were fabricated on 300mm wafers (Figure A1) eachcontaining 472 identical dies Two wafers were selected for meas-urements referred to as ldquoD08rdquo and ldquoD20rdquo both originating fromthe same lot The test module used for experimental evaluationwas PTP2 containing the PMOS-device arrays which were expec-ted to have the highest performance impact from thermomech-anical stress as was discussed in 42 The specific MOSFET arraysthat were selected for evaluation are listed in Table 43
441 Data processing methodology
For each MOSFET array evaluated the drain-to-source current(IDS) of all transistors in the array was measured in the saturationregion (VDS = 1V VGS = 1V) Subsequently the saturation cur-rent was divided with the median current value of all transistorsin the array generating IDS values normalised to the median Themedian of a data set represents the arithmetic value that the ma-jority of values tend (Appendix B) Since only a small percentageof the 256 transistors in the array are in proximity to the TSVsthe median IDS value represented typical transistor saturationcurrent not affected by thermomechanical stress
Measurement precision is increased with repeated measure-ments as random noise is removed from the experiment reducing
44 experimental measurements 62
ABCDEFBBC
FCDDFBCFFC
FCB
A
BAACADE
CECBBC
CFFCDE
B$BCFFBDBCBF
amp(BFC)B
BC
FCDDFA
CAF
Figure 418 Processing of measurement results
the standard deviation of the sampled mean (Appendix B) Con-sequently each MOSFET array was measured across 98 identicaldies on the wafer and their results were averaged Since the satur-ation current shift due to thermomechanical stress is systematicacross all dies averaging the saturation currents only removedthe random noise effectively increasing the measurement pre-cision by a factor of sim 10 (
radic98) The described procedure for
processing the measurement results is illustrated in Figure 418Each MOSFET array contains 256 transistors which are not
placed in a regular 16times 16 grid as can be seen in Figure 411 Thisapproach was chosen so as to minimise the number of transistorsin an array while allowing for 1microm resolution in the evaluation ofthe thermomechanical stress by taking advantage of its two-foldrotational symmetry as discussed in 431 However illustratingmeasurements in a contour graph requires a regular grid of dataThus for illustration purposes the experimental measurementswere reconstructed to a 33 times 31 regular grid (1023 transistors)with the intermediate values estimated using linear interpolation(Figure 419)
442 Measurement setup
For IO access a general purpose on-wafer measurement sys-tem was used that allowed direct contact with the wafer and the2times 12 test pad modules through probecard pins (Figure A2) IO
signal generation and measurement was provided by standard
44 experimental measurements 63
ABCDECDBF
ABECBF
AB
AB
Figure 419 Interpolation of measurement results
laboratory instruments controlled through the GPIB port of a PCrunning a customised LabVIEW program Four-terminal sens-ing for MOSFET DrainSource terminals was automated using aKeithley 2602 dual-channel SMU [1] The measurement setup isillustrated in Figure 420
The IO test pads for the PTP1PTP2 modules can be seen inFigure 44 while the instrument connections used for each testpad are summarised in Table 44
ABCCDE
F
AA
A
DBCB
Figure 420 Measurement setup
44 experimental measurements 64
Test pad Direction Instrument connection
VDD Power Power supply - 1V
GND Power Power supply - 0V
DEC[0-9] InputDigital Output
(GATEDRAINARRAY SelectDecoders Input)
DF[0-3] Input SMU (Drain Force terminal)
DS[0-3] Output SMU (Drain Sense terminal)
SF Input SMU (Source Force terminal)
SS Output SMU (Source Sense terminal)
VG Input Digital Output (Gate voltage)
Table 44 PTP1PTP2 test pad module instrument connections
443 PMOS (long channel)
No-TSV configuration
This MOSFET array does not contain any TSVs thus transistormobility is not affected by thermomechanical stress The arraywas measured on 98 identical dies of wafer ldquoD08rdquo the saturationcurrents normalised and then averaged over all measured diesusing the procedure elaborated in 441 In Figure 421 the norm-alised saturation current for each transistor in the array can beseen after interpolation to a 33 times 31 grid Each coordinate in thefigure represents a 1microm distance
As can be observed the majority of transistors in the array havealmost identical saturation currents as expected However there isa distinctive stripe running from north to south where the currentvalue is reduced up to sim 5 This revealed a flaw in the designof the MOSFET arrays that can be best observed in Figure 411The transistors that are located adjacent to the substrate tap donot perceive the same boundary conditions as the rest of thetransistors in the array That difference is enough to affect thephysical characteristics of these transistors during fabricationwhich in turn determine transistor parameters such as saturationcurrent This flaw stresses out the significance of careful boundarycondition design when targeting for good transistor matching ina test structure
1 TSV configuration
In this configuration long-channel MOSFETs (05microm times 05microm) areplaced in the periphery of a single TSV (Figure 47) The transistorsin the array were measured over 98 identical dies of wafer ldquoD08rdquo
44 experimental measurements 65
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
096
097
098
099
100
101
Figure 421 Normalised on-current for PMOS structure ldquoF0rdquo (D08)
and processed according to 441 The normalised saturation cur-rent for each transistor in the array can be seen in Figure 422 afterinterpolation to a 33 times 31 grid It can be clearly observed that thetransistors in proximity to the TSV show a distinctive variationin their saturation current in comparison to the other transist-ors in the array which is the direct consequence of TSV-inducedthermomechanical stress These results also demonstrate that theproposed test structure design has enough measurement preci-sion to distinguish thermomechanical stress from backgroundnoise and variability The north to south stripe observed in theprevious configuration due to boundary condition mismatch isalso present in these measurements
To observe the TSV effect on nearby transistors more accuratelythe normalised saturation current versus distance is plotted foraxis X = 16 and Y = 14 (Figure 423) which cross the TSV almostat its the centre In the graph only actual measurements are plot-ted prior to interpolation We observe that the saturation currentof a transistor at 1microm distance from the TSV increases sim 20 onthe latitudinal axis and decreases sim 15 on the longitudinalwhile the effect is almost negligible after 8microm
The same measurement is repeated on wafer ldquoD20rdquo In Fig-ure 424 the normalised saturation current for each measuredtransistor in the array can be seen after interpolation to a 33 times 31grid Also the normalised saturation current versus distance isplotted for axis X = 16 and Y = 14 (Figure 425) In the graph
44 experimental measurements 66
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
09
10
11
12
105
110
115
120TSV
Figure 422 Normalised on-current for PMOS structure ldquoG0rdquo (D08)
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
085
09
095
1
105
11
115
12
125Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 423 Normalised on-current for PMOS structure ldquoG0rdquo(D08Y=14X=16)
44 experimental measurements 67
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
09
10
11
12
13
TSV
Figure 424 Normalised on-current for PMOS structure ldquoG0rdquo (D20)
we observe that the saturation current of a transistor at 1microm dis-tance from the TSV increases sim 30 on the latitudinal axis anddecreases sim 18 on the longitudinal while the effect is almostnegligible after 8microm These results show that thermomechanicalstress can be quite different at distances very close to the TSV
even in between wafers originating from the same lotFinally the measurement is repeated on wafer ldquoD08rdquo after
increasing the temperature to 60 ordmC Normalised current contourplot can be seen in Figure 426 and measurements for X = 16and Y = 14 in Figure 427 The saturation current of a transistor
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
09
1
11
12
13
14
Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 425 Normalised on-current for PMOS structure ldquoG0rdquo(D20Y=14X=16)
44 experimental measurements 68
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
090
095
100
105
110
115TSV
Figure 426 Normalised on-current for PMOS structure ldquoG0rdquo (D0860˚C)
at 1microm distance from the TSV increases sim 16 on the latitudinalaxis (minus4 in comparison to 25 ordmC) and decreases sim 11 on thelongitudinal (minus4 in comparison to 25 ordmC) while the effect isalmost negligible after 6microm (minus2microm in relation to 25 ordmC)
4 lateral TSV configuration
In this configuration the thermomechanical stress is evaluatedin the lateral direction between 4 TSVs The array is measuredover 98 identical dies of wafer ldquoD08rdquo at 25 ordmC The contour plot ofthe normalised saturation current can be seen in Figure 428 andmeasurements for X = 16 and Y = 14 in Figures 429 and 430
respectively As can be observed the saturation current of atransistor at 1microm distance from the TSV increases sim 18 on thelatitudinal axis (minus2 in relation to 1 TSV) and decreases sim 12 onthe longitudinal (minus3 in relation to 1 TSV) while TSV interactionreduces the effect range to 5microm (minus3microm in relation to 1 TSV)
90˚ rotation TSV configuration
In this configuration the orientation of the transistor channel inrelation to the thermomechanical stress is rotated to 90˚ Thearray is measured over 98 identical dies of wafer ldquoD08rdquo at 25 ordmCThe contour plot of the normalised saturation current can beseen in Figure 431 and measurements for X = 16 and Y = 14in Figure 432 As was expected the stress effect on mobility has
44 experimental measurements 69
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
085
09
095
1
105
11
115
12Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 427 Normalised on-current for PMOS structure ldquoG0rdquo (D08Y=14 X=16 60˚C)
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
085
090
095
100
105
110
115
TSV
TSV TSV
TSV
Figure 428 Normalised on-current for PMOS structure ldquoI0rdquo (D08)
44 experimental measurements 70
2 1 1 2 3 4 5 6 7 7 6 5 4 3 2 1 1 2 309
095
1
105
11
115
12
X16
Distance (μm)
Normalized
Ion
TSV TSV
Figure 429 Normalised on-current for PMOS structure ldquoI0rdquo (D08X=16)
3 2 1 1 2 3 4 5 6 7 8 8 7 6 5 4 3 2 1 1 208208408608809092094096098
1102
Y14
Distance (μm)
Normalized
Ion
TSV TSV
Figure 430 Normalised on-current for PMOS structure ldquoI0rdquo (D08Y=14)
44 experimental measurements 71
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
090
095
100
105
110
115
120TSV
Figure 431 Normalised on-current for PMOS structure ldquoG90rdquo (D08)
been reversed between the two axis The saturation current of atransistor at 1microm distance from the TSV decreases sim 13 on thelatitudinal axis and increases sim 20 on the longitudinal whilethe effect is almost negligible after 6microm
444 PMOS (short channel)
In this configuration short-channel MOSFETs (07microm times 007microm) areplaced around a single TSV (Figure 47) The contour plot of thenormalised saturation current can be seen in Figure 433 and
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
08509
0951
10511
11512
125Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 432 Normalised on-current for PMOS structure ldquoG90rdquo (D08Y=14 X=16)
44 experimental measurements 72
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
094
096
098
100
102
104
106
108
110
TSV
Figure 433 Normalised on-current for PMOS structure ldquoB0rdquo (D08)
measurements for X = 16 and Y = 14 in Figure 434 At at 1microm
distance from the TSV the saturation current decreases 7 onthe longitudinal axis (minus8 in relation to long channel) while onthe latitudinal axis no stress effect could be detected That couldbe attributed either to short-channel effects or malfunctioningof the specific transistor due to proximity to the TSV and theexperimental nature of the manufacturing process On both axisthe effect is almost negligible after 5 minus 6microm
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
085
09
095
1
105
11
115Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 434 Normalised on-current for PMOS structure ldquoB0rdquo (D08Y=14 X=16)
45 summary and conclusions 73
Size
Waf
er
Con
figu
rati
on
Rot
atio
n
Eff
ect
1micro
mla
t
Eff
ect
1micro
mlo
ng
KO
Z
L D08 1-TSV (25 ordmC) 0ordm +20 minus15 8microm
L D20 1-TSV (25 ordmC) 0ordm +30 minus18 8microm
L D08 1-TSV (60 ordmC) 0ordm +16 minus11 6microm
L D08 4-lat TSV (25 ordmC) 0ordm +18 minus12 5microm
L D08 1-TSV (25 ordmC) 90ordm minus13 +20 6microm
S D08 1-TSV (25 ordmC) 0ordm sim 1 minus7 8microm
Table 45 Summary of experimental measurements on long (L) andshort (S) channel PMOS devices
45 summary and conclusions
In this chapter a test structure was presented for monitoringthe impact of TSV-induced thermomechanical stress on MOSFET
device performance The test structure implemented arrays ofMOSFET devices superimposed with TSVs in various configura-tions Digital selection logic allowed access to the MOSFET deviceswhich could be used to indirectly monitor thermomechanicalstress by measuring the saturation current of transistors
The structure was implemented in Imecrsquos experimental 65nm
3D process and used to characterize the process technology whileevaluating the test structure Experimental measurements werecarried on PMOS devices and the thermomechanical stress wasfound to be on par with simulation estimations
In Figure 45 the measurements are summarised for the latit-udinal and longitudinal axis at 1microm distance from the TSV alongwith the recommended keep-out-zone (KOZ) The KOZ is indic-ative and represents the distance where the thermomechanicalstress effect on saturation current is almost negligible (lt 1) Ina practical case KOZs will depend on the tolerated saturation cur-rent mismatch between devices which is a direct consequence oftiming and other circuit constraints Typically digital circuits cantolerate and thus function properly with much more transistorvariability than analog circuits
Comparing the experimental results of this work to other pub-lished data in the literature reveals that the thermomechanicalstress effect varies dramatically with different process character-istics Experimental measurements reported by Cho et al [17]confirm that long channel devices are more sensitive than short
45 summary and conclusions 74
channel however both the saturation current shift and KOZs areconsiderably reduced to 2 and lt 2microm respectively In additionthe dependency of the stress effect on TSV position was not ob-served by the authors in contrast to the measurements presentedin this work and predictions by theoretical models Contradictoryresults between theory and experiments and between differentexperiments emphasise the need for better understanding of thespecific mechanisms involved in the TSV-induced thermomech-anical stress
Evaluation of the measurements results allowed for some con-clusions to be made on the effectiveness of the presented teststructure
1 It is desirable for transistors in the array to be in a regu-lar grid so as processing of the measurement data is lesscomplex
2 The boundary conditions of transistors have an consider-able effect on transistor matching and thus measurementaccuracy
3 Distances of interest are generally lt 8microm therefore thearrays could be smaller or more dense in the area in prox-imity to the TSV than further away
4 The experimental process used to implement the test struc-ture had low yield which stressed out the need for mech-anisms to detect faults in the digital selection logic
The experiences learned from this project can be potentially usedfor designing improved thermomechanical stress characterizationstructures for next generation TSV processes In a future workit would be interesting to perform experimental measurementson NMOS devices which were not evaluated in this project Fur-thermore the cause of the glitch measured on the short-channeltransistor nearest to the TSV should be further investigated
In the following chapter the potential of the energy-recoverytechnique is investigated for reducing the energy dissipation ofTSV interconnects in future 3D-SoC designs implementing a largenumber of TSVs
5E VA L U AT I O N O F E N E R G Y- R E C O V E RY F O R T S VI N T E R C O N N E C T S
51 introduction
TSV-based 3D-ICs enable low-parasitic direct connections betweenfunctional blocks in a SoC by replacing traditional long horizontal2D global interconnects with short vertical ones (21 page 5)Compared to 2D interconnects the reduced resistance-capacitance(RC) parasitics of TSVs improve both speed (RC) and energy(12CV2
DD) performance of circuits However the TSV parasiticcapacitance may still become an important source of energydissipation in large densely interconnected 3D-SoCs since thecombined capacitance and thus the energy required to drive TSVs
increases linearly with the number of tiers and interconnections(Figure 51)
The principal technique for reducing energy dissipation inCMOS has traditionally been based on supply voltage scaling[14] though this represents a technical challenge below the 65nm
node since PVT variations and physical limits decrease in practicethe range over which voltages may be scaled An alternativelow-power design technique is energy-recovery logic [31] whichdemonstrates frequency-dependent energy dissipation and thusdoes not suffer from the limitations of voltage scaling The originsand operating principle of energy-recovery logic were discussedin detail in 22 (page 13)
In this chapter the potential of the energy-recovery techniqueis investigated for reducing the energy dissipation of TSV inter-connects in 3D-ICs The total energy dissipation per cycle andoptimum device sizing are extracted for the proposed schemeusing theoretical modelling while the configuration is evaluatedagainst conventional static CMOS Some of the presented methodsand results have appeared previously in Asimakopoulos et al[7]
52 energy-recovery tsv interconnects
As was discussed in 22 (page 13) energy-recovery circuits con-serve energy by restricting current-flow across resistances andsubsequently recovering part of the supplied energy using time-variable power sources The energy dissipated in an energy-
75
52 energy-recovery tsv interconnects 76
1XTSV
Tier1
TSV
Tier0LOGIC 0
LOGIC 0
TSV
LOGIC 0
LOGIC 0
TSV
LOGIC n
LOGIC n Tier1
Tier0
TSV
LOGIC 0
LOGIC 0
TSV
LOGIC n
LOGIC n
Tier(m-1)
Tier(m)
mXnXTSVFigure 51 TSV energy dissipation per cycle increase with 3D integra-
tion density
recovery circuit becomes proportional to the charging period (T)of the capacitive load (CL) [82 60]
EER prop
(
RCL
T
)
CLV2DD (51)
In comparison a conventional CMOS circuit driving an equi-valent capacitive load would dissipate during the same chargingperiod T [59]
ECMOS =12
CLV2DD (52)
Combining Equations 51 and 52 we calculate the energydissipation relation between the two circuits
EER
ECMOSprop
2RCL
T(53)
From Equation 53 we observe that the conditions under whichenergy-recovery conserves energy when compared to conven-tional CMOS are when either the charging period (T) is large orthe resistancecapacitance (RCL) of the load is small
In a 3D-IC TSV parasitics in terms of resistance-capacitance (RC)are inherently low by design as was discussed in 21 (page 5)In fact as was demonstrated by Katti et al [27] resistance isnegligible and thus the electrical behaviour of a TSV in a 3D-IC
52 energy-recovery tsv interconnects 77
Tier2Tier1TSV
LOGIC 1
LOGIC 2
driver
TSV
LOGIC 1
LOGIC 2
CMOS
driver
Convert
Energy-recovery
PCG
Figure 52 Comparison of a typical CMOS TSV driver to the proposedenergy-recovery
can be modelled as a single capacitance due to its predomin-ant impact on energydelay Therefore and based exclusivelyon Equation 53 energy-recovery has an advantage over con-ventional static CMOS when used for driving low-parasitic TSV
interconnectsThe differences between a typical CMOS driver in a TSV-based
3D-IC and the energy-recovery scheme proposed in this chapterare illustrated in Figure 52 A logic block in Tier1 generates adigital output which is regenerated by a driver and receivedby a second logic block in Tier2 after passing through the TSVIn the proposed energy-recovery scheme as an intermediatestep the digital output is converted to an adiabatic signal beforepassing through the TSV so as to take advantage of the low-power characteristics of energy-recovery Compatibility betweenthe receiving digital logic block in Tier2 and energy-recovery isretained by restoring the adiabatic signal back to its digital form
Converting the digital signal to an adiabatic as depicted inthe simplified diagram of Figure 52 requires the use of a spe-cialised driver that can combine (mix) the time-variable signalof the power-clock generator (PCG) with a digital output Signalsgenerated by a PCG return to zero at the end of a cycle resem-bling the operation of a digital clock (222 page 18) In contrastdigital outputs are typically allowed to have arbitrary values fromone cycle to the next thus a special structure is required thatcan encode the PCG generated signal to arbitrary binary values
52 energy-recovery tsv interconnects 78
IN IN
ININ
OUT OUT
IN
IN
OUT
OUT
Figure 53 Adiabatic driver
Figure 54 A simple pulse-to-level converter (P2LC)
An example of such structure is the adiabatic driver [8 84] (Fig-ure 53) Adiabatic drivers are composed of a pair of transmissiongates switching onoff alternately based on the value of a digitalinput On each cycle the time-variable adiabatic signal is redirec-ted either to the driverrsquos inverting or non-inverting output andas a result the data carried by the adiabatic signal are dual-railencoded
The signals at the outputs of the adiabatic driver can be restoredto their digital form after passing through the TSV by convertingthe dual-rail encoded sinusoidal pulses back to level signals usinga pulse-to-level converter (P2LC) [72] (Figure 54)
For generation of the time-variable adiabatic signal a resonantPCG may be used such as the one discussed in 222 (page 18)A resonant PCG generates continuous sinusoidal pulses approx-imating the idealised (linear) adiabatic signal of Figure 52 Theadiabatic driverrsquos resistance (RTG) along with the capacitive load(CL) assume a crucial role in the generation of the resonant si-nusoidal pulse as their presence in the signal path along with
53 analysis 79
VDD2
Rind
RTG
AdiabaticDriver
CL
M1
T=1f
L ind
RTG
CL
Figure 55 Resonant pulse generator with dual-rail data encoding
Tier2
Tier1
f-fPULSE
VDD2CLK IN
Delay1CLK1
CLK0
Delay2CLK2OUT1
Data In
Data Out
CLKRES
TSV TSV TSV
CLKBufferAdiabatic
Driver
P2LC
f-f
Data1
M1
Figure 56 Energy-recovery scheme for TSV interconnects
a resonant-tank inductor form an RLC oscillator [44] (Figure 55)resonating at
f =1
2πradic
LindCL(54)
A detailed block diagram of the proposed energy-recoveryscheme including the PCG and dual-rail encoding is illustratedin Figure 56
53 analysis
The energy dissipation relation depicted in Equation 51 is basedon the assumption that an idealised (linear) adiabatic signalis used for charging the TSV capacitance Furthermore energy
53 analysis 80
losses occurring in the additional components used in the realisticenergy-recovery circuit illustrated in Figure 56 are not consideredin Equation 51 The purpose of this section is to analyse andmodel all possible sources of energy dissipation in the proposedenergy-recovery scheme
The bulk of the energy dissipation in the energy-recovery cir-cuit of Figure 56 will occur in the adiabatic driver (EAD) theinductorrsquos parasitic resistance (Rind) transistor M1 (EM1) and theP2LC (EP2L) Therefore the total energy dissipated per cycle inthe energy-recovery circuit would be
EER = EAD + Eind + EM1 + EP2L (55)
Parameters EAD Eind and EM1 are theoretically derived in thefollowing subsections while EP2L depends on the specific P2LCs
architecture used and could be easily extracted from simulationdata as will be shown later in this chapter
531 Adiabatic driver
There are two sources of energy dissipation in the adiabaticdriver of Figure 53 adiabatic dissipation caused by current flow-ing through the transmission gate resistance (RTG) and CMOS
dissipation caused by external stages driving the transmissiongate and pull-down NMOS transistor input capacitances
The sinusoidal pulses generated by the PCG chargedischargethe load capacitance (CL) through the transmission gate resistance(RTG) The full chargedischarge cycle of the load capacitanceconcludes during a period T = 1 f while the energy dissipatedon the transmission gate resistance (ETG) during that period isadiabatic and thus proportional to Equation 22 Appending acorrection factor ξ equal to π28 for a sine-shaped current [8 76]and multiplying the equation by 2 to calculate energy dissipationfor the full cycle
ETG = 2ξ
(
RTGCL12 T
)
CLV2DD =
π2
2(RTGCL f )CLV2
DD (56)
The cross-coupled PMOS pair of Figure 53 reduces the transmis-sion gate input capacitance by 12 when compared to the simpleadiabatic driver of Figure 215 (page 18) External CMOS stagesare not required to drive the PMOS pair however the PMOS gatecapacitance will appear as an additional capacitive load at theoutput of the driver Energy will still be dissipated for driving thePMOS pair but will be adiabatic and thus more energy efficient
53 analysis 81
Transmission gates are typically sized to have equal NMOS andPMOS parts as increasing the size of the PMOS only slightly im-proves the gate resistance while significantly increasing the inputcapacitance [78] Thus assuming both NMOSPMOS transistorsare equally sized to Wn the additional capacitive load appearingat the driverrsquos output due to the PMOS pair would be one gatecapacitance Cn In addition drainsource diffusion capacitance(CD) will be an important portion of the load since in each cycle6CD will be present in the current flow path (4 contributed bythe ON and 2 by the OFF transmission gate) Including the TSV
capacitance at the output of the adiabatic driver the combinedload capacitance seen by the PCG in any cycle would be
CL = CTSV + Cn + 6CD (57)
The transmission gate resistance (RTG) can be related to thegate capacitance by a ldquodevice technology factorrdquo (κTG) [43] whichwe can define for calculation convenience
κTG = RTGCn rArr RTG =κTG
Cn(58)
Assuming that the pull-down NMOS transistors of Figure 53are small CMOS dissipation is primarily the result of driving theNMOS part of the transmission gates which is CnV2
DD Combiningthe CMOS dissipation with Equations 56 57 and 58 gives thetotal dissipated energy per cycle in the adiabatic driver
EAD = CnV2DD +
π2
2κTG
Cnf [CTSV + Cn + 6CD]
2 V2DD (59)
The second term of Equation 59 has a consistent contributionon the energy dissipation in every cycle while the first term isdependent on data switching activity (D) We can also furthersimplify this equation by defining the diffusion capacitance asa fraction of the transistors input capacitance (CD = bCn) andassigning term π2
2 κTG f to a variable (y = π2
2 κTG f ) Equation 59then becomes
EAD = D middot CnV2DD +
y
Cn[CTSV + (6b + 1)Cn]
2 V2DD
=
[
Cn
(
D
y+ (6b + 1)2
)
+1
CnC2
TSV
]
middot yV2DD
+(12b + 2)CTSV middot yV2DD (510)
Since in Equation 510 Cn is the free parameter the first twoterms are inversely proportional and thus EAD is minimised when
53 analysis 82
they become equal Solving equation Cn
(
Dy + (6b + 1)2
)
= 1Cn
C2TSV
for Cn the optimum transmission gate input capacitance is calcu-lated for a given CTSV
Cn(opt) =
radic
[D
y+ (6b + 1)2]minus1CTSV (511)
Combining Equations 510 and 511 gives the energy dissipationof an optimally sized adiabatic driver
EAD(min) =
[radic
D
(6b + 1)2y+ 1 + 1
]
middot (12b+ 2)yCTSVV2DD (512)
532 Inductorrsquos parasitic resistance
The inductorrsquos (Lind) parasitic resistance Rind is proportional tothe Qind factor which typically depends on the inductorrsquos imple-mentation technology For the purposes of this analysis it can beestimated as
Qind =1
Rind
radic
Lind
CLrArr Rind =
1QindCL2π f
(513)
Energy dissipation in Rind is adiabatic and can be estimatedcombining Equations 56 and 513
Eind =π2
2(RindCL f )CLV2
DD =π
4CL
QindV2
DD (514)
533 Switch M1
Transistor M1 is switched-on briefly to recover the energy dissip-ated each cycle in the Rtotal = RTG + Rind Its energy dissipationis a trade-off between dissipation in its on-resistance RM1 andinput capacitance CM1
Since M1 is a fairly large transistor previous ratioed stages willhave a significant contribution to the energy consumption as wellFor that reason a m factor is used to compensate for the additionallosses IM1(rms) is the root mean square (RMS) current passingthrough the transistor while turned-on and VGM1 is the peak gatevoltage A methodology for deriving optimum values for boththese parameters is proposed in [44] and the total dissipatedenergy in M1 can be estimated as
EM1(min) = 2IM1(rms)VGM1
radic
mκM1
f(515)
54 simulations 83
Delay2
CLK1
CLK2Delay1
CLKW
PULSE
CLK0T
PD
OUT1
PULSE
P2L
RES
CLK
Figure 57 Energy-recovery timing constraints
534 Timing Constraints
The gate of transistor M1 in Figure 55 is controlled by a pulsewith period TCLK and width WPULSE For optimum energy ef-ficiency the digital inputs of the adiabatic driver in Figure 53should switch during the time period defined by WPULSE whilethe input values must be valid for the complete resonant clockcycle as shown in Figure 57
The adiabatic driverrsquos input data latching is controlled by clocksignal CLK1 (Figure 56) and thus its positive edge should takeplace during WPULSE (Delay1 lt WPULSE) If these constraints arenot met energy that could be otherwise recovered by the inductorwill be dissipated on M1 during WPULSE The clock signal at thereceiving end CLK2 does not have the same constraint howeveritrsquos positive edge should occur before the P2LC outputs new data(Delay2 lt WPULSE + PDP2L)
54 simulations
To verify and evaluate the proposed theoretical models a testcase was designed for which energy dissipation was calculatedusing both theory and SPICE simulations The test case assumed1 minus bit of information per cycle was transferred between 2 tiersof a 3D-IC in which the TSV capacitance (CTSV) was 80 f F TheTSV was driven using the energy-recovery technique proposedin this chapter implemented with a single adiabatic driver fordata encoding and an inductor (Lind Rind) for generation of the
54 simulations 84
M1IN IN
IN
OUT OUT
IN
Figure 58 Test case for the evaluation of the theoretical model
Q250MHz 500MHz 750MHz
T S e () T S e () T S e ()
6 360 409 -120 476 516 -78 578 613 -57
8 308 347 -114 417 449 -70 515 543 -52
10 276 312 -115 382 411 -70 477 498 -43
12 255 289 -119 358 386 -71 451 472 -44
Table 51 Energy dissipation results ( f Jcycle) theoretical model (T)versus SPICE simulation (S)
adiabatic signal and energy recovery The circuit used for the testcase is illustrated in Figure 58
The energy dissipation on the adiabatic driver inductor andtransistor M1 were calculated using Equations 512 514 and 515with the Q factor and operating frequency acting as free variablesTechnology-specific parameters κTG κM1 and CD were extractedusing simulation models for Imecrsquos 130nm 3D process (Table 22page 11) The same circuit was simulated on Cadence VirtuosoSpectre simulator under identical conditions The energy dissipa-tion results for various Q factors and operating frequencies arelisted in Table 51
The calculated and simulated energy dissipation results arealso plotted in Figure 59 As can be observed theoretical es-timations are lower than than the equivalent results reported bythe simulator in all the cases examined That can be expected asthe calculations performed by the simulator are thorough andconsider much more details that are skipped in the theoreticalmodels Nevertheless there is good correlation between theoryand simulations
In Figure 510 the error in the estimation of the energy dissip-ation is plotted when using the theoretical model The Q factordoes not seem to have a significant effect on the error howeverthere is a considerable dependence on the operating frequency
54 simulations 85
5 6 7 8 9 10 11 12 130
10
20
30
40
50
60
70
T250 S250 T500 S500 T750 S750
Q factor
Energydissipation(fF)
Figure 59 Energy dissipation results ( f Jcycle) theoretical model (T)versus SPICE simulator (S)
5 6 7 8 9 10 11 12 13
-14
-12
-10
-8
-6
-4
-2
0
250MHz 500MHz 750MHz
Q factor
Error()
Figure 510 Energy dissipation estimation error of the theoretical modelcompared to the SPICE simulator
The parameters κTG κM1 and CD that were extracted for thespecific technology even though they were used as constants inthe models in practice they are frequency dependent Thereforeit could be said that based on the parameters chosen the theor-etical results are optimised for a specific frequency Increasingfrequency doesnrsquot necessary result in reduced error as seemsto be suggested by the graphs in Figure 510 In fact it is ex-pected that the estimation error will show a positive sign forlarger frequencies As was discussed previously in the text it isa reasonable expectation for the theoretical models to slightlyunderestimate the energy-dissipation
54 simulations 86
M1
TSV
TSV
TSV
Data in OutCMOS
OutER
Energy-recovery
CMOS
P2LC
Adiabaticdriver
Figure 511 Test case for comparison of energy-recovery to CMOS
541 Comparison to CMOS
When comparing different circuits it is typically desirable to takea holistic approach evaluating various aspects such as powerarea speed cost etc The purpose of this section is to comparethe energy-recovery scheme to a conventional static CMOS circuitusing just the theoretical models presented in this chapter Thisbasically limits the comparison to the energy dissipation andspeed parameters while on Chapter 6 (page 92) area is alsoconsidered The test case used for the comparison is illustratedin Figure 511 The TSV load capacitance was assumed to be 80 f F
for both circuitsThe energy dissipation of the CMOS circuit in Figure 511 is
described by Equation 52 Appending the data switching activity(D)
ECMOS = D middot 12
CLV2DD (516)
Equation 516 obviously calculates the theoretical minimumenergy dissipation since no parasitics and previous CMOS stagesare included In practice the energy dissipation of the CMOS
circuit can be much larger however Equation 516 is consideredadequate for the purposes of this comparison
Energy dissipation of the energy-recovery circuit is calculatedusing Equation 55 The energy contribution of each sub-componentthe adiabatic driver (EAD) the inductorrsquos parasitic resistance(Rind) and transistor M1 (EM1) were theoretically derived pre-viously in the text The P2LC energy was simulated in SPICEfor convenience The simple circuit topology of Figure 54 wasslightly improved adding the PMOS transistors seen in Figure 512so as to reduce short-circuit current when the input is in the range
54 simulations 87
Figure 512 Improved P2LC design
0 100 200 300 400 500 600 700 800 9000
2
4
6
8
10
12
14
Frequency (MHz)
Energy
dissipation(fJ)
Figure 513 Simulated energy dissipation of improved P2LC design
Vth rarr (VDD minus Vth) The energy dissipation results extracted us-ing SPICE can be seen in Figure 513 As can be observed theenergy dissipation is frequency dependent something not expec-ted in a CMOS circuit This behaviour is caused by the slow risingsinusoidal inputs which for low frequencies extend the periodduring which both PMOS and NMOS are on and thus increaseshort-circuit current
The Q factor is a critical parameter in the energy efficiency ofthe energy-recovery scheme since the entire circuit is basically anRLC oscillator A wide range of inductors are available as discretecomponents for use in circuits with Q factors depending on thesize and material On-chip integrated inductors have more re-stricted performance however their compact size is an advantageIntegrated Q factors are typically in the neighbourhood of 10for nH range inductors with sim 100microm radius [5] The Q factor
54 simulations 88
4 6 8 10 12 14 16 18 20 22-20
-10
0
10
20
30
40
50
60
800MHz 700MHz 600MHz 500MHz 400MHz 300MHz 200MHzQ factor
Energyred uction()
Figure 514 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS for variable Q factor(C = 80 f F)
4 6 8 10 12 14 16 18 20 220050
100150200250300350400450
Inductor M1 Adiabatic driver P2LC
Q factor
Energy
con trib
ution(
)
Figure 515 Energy contribution of energy-recovery circuit componentsat 200MHz (C = 80 f F)
values in the following simulations were deliberately chosen tobe in realistic ranges for integrated inductors
In Figure 514 the percentage reduction in energy dissipationachieved by the energy-recovery circuit is plotted when comparedto CMOS As expected energy efficiency of the energy-recoverycircuit improves with high Q factors and low frequencies
In Figures 515 and 516 the energy dissipation contributionof each component in the energy-recovery circuit can be seenat 200MHz and 800MHz respectively In both cases the energydissipated on the inductorrsquos resistance is reduced with higher Q
factors At lower frequencies the P2LC energy dissipation becomesa significant part of the total
In Figure 517 the percentage reduction in energy dissipationis plotted when the TSV capacitance (CTSV) is variable The en-ergy dissipation in the adiabatic driver (Equation 512) inductor(Equation 514) and transistor M1 (Equation 515) are linearlyrelated to the TSV capacitance Since the same applies to the CMOS
circuit (Equation 516) relative energy dissipation of both circuitsshould be constant with TSV capacitance The change observed in
54 simulations 89
4 6 8 10 12 14 16 18 20 2200
100
200
300
400
500
600
700
Inductor M1 Adiabatic driver P2LC
Q factor
Energy
con trib
ution(
)
Figure 516 Energy contribution of energy-recovery circuit componentsat 800MHz (C = 80 f F)
20 40 60 80 100 120 140 160 180 200 220-100
00
100
200
300
400
500
Q6 Q10 Q14 Q18
TSV capacitance (fF)
Energy
red u
ction(
)
Figure 517 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS with variable TSVcapacitance (C = 80 f F f = 500MHz)
Figure 517 is caused by the constant energy dissipation contribu-tion of the P2LC However as the load capacitance increases theenergy dissipation on the P2LC becomes insignificant in relationto the rest of the circuit components thus the reduction in energyis expected to asymptotically approach a maximum value
Switching activity (D) can also be a significant factor affect-ing energy performance Since in the energy-recovery circuit thesinusoidal oscillation may not be halted all capacitances in thecurrent flow path will charge and discharge on each cycle regard-less of data activity In contrast static CMOS ideally dissipatesenergy only when switching and thus the energy-recovery circuithas a disadvantage at low switching activities In Figure 518 theestimated effect of the switching activity on energy performanceis plotted for an operating frequency of 200MHz
Since the technology factors κTG and κM1 were extracted for a130nm process reducing their value by 12 could also provide uswith an estimation of the circuitrsquos energy performance for a 65nm
node The result in plotted in Figure 519 and as can be observedtechnology scaling has a positive effect on energy dissipationwhen compared to conventional static CMOS
54 simulations 90
4 6 8 10 12 14 16 18 20 22-40
-20
0
20
40
60
80
D=05 D=07 D=09 D=1
Q factor
Energyred uction()
Figure 518 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS (C = 80 f F f =200MHz)
4 6 8 10 12 14 16 18 20 220
10
20
30
40
50
60
130nm 65nm
Q factor
Energyred uction()
Figure 519 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS for variable technology(C = 80 f F f = 500MHz)
55 summary and conclusions 91
55 summary and conclusions
The work presented in this chapter investigated the potential ofthe energy-recovery technique as used in adiabatic logic andresonant clock distribution networks for reducing the energy dis-sipation of TSV interconnections in 3D-ICs The proposed energy-recovery scheme for TSVs was analysed using theoretical model-ling generating equations for energy dissipation and optimumdevice sizing The model accuracy was evaluated against SPICEsimulations which showed good correlation with the theoreticalestimations on a 130nm technology process
The energy-recovery scheme was compared to conventionalstatic CMOS when both circuits driving equivalent TSV load ca-pacitances under similar conditions The analysis revealed theenergy performance dependence on multiple circuit parametersthe Q factor operating frequency switching activity and TSV capa-citance The results demonstrated favourable energy performancefor the energy-recovery scheme for low operating frequenciesand high Q factors switching activities and TSV capacitancesFurthermore an estimation was provided on the energy perform-ance of the proposed scheme in an advanced technology nodewhich predicted further energy improvement
Encoding the PCG signal to arbitrary digital values using thedual-rail adiabatic driver (Figure 53) is only one of the possiblesolutions In fact energy could be saved if instead of dual-railanother encoding method is used such as 1-out-of-4 [74] In 1-out-of-4 encoding 4 outputs are used to represent 2-bits however onlyone of the outputs is charged at any given cycle In comparisondual-rail charges two outputs per cycle for the same number ofbits and thus 1-out-of-4 could potentially be saving up to 50energy when compared to dual-rail 1-out-of-4 encoding couldprove an interesting extension to the proposed energy-recoveryscheme
The theoretical models developed in this work can be used asa quick alternative to SPICE simulations for design-space explor-ation This is shown in the following chapter where the theoryestablished on the energy-recovery TSV scheme is used for design-ing a low-power 3D demonstrator circuit that was fabricated on a130nm process
6E N E R G Y- R E C O V E RY 3 D - I C D E M O N S T R AT O R
61 introduction
In the previous chapter a theoretical approach was taken toinvestigate the potential of the energy-recovery technique forreducing the energy dissipation of TSV interconnects in 3D-ICs Thetotal energy dissipation per cycle and optimum device sizing wereextracted for the proposed scheme using theoretical modellingwhile analysis revealed the design parameters affecting energyperformance
Practical design issues such as wire parasitics signal distri-bution area etc were not considered hence in this chapter a3D demonstrator circuit based on the energy-recovery scheme isdesigned under realistic physical and electrical constraints Thedemonstrator is compared to a CMOS circuit designed with thesame specifications using post-layout simulations Both circuitsare fabricated on a 130nm 3D technology process and evaluatedbased on the experimental measurements
62 physical implementation
The demonstrator circuit implemented a 500MHz 80-bit parallelIO channel on a 5-tier TSV-based 3D-IC The TSVs interconnectionswere driven using the energy-recovery scheme as discussed inChapter 5
The target process technology allowed for a low-resistivitybackside re-distribution layer (RDL) to be deposited which wasused to implement a high-Q integrated inductor Stacking of dieswas not possible in the target process so the 5-tier 3D-IC wasemulated by series-connecting TSVs in groups of 4 as is illustratedin Figure 61
The circuit consisted of a total of 640 TSVs (2 times 4 times 80bits) ofwhich half were electrically connected to the integrated inductorduring a single clock cycle The TSV capacitance for the fabrica-tion technology process had already been characterized and wasestimated at sim 40 f F (Table 22 page 11) Optimal device sizingand parameters for the energy-recovery IO circuit at 500MHz
were calculated using the methodology developed in Chapter 5
and are listed in Table 61In Figure 62 the schematic diagram of the energy-recovery
circuit is illustrated The main components comprising the circuitare the adiabatic drivers the pulse-to-level converters (P2LCs)
92
62 physical implementation 93
AB
AB
CDEFD
C
ABCD
EFD
C
Figure 61 Demonstrator circuit structure (showing 1-bit IO channelface-down view)
Adiabatic driver 80-bit circuit
Wn 461microm (Eq511) Rtotal 21Ω ( RTG80 )
RTG 168Ω (Eq58) Ctotal 165pF (80 middot Cload)
CL 207 f F (Eq57) Lind 613nH (Eq54)
Table 61 Energy-recovery IO circuit parameters
62 physical implementation 94
IntegratedInductor
VINDOER
PulseGenerator
S1AS2
DataGenerator
TSV
TSV
P2LCAdiabaticDriver
P2LCAdiabaticDriver TSV
TSV
Buffer
Buffer
1
80
1
80
EN
DecouplingCapacitor
Figure 62 Energy-recovery IO circuit schematic diagram
Metal2Metal1PolyActiveVDD
GND
RES_CLK
DATA
OUT
OUT_B
DATA
OUT_B OUT
RES_CLK
GND
VDD
Figure 63 Adiabatic driver (layoutschematic view)
the data generator the pulse generator transistor M1 and theintegrated inductor All sub-components are discussed in detailin the following paragraphs
621 Adiabatic driver
The adiabatic driver was based on the configuration theoreticallyanalysed in 531 (page 80) The adiabatic driver had 2 inputsfor the resonant pulse (RES_CLK) and the digital data (DATA)signals and 2 outputs for the inverted (OUT_B) and non-inverted(OUT) signals The layout and schematic views of the adiabaticdriver are illustrated in Figure 63
62 physical implementation 95
Metal2Metal1PolyActive
IN
OUT_B
VDD
IN_B
OUT
GND
OUT_B
IN
GND
VDD
OUT
IN_B
Figure 64 P2LC (layoutschematic view)
622 Pulse-to-level converter
The pulse-to-level converter (P2LC) was based on the simple circuitof Figure 54 (page 78) The design in layout and schematic viewsis illustrated in Figure 64 The circuit had 2 inputs for the inverted(IN_B) and non-inverted (IN) resonant pulse and 2 outputs forthe inverted (OUT_B) and non-inverted (OUT) digital signals
623 Data generator
A data generator was used for generating the digital data sig-nal required at the adiabatic driverrsquos input (DATA) The circuitwhich is illustrated in Figure 65 was a simple D-type flip-flopconfigured as a toggle The data generator had 1 input a digitalclock signal (CLK) operating at the same frequency as the res-onant clock and the digital data output (Q) with a switchingactivity of 1
As discussed in 534 (page 83) there are certain timing con-straints that must be respected for the energy-recovery circuitto operate efficiently The digital data received at the adiabaticdriverrsquos input (DATA) should switch during the period definedby WPULSE (Figure 66) In the demonstrator circuit the distribu-tion network carrying the DATA signal to the adiabatic driverswas carefully designed so that its delay impact was less thanWPULSE
624 Pulse generator
The gate of transistor M1 in Figure 62 is controlled by a pulsehaving the same frequency as the resonant clock (RES_CLK)and width WPULSE which is calculated using the models in 533(page 82) This digital pulse can be generated externally of thechip using a highly accurate laboratory signal generator Howeverwire parasitics in the path from the external signal source to the
62 physical implementation 96
Metal2Metal1PolyActive EN
CLK
VDD
Q
GND
D
CLKGND
ENVDD
QF-F
D
GND
VDDCLK
D Q
Figure 65 Data generator (layoutschematic view)
Delay1
WPULSE
PULSE
CLKTCLK
DATA
CLK_RES
Figure 66 Timing constraints for the energy-recovery demonstratorcircuit
62 physical implementation 97
PULSES1A
S2
GND
VDD
Metal2Metal1PolyActive
VDD
S1
S2
GND
PULSE
Figure 67 Pulse generator (layoutschematic view)
S2ΔPhase
WPULSE
PULSE
S1A
Figure 68 Pulse generator InputOutput signals
gate of on-chip transistor M1 affect the rising and falling times ofthe pulse resulting in pulse widths which are difficult to control
For that reason the pulse generator was implemented on-chip(Figure 67) and could be used to generate pulses with arbitrarywidths and frequencies The pulse generator had 2 sinusoidalinputs (S1A S2) supplied externally of the chip and generateda digital pulse with the same frequency as S1AS2 The pulsewidth was defined by the phase difference between S1AS2 asillustrated in Figure 68
625 Integrated inductor
The integrated inductor was designed in ASITIC1 as a planarsquare spiral with fixed dimensions (500microm times 500microm) and vari-able inductance while aiming for optimum Q factor The simu-
1 ASITIC is CAD tool that aids in the optimization and modelling of spiralinductors (httprficeecsberkeleyedu~niknejadasitichtml)
62 physical implementation 98
100 200 300 400 500 600 700 8005
7
9
11
13
15
17
19
21
10
15
20
25
30
35
40
45
Q (Simulated) Energy Reduction ()
Frequency (MHz)
Qfactor
Energyreduction()
Figure 69 Inductorrsquos Q factor simulation in ASITIC Estimated energydissipation reduction per cycle over standard CMOS
lated Q factor of the inductor for each frequency was then usedin Equation 55 to estimate the energy performance of the energy-recovery circuit when compared to standard CMOS (Equation 52)
As can be observed in Figure 69 the maximum performance isnot reached at the point where the maximum Q factor is achievedsince energy-recovery is also dependent on the operating fre-quency As a result the largest reduction in energy dissipationwas 40 attained at 300MHz while the maximum Q factor re-ported by ASITIC was 20 at 625MHz The high Q factors werea consequence of the large area allocated for the inductor andthe availability of the backside re-distribution layer (RDL) on thetarget process technology for the implementation of the inductorwhere thick low-resistance copper paths could be deposited
For an operating frequency of 500MHz the inductor had a Qfactor of sim 17 which resulted in an estimated energy reductionof 33 in comparison to the CMOS circuit The inductor had avalue of 613nH at that frequency and its design in layout viewcan be seen in Figure 610
626 Top level design
The various sub-components of the circuit were interconnectedusing Metal1 and Metal2 layers In the design there were twoglobal signals that assumed a critical role in the performance ofthe circuit The digital data signal (DATA) which was discussedin 623 and the resonant clock (RES_CLK)
For the distribution of the data signal buffers and interconnectswere properly sized so that the timing constraints of Figure 66were respected A sinusoidal signal such as RES_CLK does notrequire regeneration as it carries just one frequency harmonichowever the path itself has to be properly sized so as to make surethat any parasitic resistancescapacitances are not significantlyaffecting the target performance of the design As was previously
62 physical implementation 99
500μ
m
500μm
Figure 610 Planar square spiral inductor designed in ASITIC
shown by Chueh et al [19] when the driven load capacitancein a resonant distribution network is much larger than wireparasitic capacitance then wire resistance is more significant fordetermining both skew and energy performance
The resonant pulse distribution topology used in this designwas an asymmetric tree as shown in Figure 611 in which theinductor forces a current at the root branch terminating on alladiabatic drivers in parallel As such and considering that theresonant pulse supplies 80 adiabatic drivers with transmissiongate resistance of RTG = 168Ω the total load resistance seen atthe output node of the inductor is RTG
80 = 16880 = 21Ω However as
the current flows to lower branch levels in the tree the perceivedresistance increases to RTG
16 RTG4 and on the last one to RTG Any
additional resistance from wires is going to add in series with theadiabatic driver resistance which in turn will induce an almostlinear effect on the energy performance as can be estimated fromEquation 512
Assuming that the total driven resistance is not desirable toincrease more than 20 and that there are 4 branch levels inthe design each branch can tolerate wire resistance equal to20
4 = 5 of its value Starting from the inductor output nodeand calculating allowed wire resistivity for each branch level thecalculated tolerances for each branch are listed in Table 62 Thesecalculated limits were used for optimally sizing the resonantclock distribution tree in the demonstrator circuit
To ensure a stable voltage supply at the input of the inductor(VIND) a 1pF interdigitated metal-insulator-metal (MIM) de-coupling capacitor [78] was designed (Figure 612) which wasinserted between VIND and ground The top level design of
62 physical implementation 100
Adiabatic drivers
RTG
16
RTG
80
RTG
4
RTG
Level 1Level 2Level 3Level 4
Figure 611 Resonant pulse distribution tree
Branch level Wire resistance
1st 5 middot 16880 = 105mΩ
2nd 5 middot 16816 = 525mΩ
3rd 5 middot 1684 = 21Ω
4th 5 middot 168Ω = 84Ω
Total load 168+8480 + 21
20 + 05255 + 0105 = 252Ω
Table 62 Calculated resistance tolerances for each wire branch in theresonant clock distribution tree
63 post-layout simulation and comparison 101
+-
+-Metal2 Metal1
180μ
m
100μm
Figure 612 Interdigitated metal-insulator-metal capacitor (fringe capa-citor)
the energy-recovery IO circuit in layout view is illustrated inFigure 613
627 CMOS IO circuit
The CMOS IO circuit which was designed for comparison pur-poses is illustrated in the schematic diagram of Figure 614 Thecircuit used half the number of TSVs than the energy-recoverycircuit and was composed of two basic sub-components The datagenerator which was identical to the one discussed in 623 andthe CMOS drivers (Figure 615) which were properly sized to drivethe 160 f F TSV capacitance
The top level layout view of the CMOS IO circuit can be seenin Figure 616
63 post-layout simulation and comparison
The theoretically calculated parameters were used for the designof the demonstrator circuit in Imecrsquos 3D130nm process technologyIn Table 63 the estimated energy dissipation of each circuit sub-component is listed
63 post-layout simulation and comparison 102
22mm
12mm
GND GNDVBUF1 VDR
GND
VIND
S1A
S2
EN
OER
Figure 613 Energy-recovery IO circuit in layout view
63 post-layout simulation and comparison 103
OCMOS
S1B DataGenerator
TSV
CMOSDriver
CMOSDriver
TSV Buffer
1
80
EN
Figure 614 CMOS IO circuit schematic diagram
GND
VCMOS
DATA
078u
039u
105u
053u
316u
158u
950u
475u
OUT
Figure 615 CMOS driver schematic
12mm
12mm
OCMOS S1B GND VBUF2 VCMOSGND
GND GNDEN VCMOS
Figure 616 CMOS IO circuit in layout view
63 post-layout simulation and comparison 104
Energy dissipation (80-bits cycle)
Inductor 11pJ (Eq514)
Switch M1 092pJ [ ]
P2LCs 098pJ (Simulated)
Adiabatic drivers 315pJ (Eq512)
Total 615pJ
Table 63 Energy-dissipation theoretical calculations for Q = 17 f =500MHz
After parasitics extraction the top-level design was simulatedin SPICE The post-layout simulation results for a selection ofoutput signals are plotted in Figure 617 As can be observedthe timing constraints discussed in 623 are respected since theadiabatic driverrsquos input signal (DATA) optimally switches duringthe 327ps PULSE width It can also be seen that the resonantclock (CLK_RES) does not fully complete its discharge cyclebefore it is being forced to do so by the PULSE signal since theadditional capacitive load induced by the wire parasitics reducesthe oscillatorrsquos natural frequency
Post-layout simulated energy dissipation as well as area re-quirements for the energy-recovery and CMOS IO circuits arelisted in Table 64 The energy dissipation of the energy-recoveryimplementation is 25 more at the post-layout level in compar-ison to the theoretical estimations The energy overhead can beattributed to the effect of the wire parasitics as discussed in626 and the transistor parasitics that are not included in thetheoretical equations Therefore the accuracy of the theoreticalmodels can be considered acceptable for designing circuits basedon the proposed energy-recovery scheme
As was expected the dual-rail energy-recovery circuit occupiesalmost double the area of the equivalent CMOS circuit and eventhough both implementations have increased energy dissipationin comparison to the theoretical calculations the CMOS circuitenergy increases relatively more The latter was anticipated sincethe theoretical estimation of the CMOS energy dissipation wascalculated using the simple Equation 52 which does not accountfor any additional driver stages and parasitics The end result isthat the comparative performance favours the energy-recoveryimplementation which improves its energy savings from the
63 post-layout simulation and comparison 105
8 9 10 11 12 13 14 15 160
02
04
06
08
1
12
14RES_CLK
Time (ns)
V(V
)
8 9 10 11 12 13 14 15 160
02
04
06
08
1
12
14DATA
Time (ns)
V(V
)
8 9 10 11 12 13 14 15 160
02
04
06
08
1
12
14PULSE
Time (ns)
V(V
)
8 9 10 11 12 13 14 15 160
02
04
06
08
1
12
14OER
Time (ns)
V(V
)
Figure 617 Post-layout SPICE simulation of the energy-recovery cir-cuit
64 experimental measurements 106
AreaEnergy
dissipation(Theory)
Energydissipation
(Post-layout)
CMOS 144mm2922pJ
(115 f Jbit)
(Eq52)
1265pJ
(158 f Jbit)
Energy recovery 264mm2 615pJ
(77 f Jbit)
768pJ
(96 f Jbit)
Difference +83 minus333 minus393
Table 64 Post-layout simulated energy dissipation Area require-ments
theoretically calculated 333 up to 393 in the post-layoutsimulations
64 experimental measurements
The presented energy-recovery and CMOS IO circuits were fabric-ated on a 200mm wafer (Figure A3) which contained 47 identicaldies Micrograph images of the energy recovery and CMOS circuitscan be seen in Figures 618 and 619 respectively
641 Measurement setup
For IO access a manual on-wafer measurement system was usedthat allowed direct contact with the wafer and the IO test padsthrough RF probe heads (Figure A4) IO signal generation andmeasurement was provided by laboratory instruments whichwere manually controlled The measurement setup is illustratedin Figure 620
The IO test pads for the energy-recovery and CMOS circuitsand their equivalent instrument connections used for each testpad are summarised in Tables 65 and 66
642 Results
CMOS IO circuit
In Figure 621 the input signal S1B and the output signal OCMOS
are shown as captured by the oscilloscope Signal S1B whichhas a frequency of 500MHz is used by the data generator (Fig-ure 614) to generate a data signal switching at the same frequency
64 experimental measurements 107
Figure 618 Energy-recovery IO circuit micrograph
Figure 619 CMOS IO circuit micrograph
AABC
DEFF
D
EEBA
EBBA
Figure 620 Measurement setup
64 experimental measurements 108
Test pad Direction Instrument connection
VIND Power Power supply - gt06V
GND Power Power supply - 0V
S1A InputSine-wave generator (0deg Phase500MHz non-50Ω-terminated)
S2 InputSine-wave generator (VariablePhase 500MHznon-50Ω-terminated)
EN InputDigital Output (enable outputdata switching)
OER Output Oscilloscope
VDR PowerPower supply (adiabaticdrivers+P2LCs) - 12V
VBUF1 Power Power supply (Buffers) - 12V
Table 65 Energy-recovery circuit instrument connections
Test pad Direction Instrument connection
VCMOS PowerPower supply (CMOS drivers) -12V
GND Power Power supply - 0V
VBUF2 Power Power supply (Buffers) - 12V
S1B InputSine-wave generator (0deg Phase500MHz non-50Ω-terminated)
OCMOS Output Oscilloscope
EN InputDigital Output (enable outputdata switching)
Table 66 CMOS circuit instrument connections
64 experimental measurements 109
S1B
OCMOS
Figure 621 Oscilloscope output for signals S1B and OCMOS
100 200 300 400 500 600 700 800012345678910
Die1 Die2 Simulation
Frequency (MHz)
Powerdissipation(mW)
Figure 622 Measuredsimulated power dissipation for the CMOS cir-cuit
that passes through the TSVs and is output on OCMOS Thereforethe measured result on the oscilloscope suggests that the circuitis operating as expected
The proper operation of the circuit is also verified by measuringpower and energy per cycle dissipation on pad VCMOS Thepower dissipation of the two dies measured is approximatelylinear as expected by simulations (Figure 622) while energydissipation approximately constant (Figure 623) In additionabsolute power and energy dissipation values closely match post-layout simulations
Energy-recovery IO circuit
Input signal S1A and output signal OER can be seen in Fig-ure 624 as captured by the oscilloscope The output at OER
64 experimental measurements 110
100 200 300 400 500 600 700 800024681012141618
Die1 Die2 Simulation
Frequency (MHz)
Energydissipation(pJ)
Figure 623 Measuredsimulated energy dissipation for the CMOScircuit
S1A
OER
Figure 624 Oscilloscope output for signals S1A and OER
shows a data signal switching at 500MHz as expected The slowrising edge of the data signal suggests higher than anticipatedparasitics at the OER IO pad though the functionality of thecircuit is not affected
The energy dissipation is calculated by measuring the inputcurrent on IO test pads VIND and VDR The results for the twodies measured can be seen in Figure 625 The energy dissipationshows an approximately constant relation to frequency similar tothe CMOS circuit which does not correlate with post-layout simu-lations A properly operating energy-recovery circuit would beexpected to show minimum energy dissipation at the resonancefrequency (sim 500MHz)
Since the energy-recovery circuit shows CMOS-type energy dis-sipation it was estimated that the inductor could not resonate
65 summary and conclusions 111
200 250 300 350 400 450 500 550 600 65002468101214161820
Die1 Die2 Simulation
Frequency (MHz)
Energydissipation(pJ)
Figure 625 Measuredsimulated energy dissipation for the energy-recovery circuit
IntegratedInductor
VINDOER
PulseGenerator
S1A
S2
DataGenerator
TSV
TSV
P2LCAdiabaticDriver
P2LCAdiabaticDriver TSV
TSV
Buffer
Buffer
1
80
1
80
EN
DecouplingCapacitor
200pF
Figure 626 Energy-recovery IO circuit with fault hypothesis
properly and a hypothesis was made that the cause was a largerthan anticipated parasitic capacitance on the integrated inductorTo determine whether the hypothesis is valid a 200pF capacitoris inserted in the circuit schematic connected at the output ofthe inductor as shown in Figure 626 The new circuit is simu-lated and as can be seen in Figure 627 the output signal at OER
switches properly even though the resonant clock (RES_CLK) isnot oscillating The simulated energy dissipation of the circuitwith the fault hypothesis can be seen in Figure 628 and as canbe observed it is very similar to the experimental measurementsboth in terms of absolute values and frequency relation
65 summary and conclusions
In this chapter the design approach of a demonstrator circuitwas presented for evaluating the energy-recovery 3D interconnect-ing scheme Expanding on the theoretical analysis proposed inChapter 5 practical design issues were discussed and methodswere suggested for overcoming them
The circuit was designed and implemented on an experimental130nm 3D process technology and post-layout simulations res-
65 summary and conclusions 112
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15050
055
060
065
070
075
080RES_CLK
Time (ns)
V(V
)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 150
02
04
06
08
1
12
14PULSE
Time (ns)
V(V
)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1500
02
04
06
08
10
12
14OER
Time (ns)
V(V
)
Figure 627 Simulation of circuit with fault hypothesis
65 summary and conclusions 113
200 250 300 350 400 450 500 550 600 6500
5
10
15
20
25
Die1 Die2 Simulation CL=200pF
Frequency (MHz)
Ene
rgy
diss
ipat
ion
(pJ)
Figure 628 Measured energy dissipation and simulation of circuit withfault hypothesis
ults including parasitics were shown and interpreted The post-layout results revealed that the demonstrator circuit dissipated25 more energy per cycle than the value estimated by the-ory however the energy overhead was attributed to the effectof the wiretransistor parasitics which are not included in thetheoretical models Nevertheless the energy-recovery IO circuitdissipated 39 less energy per cycle than the equivalent CMOS
circuit implemented on the same process technologyExperimental measurements demonstrated that the fabricated
CMOS IO circuit closely matched the energy-dissipation estim-ated by post-layout simulations However the energy-recoveryIO circuit did not operate as expected and as a result its wasnot possible to evaluate the energy-recovery 3D interconnectingscheme on-chip Using simulation data the fault was attributedto higher than anticipated integrated inductor parasitics
7C O N C L U S I O N S
Planar on-chip interconnects have become a major concern inmodern processes as their impact on IC performance has beenprogressively increasing with each technology node In recentyears 3D interconnects based on vertical TSVs have been proposedas a possible solution to the interconnect bottleneck TSVs havethe potential to offer the best all around technology for meet-ing future technology trends with their advantages includingreduced parasitics and high interconnection density
As TSV is a relatively recent 3D interconnect technology thereare still challenges to be overcome before TSVs can be widelyadopted and commercialised Issues in the fabrication of TSV
devices are in the process of being resolved or solutions havealready been found For implementing TSV-based 3D-ICs withmore complex functionalities than simple test chips 3D designmethodologies and tools are needed which are still at their in-fancy stage Accurate perception of the TSV characteristics thatinfluence circuit performance is a major step towards the devel-opment of design methodologies and tools for 3D-ICs
Key TSV characteristics are the TSV parasitic capacitance andthermomechanical stress distribution The parasitic capacitanceaffects delay and power consumption of signals passing throughthe TSV interconnect while thermomechanical stress affects MOS-
FET device mobility in the vicinity of TSVs For extracting theseparameters accurately and under different conditions sophist-icated test structures were presented in this work which werefabricated and evaluated on an experimental 3D process
Although the TSV parasitic capacitance is small it may stillbecome an important source of power consumption as TSVs areimplemented in large densely interconnected 3D-SoCs Since con-ventional low-power design techniques based on voltage scalingrepresent a technical challenge in modern technology nodesa novel TSV interconnection scheme with frequency-dependentpower consumption was proposed in this work based on energy-recovery logic The proposed scheme was analysed using the-oretical modelling while a demonstrator IC was designed andfabricated on a 130nm 3D process
The contributions of this thesis as presented in the previouschapters are summarised in the following sections as well aspossible directions for future work
114
71 main contributions 115
71 main contributions
From the parasitics associated with a TSV capacitance assumesthe most significant role on its electrical properties such as en-ergy dissipation and delay thus accurate characterization of theTSV capacitance is highly desirable when designing a 3D systemIn Chapter 3 a group of test structures were presented for theelectrical characterization of the TSV parasitic capacitance un-der various conditions The test structures were based on theCBCM technique which has been used in the past for measuringBEOL metal interconnect parasitic capacitances and MOSFET ca-pacitances The presented test structures were fabricated on a65nm 3D process and the measurement results were statisticallyanalysed in terms of TSV capacitance die-to-die (D2D) and die-to-wafer (D2W) variability Comparison of the measurement data tosimulated oxide liner thickness variation for the specific processrevealed the oxide liner as a major contributor to the observedTSV capacitance variability Measurement results also confirmedthe effect of TSV pitch on the parasitic capacitance as has beenpreviously reported in the literature for a different process Theproposed CBCM-based TSV capacitance measurement techniquewas shown to be superior to conventional LCR measurementsproducing both accurate measurement results as well as lessoutliers (less defective samples) in the measurement data
The integration of TSV 3D interconnects in CMOS technology hasbeen shown to induce thermomechanical stress on the silicon (Si)substrate Thermomechanical stress can have an impact on activedevice carrier mobility implemented on the same Si substratethus monitoring and controlling stress is crucial for designing TSV-aware circuits with predictable performance Simulations usingfinite element method (FEM) modelling predict a complex ther-momechanical stress distribution which emphasises the need forsophisticated characterization structures that can monitor stressimpact with precision and verify simulation models In Chapter4 a test structure was presented for indirectly monitoring of theTSV-induced thermomechanical stress distribution by measuringits effect on transistor saturation current The structure implemen-ted arrays of MOSFET devices superimposed with TSVs in variousconfigurations and enabled accurate measurement of individualdevices by combining digital selection logic with four-terminalsensing Imecrsquos experimental 65nm 3D process was characterizedusing the presented test structure extracting the thermomechan-ical stress effect on PMOS devices which was found to be on parwith simulation estimations
In future large densely interconnected 3D-SoCs the TSV parasiticcapacitance may become an important source of energy dissip-ation Energy-recovery logic is an interesting low-power design
72 future work 116
approach as it is not limited by voltage scaling which representsa technical challenge in modern technology nodes In Chapter5 the potential of the energy-recovery technique was investig-ated for reducing the energy dissipation of TSV interconnectsin 3D-ICs The proposed energy-recovery scheme was analysedusing theoretical modelling generating equations for energy dis-sipation and optimum device sizing The model accuracy wasevaluated against SPICE simulations which showed good cor-relation with the theoretical estimations on a 130nm technologyprocess Comparison of the energy-recovery scheme to conven-tional static CMOS demonstrated favourable energy performancefor the energy-recovery scheme for low operating frequenciesand high Q factors switching activities and TSV capacitances
The theoretical models developed in Chapter 5 were used fordesigning a energy-recovery 3D demonstrator circuit in Chapter6 under realistic physical and electrical constraints The energy-recovery circuit and its CMOS equivalent were fabricated on a130nm 3D process technology and post-layout simulations showedthat the energy-recovery circuit dissipated 39 less energy percycle than the CMOS circuit on the same process technology Exper-imental measurements and comparison to simulations confirmedthe proper operation of the fabricated CMOS circuit however theenergy-recovery circuit did not operate as expected and as a res-ult its was not possible to evaluate the proposed energy-recoveryscheme on a chip Using simulation data the fault was attributedto higher than anticipated integrated inductor parasitics
72 future work
The CBCM-based test structures in Chapter 3 were shown to be atleast as accurate as external instruments for characterizing TSV
capacitance Considering that CBCM can evaluate single-endedTSVs it could prove a valuable tool for testing TSVs prior to diestacking which is a critical step for keeping 3D-IC yield acceptableMeasuring the TSV parasitic capacitance can provide with import-ant information about the quality of the fabricated TSV devicesince even slight variations in process parameters will affect itsparasitic capacitance as was demonstrated in Chapter 3 CBCM issimpler and potentially more accurate than other techniques thathave been proposed in the literature for the same role such asRC constant measurement using sense amplification [16]
In Chapter 4 the experiences learned from evaluation of theexperimental measurements results could be potentially used fordesigning improved thermomechanical stress characterizationstructures for next generation TSV processes
1 The transistors should be in a regular grid in the array forsimple processing of the measurement data
72 future work 117
2 The boundary conditions of transistors should be carefullydesigned as they have an considerable effect on transistormatching and thus measurement accuracy
3 Distances of interest are generally lt 8microm thus the arrayscould be smaller and more dense in the area in proximityto the TSV
4 Mechanisms are required to detect faults in the digitalselection logic when the implementation process is expectedto have low-yield issues
Furthermore in a future work the thermomechanical stress effecton NMOS devices should be measured which were not evaluatedin this project Simulations predict that NMOS devices are muchless affected by thermomechanical stress than PMOS devices thusthe approach used in this work might not be sufficiently accurateto extract stress effect on NMOS mobility
In Chapter 5 the energy-recovery scheme proposed could use adifferent encoding technique than dual-rail as a possible improve-ment A suggested encoding is 1-out-of-4 [74] since only chargesa single output at any given cycle which can potentially save upto 50 energy in comparison to dual-rail Also the demonstra-tion of the energy-recovery scheme feasibility on a chip is stillpending since the energy-recovery 3D demonstrator discussedin Chapter 6 failed to operate properly The fault was attributedto higher than anticipated integrated inductor parasitics thusthe same circuit should be implemented in the future with animproved inductor design for verifying the proposed theory
AA P P E N D I X A P H O T O S
Figure A1 3D65 300mm wafer (on chuck)
Figure A2 Probe card
118
appendix a photos 119
Figure A3 3D130 200mm wafer (on chuck)
ProbeHead
Microscope
Figure A4 Probe head
BA P P E N D I X B S TAT I S T I C S
mean (micro)
In a given data set the mean is the sum of the values divided bythe number of values The mean represents the average value in apopulation If xi is a value in the data set and n the total numberof values the mean micro can be calculated as
micro =1n
n
sumi=1
xi
median (m)
The median is useful for skewed distributions and represents thearithmetic value that the majority of the values in a data set tendThe median of a normal distribution with mean micro and varianceσ2 is the same as the mean
standard deviation (σ)
Standard deviation is used for measuring variability in a dataset It shows how much variation there is from the mean valueTo calculate the standard deviation of a data set first the squareof the difference of each value from the mean is calculated andthen the square root of the mean gives the standard deviation
A normal distribution contains 682 of its values within plusmn1σOther distributions can have a different spread and typically alow standard deviation indicates low variability whereas a highstandard deviation increased variability
standard deviation of the mean (σmicro )
The standard deviation of the sampled mean (σmicro) provides in-formation on the precision of the calculated mean of a data setThis is especially useful in experimental measurements wherethe certainty of the data being an accurate representation of themeasured physical quantity increases the more the measurementis repeated
Assuming there is a data set X with n values xi of mean micro
120
appendix b statistics 121
variance(X) = σ2X
variance(micro) = variance( 1n sum
ni=1 xi) =
1n2 variance(sumn
i=1 xi) =1n2 sum
ni=1 variance(xi) =
nn2 variance(X) =1n variance(X) =
σ2Xn rarr
σmicro = σXradicn
coefficient of variation
The coefficient of variation is the ratio of the standard deviationσ to the mean micro
CV =σ
micro
It can be more useful in determining variability in a populationthan standard deviation as in contrast to standard deviationcoefficient of variation is a dimensionless number often expressedas percentage
outlier
Outliers are values in a data set that are numerically distant fromthe rest of the data Observing and sometimes removing theoutliers is crucial in experimental measurement as these valuesusually indicate a measurement error or other malfunction De-termining outliers is generally subjective to the observer as thereis no specific mathematical definition of what constitutes outliersTypically the outliers are located by observing the probabilitydistribution function of a data set
kruskalndash wallis [34]
KruskalndashWallis one-way analysis of variance compares the medi-ans of two or more data sets to determine if the samples originatefrom separate populations This analysis tests the null-hypothesisthat there is no relationship between the data sets calculating theprobability of the data sets originating from the same populationand rejecting the null-hypothesis if it is less than a significancelevel The KruskalndashWallis test can be used on data sets that donot have a normal distribution and their variances do not have tobe equal
B I B L I O G R A P H Y
[1] Model 2602 dual-channel system sourcemeter instru-ment URL httpwwwkeithleycomproductsdcac
currentsourcebroadpurposemn=2602
[2] Max-3d URL httpwwwmicromagiccom
[3] The international technology roadmap for semiconductors2007 URL httppublicitrsnet
[4] The international technology roadmap for semiconductors2009 URL httppublicitrsnet
[5] Jaime Aguilera Joaquin de No Andres Garcia-Alonso FrankOehler Heiko Hein and Josef Sauerer A guide for on-chip inductor design in a conventional cmos process forrf applications In Applied Microwave amp Wireless Magazinepages 56ndash65 October 2001
[6] M Arsalan and M Shams Charge-recovery power clockgenerators for adiabatic logic circuits In Proc 18th Int VLSI
Design Conf pages 171ndash174 2005 doi 101109ICVD200564
[7] P Asimakopoulos G Van der Plas A Yakovlev andP Marchal Evaluation of energy-recovering interconnectsfor low-power 3d stacked ics In Proc IEEE Int Conf
3D System Integration 3DIC 2009 pages 1ndash5 2009 doi1011093DIC20095306581
[8] W C Athas L J Svensson J G Koller N Tzartzanis andE Ying-Chin Chou Low-power digital systems based onadiabatic-switching principles 2(4)398ndash407 1994 doi 10110992335009
[9] C H Bennett Logical reversibility of computation IBM
Journal of Research and Development 17(6)525ndash532 1973 doi101147rd1760525
[10] Charles H Bennett The fundamental physical lim-its of computation Scientific American 253(1)48ndash561985 URL httpwwwnaturecomdoifinder101038
scientificamerican0785-48
[11] E Beyne 3d system integration technologies In Proc Int
VLSI Technology Systems and Applications Symp pages 1ndash92006 doi 101109VTSA2006251113
122
bibliography 123
[12] Mark Bohr and Kaizad Mistry Intelrsquos revolu-tionary 22 nm transistor technology 2011 URLhttpdownloadintelcomnewsroomkits22nmpdfs
22nm-Details_Presentationpdf
[13] Ke Cao S Dobre and Jiang Hu Standard cell charac-terization considering lithography induced variations InProc 43rd ACMIEEE Design Automation Conf pages 801ndash8042006 doi 101109DAC2006229327
[14] A P Chandrakasan S Sheng and R W Brodersen Low-power cmos digital design 27(4)473ndash484 1992 doi 1011094126534
[15] J C Chen B W McGaughy D Sylvester and Chenming HuAn on-chip attofarad interconnect charge-based capacitancemeasurement (cbcm) technique In Proc Int Electron Devices
Meeting IEDM rsquo96 pages 69ndash72 1996 doi 101109IEDM1996553124
[16] Po-Yuan Chen Cheng-Wen Wu and Ding-Ming Kwai On-chip tsv testing for 3d ic before bonding using sense ampli-fication In Proc ATS rsquo09 Asian Test Symp pages 450ndash4552009 doi 101109ATS200942
[17] Sungdong Cho Sinwoo Kang Kangwook Park Jaechul KimKiyoung Yun Kisoon Bae Woon Seob Lee Sangwook JiEunji Kim Jangho Kim Y L Park and E S Jung Impact oftsv proximity on 45nm cmos devices in wafer level In Proc
IEEE Int Interconnect Technology Conf and 2011 Materials for
Advanced Metallization (IITCMAM) pages 1ndash3 2011 doi101109IITC20115940326
[18] Kyu-Myung Choi An industrial perspective of 3d ic inte-gration technology from the viewpoint of design technologyIn Proc 15th Asia and South Pacific Design Automation Conf
(ASP-DAC) pages 544ndash547 2010 doi 101109ASPDAC20105419823
[19] Juang-Ying Chueh C H Ziesler and M C PapaefthymiouEmpirical evaluation of timing and power in resonant clockdistribution In Proc Int Symp Circuits and Systems ISCAS
rsquo04 volume 2 2004 doi 101109ISCAS20041329255
[20] Thuy Dao D H Triyoso M Petras and M CanonicoThrough silicon via stress characterization In Proc IEEE
Int Conf IC Design and Technology ICICDT rsquo09 pages 39ndash412009 doi 101109ICICDT20095166260
[21] M Grange R Weerasekera D Pamunuwa and H Ten-hunen Exploration of through silicon via interconnect par-
bibliography 124
asitics for 3-dimensional integrated circuits In Workshop
Notes Design Automation and Test in Europe (DATE) 2009
[22] N Z Haron and S Hamdioui Why is cmos scaling comingto an end In Proc 3rd Int Design and Test Workshop IDT
2008 pages 98ndash103 2008 doi 101109IDT20084802475
[23] Imec 200mm 3d-sic tsv line 2009 URL httpwwwimec
beScientificReportSR2009HTML1213307html
[24] J W Joyner R Venkatesan P Zarkesh-Ha J A Davis andJ D Meindl Impact of three-dimensional architectures oninterconnects in gigascale integration 9(6)922ndash928 2001doi 10110992974905
[25] A P Karmarkar Xiaopeng Xu and V Moroz Perfor-manace and reliability analysis of 3d-integration struc-tures employing through silicon via (tsv) In Proc IEEE
Int Reliability Physics Symp pages 682ndash687 2009 doi101109IRPS20095173329
[26] G Katti A Mercha M Stucchi Z Tokei D VelenisJ Van Olmen C Huyghebaert A Jourdain M RakowskiI Debusschere P Soussan H Oprins W DehaeneK De Meyer Y Travaly E Beyne S Biesemans andB Swinnen Temperature dependent electrical character-istics of through-si-via (tsv) interconnections In Proc Int
Interconnect Technology Conf (IITC) pages 1ndash3 2010 doi101109IITC20105510311
[27] G Katti M Stucchi K De Meyer and W Dehaene Electricalmodeling and characterization of through silicon via forthree-dimensional ics 57(1)256ndash262 2010 doi 101109TED20092034508
[28] G Katti M Stucchi J Van Olmen K De Meyer and W De-haene Through-silicon-via capacitance reduction techniqueto benefit 3-d ic performance 31(6)549ndash551 2010 doi101109LED20102046712
[29] Walt Kester Sensor signal conditioning In Jon SWilson editor Sensor Technology Handbook pages 31 ndash136 Newnes Burlington 2005 ISBN 978-0-75-067729-5 doi DOI101016B978-075067729-550044-6 URLhttpwwwsciencedirectcomsciencearticlepii
B9780750677295500446
[30] Jonggab Kil Jie Gu and C H Kim A high-speed variation-tolerant interconnect technique for sub-threshold circuitsusing capacitive boosting 16(4)456ndash465 2008 doi 101109TVLSI2007915455
bibliography 125
[31] Suhwan Kim C H Ziesler and M C PapaefthymiouCharge-recovery computing on silicon 54(6)651ndash659 2005doi 101109TC200591
[32] C Kortekaas On-chip quasi-static floating-gate capacitancemeasurement method In Proc Int Conf Microelectronic Test
Structures ICMTS 1990 pages 109ndash113 1990 doi 101109ICMTS199067889
[33] A Kramer J S Denker B Flower and J Moroney 2ndorder adiabatic computation with 2n-2p and 2n-2n2p logiccircuits In Proceedings of the 1995 international symposium on
Low power design ISLPED rsquo95 pages 191ndash196 New York NYUSA 1995 ACM ISBN 0-89791-744-8 doi httpdoiacmorg101145224081224115 URL httpdoiacmorg10
1145224081224115
[34] William H Kruskal and W Allen Wallis Use of ranks in one-criterion variance analysis Journal of the American Statistical
Association 47(260)583ndash621 1952 ISSN 01621459 doi 1023072280779 URL httpdxdoiorg1023072280779
[35] Kelin J Kuhn Moorersquos law past 32nm Future challengesin device scaling 2009 URL httpdownloadintelcom
pressroompdfkkuhnKuhn_IWCE_invited_textpdf
[36] R Landauer Irreversibility and heat generation in the com-puting process IBM Journal of Research and Development 5
(3)183ndash191 1961 doi 101147rd530183
[37] Joonho Lim Dong-Gyu Kim and Soo-Ik Chae A 16-bitcarry-lookahead adder using reversible energy recoverylogic for ultra-low-energy systems 34(6)898ndash903 1999 doi1011094766827
[38] Joonho Lim Dong-Gyu Kim and Soo-Ik Chae nmos re-versible energy recovery logic for ultra-low-energy applica-tions 35(6)865ndash875 2000 doi 1011094845190
[39] F Liu X Gu K A Jenkins E A Cartier Y Liu P Song andS J Koester Electrical characterization of 3d through-silicon-vias In Proc 60th Electronic Components and Technology Conf
(ECTC) pages 1100ndash1105 2010 doi 101109ECTC20105490839
[40] K H Lu Xuefeng Zhang Suk-Kyu Ryu J Im Rui Huangand P S Ho Thermo-mechanical reliability of 3-d ics con-taining through silicon vias In Proc 59th Electronic Compo-
nents and Technology Conf ECTC 2009 pages 630ndash634 2009doi 101109ECTC20095074079
bibliography 126
[41] Nir Magen Avinoam Kolodny Uri Weiser and NachumShamir Interconnect-power dissipation in a microprocessorIn Proceedings of the 2004 international workshop on System
level interconnect prediction SLIP rsquo04 pages 7ndash13 New YorkNY USA 2004 ACM ISBN 1-58113-818-0 doi httpdoiacmorg101145966747966750 URL httpdoiacm
org101145966747966750
[42] H Mahmoodi V Tirumalashetty M Cooke and K RoyUltra low-power clocking scheme using energy recoveryand clock gating 17(1)33ndash44 2009 doi 101109TVLSI20082008453
[43] D Maksimovic A mos gate drive with resonant transitionsIn Proc nd Annual IEEE Power Electronics Specialists Conf
PESC rsquo91 Record pages 527ndash532 1991 doi 101109PESC1991162725
[44] D Maksimovic and V G Oklobdzija Integrated power clockgenerators for low energy logic In Proc th Annual IEEE
Power Electronics Specialists Conf PESC rsquo95 Record volume 1pages 61ndash67 1995 doi 101109PESC1995474793
[45] D Maksimovic V G Oklobdzija B Nikolic and K WCurrent Clocked cmos adiabatic logic with integratedsingle-phase power-clock supply 8(4)460ndash463 2000 doi10110992863629
[46] E J Marinissen Testing tsv-based three-dimensional stackedics In Proc Design Automation amp Test in Europe Conf amp
Exhibition (DATE) pages 1689ndash1694 2010
[47] E J Marinissen and Y Zorian Testing 3d chips containingthrough-silicon vias In Proc Int Test Conf ITC 2009 pages1ndash11 2009 doi 101109TEST20095355573
[48] A Mercha A Redolfi M Stucchi N Minas J Van Ol-men S Thangaraju D Velenis S Domae Y Yang G KattiR Labie C Okoro M Zhao P Asimakopoulos I De WolfT Chiarella T Schram E Rohr A Van Ammel A Jour-dain W Ruythooren S Armini A Radisic H PhilipsenN Heylen M Kostermans P Jaenen E Sleeckx D Sabun-cuoglu Tezcan I Debusschere P Soussan D PerryG Van der Plas J H Cho P Marchal Y Travaly E BeyneS Biesemans and B Swinnen Impact of thinning andthrough silicon via proximity on high-k metal gate firstcmos performance In Proc Symp VLSI Technology (VLSIT)pages 109ndash110 2010 doi 101109VLSIT20105556190
[49] A Mercha G Van der Plas V Moroz I De Wolf P Asi-makopoulos N Minas S Domae D Perry M Choi
bibliography 127
A Redolfi C Okoro Y Yang J Van Olmen S ThangarajuD S Tezcan P Soussan J H Cho A Yakovlev P Mar-chal Y Travaly E Beyne S Biesemans and B SwinnenComprehensive analysis of the impact of single and arraysof through silicon vias induced stress on high-k metalgate cmos performance In Proc IEEE Int Electron Devices
Meeting (IEDM) 2010 doi 101109IEDM20105703278
[50] Yong Moon and Deog-Kyoon Jeong An efficient chargerecovery logic circuit 31(4)514ndash522 1996 doi 1011094499727
[51] G E Moore Cramming more components onto inte-grated circuits Electronics 38(8)114ndash117 April 1965 doi101109JPROC1998658762 URL httpdxdoiorg10
1109JPROC1998658762
[52] O S Nakagawa S-Y Oh T Hsu and S Habu Benchmarkmethodology of interconnect capacitance simulation usinginter-digitated capacitors In Proc Int Microelectronic Test
Structures ICMTS 1998 Conf pages 235ndash237 1998 doi 101109ICMTS1998688103
[53] M W Newman S Muthukumar M SchueleinT Dambrauskas P A Dunaway J M Jordan S KulkarniC D Linde T A Opheim R A Stingel W WorwagL A Topic and J M Swan Fabrication and electricalcharacterization of 3d vertical interconnects In Proc
56th Electronic Components and Technology Conf 2006 doi101109ECTC20061645676
[54] V G Oklobdzija D Maksimovic and Fengcheng Lin Pass-transistor adiabatic logic using single power-clock supply44(10)842ndash846 1997 doi 10110982633443
[55] J Van Olmen C Huyghebaert J Coenen J Van AelstE Sleeckx A Van Ammel S Armini G Katti J VaesW Dehaene E Beyne and Y Travaly Integration chal-lenges of copper through silicon via (tsv) metallization for3d-stacked ic integration Microelectronic Engineering 88
(5)745 ndash 748 2011 ISSN 0167-9317 doi DOI101016jmee201006026 URL httpwwwsciencedirectcom
sciencearticlepiiS0167931710002121 The 2010 Inter-national workshop on - MAM 2010 The 2010 Internationalworkshop on
[56] J J Paulos and D A Antoniadis Measurement of minimum-geometry mos transistor capacitances 20(1)277ndash283 1985doi 101109JSSC19851052303
bibliography 128
[57] Vasilis F Pavlidis and Eby G Friedman Interconnect de-lay minimization through interlayer via placement in 3-dics In Proceedings of the 15th ACM Great Lakes symposium
on VLSI GLSVLSI rsquo05 pages 20ndash25 New York NY USA2005 ACM ISBN 1-59593-057-4 doi httpdoiacmorg10114510576611057669 URL httpdoiacmorg101145
10576611057669
[58] Dan Perry Jonghoon Cho Shinichi Domae Panagiotis Asi-makopoulos Alex Yakovlev Pol Marchal Geert Van derPlas and Nikolaos Minas An efficient array structure tocharacterize the impact of through silicon vias on fet devicesIn Proc IEEE Int Microelectronic Test Structures (ICMTS) Confpages 118ndash122 2011 doi 101109ICMTS20115976872
[59] Jan M Rabaey Anantha Chandrakasan and Borivoje NikolicDigital integrated circuits- A design perspective Prentice Hall2ed edition 2004
[60] K Roy and S Prasad Low-power CMOS VLSI circuit de-
sign A Wiley-Interscience publication Wiley 2000 ISBN9780471114888
[61] K Roy S Mukhopadhyay and H Mahmoodi-MeimandLeakage current mechanisms and leakage reduction tech-niques in deep-submicrometer cmos circuits 91(2)305ndash3272003 doi 101109JPROC2002808156
[62] V S Sathe M C Papaefthymiou and C H Ziesler Boostlogic a high speed energy recovery circuit family In Proc
IEEE Computer Society Annual Symp VLSI pages 22ndash27 2005doi 101109ISVLSI200522
[63] V S Sathe J C Kao and M C Papaefthymiou Resonant-clock latch-based design 43(4)864ndash873 2008 doi 101109JSSC2008917501
[64] B Sell A Avellan and W H Krautschneider Charge-basedcapacitance measurements (cbcm) on mos devices 2(1)9ndash122002 doi 101109TDMR20021014667
[65] C S Selvanayagam Xiaowu Zhang R Rajoo and D PinjalaModelling stress in silicon with tsvs and its effect on mobilityIn Proc 11th Electronics Packaging Technology Conf EPTC rsquo09pages 612ndash618 2009 doi 101109EPTC20095416477
[66] D Shariff P C Marimuthu K Hsiao L Asoy Chia LaiYee Aung Kyaw Oo K Buchanan K Crook T Wilby andS Burgess Integration of fine-pitched through-silicon viasand integrated passive devices In Proc IEEE 61st Elec-
tronic Components and Technology Conf (ECTC) pages 844ndash848 2011 doi 101109ECTC20115898609
bibliography 129
[67] S Stoukatch C Winters E Beyne W De Raedt andC Van Hoof 3d-sip integration for autonomous sensornodes In Proc 56th Electronic Components and Technology
Conf 2006 doi 101109ECTC20061645678
[68] L J Svensson and J G Koller Driving a capacitive loadwithout dissipating fcv2 In Proc IEEE Symp Low Power
Electronics Digest of Technical Papers pages 100ndash101 1994 doi101109LPE1994573220
[69] B Swinnen W Ruythooren P De Moor L Bogaerts L Car-bonell K De Munck B Eyckens S Stoukatch D S TezcanZ Tokei J Vaes J Van Aelst and E Beyne 3d integrationby cu-cu thermo-compression bonding of extremely thinnedbulk-si die containing 10 IcircŒm pitch through-si vias In Proc
Int Electron Devices Meeting IEDM rsquo06 pages 1ndash4 2006 doi101109IEDM2006346786
[70] N Tanaka M Kawashita Y Yoshimura T Uematsu M Fuji-sawa H Shimokawa N Kinoshita T Naito T Kikuchi andT Akazawa Characterization of mos transistors after tsvfabrication and 3d-assembly In Proc 2nd Electronics System-
Integration Technology Conf ESTC 2008 pages 131ndash134 2008doi 101109ESTC20084684338
[71] S E Thompson Guangyu Sun Youn Sung Choi andT Nishida Uniaxial-process-induced strained-si extend-ing the cmos roadmap 53(5)1010ndash1020 2006 doi 101109TED2006872088
[72] N Tzartzanis and W C Athas Clock-powered cmos ahybrid adiabatic logic style for energy-efficient computingIn Proc 20th Anniversary Conf Advanced Research in VLSIpages 137ndash151 1999 doi 101109ARVLSI1999756044
[73] J Van Olmen A Mercha G Katti C HuyghebaertJ Van Aelst E Seppala Zhao Chao S Armini J VaesR C Teixeira M Van Cauwenberghe P Verdonck K Ver-hemeldonck A Jourdain W Ruythooren M de Potterde ten Broeck A Opdebeeck T Chiarella B Parvais I De-busschere T Y Hoffmann B De Wachter W DehaeneM Stucchi M Rakowski P Soussan R Cartuyvels E BeyneS Biesemans and B Swinnen 3d stacked ic demonstrationusing a through silicon via first approach In Proc IEEE Int
Electron Devices Meeting IEDM 2008 pages 1ndash4 2008 doi101109IEDM20084796763
[74] Tom Verhoeff Delay-insensitive codes - an overview Dis-
tributed Computing 31ndash8 1988 ISSN 0178-2770 URL http
dxdoiorg101007BF01788562 101007BF01788562
bibliography 130
[75] P Vitanov T Dimitrova and I Eisele Direct capacitancemeasurements of small geometry mos transistors Micro-
electronics Journal 22(7-8)77 ndash 89 1991 ISSN 0026-2692doi DOI1010160026-2692(91)90016-G URL httpwww
sciencedirectcomsciencearticleB6V44-4829XDJ-2N
2f5b2c221dcf240c38cd13b7b3432f913
[76] B Voss and M Glesner A low power sinusoidal clockIn Proc IEEE Int Symp Circuits and Systems ISCAS 2001volume 4 pages 108ndash111 2001 doi 101109ISCAS2001922182
[77] R Weerasekera M Grange D Pamunuwa H Tenhunenand Li-Rong Zheng Compact modelling of through-siliconvias (tsvs) in three-dimensional (3-d) integrated circuits InProc IEEE Int Conf 3D System Integration 3DIC 2009 pages1ndash8 2009 doi 1011093DIC20095306541
[78] NHE Weste and DF Harris CMOS VLSI design a cir-
cuits and systems perspective PearsonAddison-Wesley 2005ISBN 9780321269775
[79] Steve Wozniak Failure magazine interview (by jason zasky)July 2000 URL httpfailuremagcomindexphpsite
printsteve_wozniak_interview
[80] Hyung Suk Yang and Muhannad S Bakir 3d integrationof cmos and mems using mechanically flexible intercon-nects (mfi) and through silicon vias (tsv) In Proc 60th Elec-
tronic Components and Technology Conf (ECTC) pages 822ndash828 2010 doi 101109ECTC20105490716
[81] Yu Yang G Katti R Labie Y Travaly B Verlinden andI De Wolf Electrical evaluation of 130-nm mosfets withtsv proximity in 3d-sic structure In Proc Int Interconnect
Technology Conf (IITC) pages 1ndash3 2010 doi 101109IITC20105510710
[82] Yibin Ye and K Roy Energy recovery circuits using re-versible and partially reversible logic 43(9)769ndash778 1996doi 10110981536746
[83] Yibin Ye and K Roy Qserl quasi-static energy recoverylogic 36(2)239ndash248 2001 doi 1011094902764
[84] C C Yeh J H Lou and J B Kuo 15 v cmos full-swingenergy efficient logic (eel) circuit suitable for low-voltageand low-power vlsi applications Electronics Letters 33(16)1375ndash1376 1997 doi 101049el19970915
bibliography 131
[85] C P Yuan and T N Trick A simple formula for the esti-mation of the capacitance of two-dimensional interconnectsin vlsi circuits 3(12)391ndash393 1982 doi 101109EDL198225610
- Abstract
- Publications
- Acknowledgments
- Contents
- List of Figures
- List of Tables
- Acronyms
- 1 Introduction
-
- 11 Research goals and contribution
- 12 Thesis outline
-
- 2 Background
-
- 21 3D integrated circuits
-
- 211 3D integration approaches
- 212 TSV interconnects
-
- 22 Energy-recovery logic
-
- 221 Energy-recovery architectures
- 222 Power-clock generators
-
- 23 Full-custom design flow
- 24 Summary
-
- 3 Characterization of TSV capacitance
-
- 31 Introduction
- 32 TSV parasitic capacitance
- 33 Integrated capacitance measurement techniques
- 34 Electrical characterization
-
- 341 Physical implementation
- 342 Measurement setup
- 343 Experimental results
-
- 35 Summary and conclusions
-
- 4 TSV proximity impact on MOSFET performance
-
- 41 Introduction
- 42 TSV-induced thermomechanical stress
- 43 Test structure implementation
-
- 431 MOSFET arrays
- 432 Digital selection logic
-
- 44 Experimental measurements
-
- 441 Data processing methodology
- 442 Measurement setup
- 443 PMOS (long channel)
- 444 PMOS (short channel)
-
- 45 Summary and conclusions
-
- 5 Evaluation of energy-recovery for TSV interconnects
-
- 51 Introduction
- 52 Energy-recovery TSV interconnects
- 53 Analysis
-
- 531 Adiabatic driver
- 532 Inductors parasitic resistance
- 533 Switch M1
- 534 Timing Constraints
-
- 54 Simulations
-
- 541 Comparison to CMOS
-
- 55 Summary and conclusions
-
- 6 Energy-recovery 3D-IC demonstrator
-
- 61 Introduction
- 62 Physical implementation
-
- 621 Adiabatic driver
- 622 Pulse-to-level converter
- 623 Data generator
- 624 Pulse generator
- 625 Integrated inductor
- 626 Top level design
- 627 CMOS IO circuit
-
- 63 Post-layout simulation and comparison
- 64 Experimental measurements
-
- 641 Measurement setup
- 642 Results
-
- 65 Summary and conclusions
-
- 7 Conclusions
-
- 71 Main contributions
- 72 Future work
-
- A Appendix A Photos
- B Appendix B Statistics
- Bibliography
-
C O N T E N T S
1 Introduction 1
11 Research goals and contribution 2
12 Thesis outline 3
2 Background 5
21 3D integrated circuits 5
211 3D integration approaches 6
212 TSV interconnects 9
22 Energy-recovery logic 13
221 Energy-recovery architectures 17
222 Power-clock generators 18
23 Full-custom design flow 20
24 Summary 21
3 Characterization of TSV capacitance 23
31 Introduction 23
32 TSV parasitic capacitance 23
33 Integrated capacitance measurement techniques 24
34 Electrical characterization 27
341 Physical implementation 28
342 Measurement setup 36
343 Experimental results 37
35 Summary and conclusions 44
4 TSV proximity impact on MOSFET performance 46
41 Introduction 46
42 TSV-induced thermomechanical stress 46
43 Test structure implementation 48
431 MOSFET arrays 51
432 Digital selection logic 55
44 Experimental measurements 61
441 Data processing methodology 61
442 Measurement setup 62
443 PMOS (long channel) 64
444 PMOS (short channel) 71
45 Summary and conclusions 73
5 Evaluation of energy-recovery for TSV interconnects 75
51 Introduction 75
52 Energy-recovery TSV interconnects 75
53 Analysis 79
531 Adiabatic driver 80
532 Inductorrsquos parasitic resistance 82
533 Switch M1 82
534 Timing Constraints 83
54 Simulations 83
541 Comparison to CMOS 86
vi
contents vii
55 Summary and conclusions 91
6 Energy-recovery 3D-IC demonstrator 92
61 Introduction 92
62 Physical implementation 92
621 Adiabatic driver 94
622 Pulse-to-level converter 95
623 Data generator 95
624 Pulse generator 95
625 Integrated inductor 97
626 Top level design 98
627 CMOS IO circuit 101
63 Post-layout simulation and comparison 101
64 Experimental measurements 106
641 Measurement setup 106
642 Results 106
65 Summary and conclusions 111
7 Conclusions 114
71 Main contributions 115
72 Future work 116
a Appendix A Photos 118
b Appendix B Statistics 120
bibliography 122
L I S T O F F I G U R E S
Figure 11 Scaling trend of logic and interconnect delay[30] 2
Figure 21 Comparison of 2D3D Integration 5
Figure 22 System-in-a-package 6
Figure 23 Package-on-package 6
Figure 24 Imecrsquos 3-D SIP wireless bio-electronic sensor[67] 7
Figure 25 3D wafer-level-packaging 7
Figure 26 3D-stacked IC 8
Figure 27 TSV process steps 9
Figure 28 Before thinningafter thinning 10
Figure 29 TSV stacking 10
Figure 210 Conventional AND gate 14
Figure 211 CMOS switching 16
Figure 212 Adiabatic switching 16
Figure 213 Basic 2N-2P adiabatic circuit 17
Figure 214 2N-2P timing 17
Figure 215 Adiabatic driver 18
Figure 216 Stepwise charging generator 19
Figure 217 1N power clock generator 20
Figure 218 Design flow 22
Figure 219 Inverter in Virtuoso schematic 22
Figure 220 Inverter in Virtuoso layout 22
Figure 31 Isolated 2D wire capacitance 24
Figure 32 TSV parasitic capacitance 25
Figure 33 Capacitance vs Gate voltage (CV) diagramof a MOS Capacitor 25
Figure 34 CBCM pseudo-inverters 27
Figure 35 Current as a function of frequency in CBCM 28
Figure 36 CBCM pseudo-inverter pair in layout 29
Figure 37 PRC1 test pad module (layout) 30
Figure 38 PRC1 test pad module (micrograph) 30
Figure 39 Structure rsquoArsquo - Single TSV capacitance (TOPdie) 32
Figure 310 Structure rsquoBrsquo - Single TSV capacitance (BOT-TOM die) 33
Figure 311 Structure rsquoCrsquo - Variable pitch TSV capacitance34
Figure 312 Structure rsquoDrsquo - TSV capacitance measure-ment with interconnection on TOP or BOT-TOM die 34
Figure 313 Structure rsquoRrsquo - Evaluation of TSV capacit-ance measurement accuracy 35
viii
List of Figures ix
Figure 314 Structure rsquoErsquo - TSV capacitance effect onsignal propagation delay (layout) 35
Figure 315 Structure rsquoErsquo - TSV capacitance effect onsignal propagation delay (schematic) 36
Figure 316 Measurement setup 36
Figure 317 TSV capacitance D2D variability on waferrsquoD11rsquo - Probability density function 39
Figure 318 TSV capacitance D2D variability on waferrsquoD18rsquo - Probability density function 40
Figure 319 TSV capacitance measurement results onwafer D11 40
Figure 320 TSV capacitance measurement results onwafer D18 41
Figure 321 Simulated oxide liner thickness variation 42
Figure 322 TSV capacitance W2W variability - Probab-ility density functions 42
Figure 323 Variable pitch TSV capacitance experimentaldata from wafer rsquoD18rsquo - Probability densityfunction 44
Figure 324 LCR experimental data from wafer rsquoD18rsquo -Probability density function 44
Figure 41 TSV thermomechanical stress simulation [40] 47
Figure 42 TSV thermomechanical stress interactionbetween two TSVs [40] 48
Figure 43 Estimated mobility shift on PMOS (a) andNMOS (b) transistors in between two TSVsseparated by distance d [65] 49
Figure 44 PTP1 test pad module (layout) 50
Figure 45 PTP1 test pad module (micrograph) 50
Figure 46 MOSFET dimensions and rotations 51
Figure 47 1 TSV MOSFET array configuration 52
Figure 48 4 diagonal TSVs MOSFET array configuration52
Figure 49 4 lateral TSVs MOSFET array configuration 53
Figure 410 9 TSVs MOSFET array configuration 53
Figure 411 MOSFET array detail 55
Figure 412 Digital selection logic 56
Figure 413 MOSFET array with digital selection logic 57
Figure 414 4-bit Gate-enableArray-enable decoder 58
Figure 415 Transmission gate for MOSFET Gate terminals59
Figure 416 Transmission gates for MOSFET Drain-Source terminals 59
Figure 417 2-bit Drain-select decoder 60
Figure 418 Processing of measurement results 62
Figure 419 Interpolation of measurement results 63
Figure 420 Measurement setup 63
Figure 421 Normalised on-current for PMOS structureldquoF0rdquo (D08) 65
List of Figures x
Figure 422 Normalised on-current for PMOS structureldquoG0rdquo (D08) 66
Figure 423 Normalised on-current for PMOS structureldquoG0rdquo (D08Y=14X=16) 66
Figure 424 Normalised on-current for PMOS structureldquoG0rdquo (D20) 67
Figure 425 Normalised on-current for PMOS structureldquoG0rdquo (D20Y=14X=16) 67
Figure 426 Normalised on-current for PMOS structureldquoG0rdquo (D08 60˚C) 68
Figure 427 Normalised on-current for PMOS structureldquoG0rdquo (D08 Y=14 X=16 60˚C) 69
Figure 428 Normalised on-current for PMOS structureldquoI0rdquo (D08) 69
Figure 429 Normalised on-current for PMOS structureldquoI0rdquo (D08 X=16) 70
Figure 430 Normalised on-current for PMOS structureldquoI0rdquo (D08 Y=14) 70
Figure 431 Normalised on-current for PMOS structureldquoG90rdquo (D08) 71
Figure 432 Normalised on-current for PMOS structureldquoG90rdquo (D08 Y=14 X=16) 71
Figure 433 Normalised on-current for PMOS structureldquoB0rdquo (D08) 72
Figure 434 Normalised on-current for PMOS structureldquoB0rdquo (D08 Y=14 X=16) 72
Figure 51 TSV energy dissipation per cycle increasewith 3D integration density 76
Figure 52 Comparison of a typical CMOS TSV driverto the proposed energy-recovery 77
Figure 53 Adiabatic driver 78
Figure 54 A simple pulse-to-level converter (P2LC) 78
Figure 55 Resonant pulse generator with dual-rail dataencoding 79
Figure 56 Energy-recovery scheme for TSV interconnects79
Figure 57 Energy-recovery timing constraints 83
Figure 58 Test case for the evaluation of the theoret-ical model 84
Figure 59 Energy dissipation results ( f Jcycle) the-oretical model (T) versus SPICE simulator(S) 85
Figure 510 Energy dissipation estimation error of thetheoretical model compared to the SPICEsimulator 85
Figure 511 Test case for comparison of energy-recoveryto CMOS 86
Figure 512 Improved P2LC design 87
List of Figures xi
Figure 513 Simulated energy dissipation of improvedP2LC design 87
Figure 514 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS for variable Q factor (C = 80 f F) 88
Figure 515 Energy contribution of energy-recovery cir-cuit components at 200MHz (C = 80 f F) 88
Figure 516 Energy contribution of energy-recovery cir-cuit components at 800MHz (C = 80 f F) 89
Figure 517 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS with variable TSV capacitance (C =
80 f F f = 500MHz) 89
Figure 518 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS (C = 80 f F f = 200MHz) 90
Figure 519 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS for variable technology (C = 80 f Ff = 500MHz) 90
Figure 61 Demonstrator circuit structure (showing 1-bit inputoutput (IO) channel face-downview) 93
Figure 62 Energy-recovery IO circuit schematic dia-gram 94
Figure 63 Adiabatic driver (layoutschematic view) 94
Figure 64 P2LC (layoutschematic view) 95
Figure 65 Data generator (layoutschematic view) 96
Figure 66 Timing constraints for the energy-recoverydemonstrator circuit 96
Figure 67 Pulse generator (layoutschematic view) 97
Figure 68 Pulse generator InputOutput signals 97
Figure 69 Inductorrsquos Q factor simulation in ASITIC Estimated energy dissipation reduction percycle over standard CMOS 98
Figure 610 Planar square spiral inductor designed inASITIC 99
Figure 611 Resonant pulse distribution tree 100
Figure 612 Interdigitated metal-insulator-metal capa-citor (fringe capacitor) 101
Figure 613 Energy-recovery IO circuit in layout view 102
Figure 614 CMOS IO circuit schematic diagram 103
Figure 615 CMOS driver schematic 103
Figure 616 CMOS IO circuit in layout view 103
Figure 617 Post-layout SPICE simulation of the energy-recovery circuit 105
Figure 618 Energy-recovery IO circuit micrograph 107
List of Figures xii
Figure 619 CMOS IO circuit micrograph 107
Figure 620 Measurement setup 107
Figure 621 Oscilloscope output for signals S1B and OC-MOS 109
Figure 622 Measuredsimulated power dissipation forthe CMOS circuit 109
Figure 623 Measuredsimulated energy dissipation forthe CMOS circuit 110
Figure 624 Oscilloscope output for signals S1A and OER110
Figure 625 Measuredsimulated energy dissipation forthe energy-recovery circuit 111
Figure 626 Energy-recovery IO circuit with fault hy-pothesis 111
Figure 627 Simulation of circuit with fault hypothesis 112
Figure 628 Measured energy dissipation and simula-tion of circuit with fault hypothesis 113
Figure A1 3D65 300mm wafer (on chuck) 118
Figure A2 Probe card 118
Figure A3 3D130 200mm wafer (on chuck) 119
Figure A4 Probe head 119
L I S T O F TA B L E S
Table 21 3D interconnect technologies comparison 8
Table 22 Technology parameters for Imec 3D130 process11
Table 23 Technology parameters for Imec 3D65 process11
Table 24 Comparison energy-recoveryconventionalCMOS 16
Table 31 Test structures on the PRC1 module 31
Table 32 PRC1 test pad instrument connections 37
Table 33 TSV capacitance D2D variability on waferrsquoD11rsquo 38
Table 34 TSV capacitance D2D variability on waferrsquoD18rsquo 39
Table 35 Variable pitch TSV capacitance experimentaldata from wafer rsquoD18rsquo 43
Table 36 Comparison between CBCMLCR measure-ments on wafer rsquoD18rsquo 45
Table 41 Coefficients of thermal expansion 47
Table 42 MOSFET array configurations for test padmodules PTP1PTP2 54
Table 43 MOSFET array configurations selected forexperimental evaluation 61
Table 44 PTP1PTP2 test pad module instrumentconnections 64
Table 45 Summary of experimental measurementson long (L) and short (S) channel PMOSdevices 73
Table 51 Energy dissipation results ( f Jcycle) the-oretical model (T) versus SPICE simulation(S) 84
Table 61 Energy-recovery IO circuit parameters 93
Table 62 Calculated resistance tolerances for eachwire branch in the resonant clock distribu-tion tree 100
Table 63 Energy-dissipation theoretical calculationsfor Q = 17 f = 500MHz 104
Table 64 Post-layout simulated energy dissipation Area requirements 106
Table 65 Energy-recovery circuit instrument connec-tions 108
Table 66 CMOS circuit instrument connections 108
xiii
A C R O N Y M S
2D two-dimentional
2N-2P 2 nMOS2 pMOS adiabatic logic
3D-IC three-dimentional integrated circuit
3D-SIC 3D stacked-IC
3D-SIP 3D system-in-package
3D three-dimentional
3D-WLP 3D wafer-level-packaging
BCB benzocyclobutene
BEOL back-end-of-line
CAL clocked adiabatic logic
CBCM charge-based capacitance measurement
CMOS complementary metal-oxide-semiconductor
CMP chemical-mechanical planarization
CTE coefficient of thermal expansion
Cu copper
CVD chemical vapor deposition
D2D die-to-die
D2W die-to-wafer
DCVSL differential cascode voltage switch logic
DRC design rule check
DUT device under test
ECRL efficient charge recovery logic
EDA electronic design automation
FEM finite element method
FEOL front-end-of-line
GPIB general purpose interface bus
IC integrated circuit
xiv
acronyms xv
IO inputoutput
KOZ keep-out-zone
LC inductance-capacitance
LVS layout versus schematic
MEMS microelectromechanical systems
MIM metal-insulator-metal
MOSFET metal-oxide-semiconductor field-effect transistor
MOS metal-oxide-semiconductor
MPU microprocessor unit
MuGFET multiple gate field-effect transistor
NMOS n-channel MOSFET
P2LC pulse-to-level converter
PAL pass-transistor adiabatic logic
PCB printed circuit board
PCG power-clock generator
PMD pre-metal dielectric
PMOS p-channel MOSFET
PVT process voltage temperature
QSERL quasi-static energy recovery logic
RC resistance-capacitance
RDL re-distribution layer
RLC resistance-inductance-capacitance
RMS root mean square
SIP system-in-a-package
Si silicon
SMU source measurement unit
SoC system-on-chip
TaN tantalum nitride
TSV through-silicon via
W tungsten
acronyms xvi
W2W wafer-to-wafer
WLP wafer-level-packaging
1I N T R O D U C T I O N
Extending integrated circuit (IC) functionality while keepingcost per silicon area approximately constant has historicallybeen the fundamental factor behind the growth of the semi-conductor industry Ever since Moorersquos observation [51] in 1965
chip functionality has been doubling every 15-2 years advancingfrom the Intel 4004 with 2000 transistors to modern generationmicroprocessor units (MPUs) that integrate over 1 billion tran-sistors [35] While there are indications that this trend might beslowing down due to planar metal-oxide-semiconductor field-effect transistor (MOSFET) physical limitations [22] innovativesolutions such as multiple gate field-effect transistors (MuGFETs)continue to drive the transistor roadmap forward as was recentlydemonstrated by Intel on their 22nm node MPU [12] It can bereasonably expected that IC transistor density will continue toincrease well into the next decade reaching 10 or even 100 billiontransistors per chip [4]
As transistor gate length dielectric thickness and junctiondepth are scaled performance in terms of speed and power im-proves as well The performance requirements for ICs have beensteadily increasing for succeeding technology generations drivenby market demand Although transistor scaling has traditionallybeen the main technology focus determining both cost and per-formance of ICs in recent years the performance gains enabled byscaling have been gradually challenged by on-chip interconnects
In an IC interconnects are the conducting wires connectingdevices and functional blocks in the circuit With smaller tran-sistors and increased functionality in modern processes intercon-nects have been progressively becoming more complex occupyingup to 12 metal layers in the chip As wire length increased sodid the signal delay time and power consumption of intercon-nects which are typically determined by wire intrinsic parasiticsin terms of resistance-capacitance (RC) It was estimated that a130nm microprocessor consumed over 50 of dynamic poweron interconnects [41] with a projection that in future designsup to 80 of microprocessor power would be consumed byinterconnects [3] Likewise signal propagation delay of globalinterconnects as a portion of the total system delay has beenrapidly increasing for succeeding technology nodes (Figure 11)
The rapid increase of the interconnect impact on IC perform-ance prompted ITRS1 to appoint the realisation of novel and
1 International Technology Roadmap for Semiconductors
1
11 research goals and contribution 2
Figure 11 Scaling trend of logic and interconnect delay [30]
innovative interconnect technologies as one of the ldquoGrand Chal-lengesrdquo in 2007 [3] Since then several technologies have been in-vestigated to address the interconnect bottleneck such as copperlow-k interconnects optical interconnects and three-dimentional (3D)interconnects Although copperlow-k interconnects are morecompatible with current processes and remain the most attractiveoption into the near term 3D interconnects are probably the mostrealistic option for meeting future technology trends [53]
Emerging 3D integration technologies based on vertical through-silicon vias (TSVs) promise a solution to the interconnect per-formance bottleneck along with reduced fabrication cost andheterogeneous integration TSVs are electrical connections passingthrough the silicon substrate and interconnecting separate diesstacked vertically in a 3D package [69] While there are several 3D
interconnect technologies [11] TSVs have the potential to offer thebest all around 3D integration technology with their advantagesincluding reduced parasitics and high interconnection density
11 research goals and contribution
As TSV is a relatively recent 3D interconnect technology there arestill challenges to be overcome before TSVs can be widely adoptedand commercialised Fabrication issues such as etching linerdeposition metallization and chemical-mechanical planarization(CMP) steps are in the process of being resolved or solutions havealready been found [55] Consequently the next logical step isthe implementation of three-dimentional integrated circuits (3D-
ICs) with more complex functionalities than simple test chipsHowever complex circuit design requires design methodologiesand tools that for 3D-ICs are still at their infancy stage [18]
The motivation behind this project was to accelerate the de-velopment of such design methodologies and tools for future
12 thesis outline 3
3D-IC designs by addressing key concerns of the TSV technologyprimarily from the circuit designerrsquos perspective The researcheffort was aimed at developing characterization structures forevaluating TSV characteristics affecting signal propagation delaypower consumption and circuit robustness as well as proposingdesign techniques for overcoming limitations Although TSVs areinterconnects their behaviour resembles more to MOSFET devicesthan the commonly used back-end-of-line (BEOL) metal wireswhile their integration into silicon introduces new challenges forICs Consequently under certain conditions conventional designapproaches are sufficient neither for TSV characterization nor forTSV interconnection
The TSV parasitic capacitance and thermomechanical stress dis-tribution are critical TSV characteristics The parasitic capacitanceaffects both delay and power consumption of signals passingthrough the TSV interconnect while thermomechanical stress af-fects MOSFET device mobility in the vicinity of TSVs Accuratelydetermining these parameters is necessary for generating designrules and models that can be used in TSV-aware design tools Aspart of this project sophisticated test structures were developedand fabricated on a 65nm 3D process for extracting the TSV para-sitic capacitance and thermomechanical stress distribution
The TSV parasitic capacitance is typically small however it maystill become an important source of power consumption as TSVs
are gradually implemented in large densely interconnected 3D-system-on-chips (SoCs) Although conventional low-power designtechniques based on voltage scaling can be used this representsa technical challenge in modern technology nodes In this worka novel TSV interconnection scheme is proposed which is basedon energy-recovery logic and shows frequency-dependent powerconsumption The scheme is analysed using theoretical modellingwhile a demonstrator IC was designed based on the developedtheory and fabricated on a 130nm 3D process
The work presented in this thesis was part of a larger researchproject undertaken at Imec2 with the goal of improving andoptimising TSV-based 3D integration technology The authorrsquoscontribution within this project was to propose design and sim-ulate the circuits presented in the following chapters as well asperform and analyse the experimental measurements
12 thesis outline
This thesis is organised into seven Chapters and two Appendixesas follows
2 Imec is a micro- and nanoelectronics research center located in Leuven Belgium(httpwwwimecbe)
12 thesis outline 4
Chapter 2 (Background) introduces the basic concepts com-prising the fields of 3D integration and energy-recovery logic Inaddition the design flow used for implementing the design partof this work is described
Chapter 3 (Characterization of TSV capacitance) presents agroup of test structures for characterizing the TSV parasitic ca-pacitance under various conditions The test structures are fab-ricated on a 65nm 3D process and the measurement data areanalysed using statistical methods
Chapter 4 (TSV proximity impact on MOSFET performance)presents a test structure for monitoring the impact of TSV-inducedthermomechanical stress on MOSFET device performance Thetest structure uses proven design techniques for implementinga large number of test-cases and accessing numerous deviceswith maximum precision The structure is fabricated and usedfor characterizing thermomechanical stress on a 65nm 3D process
Chapter 5 (Evaluation of energy-recovery for TSV intercon-nects) investigates the potential of the energy-recovery techniquefor reducing the energy dissipation of TSV interconnects in 3D-ICsThe total energy dissipation per cycle and optimum device sizingare extracted for the proposed scheme using theoretical model-ling while the configuration is evaluated against conventionalstatic complementary metal-oxide-semiconductor (CMOS) design
Chapter 6 (Energy-recovery 3D-IC demonstrator) elaborates onthe design of a 3D demonstrator circuit based on the energy-recovery scheme under realistic physical and electrical con-straints The demonstrator is compared to a CMOS circuit de-signed with identical specifications using post-layout simulationsBoth circuits are fabricated on a 130nm 3D technology processand evaluated based on the experimental measurements
Chapter 7 (Conclusions) summarises the contributions of thisthesis as presented in the previous chapters and discusses areasof future work
Appendix A includes photos used as reference in various partsof this work
Appendix B elaborates on the statistical descriptors used inthis work for analysing experimental measurement data
2B A C K G R O U N D
The work presented in this thesis contributes in the fields of 3D
integration and energy-recovery logic This chapter introduces thebasic concepts comprising these two fields not in extensive detailbut enough that the reader can understand the motivation andthe choices made in the context of this work Furthermore thefull-custom design flow is discussed that was used to implementthe design part of this thesis as will be presented in the followingchapters
21 3d integrated circuits
CMOS integration technologies are traditionally based on two-dimentional (2D) planar architectures Each die is individuallypackaged and placed on a printed circuit board (PCB) which isused to provide mechanical support and interconnection betweendifferent dies and electronic components A 3D-IC is a stack ofmultiple dies (Figure 21) vertically integrated and interconnec-ted sharing a common package
There are several motivations for the development of 3D-ICsPerhaps the most significant advantage of 3D integration is thereduced interconnection delay and energy dissipation In generalglobal interconnect delay and energy is linearly related to wirelength in an IC By using 3D integration modelling of simple3D-IC architectures has demonstrated that up to 50 reduction inglobal interconnect length may be achieved [24] with comparablereductions in delay and energy
Another advantage of 3D integration is heterogeneous integra-tion Presently 2D planar architectures limit designers to a singlefabrication technology for both the analog and digital parts of
ABCDEFBA ABCDEFBA
Figure 21 Comparison of 2D3D Integration
5
21 3d integrated circuits 6
ABCDEFBC
E
Figure 22 System-in-a-package
ABCDEFBC
E
Figure 23 Package-on-package
a circuit The common practice is to use an inexpensive digitalprocess for the complete design which is usually inefficient forimplementing analog circuits In contrast a 3D-IC allows for themost suitable process technology to be used for implementingeach part of the design In addition other components may beintegrated in the same circuit such as microelectromechanicalsystems (MEMS) [80] and passive devices [66] (resistors capacitorsinductors) enabling ICs with novel functionalities
The replacement of long horizontal global interconnects withshort vertical ones allows for reduced die size as well which isdirectly related to the fabrication cost Also the vertical integ-ration of dies in the same package removes the need for thespacing overhead between components that is necessary in PCBsincreasing the area efficiency and reducing the form factor
211 3D integration approaches
3D integration may be achieved using several microelectronictechnologies which result in different densities and capabilitiesThe levels of 3D integration can be reasonably classified into threemajor technology platforms [11 4] based on the interconnectwiring hierarchy and the industrial infrastructure
bull 3D system-in-package (3D-SIP) Stacking of individual pack-ages
bull 3D wafer-level-packaging (3D-WLP) Stacking of embeddeddies by wafer-level-packaging (WLP)
bull 3D stacked-IC (3D-SIC) TSV-based vertical system integra-tion
The 3D-SIP integration is based on traditional packaging intercon-nect technologies Each sub-system is integrated in a system-in-a-package (SIP) fashion (Figure 22) using wire-bonds for inter-connection and subsequently each SIP subsystem is assembledvertically to create package-on-package 3D stacks (Figure 23)
3D-SIP is currently the most mature technology and has alreadyreached volume production It provides with increased function-ality in a smaller form factor and low cost However interconnectdensity is limited since both wire bond pads and solder bumps
21 3d integrated circuits 7
Figure 24 Imecrsquos 3-D SIP wireless bio-electronic sensor [67]
A
BCDEAFE
FFDC
CFE
Figure 25 3D wafer-level-packaging
take up a large area on silicon due to their large pitch The longbond wire length can also significantly distort signals due toits intrinsic parasitics Hence this approach offers limited per-formance when compared to other 3D interconnect solutions Anexample of 3D-SIP application is the wireless bio-electronic sensor(Figure 24) developed at Imec [67]
3D-WLP integration is based on WLP technology directly stack-ing individual dies face-to-face using flip-chip micro-bumps forinterconnection The 3D interconnects are processed after IC pas-sivation using WLP techniques and may also be realised usingTSVs (Figure 25) The TSVs if existing are much larger than theequivalent interconnections in a 3D-SIC typically in the range of20 minus 40microm (1 minus 5microm in 3D-SIC) [11]
The 3D-SIC technology [69] is similar to a 2D SoC approachexcept that circuits exist physically on different layers and verticalinterconnects replace the global interconnections of the SoC 3D-
SICs (Figure 26) are based on TSV interconnections and have thepotential to offer the best all around3D integration technology Theadvantages of TSV-based 3D-SICs include the short interconnectlength (reduced parasitics) and the high interconnection densityHowever the additional process steps required for implementingthe TSVs increase manufacturing cost
A comparison between the discussed 3D interconnect technolo-gies is attempted in Table 21
21 3d integrated circuits 8
ABCD
EFFDB
F
Figure 26 3D-stacked IC
3D interconnect
technologyAdvantages Disadvantages
3D system-in-package (3D-SIP)
Simple existingtechnology bestheterogeneous
integration
Performance(speed power) low
densitymanufacturing costin high-quantities
3D wafer-level-packaging (3D-
WLP)
Existingtechnology good
compromisebetween
performance-manufacturing
cost
Averageperformance and
density
3D stacked-IC (3D-SIC)
Performance(speed power)
high densitymanufacturing costin high-quantities
Manufacturing costin low-quantities
not ready formanufacturing
Table 21 3D interconnect technologies comparison
21 3d integrated circuits 9
ABBCDE
FBACACDA
ACCD F
Figure 27 TSV process steps
212 TSV interconnects
Of particular interest to this PhD thesis is the 3D-SIC approachdeveloped at Imec and specifically the TSVs used for verticalinterconnection
In Imecrsquos 3D-SIC process [69 73] TSVs are created prior to theBEOL processing The TSV process steps are illustrated in Fig-ure 27 After processing of the CMOS front-end-of-line (FEOL) andpre-metal dielectric (PMD) stack the TSV hole is plasma-etchedwith a diameter of 5microm and a minimum pitch of 10microm A thinchemical vapor deposition (CVD) oxide layer (sim 200nm) is thendeposited for insulating the TSVs from the substrate The metal-lization sequence consists of a tantalum nitride (TaN) barrier de-position and filling of the TSV hole with electroplated copper (Cu)Finally the Cu overburden is removed in a top-side CMP step anda standard 2-metal layer BEOL process is applied to finalise thedie
Before stacking the wafer is mounted on a temporary carrierand thinned down to a silicon (Si) thickness of sim 25microm leavingthe TSV Cu exposed on the back-side of the wafer (Figure 28)After dicing the dies are stacked by Cu-Cu thermocompressionin a die-to-die (D2D) or die-to-wafer (D2W) fashion (Figure 29)using a tacky polymer as an intermediate glue layer such asbenzocyclobutene (BCB) (ǫr = 27) Target technology parametersfor Imecrsquos 3D-SIC 130nm and 65nm processes are summarised inTables 22 and 23 respectively
As TSV is a relatively new 3D interconnect technology thereare several challenges to be overcome before TSV technology canbe widely adopted and commercialised Issues in the fabricationlike etching liner deposition metallization and CMP steps arein the process of being resolved or solutions have already beenfound [55] On the contrary design methodologies for TSV-basedICs are still at their infancy stage [18]
21 3d integrated circuits 10
AAAB
CAAAB
ABC
DEF
Figure 28 Before thinningafter thinning
ABC ABC
DE DE
F
DEE
B
CBA
Figure 29 TSV stacking
21 3d integrated circuits 11
Minimum devicechannel length (nm)
130
Supply voltage (V) 12
Routing layers 2
TSV diameter (microm) 5
TSV pitch (microm) 10
TSV length (microm) 22
TSV resistance (mΩ) ~20
TSV capacitance( f F)
~40
Wafer diameter(mm)
200
Table 22 Technology paramet-ers for Imec 3D130
process
Minimum devicechannel length (nm)
70
Supply voltage (V) 10
Routing layers 2
TSV diameter (microm) 5
TSV pitch (microm) 10
TSV length (microm) 40
TSV resistance (mΩ) ~40
TSV capacitance( f F)
~80
Wafer diameter(mm)
300
Table 23 Technology paramet-ers for Imec 3D65 pro-cess
Approaching the technology challenges from the designerrsquosperspective we can emphasise on the following issues designtools thermomechanical stress thermal analysis electrical char-acteristics and testing
Design tools
Although it is possible to design 3D-ICs using existing tools byutilising ad hoc techniques as 3D implementations become moreadvanced ldquonativerdquo 3D design and analysis tools will becomenecessary A few physical layout tools are currently availablefor TSV-based design [2] however they do not offer automaticpartitioning placement and routing Moreover there are noavailable commercial tools for evaluating performance (speedpower) reliability and manufacturability in 3D-IC designs
Thermomechanical stress
TSV fill materials may include copper (Cu) tungsten (W) orpolysilicon[20] The large mismatch in the coefficient of thermalexpansion (CTE) between the fill material of the TSV (17 middot 10minus6K)and the Si surface (3 middot 10minus6K) induces mechanical stress aroundthe vias As Cu contracts much faster than Si it pulls the surface ofthe surrounding bulk while cooling down causing tensile stressin the area [25] Since even low stress values affect carrier mobilityin MOSFET devices [71] performance shift of the transistors inproximity to TSVs may occur
21 3d integrated circuits 12
The intensity of TSV-induced stress depends on various condi-tions such as TSV geometry device proximity and process char-acteristics Defining design rules such as keep-out-zones (KOZs)for digital and analog circuits and generating stress-effect modelsis an important milestone for implementing TSV-aware circuitswith predictable performance Simulations predict the complexdistribution of thermomechanical stress [40] which emphasisesthe need for sophisticated characterization structures that canmonitor stress impact with precision and verify simulation mod-els In Chapter 4 a characterization structure is presented forevaluating the impact of TSV-induced thermomechanical stresson MOSFET device performance
Thermal analysis
The close proximity of dies in a 3D-IC vertical stack increasesthe heat density proportionally to the number of stacked diesAs temperature affects transistor parameters such as leakagecurrent accurate thermal analysis is required for 3D-ICs to ensurethat design specifications are met Presently tools for thermalanalysis of planar ICs are available however they have to beadapted for use with complex 3D systems so as to reduce runtime and improve the accuracy of TSV models [18] In additionldquothermally-awarerdquo design tools are required which can partitionthe design in such a way that high heat generating componentsare placed on layers in proximity to the heat-sink
Electrical characteristics
The electrical characteristics of TSVs such as delay crosstalk sig-nal integrity (SI) and power integrity (PI) dictate many aspects ofa 3D-IC design Proper evaluation of TSV characteristics typicallythrough simulation means is critical for accurately predicting theperformance of a 3D system [57]
Several attempts have been made to generate models of theTSV structure [77 27] using its intrinsic resistance-inductance-capacitance (RLC) parasitics in an equivalent circuit model TheRLC parasitics depend on the materials and geometry of the TSV
structure while prior work has indicated that these parametersare relatively small with the capacitance assuming the dominantrole on TSV electrical properties [27] For verifying the accur-acy and calibrating the models experimental measurements arerequired using compatible characterization structures Such struc-tures are presented in Chapter 3 which were used for extractingthe TSVs parasitic capacitance of an experimental 3D process
22 energy-recovery logic 13
Testing
Testing ICs for defects is crucial for guaranteeing a sufficient man-ufacturing quality The unique processing steps of TSVs such aswafer thinning alignment and bonding introduce new types ofdefects that should be considered during testing [46] Further-more finding and diagnosing defects in TSV interconnections isconsiderably more challenging than 2D interconnects consideringthat it is not possible to gain physical access to the vias after thetop die has been placed All these factors emphasise the need fornew testing structures and methodologies for testing 3D systems
22 energy-recovery logic
TSV 3D interconnect technology aims to reduce delay and en-ergy dissipation when compared to the traditionally used planarBEOL interconnections A key TSV characteristic is its parasiticcapacitance discussed in Chapter 3 which is typically small(Tables 22 and 23) influencing proportionally both delay andenergy dissipation Although TSV technology offers significantperformance improvements to previous solutions a low-powerinterconnection scheme may still be desirable under certain con-ditions due to specific characteristics of TSV technology
1 The TSV capacitance may become a concern in future largedensely interconnected 3D-SoCs as it increases linearly withthe number of tiers and interconnections
2 Since interconnects do not follow transistor scaling tran-sistor gate capacitance will always be smaller than intercon-nect capacitance With the gap tending to widen in futuretechnologies it can be reasonably expected that 3D intercon-nects will become a performance bottleneck in ICs just asthe BEOL interconnections preceding them
Furthermore conventional CMOS low-power design techniquesbased on voltage scaling represent a technical challenge in mod-ern technology nodes as process voltage temperature (PVT) vari-ations and increased leakage [61] decrease in practice the rangeover which voltages may be scaled An interesting low-powerdesign approach is energy-recovery logic which demonstratesfrequency-dependent energy dissipation Energy-recovery logicis a design technique for minimising the energy dissipation ofcomputing devices well below typical CMOS limits Its major lim-itation is the requirement for passive components to generatetime-variable signals which cannot be efficiently integrated intocurrent generation ICs TSV-based 3D stacking and heterogeneousintegration of high-quality passive devices [66] may enable theuse of novel design techniques such as energy-recovery that
22 energy-recovery logic 14
Figure 210 Conventional AND gate
were not practically feasible in the past due to technology lim-itations In this PhD thesis energy-recovery is used for drivingTSV-based interconnections in a low-power scheme discussed inChapters 5 and 6
Historically energy-recovery logic has its roots in reversiblecomputing Reversibility is a concept from thermodynamics de-scribing physical processes that after they have taken place canbe reversed without loss or dissipation of energy Reversibleprocesses cause no change in the entropy of the system or itssurroundings (do not generate any waste heat) therefore theyachieve maximum energy efficiency The concept of reversibilityin thermodynamics may be extended to logical reversibility ifa computation can be implemented in such a way that it is al-ways possible to determine the previous state of the computationgiven a description of its current state In practical terms logicalreversibility would require that all information about the state ofthe computation in any stage is retained and never discarded
Yet the typical computing device is not reversible (irreversible)and as Landauer [36] has shown there is a fundamental min-imum amount of energy that must dissipated in such devicesThis minimum energy dissipation is known as the Von Neu-mann - Landauer limit and is at least kT ln 2 per irreversible bitcomputation The irreversible operation of conventional comput-ing can be demonstrated using the AND gate as an example(Figure 210) The AND gate has two input lines and one outputWhen both inputs are at logical 1 the output is 1 however theoutput is 0 in all of the remaining three combinations that have alogical 0 on the inputs Hence every time the gatersquos output be-comes 0 there is loss of information since we cannot determineexactly which input state caused the logical 0 at the output
Computations do not necessarily have to be irreversible aswas shown by Bennett [9] and thus it is still possible to breakthe kT ln 2 limit A reversible computing device would have to
22 energy-recovery logic 15
perform each computation twice the first time in the forwarddirection applying the inputs to a logic function and obtainingthe result and the second time backwards computing the inversefunction and returning the system to its initial state If a com-putation could be performed in such a way and implementedon non-dissipative hardware its energy requirements could bepotentially reduced to zero [10]
In practice even if a circuit is designed to be logically reversibleachieving zero-energy computations utilising existing electricalcircuits is limited considering that any process merely involvingcharge transfer will result in dissipated energy Nevertheless it isstill possible to reduce the energy dissipation during charge trans-ferring to ldquoasymptotically zerordquo levels applying circuit designtechniques generally known as adiabatic switching [8 31]
The operating principle of adiabatic switching can be bestexplained by comparing it to conventional CMOS switching In aconventional CMOS circuit (Figure 211) when the switch closes(typically the pull-up network) a certain amount of charge Q =
CVDD is pulled out of the positive power supply rail that chargesthe load capacitance C up to VDD The energy transferred fromthe power supply during the charging period (T) is [59]
E = QVDD = CV2DD (21)
From the total transferred energy only half ( 12 CV2
DD) is storedon the capacitor C while the other half is dissipated on theswitch and interconnect resistance R As can be derived fromEquation 21 to reduce the energy dissipation in conventionalCMOS either the supply voltage VDD has to be reduced or theload capacitance C
The adiabatic charging of an equivalent node capacitance C
is modelled in Figure 212 In contrast to the conventional CMOS
circuit the supply voltage in this case is not constant but time-variable with a slow rising time (T ≫ RC) If the charging periodof the capacitor (or rising time of the voltage supply) is T then itcan be proven [82 60] that the energy dissipated on the resistanceR would be proportional to
EAdiabatic prop
(
RC
T
)
CV2DD (22)
The distinct advantage of adiabatic switching is that the slowrising time of the supply voltage keeps the voltage difference(and thus current flow) on the resistance R low at all times Asthe charge transfer is spread more evenly over the entire chargingperiod the energy dissipation (E = I2R) is greatly reduced Inaddition as can be derived from Equation 22 increasing thecharging period (T) can potentially reduce the energy dissipationto arbitrary low levels irrespective of the VDD or C parameters
22 energy-recovery logic 16
A
B
CC
D
E
D
Figure 211 CMOS switching
AA
B
C
B
Figure 212 Adiabatic switching
Conventional CMOS Energy-recovery CMOS
2 states True False 3 states True False Off
Energy of information bitdissipated as heat on the
pull-down network
Energy of information bitrecovered by the
power-supply
High speed Lowmedium speed
High energy dissipation Very low energy dissipation
Area efficient Some area overhead
Table 24 Comparison energy-recoveryconventional CMOS
Combining reversible computations with adiabatic switchingtechniques allows for circuit implementations with extremely lowenergy dissipation compared to conventional designs [37 38]However fully reversible circuits in CMOS technology sufferfrom high area overhead as the inverse logic function has tobe implemented and intermediate logic states must be stored toenable reversibility of the computation [8] In cases where theoverhead of full reversibility is too high it can be significantlyreduced by allowing some information to be occasionally dis-carded without sacrificing much of the energy benefits Thesecircuits that partially apply the principles of reversible comput-ing and adiabatic switching to achieve low but not zero energydissipation are called energy or charge recovery [31]
A comparison between energy-recovery and conventional CMOSis summarised in Table 24
22 energy-recovery logic 17
ABCABC
Figure 213 Basic 2N-2P adiabaticcircuit
ABC
ABC
DEF
EE
DEF
Figure 214 2N-2P timing
221 Energy-recovery architectures
Several energy-recovery circuit architectures have been proposedin the literature over the past years Examples of logic families are2 nMOS2 pMOS adiabatic logic (2N-2P) [33] efficient charge re-covery logic (ECRL) [50] pass-transistor adiabatic logic (PAL) [54]clocked adiabatic logic (CAL) [45] quasi-static energy recoverylogic (QSERL) [83] and boost logic [62] These architectures applythe principles of reversible computing and adiabatic switchingto a variable degree while their differences are mainly relatedto characteristics such as singledual-rail logic style number ofclock phases chargingdischarging path and energy efficiency
A simple configuration is the 2N-2P family of logic circuits(Figure 213) which is based on the differential cascode voltageswitch logic (DCVSL) circuit [59] The timing diagram of the 2N-2P
circuit is presented in Figure 214 using an idealised adiabaticvoltage supply Initially the voltage supply is in the WAIT phase(LOW) keeping the outputs in the LOW state Then the inputs areset complementary to each other and the supply voltage rampsup As the inputs are evaluated the output that was enabled bythe pull-down network follows the power supply until it reachesVDD At that moment the inputs return to the LOW state andafter a certain period of time in the HOLD ldquo1rdquo phase the voltagesupply ramps down discharging the output As can be observedthe average voltage supply during the computation cycle is wellbelow VDD which can be advantageous in modern technologynodes where leakage currents are of a concern
The adiabatic chargingdischarging takes place during theramping updown of the EVALUATERESET phases while thereverse computation occurs during the RESET phase when thetransferred charge is returned to the power supply Howeverthe reversibility of the circuit is only partial since the p-channel
22 energy-recovery logic 18
Figure 215 Adiabatic driver
MOSFET (PMOS) transistor that was turned-on by the pull-downnetwork will be forced to turn-off when the output is approxim-ately a threshold voltage above zero This will cause some chargeto be trapped on the output which will be dissipated as heat on asubsequent cycle Using the 2N-2P circuit arbitrary logic functionsmay be implemented in the place of the pull-down network
The energy efficient characteristics of energy-recovery logic canbe also beneficial for driving large load capacitances A circuitthat may be used in such case is the adiabatic driveramplifier(Figure 215) [8]
The input of the adiabatic driver is dual-rail encoded as well asthe output The operation principle is straightforward First theinputs are set to valid values and then an adiabatic time-variablesignal goes through the pass-gate charging the load capacitanceat the output to its maximum value The output is now validand can be used by other stages to perform computations Sub-sequently the output slowly discharges returning the charge backto the power-supply Energy is saved by two methods Firstlythe slow-rising adiabatic signal forces a low voltage differencebetween the inputs of the pass-gate resulting in low currentand thus low energy dissipation Secondly the energy that isnot dissipated as heat during the transfer is returned back tothe power-supply which can be re-used in the next computationcycle
Circuits utilising the energy-recovery technique have been suc-cessfully implemented in the past while they have demonstratedgreat potential whilst driving highly-parasitic global intercon-nects such as clock distribution networks [42 63]
222 Power-clock generators
Energy-recovery circuits require specialised power supplies thatcan generate multiple-phase time-variable adiabatic signals and
22 energy-recovery logic 19
A
A
B
C
C
Figure 216 Stepwise charging generator
recover charge Since the power supply in an energy-recoverycircuit provides with both the power and clock signal they arereferred to as power-clock generators (PCGs) [6] The power-clockgenerators fall into two categories staircase and resonant gener-ators
Staircase generators use the stepwise charging technique [68]The basic circuit (Figure 216) is composed of N number of tankcapacitors (CT) that are successively discharged in N steps un-til the load capacitance (CLOAD) is charged to the maximumvoltage level This technique attempts to ldquoemulaterdquo the adiabaticchargingdischarging signal while its linearity depends on thenumber of charging steps N The advantage of staircase generat-ors is that they can be easily integrated on-chip since they do notrequire inductors as is the case with resonant generators How-ever both speed performance and energy efficiency are relativelylow
Resonant generators [44] are based on inductance-capacitance(LC) oscillators A simple circuit is the 1N single-phase powerclock generator of Figure 217 If R is the resistance of the adia-batic driver in Figure 215 then the load capacitance (CLOAD)along with a resonant-tank inductor (L) form an RLC oscillatorresonating at
f =1
2πradic
LCLOAD
(23)
As typical for RLC oscillators the shape of the power-clockis a sinewave which approximates the idealised waveform ofFigure 214 Energy is transferred adiabatically during the risinghalf of the sinewave while on the falling half the energy is
23 full-custom design flow 20
A
BCD
E
DFF
CD
Figure 217 1N power clock generator
partially recovered and stored in the inductor for later use RLC
oscillators are very energy efficient power-clock generators andcan operate at high speeds However inductors may be difficultto implement on-chip
23 full-custom design flow
All circuits presented in this thesis were designed using the full-custom approach since the target fabrication technology was anexperimental 3D process for which neither standard cells nordedicated 3D design tools were in our disposal
In the full-custom design methodology each individual tran-sistor and associated connectivity in the integrated circuit ismanually specified in the layout level by the designer Full-customis considered to be superior to standard-cell design when themain concern is circuit performance and area efficiency Further-more full-custom design approaches are essential in cases wherespecialised circuitry are to be designed or when standard celllibraries are not available
The electronic design automation (EDA) environment that wasused was based on Cadence1 applications and tools as integ-rated in the Design Framework II (DFII version 5141) CadenceDFII consists of tools for design management (library manager)schematic entry (Virtuoso schematic) physical layout (Virtuosolayout) verification (Assura) and simulation (Spectre) Thesetools are completely customisable supporting various processtechnologies through foundry-specific design kits In our case
1 Cadence Design Systems Inc (httpwwwcadencecom)
24 summary 21
the Assura tool was replaced by Mentor Graphics2 Calibre forthe verification step
Our customised design flow is illustrated in Figure 218 Ini-tially the circuit is captured at the transistor-level using theschematic editor (Figure 219) and the completed design is simu-lated and modified until it conforms to the design specificationsThe same circuit is re-created in the layout editor (Figure 220)where the detailed geometries and positioning of each fabricationmask layer is described The layout must conform to a series ofprocess design rules and any violations of these rules can be de-tected during the design rule check (DRC) step Subsequently thelayout design is translated to a circuit description (netlist) whichis compared to the initial circuit in the schematic editor duringthe layout versus schematic (LVS) step If the two designs matchthe interconnectivity and transistor parasitics are estimated fromthe layout design and another netlist is created which includesthese parasitics This detailed netlist can be simulated to get amore accurate estimation of the circuit behaviour including theinfluence of parasitics
24 summary
In this chapter the reader was introduced to the basic conceptsand terminology that are relevant in the context of this thesisThe following chapter begins the discussion of the research partof this work presenting a group of test structures that were usedfor the electrical characterization of the TSV parasitic capacitancein an experimental 3D process
2 Mentor Graphics Inc (httpwwwmentorcom)
24 summary 22
ABCDEA
FCCAAE
ABCDEAEC
DE
E
F
CEA
EEAEA
EDE
CAC
E
EC
CAC
Figure 218 Design flow
Figure 219 Inverter in Virtuososchematic
Figure 220 Inverter in Virtuosolayout
3C H A R A C T E R I Z AT I O N O F T S V C A PA C I TA N C E
31 introduction
In a semiconductor process technology electrical characterizationis crucial for testing device performance and reliability after goingthrough the process fabrication steps Electrical characterizationverifies the electrical properties of devices to specifications andprovides feedback to the wafer fabrication process engineers forimproving and optimising the process technology
In a 3D system the electrical behaviour of the TSV can bemodelled in terms of its resistance-inductance-capacitance (RLC)parasitics as discussed in 212 (page 12) Parasitic models are typ-ically used for simulating circuit behaviour and they are criticalfor accurately predicting the performance of a 3D system duringdesign verification [57]
From the RLC parasitics associated with a TSV capacitanceassumes the most significant role on its electrical properties suchas energy dissipation and delay [27] The capacitancersquos role is sodominant that for latency and signal integrity (SI) analysis theTSV behaviour can be modelled by considering the capacitancealone [21] Thus accurate characterization of the TSV capacitanceis highly desirable when designing a 3D system
Typical laboratory capacitance measurement techniques basedon LCR1 meters require well controlled process technologies forrealising a high number of devices without defects which mightnot be possible at the current maturity level of TSV process tech-nology In this chapter an alternative measurement technique isproposed for accurate on-chip characterization of the TSV parasiticcapacitance Test structures based on this technique are fabricatedon a 65nm 3D process (Table 23 page 11) and used for charac-terizing the TSV parasitic capacitance under various conditionsThe experimental measurements are analysed using statisticalmethods and the proposed test structures are evaluated
32 tsv parasitic capacitance
A typical isolated 2D back-end-of-line (BEOL) metal wire can bemodelled as a conductor over a ground plane (Figure 31) Sincethe conductor is separated from the ground plane by a dielectric
1 LCR meter is a measurement equipment used to determine the inductance (L)capacitance (C) and resistance (R) of devices based on impedance measure-ments
23
33 integrated capacitance measurement techniques 24
ABC
Figure 31 Isolated 2D wire capacitance
material (εox) its electrical behaviour closely resembles that ofa capacitor The basic formula for calculating a parallel platecapacitance is
C =εox
hwl (31)
In practice calculating the parasitic capacitance of a BEOL metalwire is significantly more complex since fringing fields increasethe total capacitance value [78] Nevertheless whether a 2D metalwire capacitance is calculated using Equation 31 or more com-plex formulas [85] its value is only dependent on geometricalcharacteristics and the dielectric material
In that context 3D interconnects in the form of TSVs do notbehave as simple wires but as devices The TSV copper (Cu) issurrounded by silicon (Si) and these two materials are separatedby a dielectric oxide liner (Figure 32) This structure resemblesa metal-oxide-semiconductor (MOS) capacitor and as has beenshown by modelling and measurements [27] exhibits distinctaccumulation depletion and inversion regions (Figure 33) TheTSV capacitance attains its maximum value (Cox) in the accumula-tion when Vbias lt VFB (flatband voltage) and inversion regions(Vbias gt Vth f lt 1kHz)
The bias voltage and frequency dependency of the TSV capacit-ance complicates both modelling and the experimental measure-ment of its parasitic capacitance Furthermore the MOS behaviourof TSVs is not desirable in circuits using 3D interconnections Forthat reason special techniques are employed [28] that shift theTSV C-V characteristic in such a way that for typical circuit oper-ating voltages (0 minus VDD) the TSV capacitance is constant and atits minimum value (Figure 33)
33 integrated capacitance measurement techniques
Extracting the TSV parasitic capacitance requires the use of meas-urement techniques that are compatible with the particular TSV
33 integrated capacitance measurement techniques 25
ABCDE
F
FD
F
F
FD
Figure 32 TSV parasitic capacitance
ABCDE FACDE
ECDE
BCFACDE
FCCDED DBCDE
D
DAB
Figure 33 Capacitance vs Gate voltage (CV) diagram of a MOS Capa-citor
33 integrated capacitance measurement techniques 26
characteristics as indicated in the previous section The tech-niques commonly used for measuring integrated capacitances ina laboratory environment can be classified into two categories
1 Measurements based on external instruments these can bedirect capacitance measurements LCR meter based [52] orRF S-parameter based measurements [39]
2 On-chip capacitance measurements examples are the ref-erence capacitor technique [32] ring-oscillator based [26]and charge-based [15]
The disadvantage of using external instruments directly connec-ted to the device under test (DUT) is that the measurement setupintroduces additional parasitics that have to be subtracted fromthe measurement result Using standard measurement setupssignificant noise is added when attempting to determine capa-citances well below 1pF due to the influence of the instrumentparasitics [64] In cases where the DUT capacitance is small suchas the femto-Farad TSV capacitance (Table 23) a large number ofidentical devices (typically in the hundreds) can be connected inparallel to increase the total capacitance value [75] However thismethod is only usable with process technologies that are wellcontrolled and allow the realisation of a high number of deviceswithout defects In general the extracted capacitance of the wholepopulation may not be representative of a single device [56] andas a result individual device measurement accuracy may suffer
On-chip techniques can achieve better accuracy when the DUT
capacitance is small Although DUT yield is not critical the pro-cess technology should allow the integration of active devicesand their parameters should be known to a reasonable degreeA simple yet high resolution technique is the charge-based ca-pacitance measurement (CBCM) which can be used to extractcapacitances down to the atto-Farad range [15]
In the CBCM technique the DUT capacitance (CDUT) is chargeddis-charged at frequency f using a pseudo-inverter (Figure 34) whilethe average charging current (Im) is measured with a laboratoryammeter Since transistor and wiring parasitic capacitances are inparallel with the DUT capacitance a second (reference) inverter isused to measure these parasitics (Ire f ) which are then subtractedfrom the calculated result The DUT capacitance is calculated asfollows
CDUT =Im minus Ire f
VDD middot f(32)
The accuracy of the CBCM technique can be further increasedby measuring the charging current over a range of frequenciesThe DUT capacitance can be then calculated from the slope of
34 electrical characterization 27
AB
AC
DE
FE
DE
FE
AB
AC
B
Figure 34 CBCM pseudo-inverters
Inet (Im minus Ire f ) (Figure 35) The accuracy is greatly improved aspseudo-inverter leakage current is removed from the extractedresult (slope is not affected by constant biases) and also defectiveDUTs can be detected by inspecting the linearity of the current-frequency plot
The CBCM technique has been previously used for measuringBEOL metal interconnect parasitic capacitances [15] and MOSFET
capacitances [64] In this work CBCM is proposed for the on-chipmeasurement of the TSV parasitic capacitance CBCM is compat-ible with the TSV parasitic capacitance on the premise that thevoltage and frequency ranges of CBCM are selected such thatthe TSV is forced into a region where its capacitance is constantAs was shown in the previous section (Figure 33) TSV capacit-ance is constant for typical circuit operating voltages while thefrequency dependency of the capacitance becomes significantonly in the multi-MHzGHz range [39] Therefore operating theCBCM pseudo-inverters in the 0 minus VDD range and below 1MHz
should be adequate to avoid the MOS behaviour of the TSV
34 electrical characterization
A group of test structures based on the CBCM technique wereimplemented for on-chip characterization of the TSV capacitanceon Imecrsquos 65nm 3D process technology (Table 23) As it is commonin IC design maximising the utilisation of the available chiparea is one of the priorities of the design effort For that reasonsupplementary test structures were designed and implemented
34 electrical characterization 28
AB
CADE
F
AB
E
CAD
B
Figure 35 Current as a function of frequency in CBCM
on the same chip that went beyond the core objectives of thiswork
The complete design consisted of 6 test structures 5 for ex-tracting the TSV capacitance under various conditions based onCBCM and 1 additional experiment for direct measurement ofthe RC constant All test structures are presented in this sectionhowever due to equipment and material availability only themost significant were measured and analysed
341 Physical implementation
In the CBCM technique the extraction accuracy of the TSV capacit-ance is highly dependant on the design of the pseudo-invertersThe rsquoreferencersquo pseudo-inverter in CBCM is used for measuringboth the wire parasitic capacitance and the inverter parasitics inthe form of diffusion and gate-drain overlap capacitances There-fore it is critical that the two pseudo-inverters are identical interms of parasitics as any mismatches will appear in the extrac-ted measurement result
Transistor mismatch is mostly influenced by process technologyparameters however it is possible to minimise process variationssuch as in the poly gate length by applying special design tech-niques A major cause of gate length variation is the polysiliconpitch to neighbours and one way to mitigate that effect is to fillunused space between transistor gates with polysilicon dummies[13] This technique was widely used in the CBCM pseudo-inverterdesign presented in this work (Figure 36) Concurrently theinverter transistors were designed with relatively large dimen-sions for the implementation process technology (P10microm1micromN5microm1microm) so as to minimise the effect of variation as a per-centage of the transistor parasitics
34 electrical characterization 29
Poly dummy fills
Metal2Metal1PolysiliconActive
1μm 5μm
Figure 36 CBCM pseudo-inverter pair in layout
Imecrsquos 65nm 3D process allows for 2-tier vertical stacking ofdies (TOP and BOTTOM) with the TSV passing through theTOP die Si substrate as illustrated in Figure 32 The IO foreach test structure was accessible through a standard 2 times 12 testpad module on the TOP die compatible with our wafer probetest system The 2 times 12 test pad module (PRC1) along with thecomplete test structure design is illustrated in Figure 37 while amicrograph image of the fabricated chip can be seen in Figure 38
The CBCM pseudo-inverters were implemented on either TOP
or BOTTOM tier while the TSVs were connected to the invert-ers from TOP or BOTTOM and arranged in 4 configurations (11 times 2 2 times 2 2 times 5) with different pitch distances The 6 test struc-tures available on the PRC1 test pad module are summarised inTable 31 A detailed description of the test structures (rsquoArsquo rsquoBrsquo rsquoCrsquorsquoDrsquo rsquoRrsquo and rsquoErsquo) follows in the paragraphs below
Structure rsquoArsquo Single TSV capacitance (TOP die)
The purpose of test structure rsquoArsquo (Figure 39) is to extract the capa-citance of a single TSV when it is electrically accessed through theTOP die The structure is composed of 3 pseudo-inverters con-nected to the TSVs using identical geometry Metal-2 layer wireswhose capacitance is also measured (Reference) In addition to
34 electrical characterization 30
DTOP DBTM
B10 A10
BREF AREF
B1 A1
R1 R2
SET VDD
RESET GND
C15 C10
C50 C20
SET VDD
ERC SET
EREF EC Metal2 (TOP)Metal1 (TOP)
Metal1 (BTM)Metal2 (BTM)
Figure 37 PRC1 test pad module(layout)
Figure 38 PRC1 test pad module(micrograph)
34 electrical characterization 31
Structure A B
TSVs - 1 2x5 - 1 2x5
Inverterlocation
TOP TOP TOP BTM BTM BTM
TSVconnection
TOP TOP TOP BTM BTM BTM
Multi-TSVconnection
- - TOP - - BTM
Multi-TSVpitch (microm)
- - 15 - - 15
Pad name AREF A1 A10 BREF B1 B10
Structure C D
TSVs 2x2 2x2 2x2 2x2 1x2 1x2
Inverterlocation
TOP TOP TOP TOP TOP TOP
TSVconnection
TOP TOP TOP TOP TOP TOP
Multi-TSVconnection
TOP TOP TOP TOP TOP BTM
Multi-TSVpitch (microm)
10 15 20 50 20 20
Pad name C10 C15 C20 C50 DTOP DBTM
Structure R E
TSVs 1 1 - 1x2 1x2
Inverterlocation
TOP TOP TOP TOP TOP
TSVconnection
TOP TOP - TOP TOP
Multi-TSVconnection
- - - TOP BTM
Multi-TSVpitch (microm)
- - - 20 20
Pad name R1 R2 EREF EC ERC
Table 31 Test structures on the PRC1 module
34 electrical characterization 32
Pseudo-inverters
2x5 TSVs
TSV
Reference
Metal2 (TOP)Metal1 (TOP)
Figure 39 Structure rsquoArsquo - Single TSV capacitance (TOP die)
the single TSV DUT the structure also contains 10 parallel TSVs
arranged in a 2 times 5 array with a 20microm pitch Since the suitabilityof the CBCM method for extracting the small TSV capacitance wasundetermined during design time the 2times 5 array was included asa precaution effectively increasing the DUT capacitance an orderof magnitude
Structure rsquoBrsquo Single TSV capacitance (BOTTOM die)
The design of structure rsquoBrsquo is similar to the structure rsquoArsquo asdescribed above The only difference is that the CBCM pseudo-inverters are implemented on the BOTTOM die (Figure 310)while connectivity to the TSVs is provided using a wire on theMetal-2 layer of the BOTTOM die The purpose of this structureis to examine whether the perceived TSV capacitance varies whenit is accessed from the BOTTOM side
Structure rsquoCrsquo Variable pitch TSV capacitance
The use of minimum TSV pitch in designs is highly desirable soas to minimise total chip area The purpose of test structure rsquoCrsquo(Figure 39) is to investigate whether TSV pitch has an effect onthe parasitic capacitance It has been previously reported in the
34 electrical characterization 33
2x5 TSVs
Reference
TSV
Pseudo-inverters
Metal2 (BTM)Metal1 (BTM)
Figure 310 Structure rsquoBrsquo - Single TSV capacitance (BOTTOM die)
literature that TSV proximity to other TSVs may affect its parasiticcapacitance probably through an interaction with the oxide liner[23] Test structure rsquoCrsquo aims to verify the presence of the pitcheffect on the 65nm 3D process
The CBCM pseudo-inverters are implemented on the TOP die(Figure 311) while the TSVs are connected in parallel in groups of4 with variable pitch distances at 10microm 15microm 20microm and 50micromAll groups connect to the pseudo-inverters using identical lengthwires Since wire parasitics are not measured their value is notextracted from the measured TSV capacitances thus the resultsare relative
Structure rsquoDrsquo TSV capacitance with interconnection on TOP or BOT-
TOM die
The purpose of structure rsquoDrsquo (Figure 312) is similar to structurersquoBrsquo which is to examine whether the perceived TSV capacitancechanges when it is accessed from the BOTTOM die However inthis structure the CBCM pseudo-inverters are implemented on theTOP die in the case the front-end-of-line (FEOL) active devices ofthe BOTTOM die were not capable of operating properly afterfabrication and stacking
34 electrical characterization 34
2x2 TSVs - 10μm
2x2 TSVs - 15μm
2x2 TSVs - 20μm
2x2 TSVs - 50μm
Metal2 (TOP)Metal1 (TOP)
Figure 311 Structure rsquoCrsquo - Variable pitch TSV capacitance
Metal2 (TOP)Metal1 (TOP)Metal1 (BTM)
Figure 312 Structure rsquoDrsquo - TSV capacitance measurement with inter-connection on TOP or BOTTOM die
34 electrical characterization 35
Metal2 (TOP)Metal1 (TOP)
Figure 313 Structure rsquoRrsquo - Evaluation of TSV capacitance measurementaccuracy
Metal2 (TOP)Metal1 (TOP)Metal1 (BTM)
Driver
Reference
Driver
Figure 314 Structure rsquoErsquo - TSV capacitance effect on signal propagationdelay (layout)
Structure rsquoRrsquo Evaluation of TSV capacitance measurement accuracy
Structure rsquoRrsquo (Figure 313) aims to evaluate the accuracy withwhich CBCM can extract the capacitance of two nearby TSVs Thetwo pseudo-inverters wires and TSVs are identical thus theor-etically they should give identical results Any variance in theresult should be mainly due to transistor variability
Structure rsquoErsquo TSV capacitance effect on signal propagation delay
Structure rsquoErsquo (Figure 314) allows the direct measurement ofthe delay caused by the TSV on signal propagation The signalpropagation delay is measured under two conditions When thesignal passes through the TSV (which includes the effect of resist-ance on delay) and when only the TSV capacitance is in the pathof the current (Figure 315) The contribution of wiretransistorparasitics is also measured so as to be subtracted from the finalresult
34 electrical characterization 36
ABC
D
ECBFC
ABC
Figure 315 Structure rsquoErsquo - TSV capacitance effect on signal propagationdelay (schematic)
ABC
DEF
CD
A
CA
A
Figure 316 Measurement setup
342 Measurement setup
The presented test structures were fabricated on 300mm wafers(Figure A1) each containing 472 identical dies For IO accessa general purpose on-wafer measurement system was used thatallowed direct contact with the wafer and the 2 times 12 test padmodule through probecard pins (Figure A2) All IO signals wereprovided by standard laboratory instruments controlled throughthe general purpose interface bus (GPIB) port of a PC runninga customised LabVIEW2 program The measurement setup isillustrated in Figure 316
The voltage source of the CBCM pseudo-inverters was set at1V and the CBCM pulse frequencies were swept in the range100kHz minus 1MHz For each measured test structure the DC cur-rent measurement was stored in non-volatile memory and sub-
2 LabVIEW is a visual programming language from National Instruments forautomating measurement equipment in a laboratory setup
34 electrical characterization 37
Test pad Direction Instrument connection
VDD Power Power supply - 1V
GND Power Power supply - 0V
SET InputPulse generator
(Non-overlapping with RESET)
RESET InputPulse generator
(Non-overlapping with SET)
A[REF 1 10] InputSMU (Inverter source - 1V
supply)
B[REF 1 10] InputSMU (Inverter source - 1V
supply)
C[10 15 2050]
InputSMU (Inverter source - 1V
supply)
D[TOP BTM] InputSMU (Inverter source - 1V
supply)
R[1 2] InputSMU (Inverter source - 1V
supply)
E[REF C RC] Output Oscilloscope (high-impedance)
Table 32 PRC1 test pad instrument connections
sequently the prober progressed to the next die measuringall dies on the wafer The IO test pads of the PRC1 module(Figure 37) and the instrument connections are summarised inTable 32
343 Experimental results
There were two wafers available for measurements named in thistext rsquoD11rsquo and rsquoD18rsquo which were selected from the same lot Eachwafer contained 472 identical dies of the TOP 3D tier measuredusing the measurement setup described in 342 Measurementdata were extracted only from test structures rsquoArsquo (Figure 39) andrsquoCrsquo (Figure 311) which were analysed using standard statisticalmethods
When utilising statistical methods to analyse data acquired ina laboratory environment it is common for some data pointsto be significantly away from the other measurements Thesedata points are termed outliers (Appendix B) and usually resultfrom measurement errors such as bad probe pin contacts ormalfunctioning dies on the wafer Since the following analysisis focused on TSV capacitance variability these few outliers areexcluded from the data set so as to improve analysis accuracy
34 electrical characterization 38
Structure rsquoArsquo (Fig 39)
TSV dimensions (ld) 40microm5microm
Oxide liner thickness 200nm
Total dies 472
Processed dies (excluding outliers) 454
Mean capacitance 709 f F
Standard deviation (σ) 19 f F
Coefficient of variation (3σmicro) 80
Table 33 TSV capacitance D2D variability on wafer rsquoD11rsquo
The statistical descriptors used for analysing the measurementdata are elaborated in Appendix B and include calculations ofmean standard deviation and coefficient of variation Theseparameters are used to evaluate the TSV capacitance in terms ofdie-to-die (D2D) wafer-to-wafer (W2W) and TSV pitch variabilityThe accuracy of the CBCM results is evaluated by comparing themto standard LCR measurements Furthermore the correlationbetween the extracted TSV capacitance and simulated oxide linerthickness is shown signifying the predominant role of the oxideliner on the observed TSV capacitance variability
Die to die (D2D) variability
The die-to-die (D2D) variability is evaluated by measuring thecapacitance of the single TSV of structure rsquoArsquo (Figure 39) acrossall dies on the wafer The measurements are repeated on bothavailable wafers (rsquoD11rsquo and rsquoD18rsquo)
On wafer rsquoD11rsquo 18 dies showed a capacitance value severalstandard deviations away from the calculated mean of the totalpopulation These outliers indicated a measurement error or mal-functioning on the die and they were removed from the statisticalanalysis In the remaining 454 dies the mean capacitance of theexperimental data was 709 f F with a standard deviation (σ) of19 f F (Table 33) Standard deviation is an important indicator al-lowing us to determine whether the variance in the experimentaldata is expected or if there is a random error affecting the meas-urements As can be seen in Figure 317 706 of the measuredvalues are within plusmnσ of the mean capacitance value Since mostof the experimental data are within a standard deviation of themean value the variability observed is insignificant and expectedthus the measurement results can be considered valid
On wafer rsquoD18rsquo from the 472 total dies 58 were removed asoutliers The mean capacitance of the experimental data was895 f F with a standard deviation (σ) of 22 f F (Table 34) Theprobability density function can be seen in Figure 318 with 691
34 electrical characterization 39
-60 -40 -20 00 20 40 60 80
σ-σ
706
Deviation from mean (fF)
0
005
01
015
02
025
Figure 317 TSV capacitance D2D variability on wafer rsquoD11rsquo - Probabil-ity density function
Structure rsquoArsquo (Fig 39)
TSV dimensions (ld) 40microm5microm
Oxide liner thickness 200nm
Total dies 472
Processed dies (excluding outliers) 414
Mean capacitance 895 f F
Standard deviation (σ) 22 f F
Coefficient of variation (3σmicro) 74
Table 34 TSV capacitance D2D variability on wafer rsquoD18rsquo
of the experimental data within plusmnσ of the mean capacitancevalue
Correlation between TSV capacitance and oxide liner thickness vari-
ation
The TSV capacitance as extracted from the experimental measure-ments is plotted for wafers rsquoD11rsquo and rsquoD18rsquo (Figures 319 and 320)The capacitance value in the figures is normalised to the calcu-lated mean while xy axis indicate die coordinates on the waferWe observe that on both wafers the capacitance is near the meanvalue at the periphery of the wafer and reduces as we approachto the centre indicating a common cause of variability on bothwafers
Parameters affecting the TSV capacitance are the geometry ofthe TSV (length diameter) and oxide liner characteristics (dielec-tric constant thickness) The variation of the oxide liner thicknessacross the wafer can be predicted by simulation means Oxideliner simulation results for the specific process and fabrication
34 electrical characterization 40
σ-σ
691
Deviation from mean (fF)-70 -50 -30 -10 10 30 50 70
0
002
004
006
008
01
012
014
016
018
02
Figure 318 TSV capacitance D2D variability on wafer rsquoD18rsquo - Probabil-ity density function
075
080
085
090
095
100
105
110
0
5
10
15
20
25
30
-30-25
-20-15
-10-5
Norm
aliz
ed
capacitance
Die X
Die Y
075
080
085
090
095
100
105
Wafer area
Figure 319 TSV capacitance measurement results on wafer D11
34 electrical characterization 41
086088090092094096098100102104106108
0
5
10
15
20
25
30
-30-25
-20-15
-10-5
Norm
aliz
ed
capacitance
Die X
Die Y
086
088
090
092
094
096
098
100
102
104
106
108
Wafer area
Figure 320 TSV capacitance measurement results on wafer D18
tools can be seen in Figure 321 The liner thickness in the graphis normalised to the mean value and xy axis indicate die coordin-ates on the wafer As can be observed the oxide liner thickness isnear the mean value at the periphery of the wafer and increasesto a peak value when approaching the centre Since the TSV ca-pacitance is inversely proportional to the oxide liner thicknesstherersquos a strong correlation between the two variations suggestingthe oxide liner thickness to be the major cause of die-to-die (D2D)variability on wafers rsquoD11rsquo and rsquoD18rsquo
Wafer to wafer (W2W) variability
Probability density functions for both wafers rsquoD11rsquo and rsquoD18rsquo areplotted in Figure 322 Comparing the two experimental data sets(Tables 33 and 34) we can see that although the main statisticaldescriptors have not changed much (standard deviation is slightlylarger in D18) the mean capacitance is much larger in rsquoD18rsquo(895 f F) when compared to rsquoD11rsquo (709 f F) The difference is 262capacitance increase from wafer to wafer Taking into accountthat both wafers are from the same lot this result indicates theneed for better controllability on process parameters
34 electrical characterization 42
109
100
095Nor
mal
ized
oxi
de th
ickn
ess
-5-10 -15 -20 -25 -30
510
1520
2530
0Die Y
Die X
Figure 321 Simulated oxide liner thickness variation
60 65 70 75 80 85 90 95 1000
005
01
015
02
025D11 D18
plusmnσ
plusmnσ
Capacitance (fF)
Figure 322 TSV capacitance W2W variability - Probability density func-tions
34 electrical characterization 43
Structure rsquoCrsquo (Fig 311)
TSV dimensions (ld) 40microm5microm
Oxide liner thickness 200nm
Total dies 472
Processed dies (excluding outliers) 424
Experiment C10 C50
Number of TSVs 4 4
Mean capacitance 4073 f F 4125 f F
Standard deviation (σ) 84 f F 74 f F
Population within σ 695 653
Table 35 Variable pitch TSV capacitance experimental data from waferrsquoD18rsquo
Variable pitch TSV capacitance
The variable pitch TSV capacitance was extracted using structurersquoCrsquo (Figure 311) TSVs with pitch distances at 10microm and 50microm
were measured (experiments C10 and C50) on wafer rsquoD18rsquo Fromthe 472 total dies 424 were functional and the measurementresults are summarised in Table 35 The extracted capacitancevalues represent the 4 parallel-connected TSVs which also includethe parasitics of Metal 2 connectivity wires
As can be observed in the probability distribution functions(Figure 323) the measurement data for experiments C10 and C50
are very close to each other and there is an overlap between theirstandard deviations Even though there is an observable changein mean capacitance between the two experiments the overlap ofstandard deviations does not allow to conclude whether the capa-citance change is due to difference in pitch or process variability
To decide whether the change observed in capacitance is signi-ficant we run a KruskalndashWallis (Appendix B) one-way statisticalanalysis of variance in Minitab3 Assuming that the null hypo-thesis is true we calculate a p-value of 0 indicating that the twodata sets are different and that the pitch has indeed an affect theTSVs capacitance value The pitch dependence of the capacitancewith the same direction has been also previously observed onImecrsquos 130nm 3D process [23]
Measurement accuracy of CBCM
The capacitance measurements extracted using the CBCM tech-nique were compared to conventional LCR measurements (Fig-ure 324) on wafer rsquoD18rsquo Comparing the two measurements we
3 Minitab is a program that implements common statistical functions It is de-veloped by Minitab Inc (httpwwwminitabcom)
35 summary and conclusions 44
376 380 384 388 392 396 400 404 408 412 416 420 424 428 4320
001
002
003
004
005
006C10 C50
Capacitance (fF)
plusmnσC10
plusmnσC50
695 653
Figure 323 Variable pitch TSV capacitance experimental data fromwafer rsquoD18rsquo - Probability density function
-80 -60 -40 -20 00 20 40 60 80 1000
002
004
006
008
01
012
014
016
018σ-σ
688
Deviation from mean (fF)
Figure 324 LCR experimental data from wafer rsquoD18rsquo - Probability dens-ity function
observe in Table 36 that the standard deviations of the two datasets are similar while mean capacitance is sim 8 smaller in theLCR method However the LCR method produced more out-liers than CBCM That was expected as LCR cannot extract thecapacitance of a single TSV but can only measure groups of TSVs
connected in parallel (32 in our example) so as to increase thetotal measured capacitance This increases the chance of measur-ing TSVs with defects which in turn affects the accuracy of theLCR extracted capacitance
35 summary and conclusions
In this chapter various test structures based on the CBCM tech-nique were presented for the electrical characterization of theTSV parasitic capacitance The test structures were implementedon a 65nm 3D process technology and measured using a gen-eral purpose on-wafer measurement system The measurement
35 summary and conclusions 45
Method LCR CBCM
Total dies 472 472
Processed dies (excluding outliers) 393 414
Mean capacitance 820 f F 895 f F
Standard deviation (σ) 23 f F 22 f F
Coefficient of variation (3σmicro) 86 74
Table 36 Comparison between CBCMLCR measurements on waferrsquoD18rsquo
results were statistically analysed in terms of TSV capacitancedie-to-die (D2D) and die-to-wafer (D2W) variability The calculatedD2D variability was found to be reasonable while a comparisonto simulated data of the oxide liner thickness variation revealedthe oxide liner as a major contributor to the observed TSV capa-citance variability However D2W variability was extensive andfeedback was provided to the fabrication engineering team forfurther improvement of the process The results also confirmedthe effect of TSV pitch on the parasitic capacitance as has beenpreviously reported in the literature for a different process [23]
The comparison of CBCM-based measurements to conventionalLCR gave similar results while in addition the CBCM techniquewas shown to produce less outliers (less defective samples) in themeasurement data Considering the simplicity of CBCM and thatit has acceptable accuracy for extracting the capacitance of singleTSVs it can be a valuable tool for process characterization in theearly research efforts of 3D integration
The ability of CBCM for measuring single-ended TSVs could alsoprove beneficial for testing TSVs prior to die stacking which is ancritical step for keeping 3D-IC yield acceptable [47] Extracting TSV
parasitic capacitance can provide important information about theTSV device fabrication quality since slight variations in processparameters will greatly affect its parasitic capacitance as wasdemonstrated in this chapter Techniques based on RC constantmeasurement and sense amplification have been proposed in theliterature for a similar role [16] however they are significantlymore complex to implement than CBCM
In the following chapter another critical TSV characteristic ther-momechanical stress is evaluated using a test structure designedand implemented on the same 65nm 3D process
4T S V P R O X I M I T Y I M PA C T O N M O S F E TP E R F O R M A N C E
41 introduction
The integration of through-silicon via (TSV) 3D interconnects inCMOS technology has raised concerns over the impact on activedevices implemented on the same silicon (Si) substrate The largecoefficient of thermal expansion (CTE) mismatch between the TSV
filling material and Si substrate has been shown to induce ther-momechanical stress and deform the surface in proximity to theTSV Since thermomechanical stress can have an effect on activedevice carrier mobility [71] it is essential that the stress impacton MOSFET performance is kept under control Monitoring theTSV-induced thermomechanical stress effect on MOSFET deviceperformance and generating keep-out-zone (KOZ) guidelines fordigital and analog circuits is an important milestone for design-ing TSV-aware circuits with predictable performance
Simulations using finite element method (FEM) modelling [40]predict that TSV-induced thermomechanical stress has a com-plex distribution that depends on numerous conditions such asTSV geometry device type channel orientation and proximityHowever currently published experimental measurements onthermomechanical stress investigate just a small number of test-cases typically limited to a single TSVchannel orientationaxis[53 70 81] The complex distribution of thermomechanical stressemphasises the need for sophisticated characterization structuresthat can monitor stress impact with precision and verify simula-tion models
In this chapter a test structure is presented for monitoring theimpact of TSV-induced thermomechanical stress on MOSFET deviceperformance The test structure uses proven design techniquesfor implementing a large number of test-cases and accessingnumerous devices with maximum precision Furthermore thetest structure is fabricated on an experimental 65nm 3D process(Table 23) and used for characterizing that process Some of thepresented circuits and results have previously appeared in thefollowing publications [48 49 58]
42 tsv-induced thermomechanical stress
Individual TSV integration technologies use different conductivefill materials which may include copper (Cu) tungsten (W) or
46
42 tsv-induced thermomechanical stress 47
Material CTE at 20 ˚C (10minus6K)
Si 3
Cu 17
W 45
Table 41 Coefficients of thermal expansion
Figure 41 TSV thermomechanical stress simulation [40]
polysilicon [20] The coefficient of thermal expansion (CTE) mis-match (Table 41) between the fill material and the silicon (Si) caninduce strain on the surface in proximity to the TSV This deform-ation is caused by thermomechanical stress along the XY-planeduring the cooling-down phase [25] due to the filling materialrsquoshigher-than-Si contraction rate
Simulations using finite element method (FEM) modelling showthat the thermomechanical stress exhibits two-fold rotationalsymmetry with tensile and compressive stresses concentratedon separate axis [40] The thermomechanical stress distributiondepends on TSV geometrical characteristics as well as the inter-action with other TSVs Figure 41 illustrates the simulated stressdistribution of a single TSV while in Figure 42 the interactionbetween two near-by TSVs is shown As can be observed whenmultiple TSVs are in proximity to each other stress-free areas canbe generated as a result of destructive interaction
It is well known that stress can affect carrier mobility in MOS-
FET devices In fact strained-silicon technology processes havebeen in use for years as means to enhance device performance inseveral logic families [71] However in TSV-based 3D integrationthe induced strain is not intentional since it is a characteristic in-trinsic to the TSV process technology and thus the resulting shiftin device performance might not be always desirable Character-ization of the TSV impact on device performance and definition ofdesign rules such as keep-out-zones (KOZs) for digital and ana-log circuits is an important milestone for implementing TSV-awarecircuits with predictable performance
Simulation results [65] have shown that the mobility shift de-pends on various conditions such as TSV geometry device type
43 test structure implementation 48
Figure 42 TSV thermomechanical stress interaction between two TSVs[40]
proximity and channel orientation In Figure 43 the mobilityshift of transistors between two TSVs at variable distances is simu-lated As can be observed the effect is positive and more intenseon PMOS devices while for n-channel MOSFET (NMOS) the mobil-ity shift is mostly negative
The mobility shift due to TSV proximity has been also con-firmed by experimental data [53 70 81] and in most cases theperformance shift was found to be within acceptable limits How-ever currently published experimental results investigate a smallnumber of test-cases typically limited to a single TSVdevice ori-entationaxis As was shown in Figures 41 and 42 and discussedin the text the thermomechanical stress distribution is complexand depends on various factors In order to evaluate the impactof TSV thermomechanical stress accurately and under differentconditions more sophisticated test structures are required Suchapproach is attempted in the test structure which is presented inthe following sections
43 test structure implementation
TSV-induced thermomechanical stress can be compressive in somedirections tensile in others and itrsquos effect on device mobilitymay vary depending on MOSFET typeorientationdistance orinteraction with other TSVs To accurately characterize the effectof TSV proximity on MOSFET device performance a test structurewas designed and fabricated on Imecrsquos 65nm 3D process whichattempted to cover as many test-cases as possible
43 test structure implementation 49
Figure 43 Estimated mobility shift on PMOS (a) and NMOS (b) tran-sistors in between two TSVs separated by distance d [65]
The test structure consisted of MOSFET arrays superimposedover TSVs such that MOSFET devices surrounded the TSVs in alldirections at variable distances The purpose of these experimentswas to indirectly observe the thermomechanical stress effect ontransistor mobility by measuring the saturation current of MOSFET
devices at various distances and directions in relation to the TSVsSeveral configurations of TSVs and MOSFET arrays were investig-ated so as to characterize the impact of single and multiple TSVsas well as to observe the effect on different device types
Stacked silicon dies were not fabricated thus all devices wereimplemented on the TOP die Si substrate as illustrated in Fig-ure 32 (page 25) The test structure inputoutput (IO) was ac-cessible through a standard 2 times 12 test pad module on the TOP
die compatible with our wafer probe test system In total two testpad modules were designed for this experiment PTP1 and PTP2implementing NMOS-type and PMOS-type arrays respectively Testpad module PTP1 along with the test structure layout is illus-trated in Figure 44 while a micrograph image of the fabricatedchip can be seen in Figure 45
43 test structure implementation 50
A0
B0
C0
D0
E0
F0
G0
H0
I0
J0
A90
B90
E90
F90
G90
J90
DS3 DF3
DF2 DS0
DS2 DF0
DF1 DS1
SS SF
DEC6 VDD
DEC7 GND
DEC8 VG
DEC4 DEC9
DEC5 GND
DEC2 DEC3
DEC0 DEC1
Arraydecoder
Draindecoder
Figure 44 PTP1 test pad mod-ule (layout)
Figure 45 PTP1 test pad mod-ule (micrograph)
43 test structure implementation 51
Metal2Metal1PolyActive
TSV
MOSFETs
07μm007μm 0deg
07μm007μm 90deg
05μm05μm 90deg
05μm05μm 0deg
Figure 46 MOSFET dimensions and rotations
431 MOSFET arrays
Each MOSFET array was a separate experiment consisting of 37 times37 transistors which could be either NMOS-type or PMOS-typeshort channel (07microm times 007microm) or long channel (05microm times 05microm)as well as at 0ordm or 90ordm rotation in relation to TSVs (Figure 46)
The MOSFET arrays enclosed TSVs in one of five different config-urations 1 TSV (Figure 47) 4 TSVs diagonal (Figure 48) 4 TSVs
lateral (Figure 49) 9 TSVs (Figure 410) and no-TSVs In total therewere 16 configurations for each MOSFET array type (NMOSPMOS)as listed in Table 42 The location of each MOSFET array in thetop-level design is illustrated in Figure 44
The number of available IO test pads on each module was2 times 12 which imposed a limitation for accessing all 21904 (16 times37 times 37) MOSFETs To overcome that restriction only a subset ofthe transistors in each MOSFET array were wired out (16times 16) andthus electrically accessed using digital selection logic The restof the transistors in the 37 times 37 MOSFET array were inserted asdummy devices placed in the gaps to ensure uniform boundaryconditions and good matching between transistors (Figure 411)
Since thermomechanical stress exhibits two-fold rotational sym-metry as discussed in 42 the stress effect is the same for tran-sistors on the same axis whether they are placed eastwest ofthe TSV on the latitudinal axis or northsouth of the TSV on thelongitudinal axis The symmetry of the stress distribution wasexploited in the design of the MOSFET arrays so as to increase theresolution of the experiment despite the limited number of tran-sistors As can be seen in Figure 411 the transistors in the arrayare placed in an irregular grid with a 2microm pitch Some transistorsare placed on odd-number distances (1microm 3microm) in relation tothe TSV while others on even-number distances (2microm 4microm)Nevertheless rotational symmetry allows the evaluation of thethermomechanical stress at all directions with a 1microm resolution
43 test structure implementation 52
TSV
MOSFETs Substrate taps
Figure 47 1 TSV MOSFET array configuration
MOSFETs Substrate taps
TSV TSV
TSV TSV
Figure 48 4 diagonal TSVs MOSFET array configuration
43 test structure implementation 53
MOSFETs Substrate taps
TSV
TSV TSV
TSV
Figure 49 4 lateral TSVs MOSFET array configuration
MOSFETs Substrate taps
TSV TSV TSV
TSV TSV TSV
TSV TSV TSV
Figure 410 9 TSVs MOSFET array configuration
43 test structure implementation 54
Configuration RotationFET
width(microm)
FETlength(microm)
Name
0 0deg 07 007 A0
1 0deg 07 007 B0
4diagonal 0deg 07 007 C0
4lateral 0deg 07 007 D0
9 0deg 07 007 E0
0 0deg 05 05 F0
1 0deg 05 05 G0
4diagonal 0deg 05 05 H0
4lateral 0deg 05 05 I0
9 0deg 05 05 J0
0 90deg 07 007 A90
1 90deg 07 007 B90
9 90deg 07 007 E90
0 90deg 05 05 F90
1 90deg 05 05 G90
9 90deg 05 05 J90
Table 42 MOSFET array configurations for test pad modulesPTP1PTP2
43 test structure implementation 55
2μm 1μm
Metal2Metal1PolyActive
FET
DUMMY
TAP
TSV
Figure 411 MOSFET array detail
432 Digital selection logic
Each MOSFET array was composed of 256 active transistors (16 times16) Test modules PTP1 and PTP2 implemented 16 array con-figurations each with 4096 devices in total (16 con f igurations times256 transistors) which could be accessed individually throughthe 24 IO pads of the 2 times 12 test pad modules Individual tran-sistor Gate Drain and Source terminals were multiplexed tothe IO pads using a combination of decoders and transmission-gates (Figure 412) The transmission-gates were used as analogswitches for current flow while the decoders enabled the corres-ponding transmission-gates for accessing the externally selectedtransistor for measurement
The procedure for selecting a single MOSFET device for meas-urement was as follows
bull Array (0 to 15) was enabled using the 4 minus bit Array-selectdecoder The MOSFET Source terminal (S) common to alltransistors in a single array was electrically connected toIO pads ldquoSFrdquoldquoSSrdquo when enabled
bull MOSFET Gate terminal (G0 to G15) was connected to IO
pad ldquoVGrdquo using the 4 minus bit Gate-select decoder and corres-ponding transmission-gates
bull MOSFET Drain terminal (D0 to D15) was connected to IO
pads ldquoDFnrdquoldquoDSnrdquo using the 2 minus bit Drain-select decoder
43 test structure implementation 56
D0
D15
DF[0-3]
DS[0-3]
Drain-select decoder (2-bit)
SSF
SS
Array-select decoder (4-bit)
VG
Gate-select decoder (4-bit)
G0 G15
MOSFET Array
Figure 412 Digital selection logic
and corresponding transmission-gates Drain terminals werealways enabled in groups of 4 and connected sequentiallyto IO pads ldquoDF0rdquoldquoDS0rdquo to ldquoDF3rdquoldquoDS3rdquo thus the 2 minus bit
Drain-select decoder was adequate for accessing all 16Drain terminals in the array
When a MOSFET is switched on a voltage drop is expectedbetween the IO pads and the transistor DrainSource terminalsdue to the resistance of wires To guarantee accurate voltagedelivery transistor DrainSource terminals are ldquosensedrdquo on theldquoDSnrdquoldquoSSrdquo IO pads so as to ensure that the applied voltageon IO pads ldquoDFnrdquoldquoSFrdquo supplies the desired voltage values onthe transistor terminals This procedure known as four-terminalsensing [29] is typically automated when using laboratory meas-urement equipment such as a source measurement unit (SMU)
A single MOSFET array with its digital selection logic can beseen in layout view in Figure 413 Transistor Gate terminals areselected using a 4minus bit decoder (Figure 414) while the externallyapplied voltage on IO pad ldquoVGrdquo passes through transmissiongates (Figure 415) The transmission gates schematic used forthe DrainSource terminals can be seen in Figure 416 The Drainterminal transmission gates are enabled in groups of 4 usinga 2 minus bit decoder (Figure 417) If a MOSFET array is not in theselected state by the Array-select decoder Gate terminals areforced at GND voltage level (for PTP1) or to VDD voltage (forPTP2) so that non-selected MOSFETs remain turned-off insteadof floating
43 test structure implementation 57
Gatedecoder
Metal2Metal1PolyActive
Transmissiongates (DS)
Transmission gates (G)
MOSFETArray
Figure 413 MOSFET array with digital selection logic
43 test structure implementation 58
BLK
A[0] A[1] A[2] A[3]
DEC[0-15]
Figure 414 4-bit Gate-enableArray-enable decoder
43 test structure implementation 59
DEC
VGGVDD
Transmission gate
Pull-downtransistor
Figure 415 Transmission gate for MOSFET Gate terminals
Transmission gates
FORCE
SENSE
DRAINSOURCE
EN
VDD
Figure 416 Transmission gates for MOSFET DrainSource terminals
43 test structure implementation 60
A[0] A[1]
DEC[0-3]
Figure 417 2-bit Drain-select decoder
44 experimental measurements 61
Configuration Type Name Wafer
1 No-TSVPMOS
(05microm times 05microm)F0 D08
2 1-TSV 0ordm 25 ordmCPMOS
(05microm times 05microm)G0 D08
3 1-TSV 0ordm 25 ordmCPMOS
(05microm times 05microm)G0 D20
4 1-TSV 0ordm 60 ordmCPMOS
(05microm times 05microm)G0 D08
54-lateral TSV 0ordm 25
ordmCPMOS
(05microm times 05microm)I0 D08
6 1-TSV 90ordm 25 ordmCPMOS
(05microm times 05microm)G90 D08
7 1-TSV 0ordm 25 ordmCPMOS (07microm times
007microm)B0 D08
Table 43 MOSFET array configurations selected for experimental eval-uation
44 experimental measurements
Test modules PTP1 and PTP2 implementing the presented MOS-
FET arrays were fabricated on 300mm wafers (Figure A1) eachcontaining 472 identical dies Two wafers were selected for meas-urements referred to as ldquoD08rdquo and ldquoD20rdquo both originating fromthe same lot The test module used for experimental evaluationwas PTP2 containing the PMOS-device arrays which were expec-ted to have the highest performance impact from thermomech-anical stress as was discussed in 42 The specific MOSFET arraysthat were selected for evaluation are listed in Table 43
441 Data processing methodology
For each MOSFET array evaluated the drain-to-source current(IDS) of all transistors in the array was measured in the saturationregion (VDS = 1V VGS = 1V) Subsequently the saturation cur-rent was divided with the median current value of all transistorsin the array generating IDS values normalised to the median Themedian of a data set represents the arithmetic value that the ma-jority of values tend (Appendix B) Since only a small percentageof the 256 transistors in the array are in proximity to the TSVsthe median IDS value represented typical transistor saturationcurrent not affected by thermomechanical stress
Measurement precision is increased with repeated measure-ments as random noise is removed from the experiment reducing
44 experimental measurements 62
ABCDEFBBC
FCDDFBCFFC
FCB
A
BAACADE
CECBBC
CFFCDE
B$BCFFBDBCBF
amp(BFC)B
BC
FCDDFA
CAF
Figure 418 Processing of measurement results
the standard deviation of the sampled mean (Appendix B) Con-sequently each MOSFET array was measured across 98 identicaldies on the wafer and their results were averaged Since the satur-ation current shift due to thermomechanical stress is systematicacross all dies averaging the saturation currents only removedthe random noise effectively increasing the measurement pre-cision by a factor of sim 10 (
radic98) The described procedure for
processing the measurement results is illustrated in Figure 418Each MOSFET array contains 256 transistors which are not
placed in a regular 16times 16 grid as can be seen in Figure 411 Thisapproach was chosen so as to minimise the number of transistorsin an array while allowing for 1microm resolution in the evaluation ofthe thermomechanical stress by taking advantage of its two-foldrotational symmetry as discussed in 431 However illustratingmeasurements in a contour graph requires a regular grid of dataThus for illustration purposes the experimental measurementswere reconstructed to a 33 times 31 regular grid (1023 transistors)with the intermediate values estimated using linear interpolation(Figure 419)
442 Measurement setup
For IO access a general purpose on-wafer measurement sys-tem was used that allowed direct contact with the wafer and the2times 12 test pad modules through probecard pins (Figure A2) IO
signal generation and measurement was provided by standard
44 experimental measurements 63
ABCDECDBF
ABECBF
AB
AB
Figure 419 Interpolation of measurement results
laboratory instruments controlled through the GPIB port of a PCrunning a customised LabVIEW program Four-terminal sens-ing for MOSFET DrainSource terminals was automated using aKeithley 2602 dual-channel SMU [1] The measurement setup isillustrated in Figure 420
The IO test pads for the PTP1PTP2 modules can be seen inFigure 44 while the instrument connections used for each testpad are summarised in Table 44
ABCCDE
F
AA
A
DBCB
Figure 420 Measurement setup
44 experimental measurements 64
Test pad Direction Instrument connection
VDD Power Power supply - 1V
GND Power Power supply - 0V
DEC[0-9] InputDigital Output
(GATEDRAINARRAY SelectDecoders Input)
DF[0-3] Input SMU (Drain Force terminal)
DS[0-3] Output SMU (Drain Sense terminal)
SF Input SMU (Source Force terminal)
SS Output SMU (Source Sense terminal)
VG Input Digital Output (Gate voltage)
Table 44 PTP1PTP2 test pad module instrument connections
443 PMOS (long channel)
No-TSV configuration
This MOSFET array does not contain any TSVs thus transistormobility is not affected by thermomechanical stress The arraywas measured on 98 identical dies of wafer ldquoD08rdquo the saturationcurrents normalised and then averaged over all measured diesusing the procedure elaborated in 441 In Figure 421 the norm-alised saturation current for each transistor in the array can beseen after interpolation to a 33 times 31 grid Each coordinate in thefigure represents a 1microm distance
As can be observed the majority of transistors in the array havealmost identical saturation currents as expected However there isa distinctive stripe running from north to south where the currentvalue is reduced up to sim 5 This revealed a flaw in the designof the MOSFET arrays that can be best observed in Figure 411The transistors that are located adjacent to the substrate tap donot perceive the same boundary conditions as the rest of thetransistors in the array That difference is enough to affect thephysical characteristics of these transistors during fabricationwhich in turn determine transistor parameters such as saturationcurrent This flaw stresses out the significance of careful boundarycondition design when targeting for good transistor matching ina test structure
1 TSV configuration
In this configuration long-channel MOSFETs (05microm times 05microm) areplaced in the periphery of a single TSV (Figure 47) The transistorsin the array were measured over 98 identical dies of wafer ldquoD08rdquo
44 experimental measurements 65
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
096
097
098
099
100
101
Figure 421 Normalised on-current for PMOS structure ldquoF0rdquo (D08)
and processed according to 441 The normalised saturation cur-rent for each transistor in the array can be seen in Figure 422 afterinterpolation to a 33 times 31 grid It can be clearly observed that thetransistors in proximity to the TSV show a distinctive variationin their saturation current in comparison to the other transist-ors in the array which is the direct consequence of TSV-inducedthermomechanical stress These results also demonstrate that theproposed test structure design has enough measurement preci-sion to distinguish thermomechanical stress from backgroundnoise and variability The north to south stripe observed in theprevious configuration due to boundary condition mismatch isalso present in these measurements
To observe the TSV effect on nearby transistors more accuratelythe normalised saturation current versus distance is plotted foraxis X = 16 and Y = 14 (Figure 423) which cross the TSV almostat its the centre In the graph only actual measurements are plot-ted prior to interpolation We observe that the saturation currentof a transistor at 1microm distance from the TSV increases sim 20 onthe latitudinal axis and decreases sim 15 on the longitudinalwhile the effect is almost negligible after 8microm
The same measurement is repeated on wafer ldquoD20rdquo In Fig-ure 424 the normalised saturation current for each measuredtransistor in the array can be seen after interpolation to a 33 times 31grid Also the normalised saturation current versus distance isplotted for axis X = 16 and Y = 14 (Figure 425) In the graph
44 experimental measurements 66
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
09
10
11
12
105
110
115
120TSV
Figure 422 Normalised on-current for PMOS structure ldquoG0rdquo (D08)
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
085
09
095
1
105
11
115
12
125Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 423 Normalised on-current for PMOS structure ldquoG0rdquo(D08Y=14X=16)
44 experimental measurements 67
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
09
10
11
12
13
TSV
Figure 424 Normalised on-current for PMOS structure ldquoG0rdquo (D20)
we observe that the saturation current of a transistor at 1microm dis-tance from the TSV increases sim 30 on the latitudinal axis anddecreases sim 18 on the longitudinal while the effect is almostnegligible after 8microm These results show that thermomechanicalstress can be quite different at distances very close to the TSV
even in between wafers originating from the same lotFinally the measurement is repeated on wafer ldquoD08rdquo after
increasing the temperature to 60 ordmC Normalised current contourplot can be seen in Figure 426 and measurements for X = 16and Y = 14 in Figure 427 The saturation current of a transistor
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
09
1
11
12
13
14
Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 425 Normalised on-current for PMOS structure ldquoG0rdquo(D20Y=14X=16)
44 experimental measurements 68
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
090
095
100
105
110
115TSV
Figure 426 Normalised on-current for PMOS structure ldquoG0rdquo (D0860˚C)
at 1microm distance from the TSV increases sim 16 on the latitudinalaxis (minus4 in comparison to 25 ordmC) and decreases sim 11 on thelongitudinal (minus4 in comparison to 25 ordmC) while the effect isalmost negligible after 6microm (minus2microm in relation to 25 ordmC)
4 lateral TSV configuration
In this configuration the thermomechanical stress is evaluatedin the lateral direction between 4 TSVs The array is measuredover 98 identical dies of wafer ldquoD08rdquo at 25 ordmC The contour plot ofthe normalised saturation current can be seen in Figure 428 andmeasurements for X = 16 and Y = 14 in Figures 429 and 430
respectively As can be observed the saturation current of atransistor at 1microm distance from the TSV increases sim 18 on thelatitudinal axis (minus2 in relation to 1 TSV) and decreases sim 12 onthe longitudinal (minus3 in relation to 1 TSV) while TSV interactionreduces the effect range to 5microm (minus3microm in relation to 1 TSV)
90˚ rotation TSV configuration
In this configuration the orientation of the transistor channel inrelation to the thermomechanical stress is rotated to 90˚ Thearray is measured over 98 identical dies of wafer ldquoD08rdquo at 25 ordmCThe contour plot of the normalised saturation current can beseen in Figure 431 and measurements for X = 16 and Y = 14in Figure 432 As was expected the stress effect on mobility has
44 experimental measurements 69
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
085
09
095
1
105
11
115
12Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 427 Normalised on-current for PMOS structure ldquoG0rdquo (D08Y=14 X=16 60˚C)
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
085
090
095
100
105
110
115
TSV
TSV TSV
TSV
Figure 428 Normalised on-current for PMOS structure ldquoI0rdquo (D08)
44 experimental measurements 70
2 1 1 2 3 4 5 6 7 7 6 5 4 3 2 1 1 2 309
095
1
105
11
115
12
X16
Distance (μm)
Normalized
Ion
TSV TSV
Figure 429 Normalised on-current for PMOS structure ldquoI0rdquo (D08X=16)
3 2 1 1 2 3 4 5 6 7 8 8 7 6 5 4 3 2 1 1 208208408608809092094096098
1102
Y14
Distance (μm)
Normalized
Ion
TSV TSV
Figure 430 Normalised on-current for PMOS structure ldquoI0rdquo (D08Y=14)
44 experimental measurements 71
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
090
095
100
105
110
115
120TSV
Figure 431 Normalised on-current for PMOS structure ldquoG90rdquo (D08)
been reversed between the two axis The saturation current of atransistor at 1microm distance from the TSV decreases sim 13 on thelatitudinal axis and increases sim 20 on the longitudinal whilethe effect is almost negligible after 6microm
444 PMOS (short channel)
In this configuration short-channel MOSFETs (07microm times 007microm) areplaced around a single TSV (Figure 47) The contour plot of thenormalised saturation current can be seen in Figure 433 and
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
08509
0951
10511
11512
125Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 432 Normalised on-current for PMOS structure ldquoG90rdquo (D08Y=14 X=16)
44 experimental measurements 72
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
094
096
098
100
102
104
106
108
110
TSV
Figure 433 Normalised on-current for PMOS structure ldquoB0rdquo (D08)
measurements for X = 16 and Y = 14 in Figure 434 At at 1microm
distance from the TSV the saturation current decreases 7 onthe longitudinal axis (minus8 in relation to long channel) while onthe latitudinal axis no stress effect could be detected That couldbe attributed either to short-channel effects or malfunctioningof the specific transistor due to proximity to the TSV and theexperimental nature of the manufacturing process On both axisthe effect is almost negligible after 5 minus 6microm
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
085
09
095
1
105
11
115Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 434 Normalised on-current for PMOS structure ldquoB0rdquo (D08Y=14 X=16)
45 summary and conclusions 73
Size
Waf
er
Con
figu
rati
on
Rot
atio
n
Eff
ect
1micro
mla
t
Eff
ect
1micro
mlo
ng
KO
Z
L D08 1-TSV (25 ordmC) 0ordm +20 minus15 8microm
L D20 1-TSV (25 ordmC) 0ordm +30 minus18 8microm
L D08 1-TSV (60 ordmC) 0ordm +16 minus11 6microm
L D08 4-lat TSV (25 ordmC) 0ordm +18 minus12 5microm
L D08 1-TSV (25 ordmC) 90ordm minus13 +20 6microm
S D08 1-TSV (25 ordmC) 0ordm sim 1 minus7 8microm
Table 45 Summary of experimental measurements on long (L) andshort (S) channel PMOS devices
45 summary and conclusions
In this chapter a test structure was presented for monitoringthe impact of TSV-induced thermomechanical stress on MOSFET
device performance The test structure implemented arrays ofMOSFET devices superimposed with TSVs in various configura-tions Digital selection logic allowed access to the MOSFET deviceswhich could be used to indirectly monitor thermomechanicalstress by measuring the saturation current of transistors
The structure was implemented in Imecrsquos experimental 65nm
3D process and used to characterize the process technology whileevaluating the test structure Experimental measurements werecarried on PMOS devices and the thermomechanical stress wasfound to be on par with simulation estimations
In Figure 45 the measurements are summarised for the latit-udinal and longitudinal axis at 1microm distance from the TSV alongwith the recommended keep-out-zone (KOZ) The KOZ is indic-ative and represents the distance where the thermomechanicalstress effect on saturation current is almost negligible (lt 1) Ina practical case KOZs will depend on the tolerated saturation cur-rent mismatch between devices which is a direct consequence oftiming and other circuit constraints Typically digital circuits cantolerate and thus function properly with much more transistorvariability than analog circuits
Comparing the experimental results of this work to other pub-lished data in the literature reveals that the thermomechanicalstress effect varies dramatically with different process character-istics Experimental measurements reported by Cho et al [17]confirm that long channel devices are more sensitive than short
45 summary and conclusions 74
channel however both the saturation current shift and KOZs areconsiderably reduced to 2 and lt 2microm respectively In additionthe dependency of the stress effect on TSV position was not ob-served by the authors in contrast to the measurements presentedin this work and predictions by theoretical models Contradictoryresults between theory and experiments and between differentexperiments emphasise the need for better understanding of thespecific mechanisms involved in the TSV-induced thermomech-anical stress
Evaluation of the measurements results allowed for some con-clusions to be made on the effectiveness of the presented teststructure
1 It is desirable for transistors in the array to be in a regu-lar grid so as processing of the measurement data is lesscomplex
2 The boundary conditions of transistors have an consider-able effect on transistor matching and thus measurementaccuracy
3 Distances of interest are generally lt 8microm therefore thearrays could be smaller or more dense in the area in prox-imity to the TSV than further away
4 The experimental process used to implement the test struc-ture had low yield which stressed out the need for mech-anisms to detect faults in the digital selection logic
The experiences learned from this project can be potentially usedfor designing improved thermomechanical stress characterizationstructures for next generation TSV processes In a future workit would be interesting to perform experimental measurementson NMOS devices which were not evaluated in this project Fur-thermore the cause of the glitch measured on the short-channeltransistor nearest to the TSV should be further investigated
In the following chapter the potential of the energy-recoverytechnique is investigated for reducing the energy dissipation ofTSV interconnects in future 3D-SoC designs implementing a largenumber of TSVs
5E VA L U AT I O N O F E N E R G Y- R E C O V E RY F O R T S VI N T E R C O N N E C T S
51 introduction
TSV-based 3D-ICs enable low-parasitic direct connections betweenfunctional blocks in a SoC by replacing traditional long horizontal2D global interconnects with short vertical ones (21 page 5)Compared to 2D interconnects the reduced resistance-capacitance(RC) parasitics of TSVs improve both speed (RC) and energy(12CV2
DD) performance of circuits However the TSV parasiticcapacitance may still become an important source of energydissipation in large densely interconnected 3D-SoCs since thecombined capacitance and thus the energy required to drive TSVs
increases linearly with the number of tiers and interconnections(Figure 51)
The principal technique for reducing energy dissipation inCMOS has traditionally been based on supply voltage scaling[14] though this represents a technical challenge below the 65nm
node since PVT variations and physical limits decrease in practicethe range over which voltages may be scaled An alternativelow-power design technique is energy-recovery logic [31] whichdemonstrates frequency-dependent energy dissipation and thusdoes not suffer from the limitations of voltage scaling The originsand operating principle of energy-recovery logic were discussedin detail in 22 (page 13)
In this chapter the potential of the energy-recovery techniqueis investigated for reducing the energy dissipation of TSV inter-connects in 3D-ICs The total energy dissipation per cycle andoptimum device sizing are extracted for the proposed schemeusing theoretical modelling while the configuration is evaluatedagainst conventional static CMOS Some of the presented methodsand results have appeared previously in Asimakopoulos et al[7]
52 energy-recovery tsv interconnects
As was discussed in 22 (page 13) energy-recovery circuits con-serve energy by restricting current-flow across resistances andsubsequently recovering part of the supplied energy using time-variable power sources The energy dissipated in an energy-
75
52 energy-recovery tsv interconnects 76
1XTSV
Tier1
TSV
Tier0LOGIC 0
LOGIC 0
TSV
LOGIC 0
LOGIC 0
TSV
LOGIC n
LOGIC n Tier1
Tier0
TSV
LOGIC 0
LOGIC 0
TSV
LOGIC n
LOGIC n
Tier(m-1)
Tier(m)
mXnXTSVFigure 51 TSV energy dissipation per cycle increase with 3D integra-
tion density
recovery circuit becomes proportional to the charging period (T)of the capacitive load (CL) [82 60]
EER prop
(
RCL
T
)
CLV2DD (51)
In comparison a conventional CMOS circuit driving an equi-valent capacitive load would dissipate during the same chargingperiod T [59]
ECMOS =12
CLV2DD (52)
Combining Equations 51 and 52 we calculate the energydissipation relation between the two circuits
EER
ECMOSprop
2RCL
T(53)
From Equation 53 we observe that the conditions under whichenergy-recovery conserves energy when compared to conven-tional CMOS are when either the charging period (T) is large orthe resistancecapacitance (RCL) of the load is small
In a 3D-IC TSV parasitics in terms of resistance-capacitance (RC)are inherently low by design as was discussed in 21 (page 5)In fact as was demonstrated by Katti et al [27] resistance isnegligible and thus the electrical behaviour of a TSV in a 3D-IC
52 energy-recovery tsv interconnects 77
Tier2Tier1TSV
LOGIC 1
LOGIC 2
driver
TSV
LOGIC 1
LOGIC 2
CMOS
driver
Convert
Energy-recovery
PCG
Figure 52 Comparison of a typical CMOS TSV driver to the proposedenergy-recovery
can be modelled as a single capacitance due to its predomin-ant impact on energydelay Therefore and based exclusivelyon Equation 53 energy-recovery has an advantage over con-ventional static CMOS when used for driving low-parasitic TSV
interconnectsThe differences between a typical CMOS driver in a TSV-based
3D-IC and the energy-recovery scheme proposed in this chapterare illustrated in Figure 52 A logic block in Tier1 generates adigital output which is regenerated by a driver and receivedby a second logic block in Tier2 after passing through the TSVIn the proposed energy-recovery scheme as an intermediatestep the digital output is converted to an adiabatic signal beforepassing through the TSV so as to take advantage of the low-power characteristics of energy-recovery Compatibility betweenthe receiving digital logic block in Tier2 and energy-recovery isretained by restoring the adiabatic signal back to its digital form
Converting the digital signal to an adiabatic as depicted inthe simplified diagram of Figure 52 requires the use of a spe-cialised driver that can combine (mix) the time-variable signalof the power-clock generator (PCG) with a digital output Signalsgenerated by a PCG return to zero at the end of a cycle resem-bling the operation of a digital clock (222 page 18) In contrastdigital outputs are typically allowed to have arbitrary values fromone cycle to the next thus a special structure is required thatcan encode the PCG generated signal to arbitrary binary values
52 energy-recovery tsv interconnects 78
IN IN
ININ
OUT OUT
IN
IN
OUT
OUT
Figure 53 Adiabatic driver
Figure 54 A simple pulse-to-level converter (P2LC)
An example of such structure is the adiabatic driver [8 84] (Fig-ure 53) Adiabatic drivers are composed of a pair of transmissiongates switching onoff alternately based on the value of a digitalinput On each cycle the time-variable adiabatic signal is redirec-ted either to the driverrsquos inverting or non-inverting output andas a result the data carried by the adiabatic signal are dual-railencoded
The signals at the outputs of the adiabatic driver can be restoredto their digital form after passing through the TSV by convertingthe dual-rail encoded sinusoidal pulses back to level signals usinga pulse-to-level converter (P2LC) [72] (Figure 54)
For generation of the time-variable adiabatic signal a resonantPCG may be used such as the one discussed in 222 (page 18)A resonant PCG generates continuous sinusoidal pulses approx-imating the idealised (linear) adiabatic signal of Figure 52 Theadiabatic driverrsquos resistance (RTG) along with the capacitive load(CL) assume a crucial role in the generation of the resonant si-nusoidal pulse as their presence in the signal path along with
53 analysis 79
VDD2
Rind
RTG
AdiabaticDriver
CL
M1
T=1f
L ind
RTG
CL
Figure 55 Resonant pulse generator with dual-rail data encoding
Tier2
Tier1
f-fPULSE
VDD2CLK IN
Delay1CLK1
CLK0
Delay2CLK2OUT1
Data In
Data Out
CLKRES
TSV TSV TSV
CLKBufferAdiabatic
Driver
P2LC
f-f
Data1
M1
Figure 56 Energy-recovery scheme for TSV interconnects
a resonant-tank inductor form an RLC oscillator [44] (Figure 55)resonating at
f =1
2πradic
LindCL(54)
A detailed block diagram of the proposed energy-recoveryscheme including the PCG and dual-rail encoding is illustratedin Figure 56
53 analysis
The energy dissipation relation depicted in Equation 51 is basedon the assumption that an idealised (linear) adiabatic signalis used for charging the TSV capacitance Furthermore energy
53 analysis 80
losses occurring in the additional components used in the realisticenergy-recovery circuit illustrated in Figure 56 are not consideredin Equation 51 The purpose of this section is to analyse andmodel all possible sources of energy dissipation in the proposedenergy-recovery scheme
The bulk of the energy dissipation in the energy-recovery cir-cuit of Figure 56 will occur in the adiabatic driver (EAD) theinductorrsquos parasitic resistance (Rind) transistor M1 (EM1) and theP2LC (EP2L) Therefore the total energy dissipated per cycle inthe energy-recovery circuit would be
EER = EAD + Eind + EM1 + EP2L (55)
Parameters EAD Eind and EM1 are theoretically derived in thefollowing subsections while EP2L depends on the specific P2LCs
architecture used and could be easily extracted from simulationdata as will be shown later in this chapter
531 Adiabatic driver
There are two sources of energy dissipation in the adiabaticdriver of Figure 53 adiabatic dissipation caused by current flow-ing through the transmission gate resistance (RTG) and CMOS
dissipation caused by external stages driving the transmissiongate and pull-down NMOS transistor input capacitances
The sinusoidal pulses generated by the PCG chargedischargethe load capacitance (CL) through the transmission gate resistance(RTG) The full chargedischarge cycle of the load capacitanceconcludes during a period T = 1 f while the energy dissipatedon the transmission gate resistance (ETG) during that period isadiabatic and thus proportional to Equation 22 Appending acorrection factor ξ equal to π28 for a sine-shaped current [8 76]and multiplying the equation by 2 to calculate energy dissipationfor the full cycle
ETG = 2ξ
(
RTGCL12 T
)
CLV2DD =
π2
2(RTGCL f )CLV2
DD (56)
The cross-coupled PMOS pair of Figure 53 reduces the transmis-sion gate input capacitance by 12 when compared to the simpleadiabatic driver of Figure 215 (page 18) External CMOS stagesare not required to drive the PMOS pair however the PMOS gatecapacitance will appear as an additional capacitive load at theoutput of the driver Energy will still be dissipated for driving thePMOS pair but will be adiabatic and thus more energy efficient
53 analysis 81
Transmission gates are typically sized to have equal NMOS andPMOS parts as increasing the size of the PMOS only slightly im-proves the gate resistance while significantly increasing the inputcapacitance [78] Thus assuming both NMOSPMOS transistorsare equally sized to Wn the additional capacitive load appearingat the driverrsquos output due to the PMOS pair would be one gatecapacitance Cn In addition drainsource diffusion capacitance(CD) will be an important portion of the load since in each cycle6CD will be present in the current flow path (4 contributed bythe ON and 2 by the OFF transmission gate) Including the TSV
capacitance at the output of the adiabatic driver the combinedload capacitance seen by the PCG in any cycle would be
CL = CTSV + Cn + 6CD (57)
The transmission gate resistance (RTG) can be related to thegate capacitance by a ldquodevice technology factorrdquo (κTG) [43] whichwe can define for calculation convenience
κTG = RTGCn rArr RTG =κTG
Cn(58)
Assuming that the pull-down NMOS transistors of Figure 53are small CMOS dissipation is primarily the result of driving theNMOS part of the transmission gates which is CnV2
DD Combiningthe CMOS dissipation with Equations 56 57 and 58 gives thetotal dissipated energy per cycle in the adiabatic driver
EAD = CnV2DD +
π2
2κTG
Cnf [CTSV + Cn + 6CD]
2 V2DD (59)
The second term of Equation 59 has a consistent contributionon the energy dissipation in every cycle while the first term isdependent on data switching activity (D) We can also furthersimplify this equation by defining the diffusion capacitance asa fraction of the transistors input capacitance (CD = bCn) andassigning term π2
2 κTG f to a variable (y = π2
2 κTG f ) Equation 59then becomes
EAD = D middot CnV2DD +
y
Cn[CTSV + (6b + 1)Cn]
2 V2DD
=
[
Cn
(
D
y+ (6b + 1)2
)
+1
CnC2
TSV
]
middot yV2DD
+(12b + 2)CTSV middot yV2DD (510)
Since in Equation 510 Cn is the free parameter the first twoterms are inversely proportional and thus EAD is minimised when
53 analysis 82
they become equal Solving equation Cn
(
Dy + (6b + 1)2
)
= 1Cn
C2TSV
for Cn the optimum transmission gate input capacitance is calcu-lated for a given CTSV
Cn(opt) =
radic
[D
y+ (6b + 1)2]minus1CTSV (511)
Combining Equations 510 and 511 gives the energy dissipationof an optimally sized adiabatic driver
EAD(min) =
[radic
D
(6b + 1)2y+ 1 + 1
]
middot (12b+ 2)yCTSVV2DD (512)
532 Inductorrsquos parasitic resistance
The inductorrsquos (Lind) parasitic resistance Rind is proportional tothe Qind factor which typically depends on the inductorrsquos imple-mentation technology For the purposes of this analysis it can beestimated as
Qind =1
Rind
radic
Lind
CLrArr Rind =
1QindCL2π f
(513)
Energy dissipation in Rind is adiabatic and can be estimatedcombining Equations 56 and 513
Eind =π2
2(RindCL f )CLV2
DD =π
4CL
QindV2
DD (514)
533 Switch M1
Transistor M1 is switched-on briefly to recover the energy dissip-ated each cycle in the Rtotal = RTG + Rind Its energy dissipationis a trade-off between dissipation in its on-resistance RM1 andinput capacitance CM1
Since M1 is a fairly large transistor previous ratioed stages willhave a significant contribution to the energy consumption as wellFor that reason a m factor is used to compensate for the additionallosses IM1(rms) is the root mean square (RMS) current passingthrough the transistor while turned-on and VGM1 is the peak gatevoltage A methodology for deriving optimum values for boththese parameters is proposed in [44] and the total dissipatedenergy in M1 can be estimated as
EM1(min) = 2IM1(rms)VGM1
radic
mκM1
f(515)
54 simulations 83
Delay2
CLK1
CLK2Delay1
CLKW
PULSE
CLK0T
PD
OUT1
PULSE
P2L
RES
CLK
Figure 57 Energy-recovery timing constraints
534 Timing Constraints
The gate of transistor M1 in Figure 55 is controlled by a pulsewith period TCLK and width WPULSE For optimum energy ef-ficiency the digital inputs of the adiabatic driver in Figure 53should switch during the time period defined by WPULSE whilethe input values must be valid for the complete resonant clockcycle as shown in Figure 57
The adiabatic driverrsquos input data latching is controlled by clocksignal CLK1 (Figure 56) and thus its positive edge should takeplace during WPULSE (Delay1 lt WPULSE) If these constraints arenot met energy that could be otherwise recovered by the inductorwill be dissipated on M1 during WPULSE The clock signal at thereceiving end CLK2 does not have the same constraint howeveritrsquos positive edge should occur before the P2LC outputs new data(Delay2 lt WPULSE + PDP2L)
54 simulations
To verify and evaluate the proposed theoretical models a testcase was designed for which energy dissipation was calculatedusing both theory and SPICE simulations The test case assumed1 minus bit of information per cycle was transferred between 2 tiersof a 3D-IC in which the TSV capacitance (CTSV) was 80 f F TheTSV was driven using the energy-recovery technique proposedin this chapter implemented with a single adiabatic driver fordata encoding and an inductor (Lind Rind) for generation of the
54 simulations 84
M1IN IN
IN
OUT OUT
IN
Figure 58 Test case for the evaluation of the theoretical model
Q250MHz 500MHz 750MHz
T S e () T S e () T S e ()
6 360 409 -120 476 516 -78 578 613 -57
8 308 347 -114 417 449 -70 515 543 -52
10 276 312 -115 382 411 -70 477 498 -43
12 255 289 -119 358 386 -71 451 472 -44
Table 51 Energy dissipation results ( f Jcycle) theoretical model (T)versus SPICE simulation (S)
adiabatic signal and energy recovery The circuit used for the testcase is illustrated in Figure 58
The energy dissipation on the adiabatic driver inductor andtransistor M1 were calculated using Equations 512 514 and 515with the Q factor and operating frequency acting as free variablesTechnology-specific parameters κTG κM1 and CD were extractedusing simulation models for Imecrsquos 130nm 3D process (Table 22page 11) The same circuit was simulated on Cadence VirtuosoSpectre simulator under identical conditions The energy dissipa-tion results for various Q factors and operating frequencies arelisted in Table 51
The calculated and simulated energy dissipation results arealso plotted in Figure 59 As can be observed theoretical es-timations are lower than than the equivalent results reported bythe simulator in all the cases examined That can be expected asthe calculations performed by the simulator are thorough andconsider much more details that are skipped in the theoreticalmodels Nevertheless there is good correlation between theoryand simulations
In Figure 510 the error in the estimation of the energy dissip-ation is plotted when using the theoretical model The Q factordoes not seem to have a significant effect on the error howeverthere is a considerable dependence on the operating frequency
54 simulations 85
5 6 7 8 9 10 11 12 130
10
20
30
40
50
60
70
T250 S250 T500 S500 T750 S750
Q factor
Energydissipation(fF)
Figure 59 Energy dissipation results ( f Jcycle) theoretical model (T)versus SPICE simulator (S)
5 6 7 8 9 10 11 12 13
-14
-12
-10
-8
-6
-4
-2
0
250MHz 500MHz 750MHz
Q factor
Error()
Figure 510 Energy dissipation estimation error of the theoretical modelcompared to the SPICE simulator
The parameters κTG κM1 and CD that were extracted for thespecific technology even though they were used as constants inthe models in practice they are frequency dependent Thereforeit could be said that based on the parameters chosen the theor-etical results are optimised for a specific frequency Increasingfrequency doesnrsquot necessary result in reduced error as seemsto be suggested by the graphs in Figure 510 In fact it is ex-pected that the estimation error will show a positive sign forlarger frequencies As was discussed previously in the text it isa reasonable expectation for the theoretical models to slightlyunderestimate the energy-dissipation
54 simulations 86
M1
TSV
TSV
TSV
Data in OutCMOS
OutER
Energy-recovery
CMOS
P2LC
Adiabaticdriver
Figure 511 Test case for comparison of energy-recovery to CMOS
541 Comparison to CMOS
When comparing different circuits it is typically desirable to takea holistic approach evaluating various aspects such as powerarea speed cost etc The purpose of this section is to comparethe energy-recovery scheme to a conventional static CMOS circuitusing just the theoretical models presented in this chapter Thisbasically limits the comparison to the energy dissipation andspeed parameters while on Chapter 6 (page 92) area is alsoconsidered The test case used for the comparison is illustratedin Figure 511 The TSV load capacitance was assumed to be 80 f F
for both circuitsThe energy dissipation of the CMOS circuit in Figure 511 is
described by Equation 52 Appending the data switching activity(D)
ECMOS = D middot 12
CLV2DD (516)
Equation 516 obviously calculates the theoretical minimumenergy dissipation since no parasitics and previous CMOS stagesare included In practice the energy dissipation of the CMOS
circuit can be much larger however Equation 516 is consideredadequate for the purposes of this comparison
Energy dissipation of the energy-recovery circuit is calculatedusing Equation 55 The energy contribution of each sub-componentthe adiabatic driver (EAD) the inductorrsquos parasitic resistance(Rind) and transistor M1 (EM1) were theoretically derived pre-viously in the text The P2LC energy was simulated in SPICEfor convenience The simple circuit topology of Figure 54 wasslightly improved adding the PMOS transistors seen in Figure 512so as to reduce short-circuit current when the input is in the range
54 simulations 87
Figure 512 Improved P2LC design
0 100 200 300 400 500 600 700 800 9000
2
4
6
8
10
12
14
Frequency (MHz)
Energy
dissipation(fJ)
Figure 513 Simulated energy dissipation of improved P2LC design
Vth rarr (VDD minus Vth) The energy dissipation results extracted us-ing SPICE can be seen in Figure 513 As can be observed theenergy dissipation is frequency dependent something not expec-ted in a CMOS circuit This behaviour is caused by the slow risingsinusoidal inputs which for low frequencies extend the periodduring which both PMOS and NMOS are on and thus increaseshort-circuit current
The Q factor is a critical parameter in the energy efficiency ofthe energy-recovery scheme since the entire circuit is basically anRLC oscillator A wide range of inductors are available as discretecomponents for use in circuits with Q factors depending on thesize and material On-chip integrated inductors have more re-stricted performance however their compact size is an advantageIntegrated Q factors are typically in the neighbourhood of 10for nH range inductors with sim 100microm radius [5] The Q factor
54 simulations 88
4 6 8 10 12 14 16 18 20 22-20
-10
0
10
20
30
40
50
60
800MHz 700MHz 600MHz 500MHz 400MHz 300MHz 200MHzQ factor
Energyred uction()
Figure 514 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS for variable Q factor(C = 80 f F)
4 6 8 10 12 14 16 18 20 220050
100150200250300350400450
Inductor M1 Adiabatic driver P2LC
Q factor
Energy
con trib
ution(
)
Figure 515 Energy contribution of energy-recovery circuit componentsat 200MHz (C = 80 f F)
values in the following simulations were deliberately chosen tobe in realistic ranges for integrated inductors
In Figure 514 the percentage reduction in energy dissipationachieved by the energy-recovery circuit is plotted when comparedto CMOS As expected energy efficiency of the energy-recoverycircuit improves with high Q factors and low frequencies
In Figures 515 and 516 the energy dissipation contributionof each component in the energy-recovery circuit can be seenat 200MHz and 800MHz respectively In both cases the energydissipated on the inductorrsquos resistance is reduced with higher Q
factors At lower frequencies the P2LC energy dissipation becomesa significant part of the total
In Figure 517 the percentage reduction in energy dissipationis plotted when the TSV capacitance (CTSV) is variable The en-ergy dissipation in the adiabatic driver (Equation 512) inductor(Equation 514) and transistor M1 (Equation 515) are linearlyrelated to the TSV capacitance Since the same applies to the CMOS
circuit (Equation 516) relative energy dissipation of both circuitsshould be constant with TSV capacitance The change observed in
54 simulations 89
4 6 8 10 12 14 16 18 20 2200
100
200
300
400
500
600
700
Inductor M1 Adiabatic driver P2LC
Q factor
Energy
con trib
ution(
)
Figure 516 Energy contribution of energy-recovery circuit componentsat 800MHz (C = 80 f F)
20 40 60 80 100 120 140 160 180 200 220-100
00
100
200
300
400
500
Q6 Q10 Q14 Q18
TSV capacitance (fF)
Energy
red u
ction(
)
Figure 517 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS with variable TSVcapacitance (C = 80 f F f = 500MHz)
Figure 517 is caused by the constant energy dissipation contribu-tion of the P2LC However as the load capacitance increases theenergy dissipation on the P2LC becomes insignificant in relationto the rest of the circuit components thus the reduction in energyis expected to asymptotically approach a maximum value
Switching activity (D) can also be a significant factor affect-ing energy performance Since in the energy-recovery circuit thesinusoidal oscillation may not be halted all capacitances in thecurrent flow path will charge and discharge on each cycle regard-less of data activity In contrast static CMOS ideally dissipatesenergy only when switching and thus the energy-recovery circuithas a disadvantage at low switching activities In Figure 518 theestimated effect of the switching activity on energy performanceis plotted for an operating frequency of 200MHz
Since the technology factors κTG and κM1 were extracted for a130nm process reducing their value by 12 could also provide uswith an estimation of the circuitrsquos energy performance for a 65nm
node The result in plotted in Figure 519 and as can be observedtechnology scaling has a positive effect on energy dissipationwhen compared to conventional static CMOS
54 simulations 90
4 6 8 10 12 14 16 18 20 22-40
-20
0
20
40
60
80
D=05 D=07 D=09 D=1
Q factor
Energyred uction()
Figure 518 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS (C = 80 f F f =200MHz)
4 6 8 10 12 14 16 18 20 220
10
20
30
40
50
60
130nm 65nm
Q factor
Energyred uction()
Figure 519 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS for variable technology(C = 80 f F f = 500MHz)
55 summary and conclusions 91
55 summary and conclusions
The work presented in this chapter investigated the potential ofthe energy-recovery technique as used in adiabatic logic andresonant clock distribution networks for reducing the energy dis-sipation of TSV interconnections in 3D-ICs The proposed energy-recovery scheme for TSVs was analysed using theoretical model-ling generating equations for energy dissipation and optimumdevice sizing The model accuracy was evaluated against SPICEsimulations which showed good correlation with the theoreticalestimations on a 130nm technology process
The energy-recovery scheme was compared to conventionalstatic CMOS when both circuits driving equivalent TSV load ca-pacitances under similar conditions The analysis revealed theenergy performance dependence on multiple circuit parametersthe Q factor operating frequency switching activity and TSV capa-citance The results demonstrated favourable energy performancefor the energy-recovery scheme for low operating frequenciesand high Q factors switching activities and TSV capacitancesFurthermore an estimation was provided on the energy perform-ance of the proposed scheme in an advanced technology nodewhich predicted further energy improvement
Encoding the PCG signal to arbitrary digital values using thedual-rail adiabatic driver (Figure 53) is only one of the possiblesolutions In fact energy could be saved if instead of dual-railanother encoding method is used such as 1-out-of-4 [74] In 1-out-of-4 encoding 4 outputs are used to represent 2-bits however onlyone of the outputs is charged at any given cycle In comparisondual-rail charges two outputs per cycle for the same number ofbits and thus 1-out-of-4 could potentially be saving up to 50energy when compared to dual-rail 1-out-of-4 encoding couldprove an interesting extension to the proposed energy-recoveryscheme
The theoretical models developed in this work can be used asa quick alternative to SPICE simulations for design-space explor-ation This is shown in the following chapter where the theoryestablished on the energy-recovery TSV scheme is used for design-ing a low-power 3D demonstrator circuit that was fabricated on a130nm process
6E N E R G Y- R E C O V E RY 3 D - I C D E M O N S T R AT O R
61 introduction
In the previous chapter a theoretical approach was taken toinvestigate the potential of the energy-recovery technique forreducing the energy dissipation of TSV interconnects in 3D-ICs Thetotal energy dissipation per cycle and optimum device sizing wereextracted for the proposed scheme using theoretical modellingwhile analysis revealed the design parameters affecting energyperformance
Practical design issues such as wire parasitics signal distri-bution area etc were not considered hence in this chapter a3D demonstrator circuit based on the energy-recovery scheme isdesigned under realistic physical and electrical constraints Thedemonstrator is compared to a CMOS circuit designed with thesame specifications using post-layout simulations Both circuitsare fabricated on a 130nm 3D technology process and evaluatedbased on the experimental measurements
62 physical implementation
The demonstrator circuit implemented a 500MHz 80-bit parallelIO channel on a 5-tier TSV-based 3D-IC The TSVs interconnectionswere driven using the energy-recovery scheme as discussed inChapter 5
The target process technology allowed for a low-resistivitybackside re-distribution layer (RDL) to be deposited which wasused to implement a high-Q integrated inductor Stacking of dieswas not possible in the target process so the 5-tier 3D-IC wasemulated by series-connecting TSVs in groups of 4 as is illustratedin Figure 61
The circuit consisted of a total of 640 TSVs (2 times 4 times 80bits) ofwhich half were electrically connected to the integrated inductorduring a single clock cycle The TSV capacitance for the fabrica-tion technology process had already been characterized and wasestimated at sim 40 f F (Table 22 page 11) Optimal device sizingand parameters for the energy-recovery IO circuit at 500MHz
were calculated using the methodology developed in Chapter 5
and are listed in Table 61In Figure 62 the schematic diagram of the energy-recovery
circuit is illustrated The main components comprising the circuitare the adiabatic drivers the pulse-to-level converters (P2LCs)
92
62 physical implementation 93
AB
AB
CDEFD
C
ABCD
EFD
C
Figure 61 Demonstrator circuit structure (showing 1-bit IO channelface-down view)
Adiabatic driver 80-bit circuit
Wn 461microm (Eq511) Rtotal 21Ω ( RTG80 )
RTG 168Ω (Eq58) Ctotal 165pF (80 middot Cload)
CL 207 f F (Eq57) Lind 613nH (Eq54)
Table 61 Energy-recovery IO circuit parameters
62 physical implementation 94
IntegratedInductor
VINDOER
PulseGenerator
S1AS2
DataGenerator
TSV
TSV
P2LCAdiabaticDriver
P2LCAdiabaticDriver TSV
TSV
Buffer
Buffer
1
80
1
80
EN
DecouplingCapacitor
Figure 62 Energy-recovery IO circuit schematic diagram
Metal2Metal1PolyActiveVDD
GND
RES_CLK
DATA
OUT
OUT_B
DATA
OUT_B OUT
RES_CLK
GND
VDD
Figure 63 Adiabatic driver (layoutschematic view)
the data generator the pulse generator transistor M1 and theintegrated inductor All sub-components are discussed in detailin the following paragraphs
621 Adiabatic driver
The adiabatic driver was based on the configuration theoreticallyanalysed in 531 (page 80) The adiabatic driver had 2 inputsfor the resonant pulse (RES_CLK) and the digital data (DATA)signals and 2 outputs for the inverted (OUT_B) and non-inverted(OUT) signals The layout and schematic views of the adiabaticdriver are illustrated in Figure 63
62 physical implementation 95
Metal2Metal1PolyActive
IN
OUT_B
VDD
IN_B
OUT
GND
OUT_B
IN
GND
VDD
OUT
IN_B
Figure 64 P2LC (layoutschematic view)
622 Pulse-to-level converter
The pulse-to-level converter (P2LC) was based on the simple circuitof Figure 54 (page 78) The design in layout and schematic viewsis illustrated in Figure 64 The circuit had 2 inputs for the inverted(IN_B) and non-inverted (IN) resonant pulse and 2 outputs forthe inverted (OUT_B) and non-inverted (OUT) digital signals
623 Data generator
A data generator was used for generating the digital data sig-nal required at the adiabatic driverrsquos input (DATA) The circuitwhich is illustrated in Figure 65 was a simple D-type flip-flopconfigured as a toggle The data generator had 1 input a digitalclock signal (CLK) operating at the same frequency as the res-onant clock and the digital data output (Q) with a switchingactivity of 1
As discussed in 534 (page 83) there are certain timing con-straints that must be respected for the energy-recovery circuitto operate efficiently The digital data received at the adiabaticdriverrsquos input (DATA) should switch during the period definedby WPULSE (Figure 66) In the demonstrator circuit the distribu-tion network carrying the DATA signal to the adiabatic driverswas carefully designed so that its delay impact was less thanWPULSE
624 Pulse generator
The gate of transistor M1 in Figure 62 is controlled by a pulsehaving the same frequency as the resonant clock (RES_CLK)and width WPULSE which is calculated using the models in 533(page 82) This digital pulse can be generated externally of thechip using a highly accurate laboratory signal generator Howeverwire parasitics in the path from the external signal source to the
62 physical implementation 96
Metal2Metal1PolyActive EN
CLK
VDD
Q
GND
D
CLKGND
ENVDD
QF-F
D
GND
VDDCLK
D Q
Figure 65 Data generator (layoutschematic view)
Delay1
WPULSE
PULSE
CLKTCLK
DATA
CLK_RES
Figure 66 Timing constraints for the energy-recovery demonstratorcircuit
62 physical implementation 97
PULSES1A
S2
GND
VDD
Metal2Metal1PolyActive
VDD
S1
S2
GND
PULSE
Figure 67 Pulse generator (layoutschematic view)
S2ΔPhase
WPULSE
PULSE
S1A
Figure 68 Pulse generator InputOutput signals
gate of on-chip transistor M1 affect the rising and falling times ofthe pulse resulting in pulse widths which are difficult to control
For that reason the pulse generator was implemented on-chip(Figure 67) and could be used to generate pulses with arbitrarywidths and frequencies The pulse generator had 2 sinusoidalinputs (S1A S2) supplied externally of the chip and generateda digital pulse with the same frequency as S1AS2 The pulsewidth was defined by the phase difference between S1AS2 asillustrated in Figure 68
625 Integrated inductor
The integrated inductor was designed in ASITIC1 as a planarsquare spiral with fixed dimensions (500microm times 500microm) and vari-able inductance while aiming for optimum Q factor The simu-
1 ASITIC is CAD tool that aids in the optimization and modelling of spiralinductors (httprficeecsberkeleyedu~niknejadasitichtml)
62 physical implementation 98
100 200 300 400 500 600 700 8005
7
9
11
13
15
17
19
21
10
15
20
25
30
35
40
45
Q (Simulated) Energy Reduction ()
Frequency (MHz)
Qfactor
Energyreduction()
Figure 69 Inductorrsquos Q factor simulation in ASITIC Estimated energydissipation reduction per cycle over standard CMOS
lated Q factor of the inductor for each frequency was then usedin Equation 55 to estimate the energy performance of the energy-recovery circuit when compared to standard CMOS (Equation 52)
As can be observed in Figure 69 the maximum performance isnot reached at the point where the maximum Q factor is achievedsince energy-recovery is also dependent on the operating fre-quency As a result the largest reduction in energy dissipationwas 40 attained at 300MHz while the maximum Q factor re-ported by ASITIC was 20 at 625MHz The high Q factors werea consequence of the large area allocated for the inductor andthe availability of the backside re-distribution layer (RDL) on thetarget process technology for the implementation of the inductorwhere thick low-resistance copper paths could be deposited
For an operating frequency of 500MHz the inductor had a Qfactor of sim 17 which resulted in an estimated energy reductionof 33 in comparison to the CMOS circuit The inductor had avalue of 613nH at that frequency and its design in layout viewcan be seen in Figure 610
626 Top level design
The various sub-components of the circuit were interconnectedusing Metal1 and Metal2 layers In the design there were twoglobal signals that assumed a critical role in the performance ofthe circuit The digital data signal (DATA) which was discussedin 623 and the resonant clock (RES_CLK)
For the distribution of the data signal buffers and interconnectswere properly sized so that the timing constraints of Figure 66were respected A sinusoidal signal such as RES_CLK does notrequire regeneration as it carries just one frequency harmonichowever the path itself has to be properly sized so as to make surethat any parasitic resistancescapacitances are not significantlyaffecting the target performance of the design As was previously
62 physical implementation 99
500μ
m
500μm
Figure 610 Planar square spiral inductor designed in ASITIC
shown by Chueh et al [19] when the driven load capacitancein a resonant distribution network is much larger than wireparasitic capacitance then wire resistance is more significant fordetermining both skew and energy performance
The resonant pulse distribution topology used in this designwas an asymmetric tree as shown in Figure 611 in which theinductor forces a current at the root branch terminating on alladiabatic drivers in parallel As such and considering that theresonant pulse supplies 80 adiabatic drivers with transmissiongate resistance of RTG = 168Ω the total load resistance seen atthe output node of the inductor is RTG
80 = 16880 = 21Ω However as
the current flows to lower branch levels in the tree the perceivedresistance increases to RTG
16 RTG4 and on the last one to RTG Any
additional resistance from wires is going to add in series with theadiabatic driver resistance which in turn will induce an almostlinear effect on the energy performance as can be estimated fromEquation 512
Assuming that the total driven resistance is not desirable toincrease more than 20 and that there are 4 branch levels inthe design each branch can tolerate wire resistance equal to20
4 = 5 of its value Starting from the inductor output nodeand calculating allowed wire resistivity for each branch level thecalculated tolerances for each branch are listed in Table 62 Thesecalculated limits were used for optimally sizing the resonantclock distribution tree in the demonstrator circuit
To ensure a stable voltage supply at the input of the inductor(VIND) a 1pF interdigitated metal-insulator-metal (MIM) de-coupling capacitor [78] was designed (Figure 612) which wasinserted between VIND and ground The top level design of
62 physical implementation 100
Adiabatic drivers
RTG
16
RTG
80
RTG
4
RTG
Level 1Level 2Level 3Level 4
Figure 611 Resonant pulse distribution tree
Branch level Wire resistance
1st 5 middot 16880 = 105mΩ
2nd 5 middot 16816 = 525mΩ
3rd 5 middot 1684 = 21Ω
4th 5 middot 168Ω = 84Ω
Total load 168+8480 + 21
20 + 05255 + 0105 = 252Ω
Table 62 Calculated resistance tolerances for each wire branch in theresonant clock distribution tree
63 post-layout simulation and comparison 101
+-
+-Metal2 Metal1
180μ
m
100μm
Figure 612 Interdigitated metal-insulator-metal capacitor (fringe capa-citor)
the energy-recovery IO circuit in layout view is illustrated inFigure 613
627 CMOS IO circuit
The CMOS IO circuit which was designed for comparison pur-poses is illustrated in the schematic diagram of Figure 614 Thecircuit used half the number of TSVs than the energy-recoverycircuit and was composed of two basic sub-components The datagenerator which was identical to the one discussed in 623 andthe CMOS drivers (Figure 615) which were properly sized to drivethe 160 f F TSV capacitance
The top level layout view of the CMOS IO circuit can be seenin Figure 616
63 post-layout simulation and comparison
The theoretically calculated parameters were used for the designof the demonstrator circuit in Imecrsquos 3D130nm process technologyIn Table 63 the estimated energy dissipation of each circuit sub-component is listed
63 post-layout simulation and comparison 102
22mm
12mm
GND GNDVBUF1 VDR
GND
VIND
S1A
S2
EN
OER
Figure 613 Energy-recovery IO circuit in layout view
63 post-layout simulation and comparison 103
OCMOS
S1B DataGenerator
TSV
CMOSDriver
CMOSDriver
TSV Buffer
1
80
EN
Figure 614 CMOS IO circuit schematic diagram
GND
VCMOS
DATA
078u
039u
105u
053u
316u
158u
950u
475u
OUT
Figure 615 CMOS driver schematic
12mm
12mm
OCMOS S1B GND VBUF2 VCMOSGND
GND GNDEN VCMOS
Figure 616 CMOS IO circuit in layout view
63 post-layout simulation and comparison 104
Energy dissipation (80-bits cycle)
Inductor 11pJ (Eq514)
Switch M1 092pJ [ ]
P2LCs 098pJ (Simulated)
Adiabatic drivers 315pJ (Eq512)
Total 615pJ
Table 63 Energy-dissipation theoretical calculations for Q = 17 f =500MHz
After parasitics extraction the top-level design was simulatedin SPICE The post-layout simulation results for a selection ofoutput signals are plotted in Figure 617 As can be observedthe timing constraints discussed in 623 are respected since theadiabatic driverrsquos input signal (DATA) optimally switches duringthe 327ps PULSE width It can also be seen that the resonantclock (CLK_RES) does not fully complete its discharge cyclebefore it is being forced to do so by the PULSE signal since theadditional capacitive load induced by the wire parasitics reducesthe oscillatorrsquos natural frequency
Post-layout simulated energy dissipation as well as area re-quirements for the energy-recovery and CMOS IO circuits arelisted in Table 64 The energy dissipation of the energy-recoveryimplementation is 25 more at the post-layout level in compar-ison to the theoretical estimations The energy overhead can beattributed to the effect of the wire parasitics as discussed in626 and the transistor parasitics that are not included in thetheoretical equations Therefore the accuracy of the theoreticalmodels can be considered acceptable for designing circuits basedon the proposed energy-recovery scheme
As was expected the dual-rail energy-recovery circuit occupiesalmost double the area of the equivalent CMOS circuit and eventhough both implementations have increased energy dissipationin comparison to the theoretical calculations the CMOS circuitenergy increases relatively more The latter was anticipated sincethe theoretical estimation of the CMOS energy dissipation wascalculated using the simple Equation 52 which does not accountfor any additional driver stages and parasitics The end result isthat the comparative performance favours the energy-recoveryimplementation which improves its energy savings from the
63 post-layout simulation and comparison 105
8 9 10 11 12 13 14 15 160
02
04
06
08
1
12
14RES_CLK
Time (ns)
V(V
)
8 9 10 11 12 13 14 15 160
02
04
06
08
1
12
14DATA
Time (ns)
V(V
)
8 9 10 11 12 13 14 15 160
02
04
06
08
1
12
14PULSE
Time (ns)
V(V
)
8 9 10 11 12 13 14 15 160
02
04
06
08
1
12
14OER
Time (ns)
V(V
)
Figure 617 Post-layout SPICE simulation of the energy-recovery cir-cuit
64 experimental measurements 106
AreaEnergy
dissipation(Theory)
Energydissipation
(Post-layout)
CMOS 144mm2922pJ
(115 f Jbit)
(Eq52)
1265pJ
(158 f Jbit)
Energy recovery 264mm2 615pJ
(77 f Jbit)
768pJ
(96 f Jbit)
Difference +83 minus333 minus393
Table 64 Post-layout simulated energy dissipation Area require-ments
theoretically calculated 333 up to 393 in the post-layoutsimulations
64 experimental measurements
The presented energy-recovery and CMOS IO circuits were fabric-ated on a 200mm wafer (Figure A3) which contained 47 identicaldies Micrograph images of the energy recovery and CMOS circuitscan be seen in Figures 618 and 619 respectively
641 Measurement setup
For IO access a manual on-wafer measurement system was usedthat allowed direct contact with the wafer and the IO test padsthrough RF probe heads (Figure A4) IO signal generation andmeasurement was provided by laboratory instruments whichwere manually controlled The measurement setup is illustratedin Figure 620
The IO test pads for the energy-recovery and CMOS circuitsand their equivalent instrument connections used for each testpad are summarised in Tables 65 and 66
642 Results
CMOS IO circuit
In Figure 621 the input signal S1B and the output signal OCMOS
are shown as captured by the oscilloscope Signal S1B whichhas a frequency of 500MHz is used by the data generator (Fig-ure 614) to generate a data signal switching at the same frequency
64 experimental measurements 107
Figure 618 Energy-recovery IO circuit micrograph
Figure 619 CMOS IO circuit micrograph
AABC
DEFF
D
EEBA
EBBA
Figure 620 Measurement setup
64 experimental measurements 108
Test pad Direction Instrument connection
VIND Power Power supply - gt06V
GND Power Power supply - 0V
S1A InputSine-wave generator (0deg Phase500MHz non-50Ω-terminated)
S2 InputSine-wave generator (VariablePhase 500MHznon-50Ω-terminated)
EN InputDigital Output (enable outputdata switching)
OER Output Oscilloscope
VDR PowerPower supply (adiabaticdrivers+P2LCs) - 12V
VBUF1 Power Power supply (Buffers) - 12V
Table 65 Energy-recovery circuit instrument connections
Test pad Direction Instrument connection
VCMOS PowerPower supply (CMOS drivers) -12V
GND Power Power supply - 0V
VBUF2 Power Power supply (Buffers) - 12V
S1B InputSine-wave generator (0deg Phase500MHz non-50Ω-terminated)
OCMOS Output Oscilloscope
EN InputDigital Output (enable outputdata switching)
Table 66 CMOS circuit instrument connections
64 experimental measurements 109
S1B
OCMOS
Figure 621 Oscilloscope output for signals S1B and OCMOS
100 200 300 400 500 600 700 800012345678910
Die1 Die2 Simulation
Frequency (MHz)
Powerdissipation(mW)
Figure 622 Measuredsimulated power dissipation for the CMOS cir-cuit
that passes through the TSVs and is output on OCMOS Thereforethe measured result on the oscilloscope suggests that the circuitis operating as expected
The proper operation of the circuit is also verified by measuringpower and energy per cycle dissipation on pad VCMOS Thepower dissipation of the two dies measured is approximatelylinear as expected by simulations (Figure 622) while energydissipation approximately constant (Figure 623) In additionabsolute power and energy dissipation values closely match post-layout simulations
Energy-recovery IO circuit
Input signal S1A and output signal OER can be seen in Fig-ure 624 as captured by the oscilloscope The output at OER
64 experimental measurements 110
100 200 300 400 500 600 700 800024681012141618
Die1 Die2 Simulation
Frequency (MHz)
Energydissipation(pJ)
Figure 623 Measuredsimulated energy dissipation for the CMOScircuit
S1A
OER
Figure 624 Oscilloscope output for signals S1A and OER
shows a data signal switching at 500MHz as expected The slowrising edge of the data signal suggests higher than anticipatedparasitics at the OER IO pad though the functionality of thecircuit is not affected
The energy dissipation is calculated by measuring the inputcurrent on IO test pads VIND and VDR The results for the twodies measured can be seen in Figure 625 The energy dissipationshows an approximately constant relation to frequency similar tothe CMOS circuit which does not correlate with post-layout simu-lations A properly operating energy-recovery circuit would beexpected to show minimum energy dissipation at the resonancefrequency (sim 500MHz)
Since the energy-recovery circuit shows CMOS-type energy dis-sipation it was estimated that the inductor could not resonate
65 summary and conclusions 111
200 250 300 350 400 450 500 550 600 65002468101214161820
Die1 Die2 Simulation
Frequency (MHz)
Energydissipation(pJ)
Figure 625 Measuredsimulated energy dissipation for the energy-recovery circuit
IntegratedInductor
VINDOER
PulseGenerator
S1A
S2
DataGenerator
TSV
TSV
P2LCAdiabaticDriver
P2LCAdiabaticDriver TSV
TSV
Buffer
Buffer
1
80
1
80
EN
DecouplingCapacitor
200pF
Figure 626 Energy-recovery IO circuit with fault hypothesis
properly and a hypothesis was made that the cause was a largerthan anticipated parasitic capacitance on the integrated inductorTo determine whether the hypothesis is valid a 200pF capacitoris inserted in the circuit schematic connected at the output ofthe inductor as shown in Figure 626 The new circuit is simu-lated and as can be seen in Figure 627 the output signal at OER
switches properly even though the resonant clock (RES_CLK) isnot oscillating The simulated energy dissipation of the circuitwith the fault hypothesis can be seen in Figure 628 and as canbe observed it is very similar to the experimental measurementsboth in terms of absolute values and frequency relation
65 summary and conclusions
In this chapter the design approach of a demonstrator circuitwas presented for evaluating the energy-recovery 3D interconnect-ing scheme Expanding on the theoretical analysis proposed inChapter 5 practical design issues were discussed and methodswere suggested for overcoming them
The circuit was designed and implemented on an experimental130nm 3D process technology and post-layout simulations res-
65 summary and conclusions 112
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15050
055
060
065
070
075
080RES_CLK
Time (ns)
V(V
)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 150
02
04
06
08
1
12
14PULSE
Time (ns)
V(V
)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1500
02
04
06
08
10
12
14OER
Time (ns)
V(V
)
Figure 627 Simulation of circuit with fault hypothesis
65 summary and conclusions 113
200 250 300 350 400 450 500 550 600 6500
5
10
15
20
25
Die1 Die2 Simulation CL=200pF
Frequency (MHz)
Ene
rgy
diss
ipat
ion
(pJ)
Figure 628 Measured energy dissipation and simulation of circuit withfault hypothesis
ults including parasitics were shown and interpreted The post-layout results revealed that the demonstrator circuit dissipated25 more energy per cycle than the value estimated by the-ory however the energy overhead was attributed to the effectof the wiretransistor parasitics which are not included in thetheoretical models Nevertheless the energy-recovery IO circuitdissipated 39 less energy per cycle than the equivalent CMOS
circuit implemented on the same process technologyExperimental measurements demonstrated that the fabricated
CMOS IO circuit closely matched the energy-dissipation estim-ated by post-layout simulations However the energy-recoveryIO circuit did not operate as expected and as a result its wasnot possible to evaluate the energy-recovery 3D interconnectingscheme on-chip Using simulation data the fault was attributedto higher than anticipated integrated inductor parasitics
7C O N C L U S I O N S
Planar on-chip interconnects have become a major concern inmodern processes as their impact on IC performance has beenprogressively increasing with each technology node In recentyears 3D interconnects based on vertical TSVs have been proposedas a possible solution to the interconnect bottleneck TSVs havethe potential to offer the best all around technology for meet-ing future technology trends with their advantages includingreduced parasitics and high interconnection density
As TSV is a relatively recent 3D interconnect technology thereare still challenges to be overcome before TSVs can be widelyadopted and commercialised Issues in the fabrication of TSV
devices are in the process of being resolved or solutions havealready been found For implementing TSV-based 3D-ICs withmore complex functionalities than simple test chips 3D designmethodologies and tools are needed which are still at their in-fancy stage Accurate perception of the TSV characteristics thatinfluence circuit performance is a major step towards the devel-opment of design methodologies and tools for 3D-ICs
Key TSV characteristics are the TSV parasitic capacitance andthermomechanical stress distribution The parasitic capacitanceaffects delay and power consumption of signals passing throughthe TSV interconnect while thermomechanical stress affects MOS-
FET device mobility in the vicinity of TSVs For extracting theseparameters accurately and under different conditions sophist-icated test structures were presented in this work which werefabricated and evaluated on an experimental 3D process
Although the TSV parasitic capacitance is small it may stillbecome an important source of power consumption as TSVs areimplemented in large densely interconnected 3D-SoCs Since con-ventional low-power design techniques based on voltage scalingrepresent a technical challenge in modern technology nodesa novel TSV interconnection scheme with frequency-dependentpower consumption was proposed in this work based on energy-recovery logic The proposed scheme was analysed using the-oretical modelling while a demonstrator IC was designed andfabricated on a 130nm 3D process
The contributions of this thesis as presented in the previouschapters are summarised in the following sections as well aspossible directions for future work
114
71 main contributions 115
71 main contributions
From the parasitics associated with a TSV capacitance assumesthe most significant role on its electrical properties such as en-ergy dissipation and delay thus accurate characterization of theTSV capacitance is highly desirable when designing a 3D systemIn Chapter 3 a group of test structures were presented for theelectrical characterization of the TSV parasitic capacitance un-der various conditions The test structures were based on theCBCM technique which has been used in the past for measuringBEOL metal interconnect parasitic capacitances and MOSFET ca-pacitances The presented test structures were fabricated on a65nm 3D process and the measurement results were statisticallyanalysed in terms of TSV capacitance die-to-die (D2D) and die-to-wafer (D2W) variability Comparison of the measurement data tosimulated oxide liner thickness variation for the specific processrevealed the oxide liner as a major contributor to the observedTSV capacitance variability Measurement results also confirmedthe effect of TSV pitch on the parasitic capacitance as has beenpreviously reported in the literature for a different process Theproposed CBCM-based TSV capacitance measurement techniquewas shown to be superior to conventional LCR measurementsproducing both accurate measurement results as well as lessoutliers (less defective samples) in the measurement data
The integration of TSV 3D interconnects in CMOS technology hasbeen shown to induce thermomechanical stress on the silicon (Si)substrate Thermomechanical stress can have an impact on activedevice carrier mobility implemented on the same Si substratethus monitoring and controlling stress is crucial for designing TSV-aware circuits with predictable performance Simulations usingfinite element method (FEM) modelling predict a complex ther-momechanical stress distribution which emphasises the need forsophisticated characterization structures that can monitor stressimpact with precision and verify simulation models In Chapter4 a test structure was presented for indirectly monitoring of theTSV-induced thermomechanical stress distribution by measuringits effect on transistor saturation current The structure implemen-ted arrays of MOSFET devices superimposed with TSVs in variousconfigurations and enabled accurate measurement of individualdevices by combining digital selection logic with four-terminalsensing Imecrsquos experimental 65nm 3D process was characterizedusing the presented test structure extracting the thermomechan-ical stress effect on PMOS devices which was found to be on parwith simulation estimations
In future large densely interconnected 3D-SoCs the TSV parasiticcapacitance may become an important source of energy dissip-ation Energy-recovery logic is an interesting low-power design
72 future work 116
approach as it is not limited by voltage scaling which representsa technical challenge in modern technology nodes In Chapter5 the potential of the energy-recovery technique was investig-ated for reducing the energy dissipation of TSV interconnectsin 3D-ICs The proposed energy-recovery scheme was analysedusing theoretical modelling generating equations for energy dis-sipation and optimum device sizing The model accuracy wasevaluated against SPICE simulations which showed good cor-relation with the theoretical estimations on a 130nm technologyprocess Comparison of the energy-recovery scheme to conven-tional static CMOS demonstrated favourable energy performancefor the energy-recovery scheme for low operating frequenciesand high Q factors switching activities and TSV capacitances
The theoretical models developed in Chapter 5 were used fordesigning a energy-recovery 3D demonstrator circuit in Chapter6 under realistic physical and electrical constraints The energy-recovery circuit and its CMOS equivalent were fabricated on a130nm 3D process technology and post-layout simulations showedthat the energy-recovery circuit dissipated 39 less energy percycle than the CMOS circuit on the same process technology Exper-imental measurements and comparison to simulations confirmedthe proper operation of the fabricated CMOS circuit however theenergy-recovery circuit did not operate as expected and as a res-ult its was not possible to evaluate the proposed energy-recoveryscheme on a chip Using simulation data the fault was attributedto higher than anticipated integrated inductor parasitics
72 future work
The CBCM-based test structures in Chapter 3 were shown to be atleast as accurate as external instruments for characterizing TSV
capacitance Considering that CBCM can evaluate single-endedTSVs it could prove a valuable tool for testing TSVs prior to diestacking which is a critical step for keeping 3D-IC yield acceptableMeasuring the TSV parasitic capacitance can provide with import-ant information about the quality of the fabricated TSV devicesince even slight variations in process parameters will affect itsparasitic capacitance as was demonstrated in Chapter 3 CBCM issimpler and potentially more accurate than other techniques thathave been proposed in the literature for the same role such asRC constant measurement using sense amplification [16]
In Chapter 4 the experiences learned from evaluation of theexperimental measurements results could be potentially used fordesigning improved thermomechanical stress characterizationstructures for next generation TSV processes
1 The transistors should be in a regular grid in the array forsimple processing of the measurement data
72 future work 117
2 The boundary conditions of transistors should be carefullydesigned as they have an considerable effect on transistormatching and thus measurement accuracy
3 Distances of interest are generally lt 8microm thus the arrayscould be smaller and more dense in the area in proximityto the TSV
4 Mechanisms are required to detect faults in the digitalselection logic when the implementation process is expectedto have low-yield issues
Furthermore in a future work the thermomechanical stress effecton NMOS devices should be measured which were not evaluatedin this project Simulations predict that NMOS devices are muchless affected by thermomechanical stress than PMOS devices thusthe approach used in this work might not be sufficiently accurateto extract stress effect on NMOS mobility
In Chapter 5 the energy-recovery scheme proposed could use adifferent encoding technique than dual-rail as a possible improve-ment A suggested encoding is 1-out-of-4 [74] since only chargesa single output at any given cycle which can potentially save upto 50 energy in comparison to dual-rail Also the demonstra-tion of the energy-recovery scheme feasibility on a chip is stillpending since the energy-recovery 3D demonstrator discussedin Chapter 6 failed to operate properly The fault was attributedto higher than anticipated integrated inductor parasitics thusthe same circuit should be implemented in the future with animproved inductor design for verifying the proposed theory
AA P P E N D I X A P H O T O S
Figure A1 3D65 300mm wafer (on chuck)
Figure A2 Probe card
118
appendix a photos 119
Figure A3 3D130 200mm wafer (on chuck)
ProbeHead
Microscope
Figure A4 Probe head
BA P P E N D I X B S TAT I S T I C S
mean (micro)
In a given data set the mean is the sum of the values divided bythe number of values The mean represents the average value in apopulation If xi is a value in the data set and n the total numberof values the mean micro can be calculated as
micro =1n
n
sumi=1
xi
median (m)
The median is useful for skewed distributions and represents thearithmetic value that the majority of the values in a data set tendThe median of a normal distribution with mean micro and varianceσ2 is the same as the mean
standard deviation (σ)
Standard deviation is used for measuring variability in a dataset It shows how much variation there is from the mean valueTo calculate the standard deviation of a data set first the squareof the difference of each value from the mean is calculated andthen the square root of the mean gives the standard deviation
A normal distribution contains 682 of its values within plusmn1σOther distributions can have a different spread and typically alow standard deviation indicates low variability whereas a highstandard deviation increased variability
standard deviation of the mean (σmicro )
The standard deviation of the sampled mean (σmicro) provides in-formation on the precision of the calculated mean of a data setThis is especially useful in experimental measurements wherethe certainty of the data being an accurate representation of themeasured physical quantity increases the more the measurementis repeated
Assuming there is a data set X with n values xi of mean micro
120
appendix b statistics 121
variance(X) = σ2X
variance(micro) = variance( 1n sum
ni=1 xi) =
1n2 variance(sumn
i=1 xi) =1n2 sum
ni=1 variance(xi) =
nn2 variance(X) =1n variance(X) =
σ2Xn rarr
σmicro = σXradicn
coefficient of variation
The coefficient of variation is the ratio of the standard deviationσ to the mean micro
CV =σ
micro
It can be more useful in determining variability in a populationthan standard deviation as in contrast to standard deviationcoefficient of variation is a dimensionless number often expressedas percentage
outlier
Outliers are values in a data set that are numerically distant fromthe rest of the data Observing and sometimes removing theoutliers is crucial in experimental measurement as these valuesusually indicate a measurement error or other malfunction De-termining outliers is generally subjective to the observer as thereis no specific mathematical definition of what constitutes outliersTypically the outliers are located by observing the probabilitydistribution function of a data set
kruskalndash wallis [34]
KruskalndashWallis one-way analysis of variance compares the medi-ans of two or more data sets to determine if the samples originatefrom separate populations This analysis tests the null-hypothesisthat there is no relationship between the data sets calculating theprobability of the data sets originating from the same populationand rejecting the null-hypothesis if it is less than a significancelevel The KruskalndashWallis test can be used on data sets that donot have a normal distribution and their variances do not have tobe equal
B I B L I O G R A P H Y
[1] Model 2602 dual-channel system sourcemeter instru-ment URL httpwwwkeithleycomproductsdcac
currentsourcebroadpurposemn=2602
[2] Max-3d URL httpwwwmicromagiccom
[3] The international technology roadmap for semiconductors2007 URL httppublicitrsnet
[4] The international technology roadmap for semiconductors2009 URL httppublicitrsnet
[5] Jaime Aguilera Joaquin de No Andres Garcia-Alonso FrankOehler Heiko Hein and Josef Sauerer A guide for on-chip inductor design in a conventional cmos process forrf applications In Applied Microwave amp Wireless Magazinepages 56ndash65 October 2001
[6] M Arsalan and M Shams Charge-recovery power clockgenerators for adiabatic logic circuits In Proc 18th Int VLSI
Design Conf pages 171ndash174 2005 doi 101109ICVD200564
[7] P Asimakopoulos G Van der Plas A Yakovlev andP Marchal Evaluation of energy-recovering interconnectsfor low-power 3d stacked ics In Proc IEEE Int Conf
3D System Integration 3DIC 2009 pages 1ndash5 2009 doi1011093DIC20095306581
[8] W C Athas L J Svensson J G Koller N Tzartzanis andE Ying-Chin Chou Low-power digital systems based onadiabatic-switching principles 2(4)398ndash407 1994 doi 10110992335009
[9] C H Bennett Logical reversibility of computation IBM
Journal of Research and Development 17(6)525ndash532 1973 doi101147rd1760525
[10] Charles H Bennett The fundamental physical lim-its of computation Scientific American 253(1)48ndash561985 URL httpwwwnaturecomdoifinder101038
scientificamerican0785-48
[11] E Beyne 3d system integration technologies In Proc Int
VLSI Technology Systems and Applications Symp pages 1ndash92006 doi 101109VTSA2006251113
122
bibliography 123
[12] Mark Bohr and Kaizad Mistry Intelrsquos revolu-tionary 22 nm transistor technology 2011 URLhttpdownloadintelcomnewsroomkits22nmpdfs
22nm-Details_Presentationpdf
[13] Ke Cao S Dobre and Jiang Hu Standard cell charac-terization considering lithography induced variations InProc 43rd ACMIEEE Design Automation Conf pages 801ndash8042006 doi 101109DAC2006229327
[14] A P Chandrakasan S Sheng and R W Brodersen Low-power cmos digital design 27(4)473ndash484 1992 doi 1011094126534
[15] J C Chen B W McGaughy D Sylvester and Chenming HuAn on-chip attofarad interconnect charge-based capacitancemeasurement (cbcm) technique In Proc Int Electron Devices
Meeting IEDM rsquo96 pages 69ndash72 1996 doi 101109IEDM1996553124
[16] Po-Yuan Chen Cheng-Wen Wu and Ding-Ming Kwai On-chip tsv testing for 3d ic before bonding using sense ampli-fication In Proc ATS rsquo09 Asian Test Symp pages 450ndash4552009 doi 101109ATS200942
[17] Sungdong Cho Sinwoo Kang Kangwook Park Jaechul KimKiyoung Yun Kisoon Bae Woon Seob Lee Sangwook JiEunji Kim Jangho Kim Y L Park and E S Jung Impact oftsv proximity on 45nm cmos devices in wafer level In Proc
IEEE Int Interconnect Technology Conf and 2011 Materials for
Advanced Metallization (IITCMAM) pages 1ndash3 2011 doi101109IITC20115940326
[18] Kyu-Myung Choi An industrial perspective of 3d ic inte-gration technology from the viewpoint of design technologyIn Proc 15th Asia and South Pacific Design Automation Conf
(ASP-DAC) pages 544ndash547 2010 doi 101109ASPDAC20105419823
[19] Juang-Ying Chueh C H Ziesler and M C PapaefthymiouEmpirical evaluation of timing and power in resonant clockdistribution In Proc Int Symp Circuits and Systems ISCAS
rsquo04 volume 2 2004 doi 101109ISCAS20041329255
[20] Thuy Dao D H Triyoso M Petras and M CanonicoThrough silicon via stress characterization In Proc IEEE
Int Conf IC Design and Technology ICICDT rsquo09 pages 39ndash412009 doi 101109ICICDT20095166260
[21] M Grange R Weerasekera D Pamunuwa and H Ten-hunen Exploration of through silicon via interconnect par-
bibliography 124
asitics for 3-dimensional integrated circuits In Workshop
Notes Design Automation and Test in Europe (DATE) 2009
[22] N Z Haron and S Hamdioui Why is cmos scaling comingto an end In Proc 3rd Int Design and Test Workshop IDT
2008 pages 98ndash103 2008 doi 101109IDT20084802475
[23] Imec 200mm 3d-sic tsv line 2009 URL httpwwwimec
beScientificReportSR2009HTML1213307html
[24] J W Joyner R Venkatesan P Zarkesh-Ha J A Davis andJ D Meindl Impact of three-dimensional architectures oninterconnects in gigascale integration 9(6)922ndash928 2001doi 10110992974905
[25] A P Karmarkar Xiaopeng Xu and V Moroz Perfor-manace and reliability analysis of 3d-integration struc-tures employing through silicon via (tsv) In Proc IEEE
Int Reliability Physics Symp pages 682ndash687 2009 doi101109IRPS20095173329
[26] G Katti A Mercha M Stucchi Z Tokei D VelenisJ Van Olmen C Huyghebaert A Jourdain M RakowskiI Debusschere P Soussan H Oprins W DehaeneK De Meyer Y Travaly E Beyne S Biesemans andB Swinnen Temperature dependent electrical character-istics of through-si-via (tsv) interconnections In Proc Int
Interconnect Technology Conf (IITC) pages 1ndash3 2010 doi101109IITC20105510311
[27] G Katti M Stucchi K De Meyer and W Dehaene Electricalmodeling and characterization of through silicon via forthree-dimensional ics 57(1)256ndash262 2010 doi 101109TED20092034508
[28] G Katti M Stucchi J Van Olmen K De Meyer and W De-haene Through-silicon-via capacitance reduction techniqueto benefit 3-d ic performance 31(6)549ndash551 2010 doi101109LED20102046712
[29] Walt Kester Sensor signal conditioning In Jon SWilson editor Sensor Technology Handbook pages 31 ndash136 Newnes Burlington 2005 ISBN 978-0-75-067729-5 doi DOI101016B978-075067729-550044-6 URLhttpwwwsciencedirectcomsciencearticlepii
B9780750677295500446
[30] Jonggab Kil Jie Gu and C H Kim A high-speed variation-tolerant interconnect technique for sub-threshold circuitsusing capacitive boosting 16(4)456ndash465 2008 doi 101109TVLSI2007915455
bibliography 125
[31] Suhwan Kim C H Ziesler and M C PapaefthymiouCharge-recovery computing on silicon 54(6)651ndash659 2005doi 101109TC200591
[32] C Kortekaas On-chip quasi-static floating-gate capacitancemeasurement method In Proc Int Conf Microelectronic Test
Structures ICMTS 1990 pages 109ndash113 1990 doi 101109ICMTS199067889
[33] A Kramer J S Denker B Flower and J Moroney 2ndorder adiabatic computation with 2n-2p and 2n-2n2p logiccircuits In Proceedings of the 1995 international symposium on
Low power design ISLPED rsquo95 pages 191ndash196 New York NYUSA 1995 ACM ISBN 0-89791-744-8 doi httpdoiacmorg101145224081224115 URL httpdoiacmorg10
1145224081224115
[34] William H Kruskal and W Allen Wallis Use of ranks in one-criterion variance analysis Journal of the American Statistical
Association 47(260)583ndash621 1952 ISSN 01621459 doi 1023072280779 URL httpdxdoiorg1023072280779
[35] Kelin J Kuhn Moorersquos law past 32nm Future challengesin device scaling 2009 URL httpdownloadintelcom
pressroompdfkkuhnKuhn_IWCE_invited_textpdf
[36] R Landauer Irreversibility and heat generation in the com-puting process IBM Journal of Research and Development 5
(3)183ndash191 1961 doi 101147rd530183
[37] Joonho Lim Dong-Gyu Kim and Soo-Ik Chae A 16-bitcarry-lookahead adder using reversible energy recoverylogic for ultra-low-energy systems 34(6)898ndash903 1999 doi1011094766827
[38] Joonho Lim Dong-Gyu Kim and Soo-Ik Chae nmos re-versible energy recovery logic for ultra-low-energy applica-tions 35(6)865ndash875 2000 doi 1011094845190
[39] F Liu X Gu K A Jenkins E A Cartier Y Liu P Song andS J Koester Electrical characterization of 3d through-silicon-vias In Proc 60th Electronic Components and Technology Conf
(ECTC) pages 1100ndash1105 2010 doi 101109ECTC20105490839
[40] K H Lu Xuefeng Zhang Suk-Kyu Ryu J Im Rui Huangand P S Ho Thermo-mechanical reliability of 3-d ics con-taining through silicon vias In Proc 59th Electronic Compo-
nents and Technology Conf ECTC 2009 pages 630ndash634 2009doi 101109ECTC20095074079
bibliography 126
[41] Nir Magen Avinoam Kolodny Uri Weiser and NachumShamir Interconnect-power dissipation in a microprocessorIn Proceedings of the 2004 international workshop on System
level interconnect prediction SLIP rsquo04 pages 7ndash13 New YorkNY USA 2004 ACM ISBN 1-58113-818-0 doi httpdoiacmorg101145966747966750 URL httpdoiacm
org101145966747966750
[42] H Mahmoodi V Tirumalashetty M Cooke and K RoyUltra low-power clocking scheme using energy recoveryand clock gating 17(1)33ndash44 2009 doi 101109TVLSI20082008453
[43] D Maksimovic A mos gate drive with resonant transitionsIn Proc nd Annual IEEE Power Electronics Specialists Conf
PESC rsquo91 Record pages 527ndash532 1991 doi 101109PESC1991162725
[44] D Maksimovic and V G Oklobdzija Integrated power clockgenerators for low energy logic In Proc th Annual IEEE
Power Electronics Specialists Conf PESC rsquo95 Record volume 1pages 61ndash67 1995 doi 101109PESC1995474793
[45] D Maksimovic V G Oklobdzija B Nikolic and K WCurrent Clocked cmos adiabatic logic with integratedsingle-phase power-clock supply 8(4)460ndash463 2000 doi10110992863629
[46] E J Marinissen Testing tsv-based three-dimensional stackedics In Proc Design Automation amp Test in Europe Conf amp
Exhibition (DATE) pages 1689ndash1694 2010
[47] E J Marinissen and Y Zorian Testing 3d chips containingthrough-silicon vias In Proc Int Test Conf ITC 2009 pages1ndash11 2009 doi 101109TEST20095355573
[48] A Mercha A Redolfi M Stucchi N Minas J Van Ol-men S Thangaraju D Velenis S Domae Y Yang G KattiR Labie C Okoro M Zhao P Asimakopoulos I De WolfT Chiarella T Schram E Rohr A Van Ammel A Jour-dain W Ruythooren S Armini A Radisic H PhilipsenN Heylen M Kostermans P Jaenen E Sleeckx D Sabun-cuoglu Tezcan I Debusschere P Soussan D PerryG Van der Plas J H Cho P Marchal Y Travaly E BeyneS Biesemans and B Swinnen Impact of thinning andthrough silicon via proximity on high-k metal gate firstcmos performance In Proc Symp VLSI Technology (VLSIT)pages 109ndash110 2010 doi 101109VLSIT20105556190
[49] A Mercha G Van der Plas V Moroz I De Wolf P Asi-makopoulos N Minas S Domae D Perry M Choi
bibliography 127
A Redolfi C Okoro Y Yang J Van Olmen S ThangarajuD S Tezcan P Soussan J H Cho A Yakovlev P Mar-chal Y Travaly E Beyne S Biesemans and B SwinnenComprehensive analysis of the impact of single and arraysof through silicon vias induced stress on high-k metalgate cmos performance In Proc IEEE Int Electron Devices
Meeting (IEDM) 2010 doi 101109IEDM20105703278
[50] Yong Moon and Deog-Kyoon Jeong An efficient chargerecovery logic circuit 31(4)514ndash522 1996 doi 1011094499727
[51] G E Moore Cramming more components onto inte-grated circuits Electronics 38(8)114ndash117 April 1965 doi101109JPROC1998658762 URL httpdxdoiorg10
1109JPROC1998658762
[52] O S Nakagawa S-Y Oh T Hsu and S Habu Benchmarkmethodology of interconnect capacitance simulation usinginter-digitated capacitors In Proc Int Microelectronic Test
Structures ICMTS 1998 Conf pages 235ndash237 1998 doi 101109ICMTS1998688103
[53] M W Newman S Muthukumar M SchueleinT Dambrauskas P A Dunaway J M Jordan S KulkarniC D Linde T A Opheim R A Stingel W WorwagL A Topic and J M Swan Fabrication and electricalcharacterization of 3d vertical interconnects In Proc
56th Electronic Components and Technology Conf 2006 doi101109ECTC20061645676
[54] V G Oklobdzija D Maksimovic and Fengcheng Lin Pass-transistor adiabatic logic using single power-clock supply44(10)842ndash846 1997 doi 10110982633443
[55] J Van Olmen C Huyghebaert J Coenen J Van AelstE Sleeckx A Van Ammel S Armini G Katti J VaesW Dehaene E Beyne and Y Travaly Integration chal-lenges of copper through silicon via (tsv) metallization for3d-stacked ic integration Microelectronic Engineering 88
(5)745 ndash 748 2011 ISSN 0167-9317 doi DOI101016jmee201006026 URL httpwwwsciencedirectcom
sciencearticlepiiS0167931710002121 The 2010 Inter-national workshop on - MAM 2010 The 2010 Internationalworkshop on
[56] J J Paulos and D A Antoniadis Measurement of minimum-geometry mos transistor capacitances 20(1)277ndash283 1985doi 101109JSSC19851052303
bibliography 128
[57] Vasilis F Pavlidis and Eby G Friedman Interconnect de-lay minimization through interlayer via placement in 3-dics In Proceedings of the 15th ACM Great Lakes symposium
on VLSI GLSVLSI rsquo05 pages 20ndash25 New York NY USA2005 ACM ISBN 1-59593-057-4 doi httpdoiacmorg10114510576611057669 URL httpdoiacmorg101145
10576611057669
[58] Dan Perry Jonghoon Cho Shinichi Domae Panagiotis Asi-makopoulos Alex Yakovlev Pol Marchal Geert Van derPlas and Nikolaos Minas An efficient array structure tocharacterize the impact of through silicon vias on fet devicesIn Proc IEEE Int Microelectronic Test Structures (ICMTS) Confpages 118ndash122 2011 doi 101109ICMTS20115976872
[59] Jan M Rabaey Anantha Chandrakasan and Borivoje NikolicDigital integrated circuits- A design perspective Prentice Hall2ed edition 2004
[60] K Roy and S Prasad Low-power CMOS VLSI circuit de-
sign A Wiley-Interscience publication Wiley 2000 ISBN9780471114888
[61] K Roy S Mukhopadhyay and H Mahmoodi-MeimandLeakage current mechanisms and leakage reduction tech-niques in deep-submicrometer cmos circuits 91(2)305ndash3272003 doi 101109JPROC2002808156
[62] V S Sathe M C Papaefthymiou and C H Ziesler Boostlogic a high speed energy recovery circuit family In Proc
IEEE Computer Society Annual Symp VLSI pages 22ndash27 2005doi 101109ISVLSI200522
[63] V S Sathe J C Kao and M C Papaefthymiou Resonant-clock latch-based design 43(4)864ndash873 2008 doi 101109JSSC2008917501
[64] B Sell A Avellan and W H Krautschneider Charge-basedcapacitance measurements (cbcm) on mos devices 2(1)9ndash122002 doi 101109TDMR20021014667
[65] C S Selvanayagam Xiaowu Zhang R Rajoo and D PinjalaModelling stress in silicon with tsvs and its effect on mobilityIn Proc 11th Electronics Packaging Technology Conf EPTC rsquo09pages 612ndash618 2009 doi 101109EPTC20095416477
[66] D Shariff P C Marimuthu K Hsiao L Asoy Chia LaiYee Aung Kyaw Oo K Buchanan K Crook T Wilby andS Burgess Integration of fine-pitched through-silicon viasand integrated passive devices In Proc IEEE 61st Elec-
tronic Components and Technology Conf (ECTC) pages 844ndash848 2011 doi 101109ECTC20115898609
bibliography 129
[67] S Stoukatch C Winters E Beyne W De Raedt andC Van Hoof 3d-sip integration for autonomous sensornodes In Proc 56th Electronic Components and Technology
Conf 2006 doi 101109ECTC20061645678
[68] L J Svensson and J G Koller Driving a capacitive loadwithout dissipating fcv2 In Proc IEEE Symp Low Power
Electronics Digest of Technical Papers pages 100ndash101 1994 doi101109LPE1994573220
[69] B Swinnen W Ruythooren P De Moor L Bogaerts L Car-bonell K De Munck B Eyckens S Stoukatch D S TezcanZ Tokei J Vaes J Van Aelst and E Beyne 3d integrationby cu-cu thermo-compression bonding of extremely thinnedbulk-si die containing 10 IcircŒm pitch through-si vias In Proc
Int Electron Devices Meeting IEDM rsquo06 pages 1ndash4 2006 doi101109IEDM2006346786
[70] N Tanaka M Kawashita Y Yoshimura T Uematsu M Fuji-sawa H Shimokawa N Kinoshita T Naito T Kikuchi andT Akazawa Characterization of mos transistors after tsvfabrication and 3d-assembly In Proc 2nd Electronics System-
Integration Technology Conf ESTC 2008 pages 131ndash134 2008doi 101109ESTC20084684338
[71] S E Thompson Guangyu Sun Youn Sung Choi andT Nishida Uniaxial-process-induced strained-si extend-ing the cmos roadmap 53(5)1010ndash1020 2006 doi 101109TED2006872088
[72] N Tzartzanis and W C Athas Clock-powered cmos ahybrid adiabatic logic style for energy-efficient computingIn Proc 20th Anniversary Conf Advanced Research in VLSIpages 137ndash151 1999 doi 101109ARVLSI1999756044
[73] J Van Olmen A Mercha G Katti C HuyghebaertJ Van Aelst E Seppala Zhao Chao S Armini J VaesR C Teixeira M Van Cauwenberghe P Verdonck K Ver-hemeldonck A Jourdain W Ruythooren M de Potterde ten Broeck A Opdebeeck T Chiarella B Parvais I De-busschere T Y Hoffmann B De Wachter W DehaeneM Stucchi M Rakowski P Soussan R Cartuyvels E BeyneS Biesemans and B Swinnen 3d stacked ic demonstrationusing a through silicon via first approach In Proc IEEE Int
Electron Devices Meeting IEDM 2008 pages 1ndash4 2008 doi101109IEDM20084796763
[74] Tom Verhoeff Delay-insensitive codes - an overview Dis-
tributed Computing 31ndash8 1988 ISSN 0178-2770 URL http
dxdoiorg101007BF01788562 101007BF01788562
bibliography 130
[75] P Vitanov T Dimitrova and I Eisele Direct capacitancemeasurements of small geometry mos transistors Micro-
electronics Journal 22(7-8)77 ndash 89 1991 ISSN 0026-2692doi DOI1010160026-2692(91)90016-G URL httpwww
sciencedirectcomsciencearticleB6V44-4829XDJ-2N
2f5b2c221dcf240c38cd13b7b3432f913
[76] B Voss and M Glesner A low power sinusoidal clockIn Proc IEEE Int Symp Circuits and Systems ISCAS 2001volume 4 pages 108ndash111 2001 doi 101109ISCAS2001922182
[77] R Weerasekera M Grange D Pamunuwa H Tenhunenand Li-Rong Zheng Compact modelling of through-siliconvias (tsvs) in three-dimensional (3-d) integrated circuits InProc IEEE Int Conf 3D System Integration 3DIC 2009 pages1ndash8 2009 doi 1011093DIC20095306541
[78] NHE Weste and DF Harris CMOS VLSI design a cir-
cuits and systems perspective PearsonAddison-Wesley 2005ISBN 9780321269775
[79] Steve Wozniak Failure magazine interview (by jason zasky)July 2000 URL httpfailuremagcomindexphpsite
printsteve_wozniak_interview
[80] Hyung Suk Yang and Muhannad S Bakir 3d integrationof cmos and mems using mechanically flexible intercon-nects (mfi) and through silicon vias (tsv) In Proc 60th Elec-
tronic Components and Technology Conf (ECTC) pages 822ndash828 2010 doi 101109ECTC20105490716
[81] Yu Yang G Katti R Labie Y Travaly B Verlinden andI De Wolf Electrical evaluation of 130-nm mosfets withtsv proximity in 3d-sic structure In Proc Int Interconnect
Technology Conf (IITC) pages 1ndash3 2010 doi 101109IITC20105510710
[82] Yibin Ye and K Roy Energy recovery circuits using re-versible and partially reversible logic 43(9)769ndash778 1996doi 10110981536746
[83] Yibin Ye and K Roy Qserl quasi-static energy recoverylogic 36(2)239ndash248 2001 doi 1011094902764
[84] C C Yeh J H Lou and J B Kuo 15 v cmos full-swingenergy efficient logic (eel) circuit suitable for low-voltageand low-power vlsi applications Electronics Letters 33(16)1375ndash1376 1997 doi 101049el19970915
bibliography 131
[85] C P Yuan and T N Trick A simple formula for the esti-mation of the capacitance of two-dimensional interconnectsin vlsi circuits 3(12)391ndash393 1982 doi 101109EDL198225610
- Abstract
- Publications
- Acknowledgments
- Contents
- List of Figures
- List of Tables
- Acronyms
- 1 Introduction
-
- 11 Research goals and contribution
- 12 Thesis outline
-
- 2 Background
-
- 21 3D integrated circuits
-
- 211 3D integration approaches
- 212 TSV interconnects
-
- 22 Energy-recovery logic
-
- 221 Energy-recovery architectures
- 222 Power-clock generators
-
- 23 Full-custom design flow
- 24 Summary
-
- 3 Characterization of TSV capacitance
-
- 31 Introduction
- 32 TSV parasitic capacitance
- 33 Integrated capacitance measurement techniques
- 34 Electrical characterization
-
- 341 Physical implementation
- 342 Measurement setup
- 343 Experimental results
-
- 35 Summary and conclusions
-
- 4 TSV proximity impact on MOSFET performance
-
- 41 Introduction
- 42 TSV-induced thermomechanical stress
- 43 Test structure implementation
-
- 431 MOSFET arrays
- 432 Digital selection logic
-
- 44 Experimental measurements
-
- 441 Data processing methodology
- 442 Measurement setup
- 443 PMOS (long channel)
- 444 PMOS (short channel)
-
- 45 Summary and conclusions
-
- 5 Evaluation of energy-recovery for TSV interconnects
-
- 51 Introduction
- 52 Energy-recovery TSV interconnects
- 53 Analysis
-
- 531 Adiabatic driver
- 532 Inductors parasitic resistance
- 533 Switch M1
- 534 Timing Constraints
-
- 54 Simulations
-
- 541 Comparison to CMOS
-
- 55 Summary and conclusions
-
- 6 Energy-recovery 3D-IC demonstrator
-
- 61 Introduction
- 62 Physical implementation
-
- 621 Adiabatic driver
- 622 Pulse-to-level converter
- 623 Data generator
- 624 Pulse generator
- 625 Integrated inductor
- 626 Top level design
- 627 CMOS IO circuit
-
- 63 Post-layout simulation and comparison
- 64 Experimental measurements
-
- 641 Measurement setup
- 642 Results
-
- 65 Summary and conclusions
-
- 7 Conclusions
-
- 71 Main contributions
- 72 Future work
-
- A Appendix A Photos
- B Appendix B Statistics
- Bibliography
-
contents vii
55 Summary and conclusions 91
6 Energy-recovery 3D-IC demonstrator 92
61 Introduction 92
62 Physical implementation 92
621 Adiabatic driver 94
622 Pulse-to-level converter 95
623 Data generator 95
624 Pulse generator 95
625 Integrated inductor 97
626 Top level design 98
627 CMOS IO circuit 101
63 Post-layout simulation and comparison 101
64 Experimental measurements 106
641 Measurement setup 106
642 Results 106
65 Summary and conclusions 111
7 Conclusions 114
71 Main contributions 115
72 Future work 116
a Appendix A Photos 118
b Appendix B Statistics 120
bibliography 122
L I S T O F F I G U R E S
Figure 11 Scaling trend of logic and interconnect delay[30] 2
Figure 21 Comparison of 2D3D Integration 5
Figure 22 System-in-a-package 6
Figure 23 Package-on-package 6
Figure 24 Imecrsquos 3-D SIP wireless bio-electronic sensor[67] 7
Figure 25 3D wafer-level-packaging 7
Figure 26 3D-stacked IC 8
Figure 27 TSV process steps 9
Figure 28 Before thinningafter thinning 10
Figure 29 TSV stacking 10
Figure 210 Conventional AND gate 14
Figure 211 CMOS switching 16
Figure 212 Adiabatic switching 16
Figure 213 Basic 2N-2P adiabatic circuit 17
Figure 214 2N-2P timing 17
Figure 215 Adiabatic driver 18
Figure 216 Stepwise charging generator 19
Figure 217 1N power clock generator 20
Figure 218 Design flow 22
Figure 219 Inverter in Virtuoso schematic 22
Figure 220 Inverter in Virtuoso layout 22
Figure 31 Isolated 2D wire capacitance 24
Figure 32 TSV parasitic capacitance 25
Figure 33 Capacitance vs Gate voltage (CV) diagramof a MOS Capacitor 25
Figure 34 CBCM pseudo-inverters 27
Figure 35 Current as a function of frequency in CBCM 28
Figure 36 CBCM pseudo-inverter pair in layout 29
Figure 37 PRC1 test pad module (layout) 30
Figure 38 PRC1 test pad module (micrograph) 30
Figure 39 Structure rsquoArsquo - Single TSV capacitance (TOPdie) 32
Figure 310 Structure rsquoBrsquo - Single TSV capacitance (BOT-TOM die) 33
Figure 311 Structure rsquoCrsquo - Variable pitch TSV capacitance34
Figure 312 Structure rsquoDrsquo - TSV capacitance measure-ment with interconnection on TOP or BOT-TOM die 34
Figure 313 Structure rsquoRrsquo - Evaluation of TSV capacit-ance measurement accuracy 35
viii
List of Figures ix
Figure 314 Structure rsquoErsquo - TSV capacitance effect onsignal propagation delay (layout) 35
Figure 315 Structure rsquoErsquo - TSV capacitance effect onsignal propagation delay (schematic) 36
Figure 316 Measurement setup 36
Figure 317 TSV capacitance D2D variability on waferrsquoD11rsquo - Probability density function 39
Figure 318 TSV capacitance D2D variability on waferrsquoD18rsquo - Probability density function 40
Figure 319 TSV capacitance measurement results onwafer D11 40
Figure 320 TSV capacitance measurement results onwafer D18 41
Figure 321 Simulated oxide liner thickness variation 42
Figure 322 TSV capacitance W2W variability - Probab-ility density functions 42
Figure 323 Variable pitch TSV capacitance experimentaldata from wafer rsquoD18rsquo - Probability densityfunction 44
Figure 324 LCR experimental data from wafer rsquoD18rsquo -Probability density function 44
Figure 41 TSV thermomechanical stress simulation [40] 47
Figure 42 TSV thermomechanical stress interactionbetween two TSVs [40] 48
Figure 43 Estimated mobility shift on PMOS (a) andNMOS (b) transistors in between two TSVsseparated by distance d [65] 49
Figure 44 PTP1 test pad module (layout) 50
Figure 45 PTP1 test pad module (micrograph) 50
Figure 46 MOSFET dimensions and rotations 51
Figure 47 1 TSV MOSFET array configuration 52
Figure 48 4 diagonal TSVs MOSFET array configuration52
Figure 49 4 lateral TSVs MOSFET array configuration 53
Figure 410 9 TSVs MOSFET array configuration 53
Figure 411 MOSFET array detail 55
Figure 412 Digital selection logic 56
Figure 413 MOSFET array with digital selection logic 57
Figure 414 4-bit Gate-enableArray-enable decoder 58
Figure 415 Transmission gate for MOSFET Gate terminals59
Figure 416 Transmission gates for MOSFET Drain-Source terminals 59
Figure 417 2-bit Drain-select decoder 60
Figure 418 Processing of measurement results 62
Figure 419 Interpolation of measurement results 63
Figure 420 Measurement setup 63
Figure 421 Normalised on-current for PMOS structureldquoF0rdquo (D08) 65
List of Figures x
Figure 422 Normalised on-current for PMOS structureldquoG0rdquo (D08) 66
Figure 423 Normalised on-current for PMOS structureldquoG0rdquo (D08Y=14X=16) 66
Figure 424 Normalised on-current for PMOS structureldquoG0rdquo (D20) 67
Figure 425 Normalised on-current for PMOS structureldquoG0rdquo (D20Y=14X=16) 67
Figure 426 Normalised on-current for PMOS structureldquoG0rdquo (D08 60˚C) 68
Figure 427 Normalised on-current for PMOS structureldquoG0rdquo (D08 Y=14 X=16 60˚C) 69
Figure 428 Normalised on-current for PMOS structureldquoI0rdquo (D08) 69
Figure 429 Normalised on-current for PMOS structureldquoI0rdquo (D08 X=16) 70
Figure 430 Normalised on-current for PMOS structureldquoI0rdquo (D08 Y=14) 70
Figure 431 Normalised on-current for PMOS structureldquoG90rdquo (D08) 71
Figure 432 Normalised on-current for PMOS structureldquoG90rdquo (D08 Y=14 X=16) 71
Figure 433 Normalised on-current for PMOS structureldquoB0rdquo (D08) 72
Figure 434 Normalised on-current for PMOS structureldquoB0rdquo (D08 Y=14 X=16) 72
Figure 51 TSV energy dissipation per cycle increasewith 3D integration density 76
Figure 52 Comparison of a typical CMOS TSV driverto the proposed energy-recovery 77
Figure 53 Adiabatic driver 78
Figure 54 A simple pulse-to-level converter (P2LC) 78
Figure 55 Resonant pulse generator with dual-rail dataencoding 79
Figure 56 Energy-recovery scheme for TSV interconnects79
Figure 57 Energy-recovery timing constraints 83
Figure 58 Test case for the evaluation of the theoret-ical model 84
Figure 59 Energy dissipation results ( f Jcycle) the-oretical model (T) versus SPICE simulator(S) 85
Figure 510 Energy dissipation estimation error of thetheoretical model compared to the SPICEsimulator 85
Figure 511 Test case for comparison of energy-recoveryto CMOS 86
Figure 512 Improved P2LC design 87
List of Figures xi
Figure 513 Simulated energy dissipation of improvedP2LC design 87
Figure 514 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS for variable Q factor (C = 80 f F) 88
Figure 515 Energy contribution of energy-recovery cir-cuit components at 200MHz (C = 80 f F) 88
Figure 516 Energy contribution of energy-recovery cir-cuit components at 800MHz (C = 80 f F) 89
Figure 517 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS with variable TSV capacitance (C =
80 f F f = 500MHz) 89
Figure 518 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS (C = 80 f F f = 200MHz) 90
Figure 519 Energy dissipation reduction achieved bythe energy-recovery circuit compared toCMOS for variable technology (C = 80 f Ff = 500MHz) 90
Figure 61 Demonstrator circuit structure (showing 1-bit inputoutput (IO) channel face-downview) 93
Figure 62 Energy-recovery IO circuit schematic dia-gram 94
Figure 63 Adiabatic driver (layoutschematic view) 94
Figure 64 P2LC (layoutschematic view) 95
Figure 65 Data generator (layoutschematic view) 96
Figure 66 Timing constraints for the energy-recoverydemonstrator circuit 96
Figure 67 Pulse generator (layoutschematic view) 97
Figure 68 Pulse generator InputOutput signals 97
Figure 69 Inductorrsquos Q factor simulation in ASITIC Estimated energy dissipation reduction percycle over standard CMOS 98
Figure 610 Planar square spiral inductor designed inASITIC 99
Figure 611 Resonant pulse distribution tree 100
Figure 612 Interdigitated metal-insulator-metal capa-citor (fringe capacitor) 101
Figure 613 Energy-recovery IO circuit in layout view 102
Figure 614 CMOS IO circuit schematic diagram 103
Figure 615 CMOS driver schematic 103
Figure 616 CMOS IO circuit in layout view 103
Figure 617 Post-layout SPICE simulation of the energy-recovery circuit 105
Figure 618 Energy-recovery IO circuit micrograph 107
List of Figures xii
Figure 619 CMOS IO circuit micrograph 107
Figure 620 Measurement setup 107
Figure 621 Oscilloscope output for signals S1B and OC-MOS 109
Figure 622 Measuredsimulated power dissipation forthe CMOS circuit 109
Figure 623 Measuredsimulated energy dissipation forthe CMOS circuit 110
Figure 624 Oscilloscope output for signals S1A and OER110
Figure 625 Measuredsimulated energy dissipation forthe energy-recovery circuit 111
Figure 626 Energy-recovery IO circuit with fault hy-pothesis 111
Figure 627 Simulation of circuit with fault hypothesis 112
Figure 628 Measured energy dissipation and simula-tion of circuit with fault hypothesis 113
Figure A1 3D65 300mm wafer (on chuck) 118
Figure A2 Probe card 118
Figure A3 3D130 200mm wafer (on chuck) 119
Figure A4 Probe head 119
L I S T O F TA B L E S
Table 21 3D interconnect technologies comparison 8
Table 22 Technology parameters for Imec 3D130 process11
Table 23 Technology parameters for Imec 3D65 process11
Table 24 Comparison energy-recoveryconventionalCMOS 16
Table 31 Test structures on the PRC1 module 31
Table 32 PRC1 test pad instrument connections 37
Table 33 TSV capacitance D2D variability on waferrsquoD11rsquo 38
Table 34 TSV capacitance D2D variability on waferrsquoD18rsquo 39
Table 35 Variable pitch TSV capacitance experimentaldata from wafer rsquoD18rsquo 43
Table 36 Comparison between CBCMLCR measure-ments on wafer rsquoD18rsquo 45
Table 41 Coefficients of thermal expansion 47
Table 42 MOSFET array configurations for test padmodules PTP1PTP2 54
Table 43 MOSFET array configurations selected forexperimental evaluation 61
Table 44 PTP1PTP2 test pad module instrumentconnections 64
Table 45 Summary of experimental measurementson long (L) and short (S) channel PMOSdevices 73
Table 51 Energy dissipation results ( f Jcycle) the-oretical model (T) versus SPICE simulation(S) 84
Table 61 Energy-recovery IO circuit parameters 93
Table 62 Calculated resistance tolerances for eachwire branch in the resonant clock distribu-tion tree 100
Table 63 Energy-dissipation theoretical calculationsfor Q = 17 f = 500MHz 104
Table 64 Post-layout simulated energy dissipation Area requirements 106
Table 65 Energy-recovery circuit instrument connec-tions 108
Table 66 CMOS circuit instrument connections 108
xiii
A C R O N Y M S
2D two-dimentional
2N-2P 2 nMOS2 pMOS adiabatic logic
3D-IC three-dimentional integrated circuit
3D-SIC 3D stacked-IC
3D-SIP 3D system-in-package
3D three-dimentional
3D-WLP 3D wafer-level-packaging
BCB benzocyclobutene
BEOL back-end-of-line
CAL clocked adiabatic logic
CBCM charge-based capacitance measurement
CMOS complementary metal-oxide-semiconductor
CMP chemical-mechanical planarization
CTE coefficient of thermal expansion
Cu copper
CVD chemical vapor deposition
D2D die-to-die
D2W die-to-wafer
DCVSL differential cascode voltage switch logic
DRC design rule check
DUT device under test
ECRL efficient charge recovery logic
EDA electronic design automation
FEM finite element method
FEOL front-end-of-line
GPIB general purpose interface bus
IC integrated circuit
xiv
acronyms xv
IO inputoutput
KOZ keep-out-zone
LC inductance-capacitance
LVS layout versus schematic
MEMS microelectromechanical systems
MIM metal-insulator-metal
MOSFET metal-oxide-semiconductor field-effect transistor
MOS metal-oxide-semiconductor
MPU microprocessor unit
MuGFET multiple gate field-effect transistor
NMOS n-channel MOSFET
P2LC pulse-to-level converter
PAL pass-transistor adiabatic logic
PCB printed circuit board
PCG power-clock generator
PMD pre-metal dielectric
PMOS p-channel MOSFET
PVT process voltage temperature
QSERL quasi-static energy recovery logic
RC resistance-capacitance
RDL re-distribution layer
RLC resistance-inductance-capacitance
RMS root mean square
SIP system-in-a-package
Si silicon
SMU source measurement unit
SoC system-on-chip
TaN tantalum nitride
TSV through-silicon via
W tungsten
acronyms xvi
W2W wafer-to-wafer
WLP wafer-level-packaging
1I N T R O D U C T I O N
Extending integrated circuit (IC) functionality while keepingcost per silicon area approximately constant has historicallybeen the fundamental factor behind the growth of the semi-conductor industry Ever since Moorersquos observation [51] in 1965
chip functionality has been doubling every 15-2 years advancingfrom the Intel 4004 with 2000 transistors to modern generationmicroprocessor units (MPUs) that integrate over 1 billion tran-sistors [35] While there are indications that this trend might beslowing down due to planar metal-oxide-semiconductor field-effect transistor (MOSFET) physical limitations [22] innovativesolutions such as multiple gate field-effect transistors (MuGFETs)continue to drive the transistor roadmap forward as was recentlydemonstrated by Intel on their 22nm node MPU [12] It can bereasonably expected that IC transistor density will continue toincrease well into the next decade reaching 10 or even 100 billiontransistors per chip [4]
As transistor gate length dielectric thickness and junctiondepth are scaled performance in terms of speed and power im-proves as well The performance requirements for ICs have beensteadily increasing for succeeding technology generations drivenby market demand Although transistor scaling has traditionallybeen the main technology focus determining both cost and per-formance of ICs in recent years the performance gains enabled byscaling have been gradually challenged by on-chip interconnects
In an IC interconnects are the conducting wires connectingdevices and functional blocks in the circuit With smaller tran-sistors and increased functionality in modern processes intercon-nects have been progressively becoming more complex occupyingup to 12 metal layers in the chip As wire length increased sodid the signal delay time and power consumption of intercon-nects which are typically determined by wire intrinsic parasiticsin terms of resistance-capacitance (RC) It was estimated that a130nm microprocessor consumed over 50 of dynamic poweron interconnects [41] with a projection that in future designsup to 80 of microprocessor power would be consumed byinterconnects [3] Likewise signal propagation delay of globalinterconnects as a portion of the total system delay has beenrapidly increasing for succeeding technology nodes (Figure 11)
The rapid increase of the interconnect impact on IC perform-ance prompted ITRS1 to appoint the realisation of novel and
1 International Technology Roadmap for Semiconductors
1
11 research goals and contribution 2
Figure 11 Scaling trend of logic and interconnect delay [30]
innovative interconnect technologies as one of the ldquoGrand Chal-lengesrdquo in 2007 [3] Since then several technologies have been in-vestigated to address the interconnect bottleneck such as copperlow-k interconnects optical interconnects and three-dimentional (3D)interconnects Although copperlow-k interconnects are morecompatible with current processes and remain the most attractiveoption into the near term 3D interconnects are probably the mostrealistic option for meeting future technology trends [53]
Emerging 3D integration technologies based on vertical through-silicon vias (TSVs) promise a solution to the interconnect per-formance bottleneck along with reduced fabrication cost andheterogeneous integration TSVs are electrical connections passingthrough the silicon substrate and interconnecting separate diesstacked vertically in a 3D package [69] While there are several 3D
interconnect technologies [11] TSVs have the potential to offer thebest all around 3D integration technology with their advantagesincluding reduced parasitics and high interconnection density
11 research goals and contribution
As TSV is a relatively recent 3D interconnect technology there arestill challenges to be overcome before TSVs can be widely adoptedand commercialised Fabrication issues such as etching linerdeposition metallization and chemical-mechanical planarization(CMP) steps are in the process of being resolved or solutions havealready been found [55] Consequently the next logical step isthe implementation of three-dimentional integrated circuits (3D-
ICs) with more complex functionalities than simple test chipsHowever complex circuit design requires design methodologiesand tools that for 3D-ICs are still at their infancy stage [18]
The motivation behind this project was to accelerate the de-velopment of such design methodologies and tools for future
12 thesis outline 3
3D-IC designs by addressing key concerns of the TSV technologyprimarily from the circuit designerrsquos perspective The researcheffort was aimed at developing characterization structures forevaluating TSV characteristics affecting signal propagation delaypower consumption and circuit robustness as well as proposingdesign techniques for overcoming limitations Although TSVs areinterconnects their behaviour resembles more to MOSFET devicesthan the commonly used back-end-of-line (BEOL) metal wireswhile their integration into silicon introduces new challenges forICs Consequently under certain conditions conventional designapproaches are sufficient neither for TSV characterization nor forTSV interconnection
The TSV parasitic capacitance and thermomechanical stress dis-tribution are critical TSV characteristics The parasitic capacitanceaffects both delay and power consumption of signals passingthrough the TSV interconnect while thermomechanical stress af-fects MOSFET device mobility in the vicinity of TSVs Accuratelydetermining these parameters is necessary for generating designrules and models that can be used in TSV-aware design tools Aspart of this project sophisticated test structures were developedand fabricated on a 65nm 3D process for extracting the TSV para-sitic capacitance and thermomechanical stress distribution
The TSV parasitic capacitance is typically small however it maystill become an important source of power consumption as TSVs
are gradually implemented in large densely interconnected 3D-system-on-chips (SoCs) Although conventional low-power designtechniques based on voltage scaling can be used this representsa technical challenge in modern technology nodes In this worka novel TSV interconnection scheme is proposed which is basedon energy-recovery logic and shows frequency-dependent powerconsumption The scheme is analysed using theoretical modellingwhile a demonstrator IC was designed based on the developedtheory and fabricated on a 130nm 3D process
The work presented in this thesis was part of a larger researchproject undertaken at Imec2 with the goal of improving andoptimising TSV-based 3D integration technology The authorrsquoscontribution within this project was to propose design and sim-ulate the circuits presented in the following chapters as well asperform and analyse the experimental measurements
12 thesis outline
This thesis is organised into seven Chapters and two Appendixesas follows
2 Imec is a micro- and nanoelectronics research center located in Leuven Belgium(httpwwwimecbe)
12 thesis outline 4
Chapter 2 (Background) introduces the basic concepts com-prising the fields of 3D integration and energy-recovery logic Inaddition the design flow used for implementing the design partof this work is described
Chapter 3 (Characterization of TSV capacitance) presents agroup of test structures for characterizing the TSV parasitic ca-pacitance under various conditions The test structures are fab-ricated on a 65nm 3D process and the measurement data areanalysed using statistical methods
Chapter 4 (TSV proximity impact on MOSFET performance)presents a test structure for monitoring the impact of TSV-inducedthermomechanical stress on MOSFET device performance Thetest structure uses proven design techniques for implementinga large number of test-cases and accessing numerous deviceswith maximum precision The structure is fabricated and usedfor characterizing thermomechanical stress on a 65nm 3D process
Chapter 5 (Evaluation of energy-recovery for TSV intercon-nects) investigates the potential of the energy-recovery techniquefor reducing the energy dissipation of TSV interconnects in 3D-ICsThe total energy dissipation per cycle and optimum device sizingare extracted for the proposed scheme using theoretical model-ling while the configuration is evaluated against conventionalstatic complementary metal-oxide-semiconductor (CMOS) design
Chapter 6 (Energy-recovery 3D-IC demonstrator) elaborates onthe design of a 3D demonstrator circuit based on the energy-recovery scheme under realistic physical and electrical con-straints The demonstrator is compared to a CMOS circuit de-signed with identical specifications using post-layout simulationsBoth circuits are fabricated on a 130nm 3D technology processand evaluated based on the experimental measurements
Chapter 7 (Conclusions) summarises the contributions of thisthesis as presented in the previous chapters and discusses areasof future work
Appendix A includes photos used as reference in various partsof this work
Appendix B elaborates on the statistical descriptors used inthis work for analysing experimental measurement data
2B A C K G R O U N D
The work presented in this thesis contributes in the fields of 3D
integration and energy-recovery logic This chapter introduces thebasic concepts comprising these two fields not in extensive detailbut enough that the reader can understand the motivation andthe choices made in the context of this work Furthermore thefull-custom design flow is discussed that was used to implementthe design part of this thesis as will be presented in the followingchapters
21 3d integrated circuits
CMOS integration technologies are traditionally based on two-dimentional (2D) planar architectures Each die is individuallypackaged and placed on a printed circuit board (PCB) which isused to provide mechanical support and interconnection betweendifferent dies and electronic components A 3D-IC is a stack ofmultiple dies (Figure 21) vertically integrated and interconnec-ted sharing a common package
There are several motivations for the development of 3D-ICsPerhaps the most significant advantage of 3D integration is thereduced interconnection delay and energy dissipation In generalglobal interconnect delay and energy is linearly related to wirelength in an IC By using 3D integration modelling of simple3D-IC architectures has demonstrated that up to 50 reduction inglobal interconnect length may be achieved [24] with comparablereductions in delay and energy
Another advantage of 3D integration is heterogeneous integra-tion Presently 2D planar architectures limit designers to a singlefabrication technology for both the analog and digital parts of
ABCDEFBA ABCDEFBA
Figure 21 Comparison of 2D3D Integration
5
21 3d integrated circuits 6
ABCDEFBC
E
Figure 22 System-in-a-package
ABCDEFBC
E
Figure 23 Package-on-package
a circuit The common practice is to use an inexpensive digitalprocess for the complete design which is usually inefficient forimplementing analog circuits In contrast a 3D-IC allows for themost suitable process technology to be used for implementingeach part of the design In addition other components may beintegrated in the same circuit such as microelectromechanicalsystems (MEMS) [80] and passive devices [66] (resistors capacitorsinductors) enabling ICs with novel functionalities
The replacement of long horizontal global interconnects withshort vertical ones allows for reduced die size as well which isdirectly related to the fabrication cost Also the vertical integ-ration of dies in the same package removes the need for thespacing overhead between components that is necessary in PCBsincreasing the area efficiency and reducing the form factor
211 3D integration approaches
3D integration may be achieved using several microelectronictechnologies which result in different densities and capabilitiesThe levels of 3D integration can be reasonably classified into threemajor technology platforms [11 4] based on the interconnectwiring hierarchy and the industrial infrastructure
bull 3D system-in-package (3D-SIP) Stacking of individual pack-ages
bull 3D wafer-level-packaging (3D-WLP) Stacking of embeddeddies by wafer-level-packaging (WLP)
bull 3D stacked-IC (3D-SIC) TSV-based vertical system integra-tion
The 3D-SIP integration is based on traditional packaging intercon-nect technologies Each sub-system is integrated in a system-in-a-package (SIP) fashion (Figure 22) using wire-bonds for inter-connection and subsequently each SIP subsystem is assembledvertically to create package-on-package 3D stacks (Figure 23)
3D-SIP is currently the most mature technology and has alreadyreached volume production It provides with increased function-ality in a smaller form factor and low cost However interconnectdensity is limited since both wire bond pads and solder bumps
21 3d integrated circuits 7
Figure 24 Imecrsquos 3-D SIP wireless bio-electronic sensor [67]
A
BCDEAFE
FFDC
CFE
Figure 25 3D wafer-level-packaging
take up a large area on silicon due to their large pitch The longbond wire length can also significantly distort signals due toits intrinsic parasitics Hence this approach offers limited per-formance when compared to other 3D interconnect solutions Anexample of 3D-SIP application is the wireless bio-electronic sensor(Figure 24) developed at Imec [67]
3D-WLP integration is based on WLP technology directly stack-ing individual dies face-to-face using flip-chip micro-bumps forinterconnection The 3D interconnects are processed after IC pas-sivation using WLP techniques and may also be realised usingTSVs (Figure 25) The TSVs if existing are much larger than theequivalent interconnections in a 3D-SIC typically in the range of20 minus 40microm (1 minus 5microm in 3D-SIC) [11]
The 3D-SIC technology [69] is similar to a 2D SoC approachexcept that circuits exist physically on different layers and verticalinterconnects replace the global interconnections of the SoC 3D-
SICs (Figure 26) are based on TSV interconnections and have thepotential to offer the best all around3D integration technology Theadvantages of TSV-based 3D-SICs include the short interconnectlength (reduced parasitics) and the high interconnection densityHowever the additional process steps required for implementingthe TSVs increase manufacturing cost
A comparison between the discussed 3D interconnect technolo-gies is attempted in Table 21
21 3d integrated circuits 8
ABCD
EFFDB
F
Figure 26 3D-stacked IC
3D interconnect
technologyAdvantages Disadvantages
3D system-in-package (3D-SIP)
Simple existingtechnology bestheterogeneous
integration
Performance(speed power) low
densitymanufacturing costin high-quantities
3D wafer-level-packaging (3D-
WLP)
Existingtechnology good
compromisebetween
performance-manufacturing
cost
Averageperformance and
density
3D stacked-IC (3D-SIC)
Performance(speed power)
high densitymanufacturing costin high-quantities
Manufacturing costin low-quantities
not ready formanufacturing
Table 21 3D interconnect technologies comparison
21 3d integrated circuits 9
ABBCDE
FBACACDA
ACCD F
Figure 27 TSV process steps
212 TSV interconnects
Of particular interest to this PhD thesis is the 3D-SIC approachdeveloped at Imec and specifically the TSVs used for verticalinterconnection
In Imecrsquos 3D-SIC process [69 73] TSVs are created prior to theBEOL processing The TSV process steps are illustrated in Fig-ure 27 After processing of the CMOS front-end-of-line (FEOL) andpre-metal dielectric (PMD) stack the TSV hole is plasma-etchedwith a diameter of 5microm and a minimum pitch of 10microm A thinchemical vapor deposition (CVD) oxide layer (sim 200nm) is thendeposited for insulating the TSVs from the substrate The metal-lization sequence consists of a tantalum nitride (TaN) barrier de-position and filling of the TSV hole with electroplated copper (Cu)Finally the Cu overburden is removed in a top-side CMP step anda standard 2-metal layer BEOL process is applied to finalise thedie
Before stacking the wafer is mounted on a temporary carrierand thinned down to a silicon (Si) thickness of sim 25microm leavingthe TSV Cu exposed on the back-side of the wafer (Figure 28)After dicing the dies are stacked by Cu-Cu thermocompressionin a die-to-die (D2D) or die-to-wafer (D2W) fashion (Figure 29)using a tacky polymer as an intermediate glue layer such asbenzocyclobutene (BCB) (ǫr = 27) Target technology parametersfor Imecrsquos 3D-SIC 130nm and 65nm processes are summarised inTables 22 and 23 respectively
As TSV is a relatively new 3D interconnect technology thereare several challenges to be overcome before TSV technology canbe widely adopted and commercialised Issues in the fabricationlike etching liner deposition metallization and CMP steps arein the process of being resolved or solutions have already beenfound [55] On the contrary design methodologies for TSV-basedICs are still at their infancy stage [18]
21 3d integrated circuits 10
AAAB
CAAAB
ABC
DEF
Figure 28 Before thinningafter thinning
ABC ABC
DE DE
F
DEE
B
CBA
Figure 29 TSV stacking
21 3d integrated circuits 11
Minimum devicechannel length (nm)
130
Supply voltage (V) 12
Routing layers 2
TSV diameter (microm) 5
TSV pitch (microm) 10
TSV length (microm) 22
TSV resistance (mΩ) ~20
TSV capacitance( f F)
~40
Wafer diameter(mm)
200
Table 22 Technology paramet-ers for Imec 3D130
process
Minimum devicechannel length (nm)
70
Supply voltage (V) 10
Routing layers 2
TSV diameter (microm) 5
TSV pitch (microm) 10
TSV length (microm) 40
TSV resistance (mΩ) ~40
TSV capacitance( f F)
~80
Wafer diameter(mm)
300
Table 23 Technology paramet-ers for Imec 3D65 pro-cess
Approaching the technology challenges from the designerrsquosperspective we can emphasise on the following issues designtools thermomechanical stress thermal analysis electrical char-acteristics and testing
Design tools
Although it is possible to design 3D-ICs using existing tools byutilising ad hoc techniques as 3D implementations become moreadvanced ldquonativerdquo 3D design and analysis tools will becomenecessary A few physical layout tools are currently availablefor TSV-based design [2] however they do not offer automaticpartitioning placement and routing Moreover there are noavailable commercial tools for evaluating performance (speedpower) reliability and manufacturability in 3D-IC designs
Thermomechanical stress
TSV fill materials may include copper (Cu) tungsten (W) orpolysilicon[20] The large mismatch in the coefficient of thermalexpansion (CTE) between the fill material of the TSV (17 middot 10minus6K)and the Si surface (3 middot 10minus6K) induces mechanical stress aroundthe vias As Cu contracts much faster than Si it pulls the surface ofthe surrounding bulk while cooling down causing tensile stressin the area [25] Since even low stress values affect carrier mobilityin MOSFET devices [71] performance shift of the transistors inproximity to TSVs may occur
21 3d integrated circuits 12
The intensity of TSV-induced stress depends on various condi-tions such as TSV geometry device proximity and process char-acteristics Defining design rules such as keep-out-zones (KOZs)for digital and analog circuits and generating stress-effect modelsis an important milestone for implementing TSV-aware circuitswith predictable performance Simulations predict the complexdistribution of thermomechanical stress [40] which emphasisesthe need for sophisticated characterization structures that canmonitor stress impact with precision and verify simulation mod-els In Chapter 4 a characterization structure is presented forevaluating the impact of TSV-induced thermomechanical stresson MOSFET device performance
Thermal analysis
The close proximity of dies in a 3D-IC vertical stack increasesthe heat density proportionally to the number of stacked diesAs temperature affects transistor parameters such as leakagecurrent accurate thermal analysis is required for 3D-ICs to ensurethat design specifications are met Presently tools for thermalanalysis of planar ICs are available however they have to beadapted for use with complex 3D systems so as to reduce runtime and improve the accuracy of TSV models [18] In additionldquothermally-awarerdquo design tools are required which can partitionthe design in such a way that high heat generating componentsare placed on layers in proximity to the heat-sink
Electrical characteristics
The electrical characteristics of TSVs such as delay crosstalk sig-nal integrity (SI) and power integrity (PI) dictate many aspects ofa 3D-IC design Proper evaluation of TSV characteristics typicallythrough simulation means is critical for accurately predicting theperformance of a 3D system [57]
Several attempts have been made to generate models of theTSV structure [77 27] using its intrinsic resistance-inductance-capacitance (RLC) parasitics in an equivalent circuit model TheRLC parasitics depend on the materials and geometry of the TSV
structure while prior work has indicated that these parametersare relatively small with the capacitance assuming the dominantrole on TSV electrical properties [27] For verifying the accur-acy and calibrating the models experimental measurements arerequired using compatible characterization structures Such struc-tures are presented in Chapter 3 which were used for extractingthe TSVs parasitic capacitance of an experimental 3D process
22 energy-recovery logic 13
Testing
Testing ICs for defects is crucial for guaranteeing a sufficient man-ufacturing quality The unique processing steps of TSVs such aswafer thinning alignment and bonding introduce new types ofdefects that should be considered during testing [46] Further-more finding and diagnosing defects in TSV interconnections isconsiderably more challenging than 2D interconnects consideringthat it is not possible to gain physical access to the vias after thetop die has been placed All these factors emphasise the need fornew testing structures and methodologies for testing 3D systems
22 energy-recovery logic
TSV 3D interconnect technology aims to reduce delay and en-ergy dissipation when compared to the traditionally used planarBEOL interconnections A key TSV characteristic is its parasiticcapacitance discussed in Chapter 3 which is typically small(Tables 22 and 23) influencing proportionally both delay andenergy dissipation Although TSV technology offers significantperformance improvements to previous solutions a low-powerinterconnection scheme may still be desirable under certain con-ditions due to specific characteristics of TSV technology
1 The TSV capacitance may become a concern in future largedensely interconnected 3D-SoCs as it increases linearly withthe number of tiers and interconnections
2 Since interconnects do not follow transistor scaling tran-sistor gate capacitance will always be smaller than intercon-nect capacitance With the gap tending to widen in futuretechnologies it can be reasonably expected that 3D intercon-nects will become a performance bottleneck in ICs just asthe BEOL interconnections preceding them
Furthermore conventional CMOS low-power design techniquesbased on voltage scaling represent a technical challenge in mod-ern technology nodes as process voltage temperature (PVT) vari-ations and increased leakage [61] decrease in practice the rangeover which voltages may be scaled An interesting low-powerdesign approach is energy-recovery logic which demonstratesfrequency-dependent energy dissipation Energy-recovery logicis a design technique for minimising the energy dissipation ofcomputing devices well below typical CMOS limits Its major lim-itation is the requirement for passive components to generatetime-variable signals which cannot be efficiently integrated intocurrent generation ICs TSV-based 3D stacking and heterogeneousintegration of high-quality passive devices [66] may enable theuse of novel design techniques such as energy-recovery that
22 energy-recovery logic 14
Figure 210 Conventional AND gate
were not practically feasible in the past due to technology lim-itations In this PhD thesis energy-recovery is used for drivingTSV-based interconnections in a low-power scheme discussed inChapters 5 and 6
Historically energy-recovery logic has its roots in reversiblecomputing Reversibility is a concept from thermodynamics de-scribing physical processes that after they have taken place canbe reversed without loss or dissipation of energy Reversibleprocesses cause no change in the entropy of the system or itssurroundings (do not generate any waste heat) therefore theyachieve maximum energy efficiency The concept of reversibilityin thermodynamics may be extended to logical reversibility ifa computation can be implemented in such a way that it is al-ways possible to determine the previous state of the computationgiven a description of its current state In practical terms logicalreversibility would require that all information about the state ofthe computation in any stage is retained and never discarded
Yet the typical computing device is not reversible (irreversible)and as Landauer [36] has shown there is a fundamental min-imum amount of energy that must dissipated in such devicesThis minimum energy dissipation is known as the Von Neu-mann - Landauer limit and is at least kT ln 2 per irreversible bitcomputation The irreversible operation of conventional comput-ing can be demonstrated using the AND gate as an example(Figure 210) The AND gate has two input lines and one outputWhen both inputs are at logical 1 the output is 1 however theoutput is 0 in all of the remaining three combinations that have alogical 0 on the inputs Hence every time the gatersquos output be-comes 0 there is loss of information since we cannot determineexactly which input state caused the logical 0 at the output
Computations do not necessarily have to be irreversible aswas shown by Bennett [9] and thus it is still possible to breakthe kT ln 2 limit A reversible computing device would have to
22 energy-recovery logic 15
perform each computation twice the first time in the forwarddirection applying the inputs to a logic function and obtainingthe result and the second time backwards computing the inversefunction and returning the system to its initial state If a com-putation could be performed in such a way and implementedon non-dissipative hardware its energy requirements could bepotentially reduced to zero [10]
In practice even if a circuit is designed to be logically reversibleachieving zero-energy computations utilising existing electricalcircuits is limited considering that any process merely involvingcharge transfer will result in dissipated energy Nevertheless it isstill possible to reduce the energy dissipation during charge trans-ferring to ldquoasymptotically zerordquo levels applying circuit designtechniques generally known as adiabatic switching [8 31]
The operating principle of adiabatic switching can be bestexplained by comparing it to conventional CMOS switching In aconventional CMOS circuit (Figure 211) when the switch closes(typically the pull-up network) a certain amount of charge Q =
CVDD is pulled out of the positive power supply rail that chargesthe load capacitance C up to VDD The energy transferred fromthe power supply during the charging period (T) is [59]
E = QVDD = CV2DD (21)
From the total transferred energy only half ( 12 CV2
DD) is storedon the capacitor C while the other half is dissipated on theswitch and interconnect resistance R As can be derived fromEquation 21 to reduce the energy dissipation in conventionalCMOS either the supply voltage VDD has to be reduced or theload capacitance C
The adiabatic charging of an equivalent node capacitance C
is modelled in Figure 212 In contrast to the conventional CMOS
circuit the supply voltage in this case is not constant but time-variable with a slow rising time (T ≫ RC) If the charging periodof the capacitor (or rising time of the voltage supply) is T then itcan be proven [82 60] that the energy dissipated on the resistanceR would be proportional to
EAdiabatic prop
(
RC
T
)
CV2DD (22)
The distinct advantage of adiabatic switching is that the slowrising time of the supply voltage keeps the voltage difference(and thus current flow) on the resistance R low at all times Asthe charge transfer is spread more evenly over the entire chargingperiod the energy dissipation (E = I2R) is greatly reduced Inaddition as can be derived from Equation 22 increasing thecharging period (T) can potentially reduce the energy dissipationto arbitrary low levels irrespective of the VDD or C parameters
22 energy-recovery logic 16
A
B
CC
D
E
D
Figure 211 CMOS switching
AA
B
C
B
Figure 212 Adiabatic switching
Conventional CMOS Energy-recovery CMOS
2 states True False 3 states True False Off
Energy of information bitdissipated as heat on the
pull-down network
Energy of information bitrecovered by the
power-supply
High speed Lowmedium speed
High energy dissipation Very low energy dissipation
Area efficient Some area overhead
Table 24 Comparison energy-recoveryconventional CMOS
Combining reversible computations with adiabatic switchingtechniques allows for circuit implementations with extremely lowenergy dissipation compared to conventional designs [37 38]However fully reversible circuits in CMOS technology sufferfrom high area overhead as the inverse logic function has tobe implemented and intermediate logic states must be stored toenable reversibility of the computation [8] In cases where theoverhead of full reversibility is too high it can be significantlyreduced by allowing some information to be occasionally dis-carded without sacrificing much of the energy benefits Thesecircuits that partially apply the principles of reversible comput-ing and adiabatic switching to achieve low but not zero energydissipation are called energy or charge recovery [31]
A comparison between energy-recovery and conventional CMOSis summarised in Table 24
22 energy-recovery logic 17
ABCABC
Figure 213 Basic 2N-2P adiabaticcircuit
ABC
ABC
DEF
EE
DEF
Figure 214 2N-2P timing
221 Energy-recovery architectures
Several energy-recovery circuit architectures have been proposedin the literature over the past years Examples of logic families are2 nMOS2 pMOS adiabatic logic (2N-2P) [33] efficient charge re-covery logic (ECRL) [50] pass-transistor adiabatic logic (PAL) [54]clocked adiabatic logic (CAL) [45] quasi-static energy recoverylogic (QSERL) [83] and boost logic [62] These architectures applythe principles of reversible computing and adiabatic switchingto a variable degree while their differences are mainly relatedto characteristics such as singledual-rail logic style number ofclock phases chargingdischarging path and energy efficiency
A simple configuration is the 2N-2P family of logic circuits(Figure 213) which is based on the differential cascode voltageswitch logic (DCVSL) circuit [59] The timing diagram of the 2N-2P
circuit is presented in Figure 214 using an idealised adiabaticvoltage supply Initially the voltage supply is in the WAIT phase(LOW) keeping the outputs in the LOW state Then the inputs areset complementary to each other and the supply voltage rampsup As the inputs are evaluated the output that was enabled bythe pull-down network follows the power supply until it reachesVDD At that moment the inputs return to the LOW state andafter a certain period of time in the HOLD ldquo1rdquo phase the voltagesupply ramps down discharging the output As can be observedthe average voltage supply during the computation cycle is wellbelow VDD which can be advantageous in modern technologynodes where leakage currents are of a concern
The adiabatic chargingdischarging takes place during theramping updown of the EVALUATERESET phases while thereverse computation occurs during the RESET phase when thetransferred charge is returned to the power supply Howeverthe reversibility of the circuit is only partial since the p-channel
22 energy-recovery logic 18
Figure 215 Adiabatic driver
MOSFET (PMOS) transistor that was turned-on by the pull-downnetwork will be forced to turn-off when the output is approxim-ately a threshold voltage above zero This will cause some chargeto be trapped on the output which will be dissipated as heat on asubsequent cycle Using the 2N-2P circuit arbitrary logic functionsmay be implemented in the place of the pull-down network
The energy efficient characteristics of energy-recovery logic canbe also beneficial for driving large load capacitances A circuitthat may be used in such case is the adiabatic driveramplifier(Figure 215) [8]
The input of the adiabatic driver is dual-rail encoded as well asthe output The operation principle is straightforward First theinputs are set to valid values and then an adiabatic time-variablesignal goes through the pass-gate charging the load capacitanceat the output to its maximum value The output is now validand can be used by other stages to perform computations Sub-sequently the output slowly discharges returning the charge backto the power-supply Energy is saved by two methods Firstlythe slow-rising adiabatic signal forces a low voltage differencebetween the inputs of the pass-gate resulting in low currentand thus low energy dissipation Secondly the energy that isnot dissipated as heat during the transfer is returned back tothe power-supply which can be re-used in the next computationcycle
Circuits utilising the energy-recovery technique have been suc-cessfully implemented in the past while they have demonstratedgreat potential whilst driving highly-parasitic global intercon-nects such as clock distribution networks [42 63]
222 Power-clock generators
Energy-recovery circuits require specialised power supplies thatcan generate multiple-phase time-variable adiabatic signals and
22 energy-recovery logic 19
A
A
B
C
C
Figure 216 Stepwise charging generator
recover charge Since the power supply in an energy-recoverycircuit provides with both the power and clock signal they arereferred to as power-clock generators (PCGs) [6] The power-clockgenerators fall into two categories staircase and resonant gener-ators
Staircase generators use the stepwise charging technique [68]The basic circuit (Figure 216) is composed of N number of tankcapacitors (CT) that are successively discharged in N steps un-til the load capacitance (CLOAD) is charged to the maximumvoltage level This technique attempts to ldquoemulaterdquo the adiabaticchargingdischarging signal while its linearity depends on thenumber of charging steps N The advantage of staircase generat-ors is that they can be easily integrated on-chip since they do notrequire inductors as is the case with resonant generators How-ever both speed performance and energy efficiency are relativelylow
Resonant generators [44] are based on inductance-capacitance(LC) oscillators A simple circuit is the 1N single-phase powerclock generator of Figure 217 If R is the resistance of the adia-batic driver in Figure 215 then the load capacitance (CLOAD)along with a resonant-tank inductor (L) form an RLC oscillatorresonating at
f =1
2πradic
LCLOAD
(23)
As typical for RLC oscillators the shape of the power-clockis a sinewave which approximates the idealised waveform ofFigure 214 Energy is transferred adiabatically during the risinghalf of the sinewave while on the falling half the energy is
23 full-custom design flow 20
A
BCD
E
DFF
CD
Figure 217 1N power clock generator
partially recovered and stored in the inductor for later use RLC
oscillators are very energy efficient power-clock generators andcan operate at high speeds However inductors may be difficultto implement on-chip
23 full-custom design flow
All circuits presented in this thesis were designed using the full-custom approach since the target fabrication technology was anexperimental 3D process for which neither standard cells nordedicated 3D design tools were in our disposal
In the full-custom design methodology each individual tran-sistor and associated connectivity in the integrated circuit ismanually specified in the layout level by the designer Full-customis considered to be superior to standard-cell design when themain concern is circuit performance and area efficiency Further-more full-custom design approaches are essential in cases wherespecialised circuitry are to be designed or when standard celllibraries are not available
The electronic design automation (EDA) environment that wasused was based on Cadence1 applications and tools as integ-rated in the Design Framework II (DFII version 5141) CadenceDFII consists of tools for design management (library manager)schematic entry (Virtuoso schematic) physical layout (Virtuosolayout) verification (Assura) and simulation (Spectre) Thesetools are completely customisable supporting various processtechnologies through foundry-specific design kits In our case
1 Cadence Design Systems Inc (httpwwwcadencecom)
24 summary 21
the Assura tool was replaced by Mentor Graphics2 Calibre forthe verification step
Our customised design flow is illustrated in Figure 218 Ini-tially the circuit is captured at the transistor-level using theschematic editor (Figure 219) and the completed design is simu-lated and modified until it conforms to the design specificationsThe same circuit is re-created in the layout editor (Figure 220)where the detailed geometries and positioning of each fabricationmask layer is described The layout must conform to a series ofprocess design rules and any violations of these rules can be de-tected during the design rule check (DRC) step Subsequently thelayout design is translated to a circuit description (netlist) whichis compared to the initial circuit in the schematic editor duringthe layout versus schematic (LVS) step If the two designs matchthe interconnectivity and transistor parasitics are estimated fromthe layout design and another netlist is created which includesthese parasitics This detailed netlist can be simulated to get amore accurate estimation of the circuit behaviour including theinfluence of parasitics
24 summary
In this chapter the reader was introduced to the basic conceptsand terminology that are relevant in the context of this thesisThe following chapter begins the discussion of the research partof this work presenting a group of test structures that were usedfor the electrical characterization of the TSV parasitic capacitancein an experimental 3D process
2 Mentor Graphics Inc (httpwwwmentorcom)
24 summary 22
ABCDEA
FCCAAE
ABCDEAEC
DE
E
F
CEA
EEAEA
EDE
CAC
E
EC
CAC
Figure 218 Design flow
Figure 219 Inverter in Virtuososchematic
Figure 220 Inverter in Virtuosolayout
3C H A R A C T E R I Z AT I O N O F T S V C A PA C I TA N C E
31 introduction
In a semiconductor process technology electrical characterizationis crucial for testing device performance and reliability after goingthrough the process fabrication steps Electrical characterizationverifies the electrical properties of devices to specifications andprovides feedback to the wafer fabrication process engineers forimproving and optimising the process technology
In a 3D system the electrical behaviour of the TSV can bemodelled in terms of its resistance-inductance-capacitance (RLC)parasitics as discussed in 212 (page 12) Parasitic models are typ-ically used for simulating circuit behaviour and they are criticalfor accurately predicting the performance of a 3D system duringdesign verification [57]
From the RLC parasitics associated with a TSV capacitanceassumes the most significant role on its electrical properties suchas energy dissipation and delay [27] The capacitancersquos role is sodominant that for latency and signal integrity (SI) analysis theTSV behaviour can be modelled by considering the capacitancealone [21] Thus accurate characterization of the TSV capacitanceis highly desirable when designing a 3D system
Typical laboratory capacitance measurement techniques basedon LCR1 meters require well controlled process technologies forrealising a high number of devices without defects which mightnot be possible at the current maturity level of TSV process tech-nology In this chapter an alternative measurement technique isproposed for accurate on-chip characterization of the TSV parasiticcapacitance Test structures based on this technique are fabricatedon a 65nm 3D process (Table 23 page 11) and used for charac-terizing the TSV parasitic capacitance under various conditionsThe experimental measurements are analysed using statisticalmethods and the proposed test structures are evaluated
32 tsv parasitic capacitance
A typical isolated 2D back-end-of-line (BEOL) metal wire can bemodelled as a conductor over a ground plane (Figure 31) Sincethe conductor is separated from the ground plane by a dielectric
1 LCR meter is a measurement equipment used to determine the inductance (L)capacitance (C) and resistance (R) of devices based on impedance measure-ments
23
33 integrated capacitance measurement techniques 24
ABC
Figure 31 Isolated 2D wire capacitance
material (εox) its electrical behaviour closely resembles that ofa capacitor The basic formula for calculating a parallel platecapacitance is
C =εox
hwl (31)
In practice calculating the parasitic capacitance of a BEOL metalwire is significantly more complex since fringing fields increasethe total capacitance value [78] Nevertheless whether a 2D metalwire capacitance is calculated using Equation 31 or more com-plex formulas [85] its value is only dependent on geometricalcharacteristics and the dielectric material
In that context 3D interconnects in the form of TSVs do notbehave as simple wires but as devices The TSV copper (Cu) issurrounded by silicon (Si) and these two materials are separatedby a dielectric oxide liner (Figure 32) This structure resemblesa metal-oxide-semiconductor (MOS) capacitor and as has beenshown by modelling and measurements [27] exhibits distinctaccumulation depletion and inversion regions (Figure 33) TheTSV capacitance attains its maximum value (Cox) in the accumula-tion when Vbias lt VFB (flatband voltage) and inversion regions(Vbias gt Vth f lt 1kHz)
The bias voltage and frequency dependency of the TSV capacit-ance complicates both modelling and the experimental measure-ment of its parasitic capacitance Furthermore the MOS behaviourof TSVs is not desirable in circuits using 3D interconnections Forthat reason special techniques are employed [28] that shift theTSV C-V characteristic in such a way that for typical circuit oper-ating voltages (0 minus VDD) the TSV capacitance is constant and atits minimum value (Figure 33)
33 integrated capacitance measurement techniques
Extracting the TSV parasitic capacitance requires the use of meas-urement techniques that are compatible with the particular TSV
33 integrated capacitance measurement techniques 25
ABCDE
F
FD
F
F
FD
Figure 32 TSV parasitic capacitance
ABCDE FACDE
ECDE
BCFACDE
FCCDED DBCDE
D
DAB
Figure 33 Capacitance vs Gate voltage (CV) diagram of a MOS Capa-citor
33 integrated capacitance measurement techniques 26
characteristics as indicated in the previous section The tech-niques commonly used for measuring integrated capacitances ina laboratory environment can be classified into two categories
1 Measurements based on external instruments these can bedirect capacitance measurements LCR meter based [52] orRF S-parameter based measurements [39]
2 On-chip capacitance measurements examples are the ref-erence capacitor technique [32] ring-oscillator based [26]and charge-based [15]
The disadvantage of using external instruments directly connec-ted to the device under test (DUT) is that the measurement setupintroduces additional parasitics that have to be subtracted fromthe measurement result Using standard measurement setupssignificant noise is added when attempting to determine capa-citances well below 1pF due to the influence of the instrumentparasitics [64] In cases where the DUT capacitance is small suchas the femto-Farad TSV capacitance (Table 23) a large number ofidentical devices (typically in the hundreds) can be connected inparallel to increase the total capacitance value [75] However thismethod is only usable with process technologies that are wellcontrolled and allow the realisation of a high number of deviceswithout defects In general the extracted capacitance of the wholepopulation may not be representative of a single device [56] andas a result individual device measurement accuracy may suffer
On-chip techniques can achieve better accuracy when the DUT
capacitance is small Although DUT yield is not critical the pro-cess technology should allow the integration of active devicesand their parameters should be known to a reasonable degreeA simple yet high resolution technique is the charge-based ca-pacitance measurement (CBCM) which can be used to extractcapacitances down to the atto-Farad range [15]
In the CBCM technique the DUT capacitance (CDUT) is chargeddis-charged at frequency f using a pseudo-inverter (Figure 34) whilethe average charging current (Im) is measured with a laboratoryammeter Since transistor and wiring parasitic capacitances are inparallel with the DUT capacitance a second (reference) inverter isused to measure these parasitics (Ire f ) which are then subtractedfrom the calculated result The DUT capacitance is calculated asfollows
CDUT =Im minus Ire f
VDD middot f(32)
The accuracy of the CBCM technique can be further increasedby measuring the charging current over a range of frequenciesThe DUT capacitance can be then calculated from the slope of
34 electrical characterization 27
AB
AC
DE
FE
DE
FE
AB
AC
B
Figure 34 CBCM pseudo-inverters
Inet (Im minus Ire f ) (Figure 35) The accuracy is greatly improved aspseudo-inverter leakage current is removed from the extractedresult (slope is not affected by constant biases) and also defectiveDUTs can be detected by inspecting the linearity of the current-frequency plot
The CBCM technique has been previously used for measuringBEOL metal interconnect parasitic capacitances [15] and MOSFET
capacitances [64] In this work CBCM is proposed for the on-chipmeasurement of the TSV parasitic capacitance CBCM is compat-ible with the TSV parasitic capacitance on the premise that thevoltage and frequency ranges of CBCM are selected such thatthe TSV is forced into a region where its capacitance is constantAs was shown in the previous section (Figure 33) TSV capacit-ance is constant for typical circuit operating voltages while thefrequency dependency of the capacitance becomes significantonly in the multi-MHzGHz range [39] Therefore operating theCBCM pseudo-inverters in the 0 minus VDD range and below 1MHz
should be adequate to avoid the MOS behaviour of the TSV
34 electrical characterization
A group of test structures based on the CBCM technique wereimplemented for on-chip characterization of the TSV capacitanceon Imecrsquos 65nm 3D process technology (Table 23) As it is commonin IC design maximising the utilisation of the available chiparea is one of the priorities of the design effort For that reasonsupplementary test structures were designed and implemented
34 electrical characterization 28
AB
CADE
F
AB
E
CAD
B
Figure 35 Current as a function of frequency in CBCM
on the same chip that went beyond the core objectives of thiswork
The complete design consisted of 6 test structures 5 for ex-tracting the TSV capacitance under various conditions based onCBCM and 1 additional experiment for direct measurement ofthe RC constant All test structures are presented in this sectionhowever due to equipment and material availability only themost significant were measured and analysed
341 Physical implementation
In the CBCM technique the extraction accuracy of the TSV capacit-ance is highly dependant on the design of the pseudo-invertersThe rsquoreferencersquo pseudo-inverter in CBCM is used for measuringboth the wire parasitic capacitance and the inverter parasitics inthe form of diffusion and gate-drain overlap capacitances There-fore it is critical that the two pseudo-inverters are identical interms of parasitics as any mismatches will appear in the extrac-ted measurement result
Transistor mismatch is mostly influenced by process technologyparameters however it is possible to minimise process variationssuch as in the poly gate length by applying special design tech-niques A major cause of gate length variation is the polysiliconpitch to neighbours and one way to mitigate that effect is to fillunused space between transistor gates with polysilicon dummies[13] This technique was widely used in the CBCM pseudo-inverterdesign presented in this work (Figure 36) Concurrently theinverter transistors were designed with relatively large dimen-sions for the implementation process technology (P10microm1micromN5microm1microm) so as to minimise the effect of variation as a per-centage of the transistor parasitics
34 electrical characterization 29
Poly dummy fills
Metal2Metal1PolysiliconActive
1μm 5μm
Figure 36 CBCM pseudo-inverter pair in layout
Imecrsquos 65nm 3D process allows for 2-tier vertical stacking ofdies (TOP and BOTTOM) with the TSV passing through theTOP die Si substrate as illustrated in Figure 32 The IO foreach test structure was accessible through a standard 2 times 12 testpad module on the TOP die compatible with our wafer probetest system The 2 times 12 test pad module (PRC1) along with thecomplete test structure design is illustrated in Figure 37 while amicrograph image of the fabricated chip can be seen in Figure 38
The CBCM pseudo-inverters were implemented on either TOP
or BOTTOM tier while the TSVs were connected to the invert-ers from TOP or BOTTOM and arranged in 4 configurations (11 times 2 2 times 2 2 times 5) with different pitch distances The 6 test struc-tures available on the PRC1 test pad module are summarised inTable 31 A detailed description of the test structures (rsquoArsquo rsquoBrsquo rsquoCrsquorsquoDrsquo rsquoRrsquo and rsquoErsquo) follows in the paragraphs below
Structure rsquoArsquo Single TSV capacitance (TOP die)
The purpose of test structure rsquoArsquo (Figure 39) is to extract the capa-citance of a single TSV when it is electrically accessed through theTOP die The structure is composed of 3 pseudo-inverters con-nected to the TSVs using identical geometry Metal-2 layer wireswhose capacitance is also measured (Reference) In addition to
34 electrical characterization 30
DTOP DBTM
B10 A10
BREF AREF
B1 A1
R1 R2
SET VDD
RESET GND
C15 C10
C50 C20
SET VDD
ERC SET
EREF EC Metal2 (TOP)Metal1 (TOP)
Metal1 (BTM)Metal2 (BTM)
Figure 37 PRC1 test pad module(layout)
Figure 38 PRC1 test pad module(micrograph)
34 electrical characterization 31
Structure A B
TSVs - 1 2x5 - 1 2x5
Inverterlocation
TOP TOP TOP BTM BTM BTM
TSVconnection
TOP TOP TOP BTM BTM BTM
Multi-TSVconnection
- - TOP - - BTM
Multi-TSVpitch (microm)
- - 15 - - 15
Pad name AREF A1 A10 BREF B1 B10
Structure C D
TSVs 2x2 2x2 2x2 2x2 1x2 1x2
Inverterlocation
TOP TOP TOP TOP TOP TOP
TSVconnection
TOP TOP TOP TOP TOP TOP
Multi-TSVconnection
TOP TOP TOP TOP TOP BTM
Multi-TSVpitch (microm)
10 15 20 50 20 20
Pad name C10 C15 C20 C50 DTOP DBTM
Structure R E
TSVs 1 1 - 1x2 1x2
Inverterlocation
TOP TOP TOP TOP TOP
TSVconnection
TOP TOP - TOP TOP
Multi-TSVconnection
- - - TOP BTM
Multi-TSVpitch (microm)
- - - 20 20
Pad name R1 R2 EREF EC ERC
Table 31 Test structures on the PRC1 module
34 electrical characterization 32
Pseudo-inverters
2x5 TSVs
TSV
Reference
Metal2 (TOP)Metal1 (TOP)
Figure 39 Structure rsquoArsquo - Single TSV capacitance (TOP die)
the single TSV DUT the structure also contains 10 parallel TSVs
arranged in a 2 times 5 array with a 20microm pitch Since the suitabilityof the CBCM method for extracting the small TSV capacitance wasundetermined during design time the 2times 5 array was included asa precaution effectively increasing the DUT capacitance an orderof magnitude
Structure rsquoBrsquo Single TSV capacitance (BOTTOM die)
The design of structure rsquoBrsquo is similar to the structure rsquoArsquo asdescribed above The only difference is that the CBCM pseudo-inverters are implemented on the BOTTOM die (Figure 310)while connectivity to the TSVs is provided using a wire on theMetal-2 layer of the BOTTOM die The purpose of this structureis to examine whether the perceived TSV capacitance varies whenit is accessed from the BOTTOM side
Structure rsquoCrsquo Variable pitch TSV capacitance
The use of minimum TSV pitch in designs is highly desirable soas to minimise total chip area The purpose of test structure rsquoCrsquo(Figure 39) is to investigate whether TSV pitch has an effect onthe parasitic capacitance It has been previously reported in the
34 electrical characterization 33
2x5 TSVs
Reference
TSV
Pseudo-inverters
Metal2 (BTM)Metal1 (BTM)
Figure 310 Structure rsquoBrsquo - Single TSV capacitance (BOTTOM die)
literature that TSV proximity to other TSVs may affect its parasiticcapacitance probably through an interaction with the oxide liner[23] Test structure rsquoCrsquo aims to verify the presence of the pitcheffect on the 65nm 3D process
The CBCM pseudo-inverters are implemented on the TOP die(Figure 311) while the TSVs are connected in parallel in groups of4 with variable pitch distances at 10microm 15microm 20microm and 50micromAll groups connect to the pseudo-inverters using identical lengthwires Since wire parasitics are not measured their value is notextracted from the measured TSV capacitances thus the resultsare relative
Structure rsquoDrsquo TSV capacitance with interconnection on TOP or BOT-
TOM die
The purpose of structure rsquoDrsquo (Figure 312) is similar to structurersquoBrsquo which is to examine whether the perceived TSV capacitancechanges when it is accessed from the BOTTOM die However inthis structure the CBCM pseudo-inverters are implemented on theTOP die in the case the front-end-of-line (FEOL) active devices ofthe BOTTOM die were not capable of operating properly afterfabrication and stacking
34 electrical characterization 34
2x2 TSVs - 10μm
2x2 TSVs - 15μm
2x2 TSVs - 20μm
2x2 TSVs - 50μm
Metal2 (TOP)Metal1 (TOP)
Figure 311 Structure rsquoCrsquo - Variable pitch TSV capacitance
Metal2 (TOP)Metal1 (TOP)Metal1 (BTM)
Figure 312 Structure rsquoDrsquo - TSV capacitance measurement with inter-connection on TOP or BOTTOM die
34 electrical characterization 35
Metal2 (TOP)Metal1 (TOP)
Figure 313 Structure rsquoRrsquo - Evaluation of TSV capacitance measurementaccuracy
Metal2 (TOP)Metal1 (TOP)Metal1 (BTM)
Driver
Reference
Driver
Figure 314 Structure rsquoErsquo - TSV capacitance effect on signal propagationdelay (layout)
Structure rsquoRrsquo Evaluation of TSV capacitance measurement accuracy
Structure rsquoRrsquo (Figure 313) aims to evaluate the accuracy withwhich CBCM can extract the capacitance of two nearby TSVs Thetwo pseudo-inverters wires and TSVs are identical thus theor-etically they should give identical results Any variance in theresult should be mainly due to transistor variability
Structure rsquoErsquo TSV capacitance effect on signal propagation delay
Structure rsquoErsquo (Figure 314) allows the direct measurement ofthe delay caused by the TSV on signal propagation The signalpropagation delay is measured under two conditions When thesignal passes through the TSV (which includes the effect of resist-ance on delay) and when only the TSV capacitance is in the pathof the current (Figure 315) The contribution of wiretransistorparasitics is also measured so as to be subtracted from the finalresult
34 electrical characterization 36
ABC
D
ECBFC
ABC
Figure 315 Structure rsquoErsquo - TSV capacitance effect on signal propagationdelay (schematic)
ABC
DEF
CD
A
CA
A
Figure 316 Measurement setup
342 Measurement setup
The presented test structures were fabricated on 300mm wafers(Figure A1) each containing 472 identical dies For IO accessa general purpose on-wafer measurement system was used thatallowed direct contact with the wafer and the 2 times 12 test padmodule through probecard pins (Figure A2) All IO signals wereprovided by standard laboratory instruments controlled throughthe general purpose interface bus (GPIB) port of a PC runninga customised LabVIEW2 program The measurement setup isillustrated in Figure 316
The voltage source of the CBCM pseudo-inverters was set at1V and the CBCM pulse frequencies were swept in the range100kHz minus 1MHz For each measured test structure the DC cur-rent measurement was stored in non-volatile memory and sub-
2 LabVIEW is a visual programming language from National Instruments forautomating measurement equipment in a laboratory setup
34 electrical characterization 37
Test pad Direction Instrument connection
VDD Power Power supply - 1V
GND Power Power supply - 0V
SET InputPulse generator
(Non-overlapping with RESET)
RESET InputPulse generator
(Non-overlapping with SET)
A[REF 1 10] InputSMU (Inverter source - 1V
supply)
B[REF 1 10] InputSMU (Inverter source - 1V
supply)
C[10 15 2050]
InputSMU (Inverter source - 1V
supply)
D[TOP BTM] InputSMU (Inverter source - 1V
supply)
R[1 2] InputSMU (Inverter source - 1V
supply)
E[REF C RC] Output Oscilloscope (high-impedance)
Table 32 PRC1 test pad instrument connections
sequently the prober progressed to the next die measuringall dies on the wafer The IO test pads of the PRC1 module(Figure 37) and the instrument connections are summarised inTable 32
343 Experimental results
There were two wafers available for measurements named in thistext rsquoD11rsquo and rsquoD18rsquo which were selected from the same lot Eachwafer contained 472 identical dies of the TOP 3D tier measuredusing the measurement setup described in 342 Measurementdata were extracted only from test structures rsquoArsquo (Figure 39) andrsquoCrsquo (Figure 311) which were analysed using standard statisticalmethods
When utilising statistical methods to analyse data acquired ina laboratory environment it is common for some data pointsto be significantly away from the other measurements Thesedata points are termed outliers (Appendix B) and usually resultfrom measurement errors such as bad probe pin contacts ormalfunctioning dies on the wafer Since the following analysisis focused on TSV capacitance variability these few outliers areexcluded from the data set so as to improve analysis accuracy
34 electrical characterization 38
Structure rsquoArsquo (Fig 39)
TSV dimensions (ld) 40microm5microm
Oxide liner thickness 200nm
Total dies 472
Processed dies (excluding outliers) 454
Mean capacitance 709 f F
Standard deviation (σ) 19 f F
Coefficient of variation (3σmicro) 80
Table 33 TSV capacitance D2D variability on wafer rsquoD11rsquo
The statistical descriptors used for analysing the measurementdata are elaborated in Appendix B and include calculations ofmean standard deviation and coefficient of variation Theseparameters are used to evaluate the TSV capacitance in terms ofdie-to-die (D2D) wafer-to-wafer (W2W) and TSV pitch variabilityThe accuracy of the CBCM results is evaluated by comparing themto standard LCR measurements Furthermore the correlationbetween the extracted TSV capacitance and simulated oxide linerthickness is shown signifying the predominant role of the oxideliner on the observed TSV capacitance variability
Die to die (D2D) variability
The die-to-die (D2D) variability is evaluated by measuring thecapacitance of the single TSV of structure rsquoArsquo (Figure 39) acrossall dies on the wafer The measurements are repeated on bothavailable wafers (rsquoD11rsquo and rsquoD18rsquo)
On wafer rsquoD11rsquo 18 dies showed a capacitance value severalstandard deviations away from the calculated mean of the totalpopulation These outliers indicated a measurement error or mal-functioning on the die and they were removed from the statisticalanalysis In the remaining 454 dies the mean capacitance of theexperimental data was 709 f F with a standard deviation (σ) of19 f F (Table 33) Standard deviation is an important indicator al-lowing us to determine whether the variance in the experimentaldata is expected or if there is a random error affecting the meas-urements As can be seen in Figure 317 706 of the measuredvalues are within plusmnσ of the mean capacitance value Since mostof the experimental data are within a standard deviation of themean value the variability observed is insignificant and expectedthus the measurement results can be considered valid
On wafer rsquoD18rsquo from the 472 total dies 58 were removed asoutliers The mean capacitance of the experimental data was895 f F with a standard deviation (σ) of 22 f F (Table 34) Theprobability density function can be seen in Figure 318 with 691
34 electrical characterization 39
-60 -40 -20 00 20 40 60 80
σ-σ
706
Deviation from mean (fF)
0
005
01
015
02
025
Figure 317 TSV capacitance D2D variability on wafer rsquoD11rsquo - Probabil-ity density function
Structure rsquoArsquo (Fig 39)
TSV dimensions (ld) 40microm5microm
Oxide liner thickness 200nm
Total dies 472
Processed dies (excluding outliers) 414
Mean capacitance 895 f F
Standard deviation (σ) 22 f F
Coefficient of variation (3σmicro) 74
Table 34 TSV capacitance D2D variability on wafer rsquoD18rsquo
of the experimental data within plusmnσ of the mean capacitancevalue
Correlation between TSV capacitance and oxide liner thickness vari-
ation
The TSV capacitance as extracted from the experimental measure-ments is plotted for wafers rsquoD11rsquo and rsquoD18rsquo (Figures 319 and 320)The capacitance value in the figures is normalised to the calcu-lated mean while xy axis indicate die coordinates on the waferWe observe that on both wafers the capacitance is near the meanvalue at the periphery of the wafer and reduces as we approachto the centre indicating a common cause of variability on bothwafers
Parameters affecting the TSV capacitance are the geometry ofthe TSV (length diameter) and oxide liner characteristics (dielec-tric constant thickness) The variation of the oxide liner thicknessacross the wafer can be predicted by simulation means Oxideliner simulation results for the specific process and fabrication
34 electrical characterization 40
σ-σ
691
Deviation from mean (fF)-70 -50 -30 -10 10 30 50 70
0
002
004
006
008
01
012
014
016
018
02
Figure 318 TSV capacitance D2D variability on wafer rsquoD18rsquo - Probabil-ity density function
075
080
085
090
095
100
105
110
0
5
10
15
20
25
30
-30-25
-20-15
-10-5
Norm
aliz
ed
capacitance
Die X
Die Y
075
080
085
090
095
100
105
Wafer area
Figure 319 TSV capacitance measurement results on wafer D11
34 electrical characterization 41
086088090092094096098100102104106108
0
5
10
15
20
25
30
-30-25
-20-15
-10-5
Norm
aliz
ed
capacitance
Die X
Die Y
086
088
090
092
094
096
098
100
102
104
106
108
Wafer area
Figure 320 TSV capacitance measurement results on wafer D18
tools can be seen in Figure 321 The liner thickness in the graphis normalised to the mean value and xy axis indicate die coordin-ates on the wafer As can be observed the oxide liner thickness isnear the mean value at the periphery of the wafer and increasesto a peak value when approaching the centre Since the TSV ca-pacitance is inversely proportional to the oxide liner thicknesstherersquos a strong correlation between the two variations suggestingthe oxide liner thickness to be the major cause of die-to-die (D2D)variability on wafers rsquoD11rsquo and rsquoD18rsquo
Wafer to wafer (W2W) variability
Probability density functions for both wafers rsquoD11rsquo and rsquoD18rsquo areplotted in Figure 322 Comparing the two experimental data sets(Tables 33 and 34) we can see that although the main statisticaldescriptors have not changed much (standard deviation is slightlylarger in D18) the mean capacitance is much larger in rsquoD18rsquo(895 f F) when compared to rsquoD11rsquo (709 f F) The difference is 262capacitance increase from wafer to wafer Taking into accountthat both wafers are from the same lot this result indicates theneed for better controllability on process parameters
34 electrical characterization 42
109
100
095Nor
mal
ized
oxi
de th
ickn
ess
-5-10 -15 -20 -25 -30
510
1520
2530
0Die Y
Die X
Figure 321 Simulated oxide liner thickness variation
60 65 70 75 80 85 90 95 1000
005
01
015
02
025D11 D18
plusmnσ
plusmnσ
Capacitance (fF)
Figure 322 TSV capacitance W2W variability - Probability density func-tions
34 electrical characterization 43
Structure rsquoCrsquo (Fig 311)
TSV dimensions (ld) 40microm5microm
Oxide liner thickness 200nm
Total dies 472
Processed dies (excluding outliers) 424
Experiment C10 C50
Number of TSVs 4 4
Mean capacitance 4073 f F 4125 f F
Standard deviation (σ) 84 f F 74 f F
Population within σ 695 653
Table 35 Variable pitch TSV capacitance experimental data from waferrsquoD18rsquo
Variable pitch TSV capacitance
The variable pitch TSV capacitance was extracted using structurersquoCrsquo (Figure 311) TSVs with pitch distances at 10microm and 50microm
were measured (experiments C10 and C50) on wafer rsquoD18rsquo Fromthe 472 total dies 424 were functional and the measurementresults are summarised in Table 35 The extracted capacitancevalues represent the 4 parallel-connected TSVs which also includethe parasitics of Metal 2 connectivity wires
As can be observed in the probability distribution functions(Figure 323) the measurement data for experiments C10 and C50
are very close to each other and there is an overlap between theirstandard deviations Even though there is an observable changein mean capacitance between the two experiments the overlap ofstandard deviations does not allow to conclude whether the capa-citance change is due to difference in pitch or process variability
To decide whether the change observed in capacitance is signi-ficant we run a KruskalndashWallis (Appendix B) one-way statisticalanalysis of variance in Minitab3 Assuming that the null hypo-thesis is true we calculate a p-value of 0 indicating that the twodata sets are different and that the pitch has indeed an affect theTSVs capacitance value The pitch dependence of the capacitancewith the same direction has been also previously observed onImecrsquos 130nm 3D process [23]
Measurement accuracy of CBCM
The capacitance measurements extracted using the CBCM tech-nique were compared to conventional LCR measurements (Fig-ure 324) on wafer rsquoD18rsquo Comparing the two measurements we
3 Minitab is a program that implements common statistical functions It is de-veloped by Minitab Inc (httpwwwminitabcom)
35 summary and conclusions 44
376 380 384 388 392 396 400 404 408 412 416 420 424 428 4320
001
002
003
004
005
006C10 C50
Capacitance (fF)
plusmnσC10
plusmnσC50
695 653
Figure 323 Variable pitch TSV capacitance experimental data fromwafer rsquoD18rsquo - Probability density function
-80 -60 -40 -20 00 20 40 60 80 1000
002
004
006
008
01
012
014
016
018σ-σ
688
Deviation from mean (fF)
Figure 324 LCR experimental data from wafer rsquoD18rsquo - Probability dens-ity function
observe in Table 36 that the standard deviations of the two datasets are similar while mean capacitance is sim 8 smaller in theLCR method However the LCR method produced more out-liers than CBCM That was expected as LCR cannot extract thecapacitance of a single TSV but can only measure groups of TSVs
connected in parallel (32 in our example) so as to increase thetotal measured capacitance This increases the chance of measur-ing TSVs with defects which in turn affects the accuracy of theLCR extracted capacitance
35 summary and conclusions
In this chapter various test structures based on the CBCM tech-nique were presented for the electrical characterization of theTSV parasitic capacitance The test structures were implementedon a 65nm 3D process technology and measured using a gen-eral purpose on-wafer measurement system The measurement
35 summary and conclusions 45
Method LCR CBCM
Total dies 472 472
Processed dies (excluding outliers) 393 414
Mean capacitance 820 f F 895 f F
Standard deviation (σ) 23 f F 22 f F
Coefficient of variation (3σmicro) 86 74
Table 36 Comparison between CBCMLCR measurements on waferrsquoD18rsquo
results were statistically analysed in terms of TSV capacitancedie-to-die (D2D) and die-to-wafer (D2W) variability The calculatedD2D variability was found to be reasonable while a comparisonto simulated data of the oxide liner thickness variation revealedthe oxide liner as a major contributor to the observed TSV capa-citance variability However D2W variability was extensive andfeedback was provided to the fabrication engineering team forfurther improvement of the process The results also confirmedthe effect of TSV pitch on the parasitic capacitance as has beenpreviously reported in the literature for a different process [23]
The comparison of CBCM-based measurements to conventionalLCR gave similar results while in addition the CBCM techniquewas shown to produce less outliers (less defective samples) in themeasurement data Considering the simplicity of CBCM and thatit has acceptable accuracy for extracting the capacitance of singleTSVs it can be a valuable tool for process characterization in theearly research efforts of 3D integration
The ability of CBCM for measuring single-ended TSVs could alsoprove beneficial for testing TSVs prior to die stacking which is ancritical step for keeping 3D-IC yield acceptable [47] Extracting TSV
parasitic capacitance can provide important information about theTSV device fabrication quality since slight variations in processparameters will greatly affect its parasitic capacitance as wasdemonstrated in this chapter Techniques based on RC constantmeasurement and sense amplification have been proposed in theliterature for a similar role [16] however they are significantlymore complex to implement than CBCM
In the following chapter another critical TSV characteristic ther-momechanical stress is evaluated using a test structure designedand implemented on the same 65nm 3D process
4T S V P R O X I M I T Y I M PA C T O N M O S F E TP E R F O R M A N C E
41 introduction
The integration of through-silicon via (TSV) 3D interconnects inCMOS technology has raised concerns over the impact on activedevices implemented on the same silicon (Si) substrate The largecoefficient of thermal expansion (CTE) mismatch between the TSV
filling material and Si substrate has been shown to induce ther-momechanical stress and deform the surface in proximity to theTSV Since thermomechanical stress can have an effect on activedevice carrier mobility [71] it is essential that the stress impacton MOSFET performance is kept under control Monitoring theTSV-induced thermomechanical stress effect on MOSFET deviceperformance and generating keep-out-zone (KOZ) guidelines fordigital and analog circuits is an important milestone for design-ing TSV-aware circuits with predictable performance
Simulations using finite element method (FEM) modelling [40]predict that TSV-induced thermomechanical stress has a com-plex distribution that depends on numerous conditions such asTSV geometry device type channel orientation and proximityHowever currently published experimental measurements onthermomechanical stress investigate just a small number of test-cases typically limited to a single TSVchannel orientationaxis[53 70 81] The complex distribution of thermomechanical stressemphasises the need for sophisticated characterization structuresthat can monitor stress impact with precision and verify simula-tion models
In this chapter a test structure is presented for monitoring theimpact of TSV-induced thermomechanical stress on MOSFET deviceperformance The test structure uses proven design techniquesfor implementing a large number of test-cases and accessingnumerous devices with maximum precision Furthermore thetest structure is fabricated on an experimental 65nm 3D process(Table 23) and used for characterizing that process Some of thepresented circuits and results have previously appeared in thefollowing publications [48 49 58]
42 tsv-induced thermomechanical stress
Individual TSV integration technologies use different conductivefill materials which may include copper (Cu) tungsten (W) or
46
42 tsv-induced thermomechanical stress 47
Material CTE at 20 ˚C (10minus6K)
Si 3
Cu 17
W 45
Table 41 Coefficients of thermal expansion
Figure 41 TSV thermomechanical stress simulation [40]
polysilicon [20] The coefficient of thermal expansion (CTE) mis-match (Table 41) between the fill material and the silicon (Si) caninduce strain on the surface in proximity to the TSV This deform-ation is caused by thermomechanical stress along the XY-planeduring the cooling-down phase [25] due to the filling materialrsquoshigher-than-Si contraction rate
Simulations using finite element method (FEM) modelling showthat the thermomechanical stress exhibits two-fold rotationalsymmetry with tensile and compressive stresses concentratedon separate axis [40] The thermomechanical stress distributiondepends on TSV geometrical characteristics as well as the inter-action with other TSVs Figure 41 illustrates the simulated stressdistribution of a single TSV while in Figure 42 the interactionbetween two near-by TSVs is shown As can be observed whenmultiple TSVs are in proximity to each other stress-free areas canbe generated as a result of destructive interaction
It is well known that stress can affect carrier mobility in MOS-
FET devices In fact strained-silicon technology processes havebeen in use for years as means to enhance device performance inseveral logic families [71] However in TSV-based 3D integrationthe induced strain is not intentional since it is a characteristic in-trinsic to the TSV process technology and thus the resulting shiftin device performance might not be always desirable Character-ization of the TSV impact on device performance and definition ofdesign rules such as keep-out-zones (KOZs) for digital and ana-log circuits is an important milestone for implementing TSV-awarecircuits with predictable performance
Simulation results [65] have shown that the mobility shift de-pends on various conditions such as TSV geometry device type
43 test structure implementation 48
Figure 42 TSV thermomechanical stress interaction between two TSVs[40]
proximity and channel orientation In Figure 43 the mobilityshift of transistors between two TSVs at variable distances is simu-lated As can be observed the effect is positive and more intenseon PMOS devices while for n-channel MOSFET (NMOS) the mobil-ity shift is mostly negative
The mobility shift due to TSV proximity has been also con-firmed by experimental data [53 70 81] and in most cases theperformance shift was found to be within acceptable limits How-ever currently published experimental results investigate a smallnumber of test-cases typically limited to a single TSVdevice ori-entationaxis As was shown in Figures 41 and 42 and discussedin the text the thermomechanical stress distribution is complexand depends on various factors In order to evaluate the impactof TSV thermomechanical stress accurately and under differentconditions more sophisticated test structures are required Suchapproach is attempted in the test structure which is presented inthe following sections
43 test structure implementation
TSV-induced thermomechanical stress can be compressive in somedirections tensile in others and itrsquos effect on device mobilitymay vary depending on MOSFET typeorientationdistance orinteraction with other TSVs To accurately characterize the effectof TSV proximity on MOSFET device performance a test structurewas designed and fabricated on Imecrsquos 65nm 3D process whichattempted to cover as many test-cases as possible
43 test structure implementation 49
Figure 43 Estimated mobility shift on PMOS (a) and NMOS (b) tran-sistors in between two TSVs separated by distance d [65]
The test structure consisted of MOSFET arrays superimposedover TSVs such that MOSFET devices surrounded the TSVs in alldirections at variable distances The purpose of these experimentswas to indirectly observe the thermomechanical stress effect ontransistor mobility by measuring the saturation current of MOSFET
devices at various distances and directions in relation to the TSVsSeveral configurations of TSVs and MOSFET arrays were investig-ated so as to characterize the impact of single and multiple TSVsas well as to observe the effect on different device types
Stacked silicon dies were not fabricated thus all devices wereimplemented on the TOP die Si substrate as illustrated in Fig-ure 32 (page 25) The test structure inputoutput (IO) was ac-cessible through a standard 2 times 12 test pad module on the TOP
die compatible with our wafer probe test system In total two testpad modules were designed for this experiment PTP1 and PTP2implementing NMOS-type and PMOS-type arrays respectively Testpad module PTP1 along with the test structure layout is illus-trated in Figure 44 while a micrograph image of the fabricatedchip can be seen in Figure 45
43 test structure implementation 50
A0
B0
C0
D0
E0
F0
G0
H0
I0
J0
A90
B90
E90
F90
G90
J90
DS3 DF3
DF2 DS0
DS2 DF0
DF1 DS1
SS SF
DEC6 VDD
DEC7 GND
DEC8 VG
DEC4 DEC9
DEC5 GND
DEC2 DEC3
DEC0 DEC1
Arraydecoder
Draindecoder
Figure 44 PTP1 test pad mod-ule (layout)
Figure 45 PTP1 test pad mod-ule (micrograph)
43 test structure implementation 51
Metal2Metal1PolyActive
TSV
MOSFETs
07μm007μm 0deg
07μm007μm 90deg
05μm05μm 90deg
05μm05μm 0deg
Figure 46 MOSFET dimensions and rotations
431 MOSFET arrays
Each MOSFET array was a separate experiment consisting of 37 times37 transistors which could be either NMOS-type or PMOS-typeshort channel (07microm times 007microm) or long channel (05microm times 05microm)as well as at 0ordm or 90ordm rotation in relation to TSVs (Figure 46)
The MOSFET arrays enclosed TSVs in one of five different config-urations 1 TSV (Figure 47) 4 TSVs diagonal (Figure 48) 4 TSVs
lateral (Figure 49) 9 TSVs (Figure 410) and no-TSVs In total therewere 16 configurations for each MOSFET array type (NMOSPMOS)as listed in Table 42 The location of each MOSFET array in thetop-level design is illustrated in Figure 44
The number of available IO test pads on each module was2 times 12 which imposed a limitation for accessing all 21904 (16 times37 times 37) MOSFETs To overcome that restriction only a subset ofthe transistors in each MOSFET array were wired out (16times 16) andthus electrically accessed using digital selection logic The restof the transistors in the 37 times 37 MOSFET array were inserted asdummy devices placed in the gaps to ensure uniform boundaryconditions and good matching between transistors (Figure 411)
Since thermomechanical stress exhibits two-fold rotational sym-metry as discussed in 42 the stress effect is the same for tran-sistors on the same axis whether they are placed eastwest ofthe TSV on the latitudinal axis or northsouth of the TSV on thelongitudinal axis The symmetry of the stress distribution wasexploited in the design of the MOSFET arrays so as to increase theresolution of the experiment despite the limited number of tran-sistors As can be seen in Figure 411 the transistors in the arrayare placed in an irregular grid with a 2microm pitch Some transistorsare placed on odd-number distances (1microm 3microm) in relation tothe TSV while others on even-number distances (2microm 4microm)Nevertheless rotational symmetry allows the evaluation of thethermomechanical stress at all directions with a 1microm resolution
43 test structure implementation 52
TSV
MOSFETs Substrate taps
Figure 47 1 TSV MOSFET array configuration
MOSFETs Substrate taps
TSV TSV
TSV TSV
Figure 48 4 diagonal TSVs MOSFET array configuration
43 test structure implementation 53
MOSFETs Substrate taps
TSV
TSV TSV
TSV
Figure 49 4 lateral TSVs MOSFET array configuration
MOSFETs Substrate taps
TSV TSV TSV
TSV TSV TSV
TSV TSV TSV
Figure 410 9 TSVs MOSFET array configuration
43 test structure implementation 54
Configuration RotationFET
width(microm)
FETlength(microm)
Name
0 0deg 07 007 A0
1 0deg 07 007 B0
4diagonal 0deg 07 007 C0
4lateral 0deg 07 007 D0
9 0deg 07 007 E0
0 0deg 05 05 F0
1 0deg 05 05 G0
4diagonal 0deg 05 05 H0
4lateral 0deg 05 05 I0
9 0deg 05 05 J0
0 90deg 07 007 A90
1 90deg 07 007 B90
9 90deg 07 007 E90
0 90deg 05 05 F90
1 90deg 05 05 G90
9 90deg 05 05 J90
Table 42 MOSFET array configurations for test pad modulesPTP1PTP2
43 test structure implementation 55
2μm 1μm
Metal2Metal1PolyActive
FET
DUMMY
TAP
TSV
Figure 411 MOSFET array detail
432 Digital selection logic
Each MOSFET array was composed of 256 active transistors (16 times16) Test modules PTP1 and PTP2 implemented 16 array con-figurations each with 4096 devices in total (16 con f igurations times256 transistors) which could be accessed individually throughthe 24 IO pads of the 2 times 12 test pad modules Individual tran-sistor Gate Drain and Source terminals were multiplexed tothe IO pads using a combination of decoders and transmission-gates (Figure 412) The transmission-gates were used as analogswitches for current flow while the decoders enabled the corres-ponding transmission-gates for accessing the externally selectedtransistor for measurement
The procedure for selecting a single MOSFET device for meas-urement was as follows
bull Array (0 to 15) was enabled using the 4 minus bit Array-selectdecoder The MOSFET Source terminal (S) common to alltransistors in a single array was electrically connected toIO pads ldquoSFrdquoldquoSSrdquo when enabled
bull MOSFET Gate terminal (G0 to G15) was connected to IO
pad ldquoVGrdquo using the 4 minus bit Gate-select decoder and corres-ponding transmission-gates
bull MOSFET Drain terminal (D0 to D15) was connected to IO
pads ldquoDFnrdquoldquoDSnrdquo using the 2 minus bit Drain-select decoder
43 test structure implementation 56
D0
D15
DF[0-3]
DS[0-3]
Drain-select decoder (2-bit)
SSF
SS
Array-select decoder (4-bit)
VG
Gate-select decoder (4-bit)
G0 G15
MOSFET Array
Figure 412 Digital selection logic
and corresponding transmission-gates Drain terminals werealways enabled in groups of 4 and connected sequentiallyto IO pads ldquoDF0rdquoldquoDS0rdquo to ldquoDF3rdquoldquoDS3rdquo thus the 2 minus bit
Drain-select decoder was adequate for accessing all 16Drain terminals in the array
When a MOSFET is switched on a voltage drop is expectedbetween the IO pads and the transistor DrainSource terminalsdue to the resistance of wires To guarantee accurate voltagedelivery transistor DrainSource terminals are ldquosensedrdquo on theldquoDSnrdquoldquoSSrdquo IO pads so as to ensure that the applied voltageon IO pads ldquoDFnrdquoldquoSFrdquo supplies the desired voltage values onthe transistor terminals This procedure known as four-terminalsensing [29] is typically automated when using laboratory meas-urement equipment such as a source measurement unit (SMU)
A single MOSFET array with its digital selection logic can beseen in layout view in Figure 413 Transistor Gate terminals areselected using a 4minus bit decoder (Figure 414) while the externallyapplied voltage on IO pad ldquoVGrdquo passes through transmissiongates (Figure 415) The transmission gates schematic used forthe DrainSource terminals can be seen in Figure 416 The Drainterminal transmission gates are enabled in groups of 4 usinga 2 minus bit decoder (Figure 417) If a MOSFET array is not in theselected state by the Array-select decoder Gate terminals areforced at GND voltage level (for PTP1) or to VDD voltage (forPTP2) so that non-selected MOSFETs remain turned-off insteadof floating
43 test structure implementation 57
Gatedecoder
Metal2Metal1PolyActive
Transmissiongates (DS)
Transmission gates (G)
MOSFETArray
Figure 413 MOSFET array with digital selection logic
43 test structure implementation 58
BLK
A[0] A[1] A[2] A[3]
DEC[0-15]
Figure 414 4-bit Gate-enableArray-enable decoder
43 test structure implementation 59
DEC
VGGVDD
Transmission gate
Pull-downtransistor
Figure 415 Transmission gate for MOSFET Gate terminals
Transmission gates
FORCE
SENSE
DRAINSOURCE
EN
VDD
Figure 416 Transmission gates for MOSFET DrainSource terminals
43 test structure implementation 60
A[0] A[1]
DEC[0-3]
Figure 417 2-bit Drain-select decoder
44 experimental measurements 61
Configuration Type Name Wafer
1 No-TSVPMOS
(05microm times 05microm)F0 D08
2 1-TSV 0ordm 25 ordmCPMOS
(05microm times 05microm)G0 D08
3 1-TSV 0ordm 25 ordmCPMOS
(05microm times 05microm)G0 D20
4 1-TSV 0ordm 60 ordmCPMOS
(05microm times 05microm)G0 D08
54-lateral TSV 0ordm 25
ordmCPMOS
(05microm times 05microm)I0 D08
6 1-TSV 90ordm 25 ordmCPMOS
(05microm times 05microm)G90 D08
7 1-TSV 0ordm 25 ordmCPMOS (07microm times
007microm)B0 D08
Table 43 MOSFET array configurations selected for experimental eval-uation
44 experimental measurements
Test modules PTP1 and PTP2 implementing the presented MOS-
FET arrays were fabricated on 300mm wafers (Figure A1) eachcontaining 472 identical dies Two wafers were selected for meas-urements referred to as ldquoD08rdquo and ldquoD20rdquo both originating fromthe same lot The test module used for experimental evaluationwas PTP2 containing the PMOS-device arrays which were expec-ted to have the highest performance impact from thermomech-anical stress as was discussed in 42 The specific MOSFET arraysthat were selected for evaluation are listed in Table 43
441 Data processing methodology
For each MOSFET array evaluated the drain-to-source current(IDS) of all transistors in the array was measured in the saturationregion (VDS = 1V VGS = 1V) Subsequently the saturation cur-rent was divided with the median current value of all transistorsin the array generating IDS values normalised to the median Themedian of a data set represents the arithmetic value that the ma-jority of values tend (Appendix B) Since only a small percentageof the 256 transistors in the array are in proximity to the TSVsthe median IDS value represented typical transistor saturationcurrent not affected by thermomechanical stress
Measurement precision is increased with repeated measure-ments as random noise is removed from the experiment reducing
44 experimental measurements 62
ABCDEFBBC
FCDDFBCFFC
FCB
A
BAACADE
CECBBC
CFFCDE
B$BCFFBDBCBF
amp(BFC)B
BC
FCDDFA
CAF
Figure 418 Processing of measurement results
the standard deviation of the sampled mean (Appendix B) Con-sequently each MOSFET array was measured across 98 identicaldies on the wafer and their results were averaged Since the satur-ation current shift due to thermomechanical stress is systematicacross all dies averaging the saturation currents only removedthe random noise effectively increasing the measurement pre-cision by a factor of sim 10 (
radic98) The described procedure for
processing the measurement results is illustrated in Figure 418Each MOSFET array contains 256 transistors which are not
placed in a regular 16times 16 grid as can be seen in Figure 411 Thisapproach was chosen so as to minimise the number of transistorsin an array while allowing for 1microm resolution in the evaluation ofthe thermomechanical stress by taking advantage of its two-foldrotational symmetry as discussed in 431 However illustratingmeasurements in a contour graph requires a regular grid of dataThus for illustration purposes the experimental measurementswere reconstructed to a 33 times 31 regular grid (1023 transistors)with the intermediate values estimated using linear interpolation(Figure 419)
442 Measurement setup
For IO access a general purpose on-wafer measurement sys-tem was used that allowed direct contact with the wafer and the2times 12 test pad modules through probecard pins (Figure A2) IO
signal generation and measurement was provided by standard
44 experimental measurements 63
ABCDECDBF
ABECBF
AB
AB
Figure 419 Interpolation of measurement results
laboratory instruments controlled through the GPIB port of a PCrunning a customised LabVIEW program Four-terminal sens-ing for MOSFET DrainSource terminals was automated using aKeithley 2602 dual-channel SMU [1] The measurement setup isillustrated in Figure 420
The IO test pads for the PTP1PTP2 modules can be seen inFigure 44 while the instrument connections used for each testpad are summarised in Table 44
ABCCDE
F
AA
A
DBCB
Figure 420 Measurement setup
44 experimental measurements 64
Test pad Direction Instrument connection
VDD Power Power supply - 1V
GND Power Power supply - 0V
DEC[0-9] InputDigital Output
(GATEDRAINARRAY SelectDecoders Input)
DF[0-3] Input SMU (Drain Force terminal)
DS[0-3] Output SMU (Drain Sense terminal)
SF Input SMU (Source Force terminal)
SS Output SMU (Source Sense terminal)
VG Input Digital Output (Gate voltage)
Table 44 PTP1PTP2 test pad module instrument connections
443 PMOS (long channel)
No-TSV configuration
This MOSFET array does not contain any TSVs thus transistormobility is not affected by thermomechanical stress The arraywas measured on 98 identical dies of wafer ldquoD08rdquo the saturationcurrents normalised and then averaged over all measured diesusing the procedure elaborated in 441 In Figure 421 the norm-alised saturation current for each transistor in the array can beseen after interpolation to a 33 times 31 grid Each coordinate in thefigure represents a 1microm distance
As can be observed the majority of transistors in the array havealmost identical saturation currents as expected However there isa distinctive stripe running from north to south where the currentvalue is reduced up to sim 5 This revealed a flaw in the designof the MOSFET arrays that can be best observed in Figure 411The transistors that are located adjacent to the substrate tap donot perceive the same boundary conditions as the rest of thetransistors in the array That difference is enough to affect thephysical characteristics of these transistors during fabricationwhich in turn determine transistor parameters such as saturationcurrent This flaw stresses out the significance of careful boundarycondition design when targeting for good transistor matching ina test structure
1 TSV configuration
In this configuration long-channel MOSFETs (05microm times 05microm) areplaced in the periphery of a single TSV (Figure 47) The transistorsin the array were measured over 98 identical dies of wafer ldquoD08rdquo
44 experimental measurements 65
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
096
097
098
099
100
101
Figure 421 Normalised on-current for PMOS structure ldquoF0rdquo (D08)
and processed according to 441 The normalised saturation cur-rent for each transistor in the array can be seen in Figure 422 afterinterpolation to a 33 times 31 grid It can be clearly observed that thetransistors in proximity to the TSV show a distinctive variationin their saturation current in comparison to the other transist-ors in the array which is the direct consequence of TSV-inducedthermomechanical stress These results also demonstrate that theproposed test structure design has enough measurement preci-sion to distinguish thermomechanical stress from backgroundnoise and variability The north to south stripe observed in theprevious configuration due to boundary condition mismatch isalso present in these measurements
To observe the TSV effect on nearby transistors more accuratelythe normalised saturation current versus distance is plotted foraxis X = 16 and Y = 14 (Figure 423) which cross the TSV almostat its the centre In the graph only actual measurements are plot-ted prior to interpolation We observe that the saturation currentof a transistor at 1microm distance from the TSV increases sim 20 onthe latitudinal axis and decreases sim 15 on the longitudinalwhile the effect is almost negligible after 8microm
The same measurement is repeated on wafer ldquoD20rdquo In Fig-ure 424 the normalised saturation current for each measuredtransistor in the array can be seen after interpolation to a 33 times 31grid Also the normalised saturation current versus distance isplotted for axis X = 16 and Y = 14 (Figure 425) In the graph
44 experimental measurements 66
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
09
10
11
12
105
110
115
120TSV
Figure 422 Normalised on-current for PMOS structure ldquoG0rdquo (D08)
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
085
09
095
1
105
11
115
12
125Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 423 Normalised on-current for PMOS structure ldquoG0rdquo(D08Y=14X=16)
44 experimental measurements 67
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
09
10
11
12
13
TSV
Figure 424 Normalised on-current for PMOS structure ldquoG0rdquo (D20)
we observe that the saturation current of a transistor at 1microm dis-tance from the TSV increases sim 30 on the latitudinal axis anddecreases sim 18 on the longitudinal while the effect is almostnegligible after 8microm These results show that thermomechanicalstress can be quite different at distances very close to the TSV
even in between wafers originating from the same lotFinally the measurement is repeated on wafer ldquoD08rdquo after
increasing the temperature to 60 ordmC Normalised current contourplot can be seen in Figure 426 and measurements for X = 16and Y = 14 in Figure 427 The saturation current of a transistor
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
09
1
11
12
13
14
Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 425 Normalised on-current for PMOS structure ldquoG0rdquo(D20Y=14X=16)
44 experimental measurements 68
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
090
095
100
105
110
115TSV
Figure 426 Normalised on-current for PMOS structure ldquoG0rdquo (D0860˚C)
at 1microm distance from the TSV increases sim 16 on the latitudinalaxis (minus4 in comparison to 25 ordmC) and decreases sim 11 on thelongitudinal (minus4 in comparison to 25 ordmC) while the effect isalmost negligible after 6microm (minus2microm in relation to 25 ordmC)
4 lateral TSV configuration
In this configuration the thermomechanical stress is evaluatedin the lateral direction between 4 TSVs The array is measuredover 98 identical dies of wafer ldquoD08rdquo at 25 ordmC The contour plot ofthe normalised saturation current can be seen in Figure 428 andmeasurements for X = 16 and Y = 14 in Figures 429 and 430
respectively As can be observed the saturation current of atransistor at 1microm distance from the TSV increases sim 18 on thelatitudinal axis (minus2 in relation to 1 TSV) and decreases sim 12 onthe longitudinal (minus3 in relation to 1 TSV) while TSV interactionreduces the effect range to 5microm (minus3microm in relation to 1 TSV)
90˚ rotation TSV configuration
In this configuration the orientation of the transistor channel inrelation to the thermomechanical stress is rotated to 90˚ Thearray is measured over 98 identical dies of wafer ldquoD08rdquo at 25 ordmCThe contour plot of the normalised saturation current can beseen in Figure 431 and measurements for X = 16 and Y = 14in Figure 432 As was expected the stress effect on mobility has
44 experimental measurements 69
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
085
09
095
1
105
11
115
12Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 427 Normalised on-current for PMOS structure ldquoG0rdquo (D08Y=14 X=16 60˚C)
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
085
090
095
100
105
110
115
TSV
TSV TSV
TSV
Figure 428 Normalised on-current for PMOS structure ldquoI0rdquo (D08)
44 experimental measurements 70
2 1 1 2 3 4 5 6 7 7 6 5 4 3 2 1 1 2 309
095
1
105
11
115
12
X16
Distance (μm)
Normalized
Ion
TSV TSV
Figure 429 Normalised on-current for PMOS structure ldquoI0rdquo (D08X=16)
3 2 1 1 2 3 4 5 6 7 8 8 7 6 5 4 3 2 1 1 208208408608809092094096098
1102
Y14
Distance (μm)
Normalized
Ion
TSV TSV
Figure 430 Normalised on-current for PMOS structure ldquoI0rdquo (D08Y=14)
44 experimental measurements 71
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
090
095
100
105
110
115
120TSV
Figure 431 Normalised on-current for PMOS structure ldquoG90rdquo (D08)
been reversed between the two axis The saturation current of atransistor at 1microm distance from the TSV decreases sim 13 on thelatitudinal axis and increases sim 20 on the longitudinal whilethe effect is almost negligible after 6microm
444 PMOS (short channel)
In this configuration short-channel MOSFETs (07microm times 007microm) areplaced around a single TSV (Figure 47) The contour plot of thenormalised saturation current can be seen in Figure 433 and
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
08509
0951
10511
11512
125Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 432 Normalised on-current for PMOS structure ldquoG90rdquo (D08Y=14 X=16)
44 experimental measurements 72
X
0 5 10 15 20 25 30
Y
0
5
10
15
20
25
30
094
096
098
100
102
104
106
108
110
TSV
Figure 433 Normalised on-current for PMOS structure ldquoB0rdquo (D08)
measurements for X = 16 and Y = 14 in Figure 434 At at 1microm
distance from the TSV the saturation current decreases 7 onthe longitudinal axis (minus8 in relation to long channel) while onthe latitudinal axis no stress effect could be detected That couldbe attributed either to short-channel effects or malfunctioningof the specific transistor due to proximity to the TSV and theexperimental nature of the manufacturing process On both axisthe effect is almost negligible after 5 minus 6microm
14 12 10 8 6 4 2 1 1 3 5 7 9 11 1308
085
09
095
1
105
11
115Y14X16
Distance (μm)
Nor
mal
ized
Ion
TSV
Figure 434 Normalised on-current for PMOS structure ldquoB0rdquo (D08Y=14 X=16)
45 summary and conclusions 73
Size
Waf
er
Con
figu
rati
on
Rot
atio
n
Eff
ect
1micro
mla
t
Eff
ect
1micro
mlo
ng
KO
Z
L D08 1-TSV (25 ordmC) 0ordm +20 minus15 8microm
L D20 1-TSV (25 ordmC) 0ordm +30 minus18 8microm
L D08 1-TSV (60 ordmC) 0ordm +16 minus11 6microm
L D08 4-lat TSV (25 ordmC) 0ordm +18 minus12 5microm
L D08 1-TSV (25 ordmC) 90ordm minus13 +20 6microm
S D08 1-TSV (25 ordmC) 0ordm sim 1 minus7 8microm
Table 45 Summary of experimental measurements on long (L) andshort (S) channel PMOS devices
45 summary and conclusions
In this chapter a test structure was presented for monitoringthe impact of TSV-induced thermomechanical stress on MOSFET
device performance The test structure implemented arrays ofMOSFET devices superimposed with TSVs in various configura-tions Digital selection logic allowed access to the MOSFET deviceswhich could be used to indirectly monitor thermomechanicalstress by measuring the saturation current of transistors
The structure was implemented in Imecrsquos experimental 65nm
3D process and used to characterize the process technology whileevaluating the test structure Experimental measurements werecarried on PMOS devices and the thermomechanical stress wasfound to be on par with simulation estimations
In Figure 45 the measurements are summarised for the latit-udinal and longitudinal axis at 1microm distance from the TSV alongwith the recommended keep-out-zone (KOZ) The KOZ is indic-ative and represents the distance where the thermomechanicalstress effect on saturation current is almost negligible (lt 1) Ina practical case KOZs will depend on the tolerated saturation cur-rent mismatch between devices which is a direct consequence oftiming and other circuit constraints Typically digital circuits cantolerate and thus function properly with much more transistorvariability than analog circuits
Comparing the experimental results of this work to other pub-lished data in the literature reveals that the thermomechanicalstress effect varies dramatically with different process character-istics Experimental measurements reported by Cho et al [17]confirm that long channel devices are more sensitive than short
45 summary and conclusions 74
channel however both the saturation current shift and KOZs areconsiderably reduced to 2 and lt 2microm respectively In additionthe dependency of the stress effect on TSV position was not ob-served by the authors in contrast to the measurements presentedin this work and predictions by theoretical models Contradictoryresults between theory and experiments and between differentexperiments emphasise the need for better understanding of thespecific mechanisms involved in the TSV-induced thermomech-anical stress
Evaluation of the measurements results allowed for some con-clusions to be made on the effectiveness of the presented teststructure
1 It is desirable for transistors in the array to be in a regu-lar grid so as processing of the measurement data is lesscomplex
2 The boundary conditions of transistors have an consider-able effect on transistor matching and thus measurementaccuracy
3 Distances of interest are generally lt 8microm therefore thearrays could be smaller or more dense in the area in prox-imity to the TSV than further away
4 The experimental process used to implement the test struc-ture had low yield which stressed out the need for mech-anisms to detect faults in the digital selection logic
The experiences learned from this project can be potentially usedfor designing improved thermomechanical stress characterizationstructures for next generation TSV processes In a future workit would be interesting to perform experimental measurementson NMOS devices which were not evaluated in this project Fur-thermore the cause of the glitch measured on the short-channeltransistor nearest to the TSV should be further investigated
In the following chapter the potential of the energy-recoverytechnique is investigated for reducing the energy dissipation ofTSV interconnects in future 3D-SoC designs implementing a largenumber of TSVs
5E VA L U AT I O N O F E N E R G Y- R E C O V E RY F O R T S VI N T E R C O N N E C T S
51 introduction
TSV-based 3D-ICs enable low-parasitic direct connections betweenfunctional blocks in a SoC by replacing traditional long horizontal2D global interconnects with short vertical ones (21 page 5)Compared to 2D interconnects the reduced resistance-capacitance(RC) parasitics of TSVs improve both speed (RC) and energy(12CV2
DD) performance of circuits However the TSV parasiticcapacitance may still become an important source of energydissipation in large densely interconnected 3D-SoCs since thecombined capacitance and thus the energy required to drive TSVs
increases linearly with the number of tiers and interconnections(Figure 51)
The principal technique for reducing energy dissipation inCMOS has traditionally been based on supply voltage scaling[14] though this represents a technical challenge below the 65nm
node since PVT variations and physical limits decrease in practicethe range over which voltages may be scaled An alternativelow-power design technique is energy-recovery logic [31] whichdemonstrates frequency-dependent energy dissipation and thusdoes not suffer from the limitations of voltage scaling The originsand operating principle of energy-recovery logic were discussedin detail in 22 (page 13)
In this chapter the potential of the energy-recovery techniqueis investigated for reducing the energy dissipation of TSV inter-connects in 3D-ICs The total energy dissipation per cycle andoptimum device sizing are extracted for the proposed schemeusing theoretical modelling while the configuration is evaluatedagainst conventional static CMOS Some of the presented methodsand results have appeared previously in Asimakopoulos et al[7]
52 energy-recovery tsv interconnects
As was discussed in 22 (page 13) energy-recovery circuits con-serve energy by restricting current-flow across resistances andsubsequently recovering part of the supplied energy using time-variable power sources The energy dissipated in an energy-
75
52 energy-recovery tsv interconnects 76
1XTSV
Tier1
TSV
Tier0LOGIC 0
LOGIC 0
TSV
LOGIC 0
LOGIC 0
TSV
LOGIC n
LOGIC n Tier1
Tier0
TSV
LOGIC 0
LOGIC 0
TSV
LOGIC n
LOGIC n
Tier(m-1)
Tier(m)
mXnXTSVFigure 51 TSV energy dissipation per cycle increase with 3D integra-
tion density
recovery circuit becomes proportional to the charging period (T)of the capacitive load (CL) [82 60]
EER prop
(
RCL
T
)
CLV2DD (51)
In comparison a conventional CMOS circuit driving an equi-valent capacitive load would dissipate during the same chargingperiod T [59]
ECMOS =12
CLV2DD (52)
Combining Equations 51 and 52 we calculate the energydissipation relation between the two circuits
EER
ECMOSprop
2RCL
T(53)
From Equation 53 we observe that the conditions under whichenergy-recovery conserves energy when compared to conven-tional CMOS are when either the charging period (T) is large orthe resistancecapacitance (RCL) of the load is small
In a 3D-IC TSV parasitics in terms of resistance-capacitance (RC)are inherently low by design as was discussed in 21 (page 5)In fact as was demonstrated by Katti et al [27] resistance isnegligible and thus the electrical behaviour of a TSV in a 3D-IC
52 energy-recovery tsv interconnects 77
Tier2Tier1TSV
LOGIC 1
LOGIC 2
driver
TSV
LOGIC 1
LOGIC 2
CMOS
driver
Convert
Energy-recovery
PCG
Figure 52 Comparison of a typical CMOS TSV driver to the proposedenergy-recovery
can be modelled as a single capacitance due to its predomin-ant impact on energydelay Therefore and based exclusivelyon Equation 53 energy-recovery has an advantage over con-ventional static CMOS when used for driving low-parasitic TSV
interconnectsThe differences between a typical CMOS driver in a TSV-based
3D-IC and the energy-recovery scheme proposed in this chapterare illustrated in Figure 52 A logic block in Tier1 generates adigital output which is regenerated by a driver and receivedby a second logic block in Tier2 after passing through the TSVIn the proposed energy-recovery scheme as an intermediatestep the digital output is converted to an adiabatic signal beforepassing through the TSV so as to take advantage of the low-power characteristics of energy-recovery Compatibility betweenthe receiving digital logic block in Tier2 and energy-recovery isretained by restoring the adiabatic signal back to its digital form
Converting the digital signal to an adiabatic as depicted inthe simplified diagram of Figure 52 requires the use of a spe-cialised driver that can combine (mix) the time-variable signalof the power-clock generator (PCG) with a digital output Signalsgenerated by a PCG return to zero at the end of a cycle resem-bling the operation of a digital clock (222 page 18) In contrastdigital outputs are typically allowed to have arbitrary values fromone cycle to the next thus a special structure is required thatcan encode the PCG generated signal to arbitrary binary values
52 energy-recovery tsv interconnects 78
IN IN
ININ
OUT OUT
IN
IN
OUT
OUT
Figure 53 Adiabatic driver
Figure 54 A simple pulse-to-level converter (P2LC)
An example of such structure is the adiabatic driver [8 84] (Fig-ure 53) Adiabatic drivers are composed of a pair of transmissiongates switching onoff alternately based on the value of a digitalinput On each cycle the time-variable adiabatic signal is redirec-ted either to the driverrsquos inverting or non-inverting output andas a result the data carried by the adiabatic signal are dual-railencoded
The signals at the outputs of the adiabatic driver can be restoredto their digital form after passing through the TSV by convertingthe dual-rail encoded sinusoidal pulses back to level signals usinga pulse-to-level converter (P2LC) [72] (Figure 54)
For generation of the time-variable adiabatic signal a resonantPCG may be used such as the one discussed in 222 (page 18)A resonant PCG generates continuous sinusoidal pulses approx-imating the idealised (linear) adiabatic signal of Figure 52 Theadiabatic driverrsquos resistance (RTG) along with the capacitive load(CL) assume a crucial role in the generation of the resonant si-nusoidal pulse as their presence in the signal path along with
53 analysis 79
VDD2
Rind
RTG
AdiabaticDriver
CL
M1
T=1f
L ind
RTG
CL
Figure 55 Resonant pulse generator with dual-rail data encoding
Tier2
Tier1
f-fPULSE
VDD2CLK IN
Delay1CLK1
CLK0
Delay2CLK2OUT1
Data In
Data Out
CLKRES
TSV TSV TSV
CLKBufferAdiabatic
Driver
P2LC
f-f
Data1
M1
Figure 56 Energy-recovery scheme for TSV interconnects
a resonant-tank inductor form an RLC oscillator [44] (Figure 55)resonating at
f =1
2πradic
LindCL(54)
A detailed block diagram of the proposed energy-recoveryscheme including the PCG and dual-rail encoding is illustratedin Figure 56
53 analysis
The energy dissipation relation depicted in Equation 51 is basedon the assumption that an idealised (linear) adiabatic signalis used for charging the TSV capacitance Furthermore energy
53 analysis 80
losses occurring in the additional components used in the realisticenergy-recovery circuit illustrated in Figure 56 are not consideredin Equation 51 The purpose of this section is to analyse andmodel all possible sources of energy dissipation in the proposedenergy-recovery scheme
The bulk of the energy dissipation in the energy-recovery cir-cuit of Figure 56 will occur in the adiabatic driver (EAD) theinductorrsquos parasitic resistance (Rind) transistor M1 (EM1) and theP2LC (EP2L) Therefore the total energy dissipated per cycle inthe energy-recovery circuit would be
EER = EAD + Eind + EM1 + EP2L (55)
Parameters EAD Eind and EM1 are theoretically derived in thefollowing subsections while EP2L depends on the specific P2LCs
architecture used and could be easily extracted from simulationdata as will be shown later in this chapter
531 Adiabatic driver
There are two sources of energy dissipation in the adiabaticdriver of Figure 53 adiabatic dissipation caused by current flow-ing through the transmission gate resistance (RTG) and CMOS
dissipation caused by external stages driving the transmissiongate and pull-down NMOS transistor input capacitances
The sinusoidal pulses generated by the PCG chargedischargethe load capacitance (CL) through the transmission gate resistance(RTG) The full chargedischarge cycle of the load capacitanceconcludes during a period T = 1 f while the energy dissipatedon the transmission gate resistance (ETG) during that period isadiabatic and thus proportional to Equation 22 Appending acorrection factor ξ equal to π28 for a sine-shaped current [8 76]and multiplying the equation by 2 to calculate energy dissipationfor the full cycle
ETG = 2ξ
(
RTGCL12 T
)
CLV2DD =
π2
2(RTGCL f )CLV2
DD (56)
The cross-coupled PMOS pair of Figure 53 reduces the transmis-sion gate input capacitance by 12 when compared to the simpleadiabatic driver of Figure 215 (page 18) External CMOS stagesare not required to drive the PMOS pair however the PMOS gatecapacitance will appear as an additional capacitive load at theoutput of the driver Energy will still be dissipated for driving thePMOS pair but will be adiabatic and thus more energy efficient
53 analysis 81
Transmission gates are typically sized to have equal NMOS andPMOS parts as increasing the size of the PMOS only slightly im-proves the gate resistance while significantly increasing the inputcapacitance [78] Thus assuming both NMOSPMOS transistorsare equally sized to Wn the additional capacitive load appearingat the driverrsquos output due to the PMOS pair would be one gatecapacitance Cn In addition drainsource diffusion capacitance(CD) will be an important portion of the load since in each cycle6CD will be present in the current flow path (4 contributed bythe ON and 2 by the OFF transmission gate) Including the TSV
capacitance at the output of the adiabatic driver the combinedload capacitance seen by the PCG in any cycle would be
CL = CTSV + Cn + 6CD (57)
The transmission gate resistance (RTG) can be related to thegate capacitance by a ldquodevice technology factorrdquo (κTG) [43] whichwe can define for calculation convenience
κTG = RTGCn rArr RTG =κTG
Cn(58)
Assuming that the pull-down NMOS transistors of Figure 53are small CMOS dissipation is primarily the result of driving theNMOS part of the transmission gates which is CnV2
DD Combiningthe CMOS dissipation with Equations 56 57 and 58 gives thetotal dissipated energy per cycle in the adiabatic driver
EAD = CnV2DD +
π2
2κTG
Cnf [CTSV + Cn + 6CD]
2 V2DD (59)
The second term of Equation 59 has a consistent contributionon the energy dissipation in every cycle while the first term isdependent on data switching activity (D) We can also furthersimplify this equation by defining the diffusion capacitance asa fraction of the transistors input capacitance (CD = bCn) andassigning term π2
2 κTG f to a variable (y = π2
2 κTG f ) Equation 59then becomes
EAD = D middot CnV2DD +
y
Cn[CTSV + (6b + 1)Cn]
2 V2DD
=
[
Cn
(
D
y+ (6b + 1)2
)
+1
CnC2
TSV
]
middot yV2DD
+(12b + 2)CTSV middot yV2DD (510)
Since in Equation 510 Cn is the free parameter the first twoterms are inversely proportional and thus EAD is minimised when
53 analysis 82
they become equal Solving equation Cn
(
Dy + (6b + 1)2
)
= 1Cn
C2TSV
for Cn the optimum transmission gate input capacitance is calcu-lated for a given CTSV
Cn(opt) =
radic
[D
y+ (6b + 1)2]minus1CTSV (511)
Combining Equations 510 and 511 gives the energy dissipationof an optimally sized adiabatic driver
EAD(min) =
[radic
D
(6b + 1)2y+ 1 + 1
]
middot (12b+ 2)yCTSVV2DD (512)
532 Inductorrsquos parasitic resistance
The inductorrsquos (Lind) parasitic resistance Rind is proportional tothe Qind factor which typically depends on the inductorrsquos imple-mentation technology For the purposes of this analysis it can beestimated as
Qind =1
Rind
radic
Lind
CLrArr Rind =
1QindCL2π f
(513)
Energy dissipation in Rind is adiabatic and can be estimatedcombining Equations 56 and 513
Eind =π2
2(RindCL f )CLV2
DD =π
4CL
QindV2
DD (514)
533 Switch M1
Transistor M1 is switched-on briefly to recover the energy dissip-ated each cycle in the Rtotal = RTG + Rind Its energy dissipationis a trade-off between dissipation in its on-resistance RM1 andinput capacitance CM1
Since M1 is a fairly large transistor previous ratioed stages willhave a significant contribution to the energy consumption as wellFor that reason a m factor is used to compensate for the additionallosses IM1(rms) is the root mean square (RMS) current passingthrough the transistor while turned-on and VGM1 is the peak gatevoltage A methodology for deriving optimum values for boththese parameters is proposed in [44] and the total dissipatedenergy in M1 can be estimated as
EM1(min) = 2IM1(rms)VGM1
radic
mκM1
f(515)
54 simulations 83
Delay2
CLK1
CLK2Delay1
CLKW
PULSE
CLK0T
PD
OUT1
PULSE
P2L
RES
CLK
Figure 57 Energy-recovery timing constraints
534 Timing Constraints
The gate of transistor M1 in Figure 55 is controlled by a pulsewith period TCLK and width WPULSE For optimum energy ef-ficiency the digital inputs of the adiabatic driver in Figure 53should switch during the time period defined by WPULSE whilethe input values must be valid for the complete resonant clockcycle as shown in Figure 57
The adiabatic driverrsquos input data latching is controlled by clocksignal CLK1 (Figure 56) and thus its positive edge should takeplace during WPULSE (Delay1 lt WPULSE) If these constraints arenot met energy that could be otherwise recovered by the inductorwill be dissipated on M1 during WPULSE The clock signal at thereceiving end CLK2 does not have the same constraint howeveritrsquos positive edge should occur before the P2LC outputs new data(Delay2 lt WPULSE + PDP2L)
54 simulations
To verify and evaluate the proposed theoretical models a testcase was designed for which energy dissipation was calculatedusing both theory and SPICE simulations The test case assumed1 minus bit of information per cycle was transferred between 2 tiersof a 3D-IC in which the TSV capacitance (CTSV) was 80 f F TheTSV was driven using the energy-recovery technique proposedin this chapter implemented with a single adiabatic driver fordata encoding and an inductor (Lind Rind) for generation of the
54 simulations 84
M1IN IN
IN
OUT OUT
IN
Figure 58 Test case for the evaluation of the theoretical model
Q250MHz 500MHz 750MHz
T S e () T S e () T S e ()
6 360 409 -120 476 516 -78 578 613 -57
8 308 347 -114 417 449 -70 515 543 -52
10 276 312 -115 382 411 -70 477 498 -43
12 255 289 -119 358 386 -71 451 472 -44
Table 51 Energy dissipation results ( f Jcycle) theoretical model (T)versus SPICE simulation (S)
adiabatic signal and energy recovery The circuit used for the testcase is illustrated in Figure 58
The energy dissipation on the adiabatic driver inductor andtransistor M1 were calculated using Equations 512 514 and 515with the Q factor and operating frequency acting as free variablesTechnology-specific parameters κTG κM1 and CD were extractedusing simulation models for Imecrsquos 130nm 3D process (Table 22page 11) The same circuit was simulated on Cadence VirtuosoSpectre simulator under identical conditions The energy dissipa-tion results for various Q factors and operating frequencies arelisted in Table 51
The calculated and simulated energy dissipation results arealso plotted in Figure 59 As can be observed theoretical es-timations are lower than than the equivalent results reported bythe simulator in all the cases examined That can be expected asthe calculations performed by the simulator are thorough andconsider much more details that are skipped in the theoreticalmodels Nevertheless there is good correlation between theoryand simulations
In Figure 510 the error in the estimation of the energy dissip-ation is plotted when using the theoretical model The Q factordoes not seem to have a significant effect on the error howeverthere is a considerable dependence on the operating frequency
54 simulations 85
5 6 7 8 9 10 11 12 130
10
20
30
40
50
60
70
T250 S250 T500 S500 T750 S750
Q factor
Energydissipation(fF)
Figure 59 Energy dissipation results ( f Jcycle) theoretical model (T)versus SPICE simulator (S)
5 6 7 8 9 10 11 12 13
-14
-12
-10
-8
-6
-4
-2
0
250MHz 500MHz 750MHz
Q factor
Error()
Figure 510 Energy dissipation estimation error of the theoretical modelcompared to the SPICE simulator
The parameters κTG κM1 and CD that were extracted for thespecific technology even though they were used as constants inthe models in practice they are frequency dependent Thereforeit could be said that based on the parameters chosen the theor-etical results are optimised for a specific frequency Increasingfrequency doesnrsquot necessary result in reduced error as seemsto be suggested by the graphs in Figure 510 In fact it is ex-pected that the estimation error will show a positive sign forlarger frequencies As was discussed previously in the text it isa reasonable expectation for the theoretical models to slightlyunderestimate the energy-dissipation
54 simulations 86
M1
TSV
TSV
TSV
Data in OutCMOS
OutER
Energy-recovery
CMOS
P2LC
Adiabaticdriver
Figure 511 Test case for comparison of energy-recovery to CMOS
541 Comparison to CMOS
When comparing different circuits it is typically desirable to takea holistic approach evaluating various aspects such as powerarea speed cost etc The purpose of this section is to comparethe energy-recovery scheme to a conventional static CMOS circuitusing just the theoretical models presented in this chapter Thisbasically limits the comparison to the energy dissipation andspeed parameters while on Chapter 6 (page 92) area is alsoconsidered The test case used for the comparison is illustratedin Figure 511 The TSV load capacitance was assumed to be 80 f F
for both circuitsThe energy dissipation of the CMOS circuit in Figure 511 is
described by Equation 52 Appending the data switching activity(D)
ECMOS = D middot 12
CLV2DD (516)
Equation 516 obviously calculates the theoretical minimumenergy dissipation since no parasitics and previous CMOS stagesare included In practice the energy dissipation of the CMOS
circuit can be much larger however Equation 516 is consideredadequate for the purposes of this comparison
Energy dissipation of the energy-recovery circuit is calculatedusing Equation 55 The energy contribution of each sub-componentthe adiabatic driver (EAD) the inductorrsquos parasitic resistance(Rind) and transistor M1 (EM1) were theoretically derived pre-viously in the text The P2LC energy was simulated in SPICEfor convenience The simple circuit topology of Figure 54 wasslightly improved adding the PMOS transistors seen in Figure 512so as to reduce short-circuit current when the input is in the range
54 simulations 87
Figure 512 Improved P2LC design
0 100 200 300 400 500 600 700 800 9000
2
4
6
8
10
12
14
Frequency (MHz)
Energy
dissipation(fJ)
Figure 513 Simulated energy dissipation of improved P2LC design
Vth rarr (VDD minus Vth) The energy dissipation results extracted us-ing SPICE can be seen in Figure 513 As can be observed theenergy dissipation is frequency dependent something not expec-ted in a CMOS circuit This behaviour is caused by the slow risingsinusoidal inputs which for low frequencies extend the periodduring which both PMOS and NMOS are on and thus increaseshort-circuit current
The Q factor is a critical parameter in the energy efficiency ofthe energy-recovery scheme since the entire circuit is basically anRLC oscillator A wide range of inductors are available as discretecomponents for use in circuits with Q factors depending on thesize and material On-chip integrated inductors have more re-stricted performance however their compact size is an advantageIntegrated Q factors are typically in the neighbourhood of 10for nH range inductors with sim 100microm radius [5] The Q factor
54 simulations 88
4 6 8 10 12 14 16 18 20 22-20
-10
0
10
20
30
40
50
60
800MHz 700MHz 600MHz 500MHz 400MHz 300MHz 200MHzQ factor
Energyred uction()
Figure 514 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS for variable Q factor(C = 80 f F)
4 6 8 10 12 14 16 18 20 220050
100150200250300350400450
Inductor M1 Adiabatic driver P2LC
Q factor
Energy
con trib
ution(
)
Figure 515 Energy contribution of energy-recovery circuit componentsat 200MHz (C = 80 f F)
values in the following simulations were deliberately chosen tobe in realistic ranges for integrated inductors
In Figure 514 the percentage reduction in energy dissipationachieved by the energy-recovery circuit is plotted when comparedto CMOS As expected energy efficiency of the energy-recoverycircuit improves with high Q factors and low frequencies
In Figures 515 and 516 the energy dissipation contributionof each component in the energy-recovery circuit can be seenat 200MHz and 800MHz respectively In both cases the energydissipated on the inductorrsquos resistance is reduced with higher Q
factors At lower frequencies the P2LC energy dissipation becomesa significant part of the total
In Figure 517 the percentage reduction in energy dissipationis plotted when the TSV capacitance (CTSV) is variable The en-ergy dissipation in the adiabatic driver (Equation 512) inductor(Equation 514) and transistor M1 (Equation 515) are linearlyrelated to the TSV capacitance Since the same applies to the CMOS
circuit (Equation 516) relative energy dissipation of both circuitsshould be constant with TSV capacitance The change observed in
54 simulations 89
4 6 8 10 12 14 16 18 20 2200
100
200
300
400
500
600
700
Inductor M1 Adiabatic driver P2LC
Q factor
Energy
con trib
ution(
)
Figure 516 Energy contribution of energy-recovery circuit componentsat 800MHz (C = 80 f F)
20 40 60 80 100 120 140 160 180 200 220-100
00
100
200
300
400
500
Q6 Q10 Q14 Q18
TSV capacitance (fF)
Energy
red u
ction(
)
Figure 517 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS with variable TSVcapacitance (C = 80 f F f = 500MHz)
Figure 517 is caused by the constant energy dissipation contribu-tion of the P2LC However as the load capacitance increases theenergy dissipation on the P2LC becomes insignificant in relationto the rest of the circuit components thus the reduction in energyis expected to asymptotically approach a maximum value
Switching activity (D) can also be a significant factor affect-ing energy performance Since in the energy-recovery circuit thesinusoidal oscillation may not be halted all capacitances in thecurrent flow path will charge and discharge on each cycle regard-less of data activity In contrast static CMOS ideally dissipatesenergy only when switching and thus the energy-recovery circuithas a disadvantage at low switching activities In Figure 518 theestimated effect of the switching activity on energy performanceis plotted for an operating frequency of 200MHz
Since the technology factors κTG and κM1 were extracted for a130nm process reducing their value by 12 could also provide uswith an estimation of the circuitrsquos energy performance for a 65nm
node The result in plotted in Figure 519 and as can be observedtechnology scaling has a positive effect on energy dissipationwhen compared to conventional static CMOS
54 simulations 90
4 6 8 10 12 14 16 18 20 22-40
-20
0
20
40
60
80
D=05 D=07 D=09 D=1
Q factor
Energyred uction()
Figure 518 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS (C = 80 f F f =200MHz)
4 6 8 10 12 14 16 18 20 220
10
20
30
40
50
60
130nm 65nm
Q factor
Energyred uction()
Figure 519 Energy dissipation reduction achieved by the energy-recovery circuit compared to CMOS for variable technology(C = 80 f F f = 500MHz)
55 summary and conclusions 91
55 summary and conclusions
The work presented in this chapter investigated the potential ofthe energy-recovery technique as used in adiabatic logic andresonant clock distribution networks for reducing the energy dis-sipation of TSV interconnections in 3D-ICs The proposed energy-recovery scheme for TSVs was analysed using theoretical model-ling generating equations for energy dissipation and optimumdevice sizing The model accuracy was evaluated against SPICEsimulations which showed good correlation with the theoreticalestimations on a 130nm technology process
The energy-recovery scheme was compared to conventionalstatic CMOS when both circuits driving equivalent TSV load ca-pacitances under similar conditions The analysis revealed theenergy performance dependence on multiple circuit parametersthe Q factor operating frequency switching activity and TSV capa-citance The results demonstrated favourable energy performancefor the energy-recovery scheme for low operating frequenciesand high Q factors switching activities and TSV capacitancesFurthermore an estimation was provided on the energy perform-ance of the proposed scheme in an advanced technology nodewhich predicted further energy improvement
Encoding the PCG signal to arbitrary digital values using thedual-rail adiabatic driver (Figure 53) is only one of the possiblesolutions In fact energy could be saved if instead of dual-railanother encoding method is used such as 1-out-of-4 [74] In 1-out-of-4 encoding 4 outputs are used to represent 2-bits however onlyone of the outputs is charged at any given cycle In comparisondual-rail charges two outputs per cycle for the same number ofbits and thus 1-out-of-4 could potentially be saving up to 50energy when compared to dual-rail 1-out-of-4 encoding couldprove an interesting extension to the proposed energy-recoveryscheme
The theoretical models developed in this work can be used asa quick alternative to SPICE simulations for design-space explor-ation This is shown in the following chapter where the theoryestablished on the energy-recovery TSV scheme is used for design-ing a low-power 3D demonstrator circuit that was fabricated on a130nm process
6E N E R G Y- R E C O V E RY 3 D - I C D E M O N S T R AT O R
61 introduction
In the previous chapter a theoretical approach was taken toinvestigate the potential of the energy-recovery technique forreducing the energy dissipation of TSV interconnects in 3D-ICs Thetotal energy dissipation per cycle and optimum device sizing wereextracted for the proposed scheme using theoretical modellingwhile analysis revealed the design parameters affecting energyperformance
Practical design issues such as wire parasitics signal distri-bution area etc were not considered hence in this chapter a3D demonstrator circuit based on the energy-recovery scheme isdesigned under realistic physical and electrical constraints Thedemonstrator is compared to a CMOS circuit designed with thesame specifications using post-layout simulations Both circuitsare fabricated on a 130nm 3D technology process and evaluatedbased on the experimental measurements
62 physical implementation
The demonstrator circuit implemented a 500MHz 80-bit parallelIO channel on a 5-tier TSV-based 3D-IC The TSVs interconnectionswere driven using the energy-recovery scheme as discussed inChapter 5
The target process technology allowed for a low-resistivitybackside re-distribution layer (RDL) to be deposited which wasused to implement a high-Q integrated inductor Stacking of dieswas not possible in the target process so the 5-tier 3D-IC wasemulated by series-connecting TSVs in groups of 4 as is illustratedin Figure 61
The circuit consisted of a total of 640 TSVs (2 times 4 times 80bits) ofwhich half were electrically connected to the integrated inductorduring a single clock cycle The TSV capacitance for the fabrica-tion technology process had already been characterized and wasestimated at sim 40 f F (Table 22 page 11) Optimal device sizingand parameters for the energy-recovery IO circuit at 500MHz
were calculated using the methodology developed in Chapter 5
and are listed in Table 61In Figure 62 the schematic diagram of the energy-recovery
circuit is illustrated The main components comprising the circuitare the adiabatic drivers the pulse-to-level converters (P2LCs)
92
62 physical implementation 93
AB
AB
CDEFD
C
ABCD
EFD
C
Figure 61 Demonstrator circuit structure (showing 1-bit IO channelface-down view)
Adiabatic driver 80-bit circuit
Wn 461microm (Eq511) Rtotal 21Ω ( RTG80 )
RTG 168Ω (Eq58) Ctotal 165pF (80 middot Cload)
CL 207 f F (Eq57) Lind 613nH (Eq54)
Table 61 Energy-recovery IO circuit parameters
62 physical implementation 94
IntegratedInductor
VINDOER
PulseGenerator
S1AS2
DataGenerator
TSV
TSV
P2LCAdiabaticDriver
P2LCAdiabaticDriver TSV
TSV
Buffer
Buffer
1
80
1
80
EN
DecouplingCapacitor
Figure 62 Energy-recovery IO circuit schematic diagram
Metal2Metal1PolyActiveVDD
GND
RES_CLK
DATA
OUT
OUT_B
DATA
OUT_B OUT
RES_CLK
GND
VDD
Figure 63 Adiabatic driver (layoutschematic view)
the data generator the pulse generator transistor M1 and theintegrated inductor All sub-components are discussed in detailin the following paragraphs
621 Adiabatic driver
The adiabatic driver was based on the configuration theoreticallyanalysed in 531 (page 80) The adiabatic driver had 2 inputsfor the resonant pulse (RES_CLK) and the digital data (DATA)signals and 2 outputs for the inverted (OUT_B) and non-inverted(OUT) signals The layout and schematic views of the adiabaticdriver are illustrated in Figure 63
62 physical implementation 95
Metal2Metal1PolyActive
IN
OUT_B
VDD
IN_B
OUT
GND
OUT_B
IN
GND
VDD
OUT
IN_B
Figure 64 P2LC (layoutschematic view)
622 Pulse-to-level converter
The pulse-to-level converter (P2LC) was based on the simple circuitof Figure 54 (page 78) The design in layout and schematic viewsis illustrated in Figure 64 The circuit had 2 inputs for the inverted(IN_B) and non-inverted (IN) resonant pulse and 2 outputs forthe inverted (OUT_B) and non-inverted (OUT) digital signals
623 Data generator
A data generator was used for generating the digital data sig-nal required at the adiabatic driverrsquos input (DATA) The circuitwhich is illustrated in Figure 65 was a simple D-type flip-flopconfigured as a toggle The data generator had 1 input a digitalclock signal (CLK) operating at the same frequency as the res-onant clock and the digital data output (Q) with a switchingactivity of 1
As discussed in 534 (page 83) there are certain timing con-straints that must be respected for the energy-recovery circuitto operate efficiently The digital data received at the adiabaticdriverrsquos input (DATA) should switch during the period definedby WPULSE (Figure 66) In the demonstrator circuit the distribu-tion network carrying the DATA signal to the adiabatic driverswas carefully designed so that its delay impact was less thanWPULSE
624 Pulse generator
The gate of transistor M1 in Figure 62 is controlled by a pulsehaving the same frequency as the resonant clock (RES_CLK)and width WPULSE which is calculated using the models in 533(page 82) This digital pulse can be generated externally of thechip using a highly accurate laboratory signal generator Howeverwire parasitics in the path from the external signal source to the
62 physical implementation 96
Metal2Metal1PolyActive EN
CLK
VDD
Q
GND
D
CLKGND
ENVDD
QF-F
D
GND
VDDCLK
D Q
Figure 65 Data generator (layoutschematic view)
Delay1
WPULSE
PULSE
CLKTCLK
DATA
CLK_RES
Figure 66 Timing constraints for the energy-recovery demonstratorcircuit
62 physical implementation 97
PULSES1A
S2
GND
VDD
Metal2Metal1PolyActive
VDD
S1
S2
GND
PULSE
Figure 67 Pulse generator (layoutschematic view)
S2ΔPhase
WPULSE
PULSE
S1A
Figure 68 Pulse generator InputOutput signals
gate of on-chip transistor M1 affect the rising and falling times ofthe pulse resulting in pulse widths which are difficult to control
For that reason the pulse generator was implemented on-chip(Figure 67) and could be used to generate pulses with arbitrarywidths and frequencies The pulse generator had 2 sinusoidalinputs (S1A S2) supplied externally of the chip and generateda digital pulse with the same frequency as S1AS2 The pulsewidth was defined by the phase difference between S1AS2 asillustrated in Figure 68
625 Integrated inductor
The integrated inductor was designed in ASITIC1 as a planarsquare spiral with fixed dimensions (500microm times 500microm) and vari-able inductance while aiming for optimum Q factor The simu-
1 ASITIC is CAD tool that aids in the optimization and modelling of spiralinductors (httprficeecsberkeleyedu~niknejadasitichtml)
62 physical implementation 98
100 200 300 400 500 600 700 8005
7
9
11
13
15
17
19
21
10
15
20
25
30
35
40
45
Q (Simulated) Energy Reduction ()
Frequency (MHz)
Qfactor
Energyreduction()
Figure 69 Inductorrsquos Q factor simulation in ASITIC Estimated energydissipation reduction per cycle over standard CMOS
lated Q factor of the inductor for each frequency was then usedin Equation 55 to estimate the energy performance of the energy-recovery circuit when compared to standard CMOS (Equation 52)
As can be observed in Figure 69 the maximum performance isnot reached at the point where the maximum Q factor is achievedsince energy-recovery is also dependent on the operating fre-quency As a result the largest reduction in energy dissipationwas 40 attained at 300MHz while the maximum Q factor re-ported by ASITIC was 20 at 625MHz The high Q factors werea consequence of the large area allocated for the inductor andthe availability of the backside re-distribution layer (RDL) on thetarget process technology for the implementation of the inductorwhere thick low-resistance copper paths could be deposited
For an operating frequency of 500MHz the inductor had a Qfactor of sim 17 which resulted in an estimated energy reductionof 33 in comparison to the CMOS circuit The inductor had avalue of 613nH at that frequency and its design in layout viewcan be seen in Figure 610
626 Top level design
The various sub-components of the circuit were interconnectedusing Metal1 and Metal2 layers In the design there were twoglobal signals that assumed a critical role in the performance ofthe circuit The digital data signal (DATA) which was discussedin 623 and the resonant clock (RES_CLK)
For the distribution of the data signal buffers and interconnectswere properly sized so that the timing constraints of Figure 66were respected A sinusoidal signal such as RES_CLK does notrequire regeneration as it carries just one frequency harmonichowever the path itself has to be properly sized so as to make surethat any parasitic resistancescapacitances are not significantlyaffecting the target performance of the design As was previously
62 physical implementation 99
500μ
m
500μm
Figure 610 Planar square spiral inductor designed in ASITIC
shown by Chueh et al [19] when the driven load capacitancein a resonant distribution network is much larger than wireparasitic capacitance then wire resistance is more significant fordetermining both skew and energy performance
The resonant pulse distribution topology used in this designwas an asymmetric tree as shown in Figure 611 in which theinductor forces a current at the root branch terminating on alladiabatic drivers in parallel As such and considering that theresonant pulse supplies 80 adiabatic drivers with transmissiongate resistance of RTG = 168Ω the total load resistance seen atthe output node of the inductor is RTG
80 = 16880 = 21Ω However as
the current flows to lower branch levels in the tree the perceivedresistance increases to RTG
16 RTG4 and on the last one to RTG Any
additional resistance from wires is going to add in series with theadiabatic driver resistance which in turn will induce an almostlinear effect on the energy performance as can be estimated fromEquation 512
Assuming that the total driven resistance is not desirable toincrease more than 20 and that there are 4 branch levels inthe design each branch can tolerate wire resistance equal to20
4 = 5 of its value Starting from the inductor output nodeand calculating allowed wire resistivity for each branch level thecalculated tolerances for each branch are listed in Table 62 Thesecalculated limits were used for optimally sizing the resonantclock distribution tree in the demonstrator circuit
To ensure a stable voltage supply at the input of the inductor(VIND) a 1pF interdigitated metal-insulator-metal (MIM) de-coupling capacitor [78] was designed (Figure 612) which wasinserted between VIND and ground The top level design of
62 physical implementation 100
Adiabatic drivers
RTG
16
RTG
80
RTG
4
RTG
Level 1Level 2Level 3Level 4
Figure 611 Resonant pulse distribution tree
Branch level Wire resistance
1st 5 middot 16880 = 105mΩ
2nd 5 middot 16816 = 525mΩ
3rd 5 middot 1684 = 21Ω
4th 5 middot 168Ω = 84Ω
Total load 168+8480 + 21
20 + 05255 + 0105 = 252Ω
Table 62 Calculated resistance tolerances for each wire branch in theresonant clock distribution tree
63 post-layout simulation and comparison 101
+-
+-Metal2 Metal1
180μ
m
100μm
Figure 612 Interdigitated metal-insulator-metal capacitor (fringe capa-citor)
the energy-recovery IO circuit in layout view is illustrated inFigure 613
627 CMOS IO circuit
The CMOS IO circuit which was designed for comparison pur-poses is illustrated in the schematic diagram of Figure 614 Thecircuit used half the number of TSVs than the energy-recoverycircuit and was composed of two basic sub-components The datagenerator which was identical to the one discussed in 623 andthe CMOS drivers (Figure 615) which were properly sized to drivethe 160 f F TSV capacitance
The top level layout view of the CMOS IO circuit can be seenin Figure 616
63 post-layout simulation and comparison
The theoretically calculated parameters were used for the designof the demonstrator circuit in Imecrsquos 3D130nm process technologyIn Table 63 the estimated energy dissipation of each circuit sub-component is listed
63 post-layout simulation and comparison 102
22mm
12mm
GND GNDVBUF1 VDR
GND
VIND
S1A
S2
EN
OER
Figure 613 Energy-recovery IO circuit in layout view
63 post-layout simulation and comparison 103
OCMOS
S1B DataGenerator
TSV
CMOSDriver
CMOSDriver
TSV Buffer
1
80
EN
Figure 614 CMOS IO circuit schematic diagram
GND
VCMOS
DATA
078u
039u
105u
053u
316u
158u
950u
475u
OUT
Figure 615 CMOS driver schematic
12mm
12mm
OCMOS S1B GND VBUF2 VCMOSGND
GND GNDEN VCMOS
Figure 616 CMOS IO circuit in layout view
63 post-layout simulation and comparison 104
Energy dissipation (80-bits cycle)
Inductor 11pJ (Eq514)
Switch M1 092pJ [ ]
P2LCs 098pJ (Simulated)
Adiabatic drivers 315pJ (Eq512)
Total 615pJ
Table 63 Energy-dissipation theoretical calculations for Q = 17 f =500MHz
After parasitics extraction the top-level design was simulatedin SPICE The post-layout simulation results for a selection ofoutput signals are plotted in Figure 617 As can be observedthe timing constraints discussed in 623 are respected since theadiabatic driverrsquos input signal (DATA) optimally switches duringthe 327ps PULSE width It can also be seen that the resonantclock (CLK_RES) does not fully complete its discharge cyclebefore it is being forced to do so by the PULSE signal since theadditional capacitive load induced by the wire parasitics reducesthe oscillatorrsquos natural frequency
Post-layout simulated energy dissipation as well as area re-quirements for the energy-recovery and CMOS IO circuits arelisted in Table 64 The energy dissipation of the energy-recoveryimplementation is 25 more at the post-layout level in compar-ison to the theoretical estimations The energy overhead can beattributed to the effect of the wire parasitics as discussed in626 and the transistor parasitics that are not included in thetheoretical equations Therefore the accuracy of the theoreticalmodels can be considered acceptable for designing circuits basedon the proposed energy-recovery scheme
As was expected the dual-rail energy-recovery circuit occupiesalmost double the area of the equivalent CMOS circuit and eventhough both implementations have increased energy dissipationin comparison to the theoretical calculations the CMOS circuitenergy increases relatively more The latter was anticipated sincethe theoretical estimation of the CMOS energy dissipation wascalculated using the simple Equation 52 which does not accountfor any additional driver stages and parasitics The end result isthat the comparative performance favours the energy-recoveryimplementation which improves its energy savings from the
63 post-layout simulation and comparison 105
8 9 10 11 12 13 14 15 160
02
04
06
08
1
12
14RES_CLK
Time (ns)
V(V
)
8 9 10 11 12 13 14 15 160
02
04
06
08
1
12
14DATA
Time (ns)
V(V
)
8 9 10 11 12 13 14 15 160
02
04
06
08
1
12
14PULSE
Time (ns)
V(V
)
8 9 10 11 12 13 14 15 160
02
04
06
08
1
12
14OER
Time (ns)
V(V
)
Figure 617 Post-layout SPICE simulation of the energy-recovery cir-cuit
64 experimental measurements 106
AreaEnergy
dissipation(Theory)
Energydissipation
(Post-layout)
CMOS 144mm2922pJ
(115 f Jbit)
(Eq52)
1265pJ
(158 f Jbit)
Energy recovery 264mm2 615pJ
(77 f Jbit)
768pJ
(96 f Jbit)
Difference +83 minus333 minus393
Table 64 Post-layout simulated energy dissipation Area require-ments
theoretically calculated 333 up to 393 in the post-layoutsimulations
64 experimental measurements
The presented energy-recovery and CMOS IO circuits were fabric-ated on a 200mm wafer (Figure A3) which contained 47 identicaldies Micrograph images of the energy recovery and CMOS circuitscan be seen in Figures 618 and 619 respectively
641 Measurement setup
For IO access a manual on-wafer measurement system was usedthat allowed direct contact with the wafer and the IO test padsthrough RF probe heads (Figure A4) IO signal generation andmeasurement was provided by laboratory instruments whichwere manually controlled The measurement setup is illustratedin Figure 620
The IO test pads for the energy-recovery and CMOS circuitsand their equivalent instrument connections used for each testpad are summarised in Tables 65 and 66
642 Results
CMOS IO circuit
In Figure 621 the input signal S1B and the output signal OCMOS
are shown as captured by the oscilloscope Signal S1B whichhas a frequency of 500MHz is used by the data generator (Fig-ure 614) to generate a data signal switching at the same frequency
64 experimental measurements 107
Figure 618 Energy-recovery IO circuit micrograph
Figure 619 CMOS IO circuit micrograph
AABC
DEFF
D
EEBA
EBBA
Figure 620 Measurement setup
64 experimental measurements 108
Test pad Direction Instrument connection
VIND Power Power supply - gt06V
GND Power Power supply - 0V
S1A InputSine-wave generator (0deg Phase500MHz non-50Ω-terminated)
S2 InputSine-wave generator (VariablePhase 500MHznon-50Ω-terminated)
EN InputDigital Output (enable outputdata switching)
OER Output Oscilloscope
VDR PowerPower supply (adiabaticdrivers+P2LCs) - 12V
VBUF1 Power Power supply (Buffers) - 12V
Table 65 Energy-recovery circuit instrument connections
Test pad Direction Instrument connection
VCMOS PowerPower supply (CMOS drivers) -12V
GND Power Power supply - 0V
VBUF2 Power Power supply (Buffers) - 12V
S1B InputSine-wave generator (0deg Phase500MHz non-50Ω-terminated)
OCMOS Output Oscilloscope
EN InputDigital Output (enable outputdata switching)
Table 66 CMOS circuit instrument connections
64 experimental measurements 109
S1B
OCMOS
Figure 621 Oscilloscope output for signals S1B and OCMOS
100 200 300 400 500 600 700 800012345678910
Die1 Die2 Simulation
Frequency (MHz)
Powerdissipation(mW)
Figure 622 Measuredsimulated power dissipation for the CMOS cir-cuit
that passes through the TSVs and is output on OCMOS Thereforethe measured result on the oscilloscope suggests that the circuitis operating as expected
The proper operation of the circuit is also verified by measuringpower and energy per cycle dissipation on pad VCMOS Thepower dissipation of the two dies measured is approximatelylinear as expected by simulations (Figure 622) while energydissipation approximately constant (Figure 623) In additionabsolute power and energy dissipation values closely match post-layout simulations
Energy-recovery IO circuit
Input signal S1A and output signal OER can be seen in Fig-ure 624 as captured by the oscilloscope The output at OER
64 experimental measurements 110
100 200 300 400 500 600 700 800024681012141618
Die1 Die2 Simulation
Frequency (MHz)
Energydissipation(pJ)
Figure 623 Measuredsimulated energy dissipation for the CMOScircuit
S1A
OER
Figure 624 Oscilloscope output for signals S1A and OER
shows a data signal switching at 500MHz as expected The slowrising edge of the data signal suggests higher than anticipatedparasitics at the OER IO pad though the functionality of thecircuit is not affected
The energy dissipation is calculated by measuring the inputcurrent on IO test pads VIND and VDR The results for the twodies measured can be seen in Figure 625 The energy dissipationshows an approximately constant relation to frequency similar tothe CMOS circuit which does not correlate with post-layout simu-lations A properly operating energy-recovery circuit would beexpected to show minimum energy dissipation at the resonancefrequency (sim 500MHz)
Since the energy-recovery circuit shows CMOS-type energy dis-sipation it was estimated that the inductor could not resonate
65 summary and conclusions 111
200 250 300 350 400 450 500 550 600 65002468101214161820
Die1 Die2 Simulation
Frequency (MHz)
Energydissipation(pJ)
Figure 625 Measuredsimulated energy dissipation for the energy-recovery circuit
IntegratedInductor
VINDOER
PulseGenerator
S1A
S2
DataGenerator
TSV
TSV
P2LCAdiabaticDriver
P2LCAdiabaticDriver TSV
TSV
Buffer
Buffer
1
80
1
80
EN
DecouplingCapacitor
200pF
Figure 626 Energy-recovery IO circuit with fault hypothesis
properly and a hypothesis was made that the cause was a largerthan anticipated parasitic capacitance on the integrated inductorTo determine whether the hypothesis is valid a 200pF capacitoris inserted in the circuit schematic connected at the output ofthe inductor as shown in Figure 626 The new circuit is simu-lated and as can be seen in Figure 627 the output signal at OER
switches properly even though the resonant clock (RES_CLK) isnot oscillating The simulated energy dissipation of the circuitwith the fault hypothesis can be seen in Figure 628 and as canbe observed it is very similar to the experimental measurementsboth in terms of absolute values and frequency relation
65 summary and conclusions
In this chapter the design approach of a demonstrator circuitwas presented for evaluating the energy-recovery 3D interconnect-ing scheme Expanding on the theoretical analysis proposed inChapter 5 practical design issues were discussed and methodswere suggested for overcoming them
The circuit was designed and implemented on an experimental130nm 3D process technology and post-layout simulations res-
65 summary and conclusions 112
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15050
055
060
065
070
075
080RES_CLK
Time (ns)
V(V
)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 150
02
04
06
08
1
12
14PULSE
Time (ns)
V(V
)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1500
02
04
06
08
10
12
14OER
Time (ns)
V(V
)
Figure 627 Simulation of circuit with fault hypothesis
65 summary and conclusions 113
200 250 300 350 400 450 500 550 600 6500
5
10
15
20
25
Die1 Die2 Simulation CL=200pF
Frequency (MHz)
Ene
rgy
diss
ipat
ion
(pJ)
Figure 628 Measured energy dissipation and simulation of circuit withfault hypothesis
ults including parasitics were shown and interpreted The post-layout results revealed that the demonstrator circuit dissipated25 more energy per cycle than the value estimated by the-ory however the energy overhead was attributed to the effectof the wiretransistor parasitics which are not included in thetheoretical models Nevertheless the energy-recovery IO circuitdissipated 39 less energy per cycle than the equivalent CMOS
circuit implemented on the same process technologyExperimental measurements demonstrated that the fabricated
CMOS IO circuit closely matched the energy-dissipation estim-ated by post-layout simulations However the energy-recoveryIO circuit did not operate as expected and as a result its wasnot possible to evaluate the energy-recovery 3D interconnectingscheme on-chip Using simulation data the fault was attributedto higher than anticipated integrated inductor parasitics
7C O N C L U S I O N S
Planar on-chip interconnects have become a major concern inmodern processes as their impact on IC performance has beenprogressively increasing with each technology node In recentyears 3D interconnects based on vertical TSVs have been proposedas a possible solution to the interconnect bottleneck TSVs havethe potential to offer the best all around technology for meet-ing future technology trends with their advantages includingreduced parasitics and high interconnection density
As TSV is a relatively recent 3D interconnect technology thereare still challenges to be overcome before TSVs can be widelyadopted and commercialised Issues in the fabrication of TSV
devices are in the process of being resolved or solutions havealready been found For implementing TSV-based 3D-ICs withmore complex functionalities than simple test chips 3D designmethodologies and tools are needed which are still at their in-fancy stage Accurate perception of the TSV characteristics thatinfluence circuit performance is a major step towards the devel-opment of design methodologies and tools for 3D-ICs
Key TSV characteristics are the TSV parasitic capacitance andthermomechanical stress distribution The parasitic capacitanceaffects delay and power consumption of signals passing throughthe TSV interconnect while thermomechanical stress affects MOS-
FET device mobility in the vicinity of TSVs For extracting theseparameters accurately and under different conditions sophist-icated test structures were presented in this work which werefabricated and evaluated on an experimental 3D process
Although the TSV parasitic capacitance is small it may stillbecome an important source of power consumption as TSVs areimplemented in large densely interconnected 3D-SoCs Since con-ventional low-power design techniques based on voltage scalingrepresent a technical challenge in modern technology nodesa novel TSV interconnection scheme with frequency-dependentpower consumption was proposed in this work based on energy-recovery logic The proposed scheme was analysed using the-oretical modelling while a demonstrator IC was designed andfabricated on a 130nm 3D process
The contributions of this thesis as presented in the previouschapters are summarised in the following sections as well aspossible directions for future work
114
71 main contributions 115
71 main contributions
From the parasitics associated with a TSV capacitance assumesthe most significant role on its electrical properties such as en-ergy dissipation and delay thus accurate characterization of theTSV capacitance is highly desirable when designing a 3D systemIn Chapter 3 a group of test structures were presented for theelectrical characterization of the TSV parasitic capacitance un-der various conditions The test structures were based on theCBCM technique which has been used in the past for measuringBEOL metal interconnect parasitic capacitances and MOSFET ca-pacitances The presented test structures were fabricated on a65nm 3D process and the measurement results were statisticallyanalysed in terms of TSV capacitance die-to-die (D2D) and die-to-wafer (D2W) variability Comparison of the measurement data tosimulated oxide liner thickness variation for the specific processrevealed the oxide liner as a major contributor to the observedTSV capacitance variability Measurement results also confirmedthe effect of TSV pitch on the parasitic capacitance as has beenpreviously reported in the literature for a different process Theproposed CBCM-based TSV capacitance measurement techniquewas shown to be superior to conventional LCR measurementsproducing both accurate measurement results as well as lessoutliers (less defective samples) in the measurement data
The integration of TSV 3D interconnects in CMOS technology hasbeen shown to induce thermomechanical stress on the silicon (Si)substrate Thermomechanical stress can have an impact on activedevice carrier mobility implemented on the same Si substratethus monitoring and controlling stress is crucial for designing TSV-aware circuits with predictable performance Simulations usingfinite element method (FEM) modelling predict a complex ther-momechanical stress distribution which emphasises the need forsophisticated characterization structures that can monitor stressimpact with precision and verify simulation models In Chapter4 a test structure was presented for indirectly monitoring of theTSV-induced thermomechanical stress distribution by measuringits effect on transistor saturation current The structure implemen-ted arrays of MOSFET devices superimposed with TSVs in variousconfigurations and enabled accurate measurement of individualdevices by combining digital selection logic with four-terminalsensing Imecrsquos experimental 65nm 3D process was characterizedusing the presented test structure extracting the thermomechan-ical stress effect on PMOS devices which was found to be on parwith simulation estimations
In future large densely interconnected 3D-SoCs the TSV parasiticcapacitance may become an important source of energy dissip-ation Energy-recovery logic is an interesting low-power design
72 future work 116
approach as it is not limited by voltage scaling which representsa technical challenge in modern technology nodes In Chapter5 the potential of the energy-recovery technique was investig-ated for reducing the energy dissipation of TSV interconnectsin 3D-ICs The proposed energy-recovery scheme was analysedusing theoretical modelling generating equations for energy dis-sipation and optimum device sizing The model accuracy wasevaluated against SPICE simulations which showed good cor-relation with the theoretical estimations on a 130nm technologyprocess Comparison of the energy-recovery scheme to conven-tional static CMOS demonstrated favourable energy performancefor the energy-recovery scheme for low operating frequenciesand high Q factors switching activities and TSV capacitances
The theoretical models developed in Chapter 5 were used fordesigning a energy-recovery 3D demonstrator circuit in Chapter6 under realistic physical and electrical constraints The energy-recovery circuit and its CMOS equivalent were fabricated on a130nm 3D process technology and post-layout simulations showedthat the energy-recovery circuit dissipated 39 less energy percycle than the CMOS circuit on the same process technology Exper-imental measurements and comparison to simulations confirmedthe proper operation of the fabricated CMOS circuit however theenergy-recovery circuit did not operate as expected and as a res-ult its was not possible to evaluate the proposed energy-recoveryscheme on a chip Using simulation data the fault was attributedto higher than anticipated integrated inductor parasitics
72 future work
The CBCM-based test structures in Chapter 3 were shown to be atleast as accurate as external instruments for characterizing TSV
capacitance Considering that CBCM can evaluate single-endedTSVs it could prove a valuable tool for testing TSVs prior to diestacking which is a critical step for keeping 3D-IC yield acceptableMeasuring the TSV parasitic capacitance can provide with import-ant information about the quality of the fabricated TSV devicesince even slight variations in process parameters will affect itsparasitic capacitance as was demonstrated in Chapter 3 CBCM issimpler and potentially more accurate than other techniques thathave been proposed in the literature for the same role such asRC constant measurement using sense amplification [16]
In Chapter 4 the experiences learned from evaluation of theexperimental measurements results could be potentially used fordesigning improved thermomechanical stress characterizationstructures for next generation TSV processes
1 The transistors should be in a regular grid in the array forsimple processing of the measurement data
72 future work 117
2 The boundary conditions of transistors should be carefullydesigned as they have an considerable effect on transistormatching and thus measurement accuracy
3 Distances of interest are generally lt 8microm thus the arrayscould be smaller and more dense in the area in proximityto the TSV
4 Mechanisms are required to detect faults in the digitalselection logic when the implementation process is expectedto have low-yield issues
Furthermore in a future work the thermomechanical stress effecton NMOS devices should be measured which were not evaluatedin this project Simulations predict that NMOS devices are muchless affected by thermomechanical stress than PMOS devices thusthe approach used in this work might not be sufficiently accurateto extract stress effect on NMOS mobility
In Chapter 5 the energy-recovery scheme proposed could use adifferent encoding technique than dual-rail as a possible improve-ment A suggested encoding is 1-out-of-4 [74] since only chargesa single output at any given cycle which can potentially save upto 50 energy in comparison to dual-rail Also the demonstra-tion of the energy-recovery scheme feasibility on a chip is stillpending since the energy-recovery 3D demonstrator discussedin Chapter 6 failed to operate properly The fault was attributedto higher than anticipated integrated inductor parasitics thusthe same circuit should be implemented in the future with animproved inductor design for verifying the proposed theory
AA P P E N D I X A P H O T O S
Figure A1 3D65 300mm wafer (on chuck)
Figure A2 Probe card
118
appendix a photos 119
Figure A3 3D130 200mm wafer (on chuck)
ProbeHead
Microscope
Figure A4 Probe head
BA P P E N D I X B S TAT I S T I C S
mean (micro)
In a given data set the mean is the sum of the values divided bythe number of values The mean represents the average value in apopulation If xi is a value in the data set and n the total numberof values the mean micro can be calculated as
micro =1n
n
sumi=1
xi
median (m)
The median is useful for skewed distributions and represents thearithmetic value that the majority of the values in a data set tendThe median of a normal distribution with mean micro and varianceσ2 is the same as the mean
standard deviation (σ)
Standard deviation is used for measuring variability in a dataset It shows how much variation there is from the mean valueTo calculate the standard deviation of a data set first the squareof the difference of each value from the mean is calculated andthen the square root of the mean gives the standard deviation
A normal distribution contains 682 of its values within plusmn1σOther distributions can have a different spread and typically alow standard deviation indicates low variability whereas a highstandard deviation increased variability
standard deviation of the mean (σmicro )
The standard deviation of the sampled mean (σmicro) provides in-formation on the precision of the calculated mean of a data setThis is especially useful in experimental measurements wherethe certainty of the data being an accurate representation of themeasured physical quantity increases the more the measurementis repeated
Assuming there is a data set X with n values xi of mean micro
120
appendix b statistics 121
variance(X) = σ2X
variance(micro) = variance( 1n sum
ni=1 xi) =
1n2 variance(sumn
i=1 xi) =1n2 sum
ni=1 variance(xi) =
nn2 variance(X) =1n variance(X) =
σ2Xn rarr
σmicro = σXradicn
coefficient of variation
The coefficient of variation is the ratio of the standard deviationσ to the mean micro
CV =σ
micro
It can be more useful in determining variability in a populationthan standard deviation as in contrast to standard deviationcoefficient of variation is a dimensionless number often expressedas percentage
outlier
Outliers are values in a data set that are numerically distant fromthe rest of the data Observing and sometimes removing theoutliers is crucial in experimental measurement as these valuesusually indicate a measurement error or other malfunction De-termining outliers is generally subjective to the observer as thereis no specific mathematical definition of what constitutes outliersTypically the outliers are located by observing the probabilitydistribution function of a data set
kruskalndash wallis [34]
KruskalndashWallis one-way analysis of variance compares the medi-ans of two or more data sets to determine if the samples originatefrom separate populations This analysis tests the null-hypothesisthat there is no relationship between the data sets calculating theprobability of the data sets originating from the same populationand rejecting the null-hypothesis if it is less than a significancelevel The KruskalndashWallis test can be used on data sets that donot have a normal distribution and their variances do not have tobe equal
B I B L I O G R A P H Y
[1] Model 2602 dual-channel system sourcemeter instru-ment URL httpwwwkeithleycomproductsdcac
currentsourcebroadpurposemn=2602
[2] Max-3d URL httpwwwmicromagiccom
[3] The international technology roadmap for semiconductors2007 URL httppublicitrsnet
[4] The international technology roadmap for semiconductors2009 URL httppublicitrsnet
[5] Jaime Aguilera Joaquin de No Andres Garcia-Alonso FrankOehler Heiko Hein and Josef Sauerer A guide for on-chip inductor design in a conventional cmos process forrf applications In Applied Microwave amp Wireless Magazinepages 56ndash65 October 2001
[6] M Arsalan and M Shams Charge-recovery power clockgenerators for adiabatic logic circuits In Proc 18th Int VLSI
Design Conf pages 171ndash174 2005 doi 101109ICVD200564
[7] P Asimakopoulos G Van der Plas A Yakovlev andP Marchal Evaluation of energy-recovering interconnectsfor low-power 3d stacked ics In Proc IEEE Int Conf
3D System Integration 3DIC 2009 pages 1ndash5 2009 doi1011093DIC20095306581
[8] W C Athas L J Svensson J G Koller N Tzartzanis andE Ying-Chin Chou Low-power digital systems based onadiabatic-switching principles 2(4)398ndash407 1994 doi 10110992335009
[9] C H Bennett Logical reversibility of computation IBM
Journal of Research and Development 17(6)525ndash532 1973 doi101147rd1760525
[10] Charles H Bennett The fundamental physical lim-its of computation Scientific American 253(1)48ndash561985 URL httpwwwnaturecomdoifinder101038
scientificamerican0785-48
[11] E Beyne 3d system integration technologies In Proc Int
VLSI Technology Systems and Applications Symp pages 1ndash92006 doi 101109VTSA2006251113
122
bibliography 123
[12] Mark Bohr and Kaizad Mistry Intelrsquos revolu-tionary 22 nm transistor technology 2011 URLhttpdownloadintelcomnewsroomkits22nmpdfs
22nm-Details_Presentationpdf
[13] Ke Cao S Dobre and Jiang Hu Standard cell charac-terization considering lithography induced variations InProc 43rd ACMIEEE Design Automation Conf pages 801ndash8042006 doi 101109DAC2006229327
[14] A P Chandrakasan S Sheng and R W Brodersen Low-power cmos digital design 27(4)473ndash484 1992 doi 1011094126534
[15] J C Chen B W McGaughy D Sylvester and Chenming HuAn on-chip attofarad interconnect charge-based capacitancemeasurement (cbcm) technique In Proc Int Electron Devices
Meeting IEDM rsquo96 pages 69ndash72 1996 doi 101109IEDM1996553124
[16] Po-Yuan Chen Cheng-Wen Wu and Ding-Ming Kwai On-chip tsv testing for 3d ic before bonding using sense ampli-fication In Proc ATS rsquo09 Asian Test Symp pages 450ndash4552009 doi 101109ATS200942
[17] Sungdong Cho Sinwoo Kang Kangwook Park Jaechul KimKiyoung Yun Kisoon Bae Woon Seob Lee Sangwook JiEunji Kim Jangho Kim Y L Park and E S Jung Impact oftsv proximity on 45nm cmos devices in wafer level In Proc
IEEE Int Interconnect Technology Conf and 2011 Materials for
Advanced Metallization (IITCMAM) pages 1ndash3 2011 doi101109IITC20115940326
[18] Kyu-Myung Choi An industrial perspective of 3d ic inte-gration technology from the viewpoint of design technologyIn Proc 15th Asia and South Pacific Design Automation Conf
(ASP-DAC) pages 544ndash547 2010 doi 101109ASPDAC20105419823
[19] Juang-Ying Chueh C H Ziesler and M C PapaefthymiouEmpirical evaluation of timing and power in resonant clockdistribution In Proc Int Symp Circuits and Systems ISCAS
rsquo04 volume 2 2004 doi 101109ISCAS20041329255
[20] Thuy Dao D H Triyoso M Petras and M CanonicoThrough silicon via stress characterization In Proc IEEE
Int Conf IC Design and Technology ICICDT rsquo09 pages 39ndash412009 doi 101109ICICDT20095166260
[21] M Grange R Weerasekera D Pamunuwa and H Ten-hunen Exploration of through silicon via interconnect par-
bibliography 124
asitics for 3-dimensional integrated circuits In Workshop
Notes Design Automation and Test in Europe (DATE) 2009
[22] N Z Haron and S Hamdioui Why is cmos scaling comingto an end In Proc 3rd Int Design and Test Workshop IDT
2008 pages 98ndash103 2008 doi 101109IDT20084802475
[23] Imec 200mm 3d-sic tsv line 2009 URL httpwwwimec
beScientificReportSR2009HTML1213307html
[24] J W Joyner R Venkatesan P Zarkesh-Ha J A Davis andJ D Meindl Impact of three-dimensional architectures oninterconnects in gigascale integration 9(6)922ndash928 2001doi 10110992974905
[25] A P Karmarkar Xiaopeng Xu and V Moroz Perfor-manace and reliability analysis of 3d-integration struc-tures employing through silicon via (tsv) In Proc IEEE
Int Reliability Physics Symp pages 682ndash687 2009 doi101109IRPS20095173329
[26] G Katti A Mercha M Stucchi Z Tokei D VelenisJ Van Olmen C Huyghebaert A Jourdain M RakowskiI Debusschere P Soussan H Oprins W DehaeneK De Meyer Y Travaly E Beyne S Biesemans andB Swinnen Temperature dependent electrical character-istics of through-si-via (tsv) interconnections In Proc Int
Interconnect Technology Conf (IITC) pages 1ndash3 2010 doi101109IITC20105510311
[27] G Katti M Stucchi K De Meyer and W Dehaene Electricalmodeling and characterization of through silicon via forthree-dimensional ics 57(1)256ndash262 2010 doi 101109TED20092034508
[28] G Katti M Stucchi J Van Olmen K De Meyer and W De-haene Through-silicon-via capacitance reduction techniqueto benefit 3-d ic performance 31(6)549ndash551 2010 doi101109LED20102046712
[29] Walt Kester Sensor signal conditioning In Jon SWilson editor Sensor Technology Handbook pages 31 ndash136 Newnes Burlington 2005 ISBN 978-0-75-067729-5 doi DOI101016B978-075067729-550044-6 URLhttpwwwsciencedirectcomsciencearticlepii
B9780750677295500446
[30] Jonggab Kil Jie Gu and C H Kim A high-speed variation-tolerant interconnect technique for sub-threshold circuitsusing capacitive boosting 16(4)456ndash465 2008 doi 101109TVLSI2007915455
bibliography 125
[31] Suhwan Kim C H Ziesler and M C PapaefthymiouCharge-recovery computing on silicon 54(6)651ndash659 2005doi 101109TC200591
[32] C Kortekaas On-chip quasi-static floating-gate capacitancemeasurement method In Proc Int Conf Microelectronic Test
Structures ICMTS 1990 pages 109ndash113 1990 doi 101109ICMTS199067889
[33] A Kramer J S Denker B Flower and J Moroney 2ndorder adiabatic computation with 2n-2p and 2n-2n2p logiccircuits In Proceedings of the 1995 international symposium on
Low power design ISLPED rsquo95 pages 191ndash196 New York NYUSA 1995 ACM ISBN 0-89791-744-8 doi httpdoiacmorg101145224081224115 URL httpdoiacmorg10
1145224081224115
[34] William H Kruskal and W Allen Wallis Use of ranks in one-criterion variance analysis Journal of the American Statistical
Association 47(260)583ndash621 1952 ISSN 01621459 doi 1023072280779 URL httpdxdoiorg1023072280779
[35] Kelin J Kuhn Moorersquos law past 32nm Future challengesin device scaling 2009 URL httpdownloadintelcom
pressroompdfkkuhnKuhn_IWCE_invited_textpdf
[36] R Landauer Irreversibility and heat generation in the com-puting process IBM Journal of Research and Development 5
(3)183ndash191 1961 doi 101147rd530183
[37] Joonho Lim Dong-Gyu Kim and Soo-Ik Chae A 16-bitcarry-lookahead adder using reversible energy recoverylogic for ultra-low-energy systems 34(6)898ndash903 1999 doi1011094766827
[38] Joonho Lim Dong-Gyu Kim and Soo-Ik Chae nmos re-versible energy recovery logic for ultra-low-energy applica-tions 35(6)865ndash875 2000 doi 1011094845190
[39] F Liu X Gu K A Jenkins E A Cartier Y Liu P Song andS J Koester Electrical characterization of 3d through-silicon-vias In Proc 60th Electronic Components and Technology Conf
(ECTC) pages 1100ndash1105 2010 doi 101109ECTC20105490839
[40] K H Lu Xuefeng Zhang Suk-Kyu Ryu J Im Rui Huangand P S Ho Thermo-mechanical reliability of 3-d ics con-taining through silicon vias In Proc 59th Electronic Compo-
nents and Technology Conf ECTC 2009 pages 630ndash634 2009doi 101109ECTC20095074079
bibliography 126
[41] Nir Magen Avinoam Kolodny Uri Weiser and NachumShamir Interconnect-power dissipation in a microprocessorIn Proceedings of the 2004 international workshop on System
level interconnect prediction SLIP rsquo04 pages 7ndash13 New YorkNY USA 2004 ACM ISBN 1-58113-818-0 doi httpdoiacmorg101145966747966750 URL httpdoiacm
org101145966747966750
[42] H Mahmoodi V Tirumalashetty M Cooke and K RoyUltra low-power clocking scheme using energy recoveryand clock gating 17(1)33ndash44 2009 doi 101109TVLSI20082008453
[43] D Maksimovic A mos gate drive with resonant transitionsIn Proc nd Annual IEEE Power Electronics Specialists Conf
PESC rsquo91 Record pages 527ndash532 1991 doi 101109PESC1991162725
[44] D Maksimovic and V G Oklobdzija Integrated power clockgenerators for low energy logic In Proc th Annual IEEE
Power Electronics Specialists Conf PESC rsquo95 Record volume 1pages 61ndash67 1995 doi 101109PESC1995474793
[45] D Maksimovic V G Oklobdzija B Nikolic and K WCurrent Clocked cmos adiabatic logic with integratedsingle-phase power-clock supply 8(4)460ndash463 2000 doi10110992863629
[46] E J Marinissen Testing tsv-based three-dimensional stackedics In Proc Design Automation amp Test in Europe Conf amp
Exhibition (DATE) pages 1689ndash1694 2010
[47] E J Marinissen and Y Zorian Testing 3d chips containingthrough-silicon vias In Proc Int Test Conf ITC 2009 pages1ndash11 2009 doi 101109TEST20095355573
[48] A Mercha A Redolfi M Stucchi N Minas J Van Ol-men S Thangaraju D Velenis S Domae Y Yang G KattiR Labie C Okoro M Zhao P Asimakopoulos I De WolfT Chiarella T Schram E Rohr A Van Ammel A Jour-dain W Ruythooren S Armini A Radisic H PhilipsenN Heylen M Kostermans P Jaenen E Sleeckx D Sabun-cuoglu Tezcan I Debusschere P Soussan D PerryG Van der Plas J H Cho P Marchal Y Travaly E BeyneS Biesemans and B Swinnen Impact of thinning andthrough silicon via proximity on high-k metal gate firstcmos performance In Proc Symp VLSI Technology (VLSIT)pages 109ndash110 2010 doi 101109VLSIT20105556190
[49] A Mercha G Van der Plas V Moroz I De Wolf P Asi-makopoulos N Minas S Domae D Perry M Choi
bibliography 127
A Redolfi C Okoro Y Yang J Van Olmen S ThangarajuD S Tezcan P Soussan J H Cho A Yakovlev P Mar-chal Y Travaly E Beyne S Biesemans and B SwinnenComprehensive analysis of the impact of single and arraysof through silicon vias induced stress on high-k metalgate cmos performance In Proc IEEE Int Electron Devices
Meeting (IEDM) 2010 doi 101109IEDM20105703278
[50] Yong Moon and Deog-Kyoon Jeong An efficient chargerecovery logic circuit 31(4)514ndash522 1996 doi 1011094499727
[51] G E Moore Cramming more components onto inte-grated circuits Electronics 38(8)114ndash117 April 1965 doi101109JPROC1998658762 URL httpdxdoiorg10
1109JPROC1998658762
[52] O S Nakagawa S-Y Oh T Hsu and S Habu Benchmarkmethodology of interconnect capacitance simulation usinginter-digitated capacitors In Proc Int Microelectronic Test
Structures ICMTS 1998 Conf pages 235ndash237 1998 doi 101109ICMTS1998688103
[53] M W Newman S Muthukumar M SchueleinT Dambrauskas P A Dunaway J M Jordan S KulkarniC D Linde T A Opheim R A Stingel W WorwagL A Topic and J M Swan Fabrication and electricalcharacterization of 3d vertical interconnects In Proc
56th Electronic Components and Technology Conf 2006 doi101109ECTC20061645676
[54] V G Oklobdzija D Maksimovic and Fengcheng Lin Pass-transistor adiabatic logic using single power-clock supply44(10)842ndash846 1997 doi 10110982633443
[55] J Van Olmen C Huyghebaert J Coenen J Van AelstE Sleeckx A Van Ammel S Armini G Katti J VaesW Dehaene E Beyne and Y Travaly Integration chal-lenges of copper through silicon via (tsv) metallization for3d-stacked ic integration Microelectronic Engineering 88
(5)745 ndash 748 2011 ISSN 0167-9317 doi DOI101016jmee201006026 URL httpwwwsciencedirectcom
sciencearticlepiiS0167931710002121 The 2010 Inter-national workshop on - MAM 2010 The 2010 Internationalworkshop on
[56] J J Paulos and D A Antoniadis Measurement of minimum-geometry mos transistor capacitances 20(1)277ndash283 1985doi 101109JSSC19851052303
bibliography 128
[57] Vasilis F Pavlidis and Eby G Friedman Interconnect de-lay minimization through interlayer via placement in 3-dics In Proceedings of the 15th ACM Great Lakes symposium
on VLSI GLSVLSI rsquo05 pages 20ndash25 New York NY USA2005 ACM ISBN 1-59593-057-4 doi httpdoiacmorg10114510576611057669 URL httpdoiacmorg101145
10576611057669
[58] Dan Perry Jonghoon Cho Shinichi Domae Panagiotis Asi-makopoulos Alex Yakovlev Pol Marchal Geert Van derPlas and Nikolaos Minas An efficient array structure tocharacterize the impact of through silicon vias on fet devicesIn Proc IEEE Int Microelectronic Test Structures (ICMTS) Confpages 118ndash122 2011 doi 101109ICMTS20115976872
[59] Jan M Rabaey Anantha Chandrakasan and Borivoje NikolicDigital integrated circuits- A design perspective Prentice Hall2ed edition 2004
[60] K Roy and S Prasad Low-power CMOS VLSI circuit de-
sign A Wiley-Interscience publication Wiley 2000 ISBN9780471114888
[61] K Roy S Mukhopadhyay and H Mahmoodi-MeimandLeakage current mechanisms and leakage reduction tech-niques in deep-submicrometer cmos circuits 91(2)305ndash3272003 doi 101109JPROC2002808156
[62] V S Sathe M C Papaefthymiou and C H Ziesler Boostlogic a high speed energy recovery circuit family In Proc
IEEE Computer Society Annual Symp VLSI pages 22ndash27 2005doi 101109ISVLSI200522
[63] V S Sathe J C Kao and M C Papaefthymiou Resonant-clock latch-based design 43(4)864ndash873 2008 doi 101109JSSC2008917501
[64] B Sell A Avellan and W H Krautschneider Charge-basedcapacitance measurements (cbcm) on mos devices 2(1)9ndash122002 doi 101109TDMR20021014667
[65] C S Selvanayagam Xiaowu Zhang R Rajoo and D PinjalaModelling stress in silicon with tsvs and its effect on mobilityIn Proc 11th Electronics Packaging Technology Conf EPTC rsquo09pages 612ndash618 2009 doi 101109EPTC20095416477
[66] D Shariff P C Marimuthu K Hsiao L Asoy Chia LaiYee Aung Kyaw Oo K Buchanan K Crook T Wilby andS Burgess Integration of fine-pitched through-silicon viasand integrated passive devices In Proc IEEE 61st Elec-
tronic Components and Technology Conf (ECTC) pages 844ndash848 2011 doi 101109ECTC20115898609
bibliography 129
[67] S Stoukatch C Winters E Beyne W De Raedt andC Van Hoof 3d-sip integration for autonomous sensornodes In Proc 56th Electronic Components and Technology
Conf 2006 doi 101109ECTC20061645678
[68] L J Svensson and J G Koller Driving a capacitive loadwithout dissipating fcv2 In Proc IEEE Symp Low Power
Electronics Digest of Technical Papers pages 100ndash101 1994 doi101109LPE1994573220
[69] B Swinnen W Ruythooren P De Moor L Bogaerts L Car-bonell K De Munck B Eyckens S Stoukatch D S TezcanZ Tokei J Vaes J Van Aelst and E Beyne 3d integrationby cu-cu thermo-compression bonding of extremely thinnedbulk-si die containing 10 IcircŒm pitch through-si vias In Proc
Int Electron Devices Meeting IEDM rsquo06 pages 1ndash4 2006 doi101109IEDM2006346786
[70] N Tanaka M Kawashita Y Yoshimura T Uematsu M Fuji-sawa H Shimokawa N Kinoshita T Naito T Kikuchi andT Akazawa Characterization of mos transistors after tsvfabrication and 3d-assembly In Proc 2nd Electronics System-
Integration Technology Conf ESTC 2008 pages 131ndash134 2008doi 101109ESTC20084684338
[71] S E Thompson Guangyu Sun Youn Sung Choi andT Nishida Uniaxial-process-induced strained-si extend-ing the cmos roadmap 53(5)1010ndash1020 2006 doi 101109TED2006872088
[72] N Tzartzanis and W C Athas Clock-powered cmos ahybrid adiabatic logic style for energy-efficient computingIn Proc 20th Anniversary Conf Advanced Research in VLSIpages 137ndash151 1999 doi 101109ARVLSI1999756044
[73] J Van Olmen A Mercha G Katti C HuyghebaertJ Van Aelst E Seppala Zhao Chao S Armini J VaesR C Teixeira M Van Cauwenberghe P Verdonck K Ver-hemeldonck A Jourdain W Ruythooren M de Potterde ten Broeck A Opdebeeck T Chiarella B Parvais I De-busschere T Y Hoffmann B De Wachter W DehaeneM Stucchi M Rakowski P Soussan R Cartuyvels E BeyneS Biesemans and B Swinnen 3d stacked ic demonstrationusing a through silicon via first approach In Proc IEEE Int
Electron Devices Meeting IEDM 2008 pages 1ndash4 2008 doi101109IEDM20084796763
[74] Tom Verhoeff Delay-insensitive codes - an overview Dis-
tributed Computing 31ndash8 1988 ISSN 0178-2770 URL http
dxdoiorg101007BF01788562 101007BF01788562
bibliography 130
[75] P Vitanov T Dimitrova and I Eisele Direct capacitancemeasurements of small geometry mos transistors Micro-
electronics Journal 22(7-8)77 ndash 89 1991 ISSN 0026-2692doi DOI1010160026-2692(91)90016-G URL httpwww
sciencedirectcomsciencearticleB6V44-4829XDJ-2N
2f5b2c221dcf240c38cd13b7b3432f913
[76] B Voss and M Glesner A low power sinusoidal clockIn Proc IEEE Int Symp Circuits and Systems ISCAS 2001volume 4 pages 108ndash111 2001 doi 101109ISCAS2001922182
[77] R Weerasekera M Grange D Pamunuwa H Tenhunenand Li-Rong Zheng Compact modelling of through-siliconvias (tsvs) in three-dimensional (3-d) integrated circuits InProc IEEE Int Conf 3D System Integration 3DIC 2009 pages1ndash8 2009 doi 1011093DIC20095306541
[78] NHE Weste and DF Harris CMOS VLSI design a cir-
cuits and systems perspective PearsonAddison-Wesley 2005ISBN 9780321269775
[79] Steve Wozniak Failure magazine interview (by jason zasky)July 2000 URL httpfailuremagcomindexphpsite
printsteve_wozniak_interview
[80] Hyung Suk Yang and Muhannad S Bakir 3d integrationof cmos and mems using mechanically flexible intercon-nects (mfi) and through silicon vias (tsv) In Proc 60th Elec-
tronic Components and Technology Conf (ECTC) pages 822ndash828 2010 doi 101109ECTC20105490716
[81] Yu Yang G Katti R Labie Y Travaly B Verlinden andI De Wolf Electrical evaluation of 130-nm mosfets withtsv proximity in 3d-sic structure In Proc Int Interconnect
Technology Conf (IITC) pages 1ndash3 2010 doi 101109IITC20105510710
[82] Yibin Ye and K Roy Energy recovery circuits using re-versible and partially reversible logic 43(9)769ndash778 1996doi 10110981536746
[83] Yibin Ye and K Roy Qserl quasi-static energy recoverylogic 36(2)239ndash248 2001 doi 1011094902764
[84] C C Yeh J H Lou and J B Kuo 15 v cmos full-swingenergy efficient logic (eel) circuit suitable for low-voltageand low-power vlsi applications Electronics Letters 33(16)1375ndash1376 1997 doi 101049el19970915
bibliography 131
[85] C P Yuan and T N Trick A simple formula for the esti-mation of the capacitance of two-dimensional interconnectsin vlsi circuits 3(12)391ndash393 1982 doi 101109EDL198225610
- Abstract
- Publications
- Acknowledgments
- Contents
- List of Figures
- List of Tables
- Acronyms
- 1 Introduction
-
- 11 Research goals and contribution
- 12 Thesis outline
-
- 2 Background
-
- 21 3D integrated circuits
-
- 211 3D integration approaches
- 212 TSV interconnects
-
- 22 Energy-recovery logic
-
- 221 Energy-recovery architectures
- 222 Power-clock generators
-
- 23 Full-custom design flow
- 24 Summary
-
- 3 Characterization of TSV capacitance
-
- 31 Introduction
- 32 TSV parasitic capacitance
- 33 Integrated capacitance measurement techniques
- 34 Electrical characterization
-
- 341 Physical implementation
- 342 Measurement setup
- 343 Experimental results
-
- 35 Summary and conclusions
-
- 4 TSV proximity impact on MOSFET performance
-
- 41 Introduction
- 42 TSV-induced thermomechanical stress
- 43 Test structure implementation
-
- 431 MOSFET arrays
- 432 Digital selection logic
-
- 44 Experimental measurements
-
- 441 Data processing methodology
- 442 Measurement setup
- 443 PMOS (long channel)
- 444 PMOS (short channel)
-
- 45 Summary and conclusions
-
- 5 Evaluation of energy-recovery for TSV interconnects
-
- 51 Introduction
- 52 Energy-recovery TSV interconnects
- 53 Analysis
-
- 531 Adiabatic driver
- 532 Inductors parasitic resistance
- 533 Switch M1
- 534 Timing Constraints
-
- 54 Simulations
-
- 541 Comparison to CMOS
-
- 55 Summary and conclusions
-
- 6 Energy-recovery 3D-IC demonstrator
-
- 61 Introduction
- 62 Physical implementation
-
- 621 Adiabatic driver
- 622 Pulse-to-level converter
- 623 Data generator
- 624 Pulse generator
- 625 Integrated inductor
- 626 Top level design
- 627 CMOS IO circuit
-
- 63 Post-layout simulation and comparison
- 64 Experimental measurements
-
- 641 Measurement setup
- 642 Results
-
- 65 Summary and conclusions
-
- 7 Conclusions
-
- 71 Main contributions
- 72 Future work
-
- A Appendix A Photos
- B Appendix B Statistics
- Bibliography
-